Bioinformatics Main Page
Gene prediction, Plant promoter finding
PlantProm: Plant Promoters database
Arabidopsis genome analysis: genes, promoters, proteins
Prokaryotic promoter prediction demo
Escherichia coli
promoter map
Department of Computer Science
Computer Learning Research Centre
Royal Holloway, University of London
Useful links
People

 

 

PlantProm DB

A Database of Plant Promoter Sequences. Release 2002.01

   PlantProm DB is an annotated, non-redundant collection of proximal 
promoter sequences for RNA polymerase II with experimentally determined 
transcription start site(s), TSS, from various plant species. 
   It was developed by the Department of Computer Science at Royal Holloway, 
University of London, in collaboration with Softberry Inc. (USA) 
and is available also at www.softberry.com. 
   The current release of PlantProm DB contains 305 entries including 71, 220 
and 14 promoters from monocot, dicot and other plants, respectively. 
For collecting plant gene promoters the following criteria was followed. 
·   There is experimental evidence of the TSS position(s) of the gene, 
        published in the literature.  For genes with multiple TSSs the 
        nearest to the CDS start position is taken, if no additional 
        information on the predominance of one of them is available (positions 
        of other TSSs are given in the name line of the sequence written in 
        the FASTA format.
·   The length of known promoter sequence upstream of chosen TSS is 200 bp 
        or more; all stored promoter sequences are the same length, 251 bp, 
        where the position 201 corresponds to the TSS, i.e. collected sequences 
        occupy the region [-200 : +51], with the TSS in the position +1, and, 
        thus, present proximal promoters mentioned above.   
·   An entry corresponds to the gene mapped on the genomic sequences
·   Various alleles of a gene are presented in the database by a single entry.
·   Genes with more than one non-allelic copy in the genome as well as 
        paralogous genes are taken as different entries.

PlantProm DB provides the following information. 

1.  DNA sequence of 305 promoter regions [-200:+51], with  TSS on the fixed 
        position +201, from various plant species, in the FASTA format, including:   
1.1.	71 promoters of monocots, 
1.2.	220 promoters of dicots,  
1.3.	14 promoters from other plants, 
1.4.	175 TATA promoters, consisting of 41 monocot, 131 dicot and 3 other, 
        plant   species  sequences,  respectively.
1.5.	130 TATA-less promoters, consisting of 30 monocot, 89 dicot  
        and 11 other plant species  sequences,  respectively.

2.   Taxonomic and promoter type classification of promoters, including:   
2.1.	List of species represented in the PlantProm DB,   
2.2.	List of genes/gene products and promoter types  represented in the PlantProm DB.
      

3.   Nucleotide Frequency Matrices for canonical promoter elements 
        (TATA-box, CCAAT- box, and TSS-motif or Initiator element, Inr), including:   
3.1	TATA-matrices for various promoter collections,   
3.2	CCAAT-matrices for various promoter collections,   
3.3	TSS-motif-matrices for various promoter collections   

4.	Location of TATA-boxes in some promoters collections mentioned above, including:   
4.1.	171 unrelated promoters from various plant species,    
4.2.	128 unrelated promoters from dicot plants,   

5.	Location of CCAAT-boxes in some promoters collections mentioned above, including:   
5.1. 131 unrelated promoters of both (TATA and TATA-less) types from various 
       plant species,    
5.2. 71 unrelated TATA promoters from various plant species,        
5.3. 60 unrelated TATA-less promoters from various plant species.
      
6.	Location of TSS-motifs in some promoters collections mentioned above, including:   
6.1. 70 unrelated promoters of both (TATA and TATA-less) types from monocot 
       plants,  
6.2.	217 unrelated promoters of both (TATA and TATA-less) types from dicot plants, 
6.3.	171 unrelated TATA promoters from various plant species, 
6.4.	30 unrelated TATA-less promoters from various plant species. 
      
7.	Short description of the computation of nucleotide frequency matrices for 
        various promoter elements.   
Reference: Ilham A.Shahmuradov, Alex J.Gammerman, John M. Hancock, Peter M.Bramley and Victor V.Solovyev. PlantProm: A Database of Plant Promoter Sequences. Nucl. Acids Res. (2002) (in publication).
Acknowledgements: PlantProm Database is partially funded by grant 111/BIO14428 Pattern Recognition Techniques for Gene Identification in Plant Genomic Sequences, from the UK Biotechnology and Biological Sciences Research Council and is designed and maintained at Royal Holloway in collaboration with Softberry Inc. www.softberry.com (mirror site) (USA).
Questions/comments send to: Ilham Shahmuradov ilham@cs.ehul.ac.uk Victor Solovyev victor@softberry.com