Skip to main content

Table 1

From: Bioinformatic identification of novel regulatory DNA sequence motifs in Streptomyces coelicolor

Matrix Protein class Number of ORFs in this set Number of ORFs in this set belonging to this protein class P value Consensus sequence
2083 2.1.3 Degradation of polysaccharides 54 10 7.557e-10 See Figure 2A
2318 2.2.3 DNA – replication, repair, restriction / modification 21 6 7.76e-08 See Figure 2B
1744 1.2.1 Chromosome replication 24 3 2.12e-06 See Figure 2C
1909 4.1.7 Gram +ve exported / lipoprotein 106 22 9.24e-08 See Figure 2D
46 6.2.1 sigma factor 116 10 6.60e-08 See Figure 2E
2034 6.3.13 ArsR 45 4 1.89e-06 See Figure 2F
1853 3.3.11 Nucleotide interconversions 46 5 4.09e-07 See Figure 2G
363 3.8.0 Secondary metabolism 9 5 4.89e-07 See Figure 2H
571 3.8.0 Secondary metabolism 10 5 9.621e-07 See Figure 2H
293 3.8.0 Secondary metabolism 10 5 9.62e-07 See Figure 2H
153 3.8.0 Secondary metabolism 18 6 1.31e-06 See Figure 2H
  1. Table 1. Position-specific weight matrices (PSWMs) that represent DNA sequence motifs shared by functionally coherent sets of genes in Streptomyces coelicolor. A library of 2497 matrices was generated from alignments of over-represented DNA sequence dyads as described in the Methods section. Each matrix is essentially a statistical model of a DNA sequence motif [58]. The non-coding regions of the S. coelicolor genome were searced against the matrices to find matches to each of the sequence motifs. The scanning method assigneda score (maximum 100) to each match site. The minimum score threshold was chosen as 80. For each matrix, we recorded the number of genes whose upstream region contains at least one match site. We also recorded the number of those genes belonging to each functional category in the protein classification scheme, and calculated a P value to determine whether that functional category was significantly over-represented.