Skip to main content

Table 2 Results of comparison between protein content similarity and 16S rRNA gene percent identity

From: Analysis and comparison of the pan-genomic properties of sixteen well-characterized bacterial genera

Genus 16S range Shared proteins Average unique proteins
   Range Slope R 2 Range Slope R 2
Bacillus 90.4-100% 1741-5204 231 0.83* 248-3000 -176 0.69*
Brucella 99.9-100% 2495-3060 NDa ND 154-454 NDa ND
Burkholderia 93.8-100% 2861-6337 192 0.26* 337-4554 -394 0.67*
Clostridium 80.3-100% 917-3333 38 0.47* 141-2987 -60 0.36*
Lactobacillus 85.8-100% 720-2348 42 0.49* 235-1595 -46 0.19*
Mycobacterium 91.3-100% 1258-4327 99 0.13* 87-2994 -151 0.47*
Neisseria 98.4-100% 1470-1794 -263 0.19 206-753 305 0.03
Pseudomonas 93.1-100% 2368-5339 68 0.06* 383-2847 -129 0.37*
Rhizobium 98.9-99.9% 3482-4690 178 0.03 1296-2095 12 0.00
Rickettsia 97.2-100% 743-1275 92 0.49* 48-556 51 0.07
Shigella 97.4-99.7% 2781-3481 122 0.13 463-1185 -113 0.11
Staphylococcus 97.4-100% 1674-2653 72 0.41* 49-923 -18 0.02
Streptococcus 92.6-100% 929-1954 46 0.28* 84-1028 -35 0.15*
Vibrio 90.9-99.8% 2345-3879 142 0.81* 396-2167 -21 0.03
Xanthomonas 99.8-100% 2802-3982 ND ND 201-1653 ND ND
Yersinia 97.2-100% 2675-3825 347 0.94* 216-1319 -27 0.94*
  1. For each genus, the range of 16S rRNA gene percent identities for all pairs of isolates from that genus is listed. Under the "shared proteins" heading, "range" indicates the range of shared proteins in pairs of isolates from that genus. The "slope" column indicates the slope of the regression line when the number of shared proteins in each pair of isolates is plotted against their 16S rRNA gene percent identities. The "R2" column contains the square of the standard correlation coefficient between these two variables, and indicates the strength of their relationship. The data under the "average unique proteins" heading are analogous to those under the "shared proteins" heading. Isolates sharing ≥ 99.5% identity of the 16S rRNA gene were not used in the calculation of slope or R2. Values marked with "ND" were not determined; despite having different species names, all isolates with sequenced genomes within these genera shared ≥ 99.5% identity of the 16S rRNA gene. An asterisk (*) beside an R2 value indicates that it is statistically significant with P-value < 0.05.