Skip to main content

Table 2 Results of comparison between protein content similarity and 16S rRNA gene percent identity

From: Analysis and comparison of the pan-genomic properties of sixteen well-characterized bacterial genera

Genus

16S range

Shared proteins

Average unique proteins

  

Range

Slope

R 2

Range

Slope

R 2

Bacillus

90.4-100%

1741-5204

231

0.83*

248-3000

-176

0.69*

Brucella

99.9-100%

2495-3060

NDa

ND

154-454

NDa

ND

Burkholderia

93.8-100%

2861-6337

192

0.26*

337-4554

-394

0.67*

Clostridium

80.3-100%

917-3333

38

0.47*

141-2987

-60

0.36*

Lactobacillus

85.8-100%

720-2348

42

0.49*

235-1595

-46

0.19*

Mycobacterium

91.3-100%

1258-4327

99

0.13*

87-2994

-151

0.47*

Neisseria

98.4-100%

1470-1794

-263

0.19

206-753

305

0.03

Pseudomonas

93.1-100%

2368-5339

68

0.06*

383-2847

-129

0.37*

Rhizobium

98.9-99.9%

3482-4690

178

0.03

1296-2095

12

0.00

Rickettsia

97.2-100%

743-1275

92

0.49*

48-556

51

0.07

Shigella

97.4-99.7%

2781-3481

122

0.13

463-1185

-113

0.11

Staphylococcus

97.4-100%

1674-2653

72

0.41*

49-923

-18

0.02

Streptococcus

92.6-100%

929-1954

46

0.28*

84-1028

-35

0.15*

Vibrio

90.9-99.8%

2345-3879

142

0.81*

396-2167

-21

0.03

Xanthomonas

99.8-100%

2802-3982

ND

ND

201-1653

ND

ND

Yersinia

97.2-100%

2675-3825

347

0.94*

216-1319

-27

0.94*

  1. For each genus, the range of 16S rRNA gene percent identities for all pairs of isolates from that genus is listed. Under the "shared proteins" heading, "range" indicates the range of shared proteins in pairs of isolates from that genus. The "slope" column indicates the slope of the regression line when the number of shared proteins in each pair of isolates is plotted against their 16S rRNA gene percent identities. The "R2" column contains the square of the standard correlation coefficient between these two variables, and indicates the strength of their relationship. The data under the "average unique proteins" heading are analogous to those under the "shared proteins" heading. Isolates sharing ≥ 99.5% identity of the 16S rRNA gene were not used in the calculation of slope or R2. Values marked with "ND" were not determined; despite having different species names, all isolates with sequenced genomes within these genera shared ≥ 99.5% identity of the 16S rRNA gene. An asterisk (*) beside an R2 value indicates that it is statistically significant with P-value < 0.05.