 
Core proteomes

Unique proteomes


S

N
_{
I
}

${N}_{C}^{A}$

${N}_{C}^{R}$

P
_{
C
}

${N}_{C}^{>}$

${N}_{U}^{A}$

${N}_{U}^{R}$

P
_{
U
}

${N}_{U}^{>}$


Bacillus anthracis

3

4941

2123

**

0/25

168

1

**

0/25

Bacillus cereus

4

2881

1840

**

0/25

2

0



0/25

Bacillus thuringiensis

2

4255

2864

**

5/25

4

7

n.s.

7/25

Brucella abortus

3

2699

2603

**

6/25

2

1

*

4/25

Brucella suis

2

3025

2760

**

2/24

5

4

n.s.

5/24

Burkholderia ambifaria

2

5609

3798

**

1/25

198

17

**

0/25

Burkholderia cenocepacia

3

5908

3352

**

0/25

168

0

**

0/25

Burkholderia mallei

4

3623

3086

**

1/25

18

0



0/25

Burkholderia pseudomallei

4

4972

3086

**

0/25

45

0



0/25

Clostridium botulinum

8

1514

763

**

0/25

10

0



0/25

Clostridium perfringens

3

2110

1085

**

0/25

298

0

**

0/25

Lactobacillus casei

2

2355

959

**

0/25

593

5

**

0/25

Lactobacillus delbrueckii

2

1372

959

**

0/25

222

5

**

0/25

Lactobacillus reuteri

2

1402

959

**

0/25

120

5

**

0/25

Mycobacterium bovis

2

3822

2577

**

1/25

36

38

n.s.

3/25

Mycobacterium tuberculosis

3

3724

2118

**

0/25

26

17

n.s.

3/25

Neisseria gonorrhoeae

2

1795

1560

**

0/8

229

3

**

0/8

Neisseria meningitidis

4

1547

1426

**

0/14

75

4

**

0/14

 Column headings are: S, species; N_{
I
}, number of sequenced isolates of species S; ${N}_{C}^{A}$, core proteome size of the sequenced isolates of S; ${N}_{C}^{R}$, average core proteome size of the randomlygenerated sets; P_{
C
}, probability that the average core proteome size of the randomlygenerated sets is different than the core proteome size of the sequenced isolates of S; ${N}_{C}^{>}$, fraction of random sets having a core proteome larger than S. ${N}_{U}^{A}$, ${N}_{U}^{R}$, P_{
U
}and ${N}_{U}^{>}$ are analogous to ${N}_{C}^{A}$, ${N}_{C}^{R}$, P_{
C
}, and ${N}_{C}^{>}$, respectively, and refer to the comparisons involving the number of proteins found in all sequenced isolates of S, but no other isolates from the same genus ("unique proteomes"). In some cases, all of the random sets corresponding to a particular species had zero unique proteins. No Pvalue could be computed for these because the standard deviation of these values was zero. In these situations, the P_{
U
}column contains a dash character (). The averages in both column ${N}_{C}^{R}$ and column ${N}_{U}^{R}$ are rounded to the nearest whole number. For certain rows, column ${N}_{U}^{R}$ shows a value of 0; in some cases, this value is exact, while in other situations, it is due to rounding. If due to rounding, then the standard deviation of the random sets is nonzero, and column P_{
U
}contains a Pvalue. For columns P_{
C
}and P_{
U
}, "n.s." means "not significant", a single asterisk indicates a Pvalue of less than 0.05, and a double asterisk indicates a Pvalue of less than 0.001. See Table 4 for the continuation of this table.