Skip to main content

Table 1 Characterization of 454 pyrosequenced libraries from the microbial community of biofilms

From: Metagenome analyses of corroded concrete wastewater pipe biofilms reveal a complex microbial system

 

Top pipe (TP)

Bottom pipe (BP)

reads

1 004 530

976 729

avg reads (bp)

370

427

dataset size (108 bp)

3.2

3.7

reads for analysis§

862 893

856 080

CAMERA v2

  

COG hits†

370 393

389 807

Pfam hits†

338 966

352 466

TIGRfam hits†

579 127

607 388

MG-RAST v3

  

reads matching to a taxa†

629 161

641 853

reads matching to a subsystems†

425 346

427 295

no. of subsystems (function level)

5 633

6 117

Annotated proteins (%) [SEED]

  

Bacteria

95.5

94.1

Archaea

0.5

1.3

Virus

0.1

0.1

Eukaryota

0.6

0.3

Unclassified

3.3

4.2

Comparative metagenome ‡

  

average genome size [Mb]

3.3

3.3

ESC of COG hits

369 671

390 570

  1. §Prior to sequence analysis we implemented a dereplication pipeline to identify and remove clusters of artificially replicated sequences [17].
  2. †E-value cut-off >1e-05.
  3. ‡Average genome size and effective sequence count (ESC) as calculated by Beszteri et al.[20].