Storage conditions of intestinal microbiota matter in metagenomic analysis
© Cardona et al.; licensee BioMed Central Ltd. 2012
Received: 6 March 2012
Accepted: 20 July 2012
Published: 30 July 2012
The structure and function of human gut microbiota is currently inferred from metagenomic and metatranscriptomic analyses. Recovery of intact DNA and RNA is therefore a critical step in these studies. Here, we evaluated how different storage conditions of fecal samples affect the quality of extracted nucleic acids and the stability of their microbial communities.
We assessed the quality of genomic DNA and total RNA by microcapillary electrophoresis and analyzed the bacterial community structure by pyrosequencing the 16S rRNA gene. DNA and RNA started to fragment when samples were kept at room temperature for more than 24 h. The use of RNAse inhibitors diminished RNA degradation but this protection was not consistent among individuals. DNA and RNA degradation also occurred when frozen samples were defrosted for a short period (1 h) before nucleic acid extraction. The same conditions that affected DNA and RNA integrity also altered the relative abundance of most taxa in the bacterial community analysis. In this case, intra-individual variability of microbial diversity was larger than inter-individual one.
Though this preliminary work explored a very limited number of parameters, the results suggest that storage conditions of fecal samples affect the integrity of DNA and RNA and the composition of their microbial community. For optimal preservation, stool samples should be kept at room temperature and brought at the laboratory within 24 h after collection or be stored immediately at −20°C in a home freezer and transported afterwards in a freezer pack to ensure that they do not defrost at any time. Mixing the samples with RNAse inhibitors outside the laboratory is not recommended since proper homogenization of the stool is difficult to monitor.
The human gut microbiome is a highly dense microbial ecosystem, largely outnumbering our own eukaryotic body cells. Its intimate contact with our digestive system and its potential role in health and disease states makes this ecosystem very attractive for a deep characterization of its composition and function. In recent years, high-throughput sequencing has been the catalyst for analyzing microbial population diversity and functions. While bacterial 16S rRNA gene survey can answer the question “which species are there” , functional metagenomics can also address “what are they doing” by examining the sequences of genomic fragments and by exploiting, for instance, gene expression analysis by metatranscriptomics [2–4]. These approaches allow not only the characterization of individual organisms and their genes; but also metabolic and regulatory pathways, functional interactions inside a microbial community and crosstalk between a microbial community and its host.
Functional metagenomic projects are highly interdisciplinary and involve numerous procedures, ranging from clinical protocols for sample collection to bioinformatics tools for data interpretation. Strong biases can be introduced in each of these steps. Sample storage conditions, one of the first steps, is critical for downstream analyses. Previous studies had indicated that storing conditions of stool samples only modestly affect the structure of their microbial community [5–8]. However, little is known about the influence of storing conditions on more deep structural and functional analyses, which require maximal integrity of genomic DNA and RNA. Intact DNA fragments are critical for metagenomic library construction [9–11] and to characterizing intact genetic pathways either by sequence-based or function screening-based approaches [12, 13]. Moreover, excessive degradation of DNA reduces the efficiency of shotgun sequencing . The recovery of total RNA with high integrity is necessary for proper cDNA synthesis and absolutely essential for describing the gene expression in a community sample [4, 14–16].
In the present study, we compared the effect of different storage conditions of stool samples on microbial community composition, genomic DNA and total RNA integrity.
Results and discussion
Effect of storage conditions on genomic DNA
In order to investigate the effect of storage conditions on the quality of genomic DNA, we chose a subset of stool samples collected by 4 volunteers (#1, #2, #3 and #4) and that had been stored in the following 6 conditions: immediately frozen at −20°C (F); immediately frozen (UF) and then unfrozen during 1 h and 3 h; kept at room temperature (RT) during 3 h, 24 h and 2 weeks. In this case, all 24 samples were kept at −80°C in the laboratory until genomic DNA was extracted and its integrity analyzed using microcapillary electrophoresis.
Percentage of DNA compared to the frozen samples
% degraded DNA
n = 4
p value when compared to frozen samples
Even though mechanical disruption of the samples used in our extraction method could damage the integrity of large DNA molecules, we believe that storage conditions, more than directly degrade DNA during storage period or the extraction step, dysregulate cellular compartments and activate enzymatic activities (i.e. nucleases). Further studies could be designed in order to test the effect of different extraction methods including mechanical or non-mechanical disruption on DNA integrity.
Effect of storage conditions on microbial diversity
Although storage conditions of stool samples greatly affected the integrity of bacterial DNA, this observation did not demonstrate an impediment for metagenomic analyses. In order to verify this extreme, we examined to which extent storage conditions could bias intestinal microbial composition. By using the genomic DNA extracted from the 24 samples obtained from the 4 above cited volunteers (#1, #2, #3 and #4), we PCR-amplified the V4 region of the 16S rRNA gene and sequenced the products using a GS FLX 454 pyrosequencer. We obtained a total of 127,275 high quality sequences, which we then analyzed using the Qiime pipeline to determine and compare the microbial diversity.
We validated the presence of a bacterial species or taxon when its abundance was higher than 0.2% in at least one sample. Accordingly, we identified a total of 188 taxa after validating an average of 3,400 sequences and 114 taxa per sample (see Additional file 1: Table S1). These 188 species classified into 48 genera and 4 phyla as follows: Firmicutes (48%), Bacteroidetes (46%), Actinobacteria (5%) and Proteobacteria (1%).
Taxonomic comparison for 3 main bacterial taxa between frozen and unfrozen samples
pvalue F vs UF1h
pvalue F vs UF3h
Prevotellaceae;uncultured;human gut metagenome
Taxonomic comparison for 3 main bacterial taxa between frozen and RT samples
pvalue F vs RT3h
pvalue F vs RT24h
pvalue F vs RT2w
Prevotellaceae;uncultured;human gut metagenome
To further compare the 24 samples, we used the weighted Unifrac UPGMA method to build a clustering tree. The result showed that frozen samples, 3 h and 24 h room temperature samples tend to cluster together and far from the defrosted and 2 weeks room temperature samples (figure 2C). This analysis also indicated that, under these later conditions, intra-individual variability became higher than inter-individual one.
The above analyses on the effect of storage conditions on microbial diversity corroborate previous observations showing a relative stable community composition when stool samples are kept up to 24 h at room temperature . However, our study reveals that under more prolonged conditions (i.e. 2 weeks room temperature) or by changing temperature (i.e. unfreezing samples during only 1 or 3 h), the relative abundances of most taxa can be greatly altered in the bacterial community.
Effect of storage conditions on total RNA
In all the conditions tested, the amount of RNA extracted was above 30 μg per 250 mg of stool, which is adequate for downstream analyses such as qRT-PCR and microarray experiments. When samples were immediately frozen after collection, extracted RNA had average RIN numbers above the value 7, which is the threshold acceptable for conducting metatranscriptomic studies [17, 18].
However, unfreezing these samples during 1 h or 3 h before starting RNA extraction produced a strong RNA degradation, as illustrated in figure 1A by the fading of the 23S rRNA band and the appearance of numerous bands below the 16S rRNA. Decrease of the RIN numbers was significant after thawing samples for 1 h (p = 0.006, Wilcoxon paired test) and 3 h (p = 0.004, Wilcoxon paired test) compared to frozen samples. Conversely, when samples were kept at room temperature during few hours (3 h to 24 h) rather than immediately frozen after collection, total RNA extracted did not show signs of fragmentation and average RIN numbers were above 7. Longer storage periods at room temperature (more than 24 h) produced a progressive fragmentation of the RNA. Indeed, decrease in RIN number became significant when samples were kept at room temperature during 48 h (p = 0.036, Wilcoxon paired test). Finally, when samples were kept at room temperature in RNAse inhibitor solution, they showed less signs of fragmentation even after 4 weeks (figure 3A). In these conditions, however, there was a large RIN number variability among individuals (figure 1B).
Thus, our results indicate that the best storing condition to extract high quality RNA for metatranscriptomic analyses is to keep the stool samples at room (or low) temperature no more than few hours (< 24 h) after collection. Alternatively, samples can be kept at −20°C for longer periods as long as defrosting is prevented until the extraction of RNA starts in the laboratory. The RIN variability observed in samples mixed with RNA inhibitor could reflect an insufficient homogenization of hard stools (type 1 or 2 in the Bristol scale). Although the subjects could be asked to mix more thoroughly their stool after collection, this requirement is difficult to monitor. Therefore, the use of RNAse inhibitors may not be the best choice for semi or large-scale studies.
Our study, although under a context of a small sampling size and other limiting parameters, suggests that storage conditions of stool samples can largely affect the integrity of extracted DNA and RNA and the composition of their microbial community. In light of our observations, our recommendation for semi or large-scale metagenomic and metatranscriptomic projects is to keep the samples at room temperature and to bring them in the laboratory within the initial 24 hours after collection. Alternatively, if bringing the samples during this period is not possible, samples should be stored immediately at −20°C in a home freezer. In this case, samples need to be transported afterwards in freezer packs to ensure that they do not defrost at any time. Mixing the samples with RNAse inhibitors and keeping them at home for longer periods of time (days) is not recommended since proper homogenization of the stool is difficult to monitor outside the laboratory.
Fecal samples were collected from healthy volunteers (n = 11), who did not receive antibiotics within the last three months. Samples were stored following 3 different procedures, which took into account volunteer’s compliance. In the first procedure, before being frozen at −80°C, each sample was kept at room temperature (RT) during different time periods (3 h, 24 h, 48 h, 72 h and 14 days). Time points before 3 h were not applicable, since volunteers needed this time to bring the samples from home to the laboratory. In the second protocol, samples were immediately frozen by the volunteers at their home freezer at −20°C and later were brought at the laboratory in a freezer pack, where they were immediately stored at −80°C. In order to test the effect of freezing and thawing episodes, some aliquots were defrosted during 1 h and 3 h before being stored at −80°C. In the third protocol, some volunteers agreed to collect their samples in tubes containing the RNAse inhibitor RNA Later® (Ambion) as indicated by the manufacturer instructions. The tubes were kept at room temperature during different time periods (3 h, 24 h, 14 days and 1 month) before RNA extraction. The protocol was approved by the Ethics Committee of the Vall d´Hebron University Hospital and all participants gave informed consent.
Assessing the quantity and quality of total RNA
For total RNA extraction, we modified the protocol described in Zoetendal et al. , which utilizes 15 g of fecal sample. Briefly, 200 mg of fecal sample were mixed with 500 μl TE buffer, 0.8 g Zirconia/silica Beads, 50 μl SDS 10% solution, 50 μl sodium acetate and 500 μl acid phenol. Physical disruption was conducted using a FastPrep apparatus. Following centrifugation of the lysate, nucleic acids were recovered from the aqueous phase and re-extracted with chloroform. DNA was selectively digested and the RNA was purified by using the RNeasy® mini kit (Qiagen) as described in the manufacturer instructions. A detailed protocol is provided in the supplementary information (See Additional file 3: Supplementary Methods).
An equivalent of 1 mg of each fecal sample was used for RNA quantification using a NanoDrop ND-1000 Spectrophotometer (Nucliber). The RNA was then examined by microcapillary electrophoresis using an Agilent 2100 Bioanalyzer with the RNA 6000 Nano Kit. The RNA quality was determined by the RNA integrity number (RIN), which is calculated from the relative height and area of the 16S and 23S RNA peaks and follows a numbering system from 1 to 10, being 1 the most degraded profile and 10 the most intact [14, 19].
Assessing the quantity and quality of genomic DNA
Aliquots (250 mg) of each fecal sample were suspended in 0.1 M Tris (pH 7.5), 250 μl of 4 M guanidine thiocyanate and 40 μl of 10% N-lauroyl sarcosine. DNA extraction was conducted by mechanical disruption of the microbial cells with glass beads and recovery of nucleic acids from clear lysates by alcohol precipitation, as previously described in Godon et al. . An equivalent of 1 mg of each fecal sample was used for DNA quantification using a NanoDrop ND-1000 Spectrophotometer (Nucliber). DNA integrity was examined by microcapillary electrophoresis using an Agilent 2100 Bioanalyzer with the DNA 12,000 kit, which resolves the distribution of double-stranded DNA fragments up to 17,000 bp in length.
Assessment of microbial composition through 16 S rRNA gene survey
In order to analyze bacterial composition, the V4 hypervariable region of the 16 S rRNA gene was amplified from the genomic DNA extracted from fecal samples by using two universal primers: V4F_517_17 (5’-GCCAGCAGCCGCGGTAA-3’)  and V4R_805_19 (5’-GACTACCAGGGTATCTAAT-3’) . Multiplex identifiers (MIDs), which were used to perform tag pyrosequencing, were included upstream the forward primer sequence (V4F_517_17). PCR amplification was run in a Mastercycler gradient (Eppendorf) at 94°C for 2 min, followed by 35 cycles of 94°C for 30 sec, 56°C for 20 sec, 72°C for 40 sec, and a final cycle of 72°C for 7 min. PCR products were purified using PCR Purification kit (Qiagen, Spain) and subsequently sequenced on a 454 Life Sciences (Roche) Genome Sequencer FLX platform (UCTS, Hospital Vall d’Hebron, Barcelona, Spain).
Sequence analyses were performed using the Qiime pipeline . Sequences were deposited in Genbank (Genbank: SRA055900). Uclust  was used to cluster sequences into OTUs (Operational Taxonomic Unit, taxa or species) at 97% sequence identity. Representative sequences for each OTU were aligned using PyNast against Silva 108 release database and taxonomy was assigned to the OTUs detected using blast and the Silva 108 release taxa mapping file. The results were summarized as the number of times an OTU was found in each sample and the taxonomic prediction for each OTU.
For beta diversity analysis we sub-sampled to 3080 sequences per sample to remove sequencing depth bias. A distance matrix was built based on weighted UniFrac method  and hierarchical cluster tree was built using UPGMA (unweighted pair group method with arithmetic mean).
The Kolmogorov-Smirnov test was used to check the normality of data distribution. Comparisons of parametric normally distributed data were made by the Student’s test, paired tests for intra-group comparisons and unpaired tests for inter-group comparisons; otherwise the Wilcoxon signed rank test was used for paired data, and the Mann–Whitney U test for unpaired data. When dataset was small (n<5), we performed a Poisson regression model analysis using the function glm (Generalized Linear model) of R with the following formula [glm(formula = z ~ group + pair, family = poisson)]. This model is appropriate for modeling paired count data. P values < 0.05 were referred as significant.
We thank Ricardo Gonzalo, Francisca Gallego, Rosario M Prieto from the Scientific and Technical Support Unit (STSU) for their technical assistance. This work was supported by the FIS PI10/00902 grant (Ministerio de Ciencia e Innovacion, Spain) and the European Community’s Seventh Framework Programme (FP7/2007-2013): International Human Microbiome Standards (IHMS), grant agreement HEALTH.2010.2.1.1-2. Ciberehd is funded by the Instituto de Salud Carlos III (Spain).
- Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R: Bacterial community variation in human body habitats across space and time. Science. 2009, 326 (5960): 1694-1697. 10.1126/science.1177486.PubMedPubMed CentralView ArticleGoogle Scholar
- Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T, Mende DR, Li J, Xu J, Li S, Li D, Cao J, Wang B, Liang H, Zheng H, Xie Y, Tap J, Lepage P, Bertalan M, Batto JM, Hansen T, Le Paslier D, Linneberg A, Nielsen HB, Pelletier E, Renault P, et al: A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010, 464 (7285): 59-65. 10.1038/nature08821.PubMedPubMed CentralView ArticleGoogle Scholar
- Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Fernandes GR, Tap J, Bruls T, Batto JM, Bertalan M, Borruel N, Casellas F, Fernandez L, Gautier L, Hansen T, Hattori M, Hayashi T, Kleerebezem M, Kurokawa K, Leclerc M, Levenez F, Manichanh C, Nielsen HB, Nielsen T, Pons N, Poulain J, Qin J, Sicheritz-Ponten T, Tims S, et al: Enterotypes of the human gut microbiome. Nature. 2011, 473 (7346): 174-180. 10.1038/nature09944.PubMedPubMed CentralView ArticleGoogle Scholar
- Gosalbes MJ, Durban A, Pignatelli M, Abellan JJ, Jimenez-Hernandez N, Perez-Cobas AE, Latorre A, Moya A: Metatranscriptomic approach to analyze the functional human gut microbiota. PLoS One. 2011, 6 (3): e17447-10.1371/journal.pone.0017447.PubMedPubMed CentralView ArticleGoogle Scholar
- Dolfing J, Vos A, Bloem J, Ehlert PA, Naumova NB, Kuikman PJ: Microbial diversity in archived soils. Science. 2004, 306 (5697): 813-PubMedView ArticleGoogle Scholar
- Klammer S, Mondini C, Insam H: Microbial community fingerprints of composts stored under different conditions. Ann Microbiol. 2005, 55: 299-305.Google Scholar
- Roesch LF, Casella G, Simell O, Krischer J, Wasserfall CH, Schatz D, Atkinson MA, Neu J, Triplett EW: Influence of fecal sample storage on bacterial community diversity. Open Microbiol J. 2009, 3: 40-46. 10.2174/1874285800903010040.PubMedPubMed CentralView ArticleGoogle Scholar
- Lauber CL, Zhou N, Gordon JI, Knight R, Fierer N: Effect of storage conditions on the assessment of bacterial community structure in soil and human-associated samples. FEMS Microbiol Lett. 2010, 307 (1): 80-86. 10.1111/j.1574-6968.2010.01965.x.PubMedPubMed CentralView ArticleGoogle Scholar
- Bertrand H, Poly F, Van VT, Lombard N, Nalin R, Vogel TM, Simonet P: High molecular weight DNA recovery from soils prerequisite for biotechnological metagenomic library construction. J Microbiol Methods. 2005, 62 (1): 1-11. 10.1016/j.mimet.2005.01.003.PubMedView ArticleGoogle Scholar
- Liles MR, Williamson LL, Rodbumrer J, Torsvik V, Parsley LC, Goodman RM, Handelsman J: Isolation and cloning of high-molecular-weight metagenomic DNA from soil microorganisms. Cold Spring Harb Protoc 2009. 2009, 8: pdb.prot5271-Google Scholar
- Reigstad LJ, Bartossek R, Schleper C: Preparation of high-molecular weight DNA and metagenomic libraries from soils and hot springs. Methods Enzymol. 2011, 496: 319-344.PubMedView ArticleGoogle Scholar
- Gloux K, Berteau O, El Oumami H, Beguet F, Leclerc M, Dore J: A metagenomic beta-glucuronidase uncovers a core adaptive function of the human intestinal microbiome. Proc Natl Acad Sci U S A. 2011, 108 (Suppl 1): 4539-4546.PubMedPubMed CentralView ArticleGoogle Scholar
- Lakhdari O, Cultrone A, Tap J, Gloux K, Bernard F, Ehrlich SD, Lefevre F, Dore J, Blottiere HM: Functional metagenomics: a high throughput screening method to decipher microbiota-driven NF-kappaB modulation in the human gut. PLoS One. 2010, 5 (9):Google Scholar
- Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T: The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol. 2006, 7: 3-10.1186/1471-2199-7-3.PubMedPubMed CentralView ArticleGoogle Scholar
- Zoetendal EG, Booijink CC, Klaassens ES, Heilig HG, Kleerebezem M, Smidt H, de Vos WM: Isolation of RNA from bacterial samples of the human gastrointestinal tract. Nat Protoc. 2006, 1 (2): 954-959. 10.1038/nprot.2006.143.PubMedView ArticleGoogle Scholar
- Wang P, Qi M, Barboza P, Leigh MB, Ungerfeld E, Selinger LB, McAllister TA, Forster RJ: Isolation of high-quality total RNA from rumen anaerobic bacteria and fungi, and subsequent detection of glycoside hydrolases. Can J Microbiol. 2011, 57 (7): 590-598. 10.1139/w11-048.PubMedView ArticleGoogle Scholar
- Fleige S, Pfaffl MW: RNA integrity and the effect on the real-time qRT-PCR performance. Mol Aspects Med. 2006, 27 (2–3): 126-139.PubMedView ArticleGoogle Scholar
- Strand C, Enell J, Hedenfalk I, Ferno M: RNA quality in frozen breast cancer samples and the influence on gene expression analysis–a comparison of three evaluation methods using microcapillary electrophoresis traces. BMC Mol Biol. 2007, 8: 38-10.1186/1471-2199-8-38.PubMedPubMed CentralView ArticleGoogle Scholar
- Imbeaud S, Graudens E, Boulanger V, Barlet X, Zaborski P, Eveno E, Mueller O, Schroeder A, Auffray C: Towards standardization of RNA quality assessment using user-independent classifiers of microcapillary electrophoresis traces. Nucleic Acids Res. 2005, 33 (6): e56-10.1093/nar/gni054.PubMedPubMed CentralView ArticleGoogle Scholar
- Godon JJ, Zumstein E, Dabert P, Habouzit F, Moletta R: Molecular microbial diversity of an anaerobic digestor as determined by small-subunit rDNA sequence analysis. Appl Environ Microbiol. 1997, 63 (7): 2802-2813.PubMedPubMed CentralGoogle Scholar
- Wilmotte A, Van der Auwera G, De Wachter R: Structure of the 16S ribosomal RNA of the thermophilic cyanobacterium Chlorogloeopsis HTF ('Mastigocladus laminosus HTF') strain PCC7518, and phylogenetic analysis. FEBS Lett. 1993, 317 (1–2): 96-100.PubMedView ArticleGoogle Scholar
- Dalby AB, Frank DN, St Amand AL, Bendele AM, Pace NR: Culture-independent analysis of indomethacin-induced alterations in the rat gastrointestinal microbiota. Appl Environ Microbiol. 2006, 72 (10): 6707-6715. 10.1128/AEM.00378-06.PubMedPubMed CentralView ArticleGoogle Scholar
- Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R: QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010, 7 (5): 335-336. 10.1038/nmeth.f.303.PubMedPubMed CentralView ArticleGoogle Scholar
- Edgar RC: Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010, 26 (19): 2460-2461. 10.1093/bioinformatics/btq461.PubMedView ArticleGoogle Scholar
- Lozupone C, Hamady M, Knight R: UniFrac–an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics. 2006, 7: 371-10.1186/1471-2105-7-371.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.