High taxonomic level fingerprint of the human intestinal microbiota by Ligase Detection Reaction - Universal Array approach

Background Affecting the core functional microbiome, peculiar high level taxonomic unbalances of the human intestinal microbiota have been recently associated with specific diseases, such as obesity, inflammatory bowel diseases, and intestinal inflammation. Results In order to specifically monitor microbiota unbalances that impact human physiology, here we develop and validate an original DNA-microarray (HTF-Microbi.Array) for the high taxonomic level fingerprint of the human intestinal microbiota. Based on the Ligase Detection Reaction-Universal Array (LDR-UA) approach, the HTF-Microbi.Array enables specific detection and approximate relative quantification of 16S rRNAs from 30 phylogenetically related groups of the human intestinal microbiota. The HTF-Microbi.Array was used in a pilot study of the faecal microbiota of eight young adults. Cluster analysis revealed the good reproducibility of the high level taxonomic microbiota fingerprint obtained for each of the subject. Conclusion The HTF-Microbi.Array is a fast and sensitive tool for the high taxonomic level fingerprint of the human intestinal microbiota in terms of presence/absence of the principal groups. Moreover, analysis of the relative fluorescence intensity for each probe pair of our LDR-UA platform can provide estimation of the relative abundance of the microbial target groups within each samples. Focusing the phylogenetic resolution at division, order and cluster levels, the HTF-Microbi.Array is blind with respect to the inter-individual variability at the species level.


Background
Human beings have been recently reconsidered as superorganisms in co-evolution with an immense microbial community living in the gastrointestinal tract (GIT), the human intestinal microbiota [1,2]. Providing important metabolic functions that we have not evolved by our own [3], the intestinal microbiota has a fundamental role for the human health and well being [4,5]. Several of our physiological features, such as nutrient processing, maturation of the immune system, pathogen resistance, and development of the intestinal architecture, strictly depend on the mutualistic symbiotic relationship with the intestinal microbiota [6]. On the basis of its global impact on human physiology, the intestinal microbiota has been considered an essential organ of the human body [7].
The composition of the adult intestinal microbiota has been determined in three large scale 16S rRNA sequences surveys [7][8][9][10][11]. The phylogenetic analysis of a total of 45,000 bacterial 16S rRNA data from 139 adults revealed that, at the phylum level, only a small fraction of the known bacterial diversity is represented in our GIT. The vast majority of bacteria in the human intestinal microbiota (>99%) belongs to six bacterial phyla: Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria, Fusobacteria and Verrucomicrobia. The two dominant divisions are Firmicutes and Bacteroidetes, which represent together up to 90% of the total microbiota, with a relative abundance of 65% and 25%, respectively. Actinobacteria, Proteobacteria, Verrucomicrobia and Fusobacteria are the subdominants phyla with a relative abundance up to 5, 8, 2 and 1%, respectively. On the contrary, at lower taxonomic levels, we assist to a real explosion of the bacterial diversity in the human GIT. At least 1,800 genera [≥ 90% of sequence identity (ID)] and 16,000 phylotypes at the species level (≥ 97% ID) have been identified until now, predicting even a greater diversity at the species level [8]. Since 70% of these phylotypes are subject-specific, and no phylotype is present at more than 0.5% abundance in all subjects [12], the intestinal microbiota of each individual has been shown to consist in a subject specific complement of hundreds of genera and thousands of species. However, the large degree of functional redundancy between species and genera allowed identifying a core microbiome at the gene level which is shared between all individuals [12]. Coding for genes involved in important metabolic functions, this core functional microbiome is fundamental to support the mutualistic symbiotic relationship with the human host.
Recently, 16S rRNA sequences studies have been carried out with the attempt to describe disease-associated unbalances of the human intestinal microbiota. Even though species variability was associated with inter-individual variability, phylum-level changes of the intestinal microbiota were associated with specific diseases. In particular, obesity was characterized by a higher proportion of Firmicutes and Actinobacteria with respect to Bacteroidetes and an overall reduced bacterial diversity [12,13]. Differently, inflammatory bowel diseases (IBD) were characterized by a marked reduction of bacterial diversity in the Clostridium cluster IV and XIVa belonging to Firmicutes, a decline in Bacteroidetes biodiversity, and a correspondent increase in Proteobacteria and Bacillus [14,15]. Analogously, intestinal inflammation has been generally related with a marked increase in Enterobacteriaceae and a correspondent decrease in members of the resident colonic bacteria [16,17]. In the light of these findings, it has been recently hypothesized that these high level taxonomic unbalances of the human intestinal microbiota can cause deviations from the core functional microbiome with a final impact on the host physiological state [12,18,19].
Since more than 75% of the phylotypes detected in the human GIT does not correspond to cultured species [20], phylogenetic DNA-microarrays have been recognized as a valuable tool for a high-throughput, quantitative and systematic analysis of the human intestinal microbiota [21]. Recently, three different small ribosomal subunit RNA (SSU rRNA) based high-density phylogenetic microarrays for studying the human microbiota have been developed [22][23][24]. Targeting thousands bacterial phylo-types, these DNA-microarrays have been successfully applied in studies for the deep phylogenetic characterization of the human intestinal microbiota.
In order to specifically monitor the microbiota unbalances that impact on human physiology independently of the inter-individual variability, here we developed an original DNA-microarray for the high taxonomic level fingerprint of the human intestinal microbiota, called HTF-Microbi.Array (High Taxonomic Fingerprint Microbiota Array). The relatively low number of targets allowed implementing the Ligase Detection Reaction (LDR) technology [25,26] for the development of the HTF-Microbi.Array. This enzymatic in vitro reaction, based on the discriminative properties of the DNA ligation enzyme, requires the design of a pair of two adjacent oligonucleotides specific for each target sequence: a probe specific for the variation (called "Discriminating Probe", or DS) which carries a 5'-fluorescent label, and a second probe, named "Common Probe" (or CP), starting one base 3'-downstream of the DS that carries a 5'-phosphate group and a unique sequence named cZipCode at its 3'-end. The oligonucleotide probe pairs and a thermostable DNA ligase are used in a LDR reaction with previously PCR-amplified DNA fragments. This reaction is cycled to increase product yield. The LDR products, obtained only in presence of a perfectly matching template by action of the DNA ligase, are addressed to a precise location onto a Universal Array (UA), where a set of artificial sequences, called Zip-codes are arranged. These products carry both the fluorescent label and a unique cZipCode sequence and can be detected by laser scanning and identified according to their location within the array. The LDR approach is a highly specific and sensitive assay for detecting single nucleotide variations; thus, differences of a single base along the 16S rRNA gene can be employed to distinguish among different microbial lineages. The HTF-Microbi.Array was successfully tested in a pilot study for the characterization of the faecal microbiota of eight healthy young adults.

Target selection and probe design
The rational selection of the HTF-Microbi.Array targets was carried out using a phylogenetic approach. To this aim we implemented the 16S rRNA database of the ARB Project (release February, 2005) with the 16S rRNA gene database of the RDP available at the time and a phylogenetic tree was constructed. Based on the tree nodes, 30 phylogenetical groups of the human intestinal microbiota were rationally selected as the target group for the HTF-Microbi.Array (Additional file 1). In Fig. 1 we report the phylogenetic tree of the 16S rRNA sequences of the HTF-Microbi.Array positive set. The selected groups belonged to different phylogenetic levels (species, genus, family, cluster, or group of species indicated by the warding "et rel."). The entire list of the array targets is represented in Table 1. For part of the division Firmicutes, the target For each node we report the number of sequences used from our ARB 16S rRNA sequence database. The triangles dimension is proportional to the number of sequences clustered together. The phylogenetic tree was obtained by using the neighbour-joining algorithm for the sequence alignment in ARB software. Based on an original phylogenetic design, the entire probe set of the HTF-Microbi.Array cover up to 95% of the bacterial groups belonging to the human intestinal microbiota [28]. Specificity and coverage of each candidate probe was assessed by using the tool Probe Match of the RDP database. The probe pairs selected for the HTF-Microbi.Array were required to perfectly match the sequences of the positive set and to possess at least a mismatch at the 3' end of the discriminating probe respect to the entire negative set. The designed probes pairs had an average melting temperature (Tm) of 67.8 ± 0.9°C (n = 60) and an average length of 35.6 ± 4.9 nucleotides. Sixteen out of the 30 probe pairs were characterized by having no degenerated bases, whereas only one probe pair (i.e. the one for Clostridium cluster I and II) had 4 and 3 ambiguous bases on DS and CP, respectively (Additional file 2).

LDR probe pair specificity
The specificity of the designed LDR probe pairs was tested by using 16S rRNA PCR amplicons from 28 microorganisms members of the human intestinal microbiota. Amplicons were prepared by amplification of genomic DNA extracted from DSMZ cultures or genomic DNA from ATCC collection. Proving the specificity of the HTF-Microbi.Array all the 16S rRNA amplicons were properly recognized in separate LDR hybridization reactions with the entire probe set of the array. Two replicated independent LDR-UA experiments were performed with an optimal reproducibility (Additional file 3). For each of the 16S rRNA template only group-specific spots, and spots corresponding to the hybridization controls showed positive signals (P < 0.01) ( Table 2). As a negative control, we performed two independent PCR-LDR-UA experiments using double distilled water, instead of genomic DNA, as sample. As expected, no positive signal was detected. The ratio between the signal intensities of the specific probes and the blank intensity (SNR s ) averaged 206.9 ± 185.7, whereas the ratio between all the other probes and the blank intensity (SNR ns ) averaged 2.1 ± 1.4. Therefore, the ratio between specific and non-specific probes resulted more than 100 fold on average.

Evaluation of the LDR sensitivity and relative abundance detection level
In order to define the detection limits of the HTF-Microbi.Array, LDR-UA experiments were carried out with different concentrations of an artificial mix of 16S rRNA amplicons from 6 members of the human intestinal microbiota. The 16S rRNA amplicons from Bacillus cereus, Lactobacillus casei, Bifidobacterium adolescentis, Ruminococcus albus, Prevotella, Y. enterocolitica were all specifically recognized in a range of concentrations from 0.7 to 75 fmol (P < 0.01), demonstrating the high sensitivity and specificity of the array (Fig. 2) For each probe is indicated the spot number, the phylogenetic level, the phylogeny of the target group, the ecology in the gastrointestinal ecosystem [mutualistic (M), opportunistic (O), pathogen (P)]. The relative abundance in a healthy gut ecosystem of the principal microbial groups is also indicated.

Discussion
In these last years, 16S rRNA microarrays emerged as a sensitive and efficient way to screen complex bacterial communities. Here we describe and validate the HTF-Microbi.Array, a new phylogenetic DNA microarray designed for the high taxonomic level fingerprint of the human intestinal microbial community. The HTF-Microbi.Array is based on the LDR-UA approach, which is a fast and sensitive tool for the characterization of complex microbial communities with high sensitivity and specificity [25,26]. The use of this molecular technique allows overcoming the major limitations of DNA microarrays whose discriminative power is based on hybridization. In fact, a) optimization of the hybridiza-   tion conditions for each probe set is not required; b) problems due to the secondary structures of the target DNA are minimized, c) steric hindrances of differentially sized nucleic acid hybrids formed on the array after the hybridization are decreased [29]. The final probe set of the HTF-Microbi.Array allows a high taxonomic level fingerprint of the human intestinal microbiota, with a good coverage of the major and minor components, as well as some of the most important pathogens and opportunistic bacteria [30]. The LDR probes were designed by choosing DS oligonucleotides whose 3'end allowed the perfect discrimination of the target species from the nontarget ones on the basis of our 16S rRNA sequence database. Definition of accurate and specific negative sets of gut microbiota sequences by ORMA tool [31] allowed the selection of maximally discriminative probe pairs. Probe specificity was confirmed on the entire known 16S rRNA gene sequences environment by the RDP Probe Match tool. This requirement is fundamental, since the primer set used for the PCR amplification was the "universal" 16S rRNA primer set designed by Edwards and co-workers [32].
The HTF-Microbi.Array recognized without ambiguity the 16S rRNA amplicons obtained from 28 members of the intestinal microbiota belonging to Bacteroides/Prevotella, Clostridium clusters IV, IX, XIVa, XI, I and II, Bifidobacteriaceae, Lactobacillaceae, Bacillus, Enterococcus, Enterobacteriaceae and Campylobacter, demonstrating the specificity of all the probe pairs. The sensitivity of the HTF-Microbi.Array was evaluated by using different concentrations of an artificial mix of 16S rRNA amplicons obtained from 6 microorganisms members of the human intestinal microbiota. To compensate the eventual drop in the signal due to a very low target concentrations, lower than 0.7 fmol (i.e. a percentage lower than 1.5% of the commonly used quantity of 50 fmol), a slightly relaxed criteria for significance of the t-test to α = 0.05 was chosen. All PCR products were specifically recognized in a concentration range from 75 to 0.7 fmol, showing high array sensitivity. The efficiency of the HTF-Microbi.Array in the detection of a particular target in a complex DNA environment was also determined. According to our data, the array is able to detect a specific DNA target down to 0.02% of the total 16S rRNA, which  is comparable to the values obtained by Rajilic-Stojanovic et al. [23] and Palmer et al. [21]. Thus the HTF-Microbi.Array shows the potentiality to sense low abundant species of the gastrointestinal microbiota, enabling the detection of the 16S rRNA of a peculiar target group present at a fractional abundance <0.1% in an artificial mixture.
The HTF-Microbi.Array was used in a pilot study to characterize the faecal microbiota of eight young adults. Faecal microbiota was chosen as DNA source since sample collection is not invasive, samples contain large amount of microbes, and, most important, it is representative of interpersonal differences in distal gut microbial ecology [33]. In order to have a good representation of the less abundant species of the intestinal microbial community, LDR reactions were performed starting from 50 fmol of PCR product. Cluster analysis of the presenceabsence probes profiles enabled the identification of a reproducible high taxonomic level microbiota fingerprint for each subject. As expected, the intestinal microbial community of the voluntaries in the study resembled the typical fingerprint of healthy adults [28]. According to our data, the faecal microbiota of the enrolled subjects was dominated by major mutualistic symbionts. In fact, members of Bacteroidetes, Clostridium clusters IV, IX and XIVa were all represented in 100% of the subjects. On the other hand, minor mutualistic symbionts, such as Lactobacillaceae, B. subtilis et re., Fusobacterium and Cyanobacteria, were detected in 55, 37, 50, and 63% of the subjects, respectively. Opportunistic pathogens, such as E. faecalis et rel., members of the Clostridium cluster I and II and Enterobacteriaceae, were represented only in 43, 25 and 12% of the subjects, respectively. Most importantly, enteropathogens such as, C. difficile, C. perfringens, E. faecium et rel., B. cereus et rel., and Campylobacter were never detected. A discrepancy between our data and the literature is the relatively low prevalence of the health promoting Bifidobacteriaceae in our samples (only 13% of samples). However, the low prevalence of bifidobacteria is a typical bias for several phylogenetic DNA microarrays [22,23]. Probably this is due to the intrinsic low efficiency of amplification of the bifidobacterial genome with universal primer sets for the 16S rRNA gene [8]. Surprisingly, a high prevalence was obtained for the minor mutualistic symbiont B. clausii et rel., 100% of samples, and the opportunistic pathogen Proteus, 50% of samples. For each subject the relative IF contributions of the probes were calculated, obtaining an approximate evaluation of the relative abundance of the principal microbial groups of the faecal microbiota. In general agreement with previous metagenomic studies [7][8][9][10][11] and SSU rRNA phylogenetic microarray investigations [22,23], mutualistic symbionts such as Bacteroidetes, Clostridium clusters IV, IX and XIVa largely dominated the faecal microbiota, contributing for the 65 to 80% of total microbiota, depending on the subject. Differently, with an overall contribution ranging from 10 to 30%, minor mutualistic symbionts such as B. clausii et rel., Bifidobacteriaceae, Lactobacillaceae, B. subtilis et rel., Fusobacterium, and Cyanobacteria were largely subdominant. Opportunistic pathogens represented only a small fraction of the intestinal microbiota. Even if subjects under study show a common trend when the ratio between the relative IF of major, minor and opportunistic components were considered, differences in the relative IF contribution of single probes were detectable and subject specific profiles were identified. For instance, subject n. 1 showed a higher relative fluorescence for probes targeting major mutualistic symbionts and a lower relative fluorescence for minor mutualistic symbionts and opportunistic pathogens than subjects n. 4 and 15. On the other hand subjects n. 15 and 17 were characterized by a lower ratio Bacteroidetes/Firmicutes with respect to all the other subjects. It is tempting to hypothesize that differences in relative IF contribution within samples could represent an approximation of differences in relative abundances of the targeted groups in the faecal microbiota. However, caution must be taken when microarray based methods for the relative quantification of bacterial groups in complex microbial communities are used. In fact, biases are introduced at several levels of the experimental procedure: DNA extraction and purification, PCR amplification of the 16S rRNA gene, and interspecies variation of the rRNA gene copy number [21].

Conclusion
The HTF-Microbi.Array has been revealed a fast and sensitive tool for the high taxonomic level fingerprint of the human intestinal microbiota in terms of presence/ absence of the principal groups. Since the flexibility of the universal array platform allow the addition of new probe pairs without a further optimization of the hybridization conditions [25,26], the HTF-Microbi.Array can be easy implemented with the addition of new probe pairs targeting emerging microbial groups of the human intestinal microbiota, such as, for instance, the mucin degrading bacterium Akkermansia muciniphila [34]. The evaluation of the relative abundance of the target groups on the bases of the relative IF probes response still has some hindrances. However, considered all the possible biases (i.e. DNA extraction/purification, PCR, copy number variations, etc.) typical of the microarray technology, analysis of IFs from our LDR-UA platform can be useful in the estimation of the relative abundance of the targets groups within each sample. Focusing the phylogenetic resolution at division, order and cluster levels, the HTF-Microbi.Array results blind with respect to the inter-individual variability at the species level. Its potential to char-acterize the high order taxonomic unbalances of the human intestinal microbiota associated with specific diseases will be assessed in further studies.

Recruitment
Eight healthy Italian individuals of 30 years old were enrolled for the study. None of the subjects had dietary restrictions except for antibiotics, probiotics and functional foods for at least 4 weeks prior to sampling. None of the selected subjects had a history of gastrointestinal disorders at the time of sampling. The study protocol was approved by the Ethical committee of Sant'Orsola-Malpighi Hospital (Bologna, Italy) and an informed consent was obtained from each enrolled subject. Faeces were collected for each subject and stored at -20°C.

Target selection and consensus extraction
A database of 16S rRNA sequences was created by integration of the 16S rRNA database of the ARB Project (release February, 2005) (http://www.arb-home.de; [35]) with the database of the Ribosomal Database Project (RDP; release September, 2007) (http://rdp.cme.msu.edu/ ; [36,37]). A phylogenetic tree was obtained in the ARB software, by using the neighbour-joining algorithm for the sequence alignment. The tree was used for the rational selection of phylogenetically related groups of bacteria belonging to the human intestinal microbiota which correspond to nodes of the phylogenetic tree (Additional file 1). Group specific consensus sequences were extracted, with a cut-off of 75% for base calling. Nucleotides which occurred at lower frequencies were replaced by the appropriate IUPAC ambiguity code.

Probe design
Multiple alignment step of the selected sequences was performed in ClustalW [38]. Since the taxonomic classification of the 30 groups selected for the probe design varied from species to phylum level, careful grouping of the sequences was performed for the multiple alignment step: (a) for higher level probes, only family/phylum consensus sequences were used as a negative set for probe design; (b) for genus/species level probes, only sequences belonging to other families/phyla were selected. All the LDR probe pairs were designed using ORMA [31]. Both DS and CP were required to be between 25 and 60 bases pair, with a Tm of 68 ± 1°C, and with maximum 4 degenerated bases. In-silico check versus a publicly available database (i.e.: RDP) was then performed for assessing probe pair specificity.

DNA extraction
Total DNA was extracted from 10 9 bacterial cells by using the DNeasy Tissue Kit 50 (Quiagen, Düsseldorf, Germany) following the manufacturer instructions. Bacterial DNA was also extracted from lyophilized bacterial cells of the following DSMZ (Braunschweig, Germany) collection strains: Clostridium leptum DSM73, Ruminococcus albus DSM20455, Eubacterium siraeum DSM15700, C. viride DSM6836, Megasphera micrinuciformis DSM17226, Bacillus clausii DSM2515, B. subtilis DSM704, B. cereus DSM21, and Proteus mirabilis DSM4479. Lyophilized bacterial cells were suspended in 1 ml of lysis buffer (500 mM NaCl, 50 mM Tris-HCl pH 8, 50 mM EDTA, 4% SDS) and DNA extraction was carried out by employing the same procedure used for the extraction of genomic DNA from faecal samples, according to the following procedure. Total DNA from faecal material was extracted using QIAamp DNA Stool Min Kit (Qiagen) with a modified protocol. 250 mg of faeces were suspended in 1 ml of lysis buffer. Four 3 mm glass beads and 0.5 g of 0.1 mm zirconia beads were added, and the samples were treated in FastPrep (MP Biomedical, Irvine, CA, USA) at 5.5 ms for 3 min. Samples were heated at 95°C for 15 minutes, then centrifuged for 5 min at full speed to pellet stool particles. Supernatants were collected and 260 μl of 10 M ammonium acetate were added, followed by incubation in ice for 5 min and centrifugation at full speed for 10 min. One volume of isopropanol was added to each supernatant and incubated in ice for 30 min. The precipitated nucleic acids were collected by centrifugation for 15 min at full speed and washed with 70% ethanol. Pellets were resuspended in 100 μl of TE buffer and treated with 2 μl of DNase-free RNase (10 mg/ ml) at 37°C for 15 min. Protein removal by Proteinase K treatment and DNA purification with QIAamp Mini Spin columns were performed following the kit protocol. 200 μl of TE buffer were used for DNA elution. Final DNA concentration was determined by using NanoDrop ND-1000 (NanoDrop Technologies, Wilmington, DE). The bacterial DNA from the following 11 ATCC strains was directly obtained from the ATCC: Bacteroides fragilis ATCC25285, B. thetaiotaomicron ATCC29148, Prevotella melaninogenica ATCC25845, Veilonella parvula ATCC10790, C. difficile ATCCBAA1382, C. acetobutilicum ATCC824, C. perfringens ATCC13124, Enterococcus faecalis ATCC700802, E. faecium ATCC51559, Campylobacter jejuni ATCC33292, R. productus 23340.

Polymerase Chain Reaction (PCR)
All the oligonucleotide primers and probe pairs were synthesized by Thermo Electron (Ulm, Germany). PCR amplifications were performed with Biometra Thermal Cycler II and Biometra Thermal Cycler T Gradient (Biometra, Germany). PCR products were purified by using a Wizard SV gel and PCR clean-up System purification kit (Promega Italia, Milan, Italy), according to the manufacturer's instructions, eluted in 20 μl of sterile water, and quantified with the DNA 7500 LabChip Assay kit and BioAnalyzer 2100 (Agilent Technologies, Palo Alto, CA, USA). 16S rRNA was amplified using universal forward primer 16S27F (5'-AGAGTTTGATCMTGGCT-CAG-3') and reverse primer r1492 (5'-TACGGYTACCT-TGTTACGACTT-3'), following the protocol described in Castiglioni et al. [25] except for using 50 ng of starting DNA and 0.5 U of DNAzyme DNA polymerase II (Finnzymes, Espoo, Finland).

LDR/Universal Array approach
Phenylen-diisothiocyanate (PDITC) activated chitosan glass slides were used as surfaces for the preparation of universal arrays [39], comprising a total of 49 Zip-codes. Hybridization controls (cZip 66 oligonucleotide, complementary to zip 66, 5'-Cy3-GTTACCGCTGGTGCTGC-CGCCGGTA-3') were used to locate the submatrixes during the scanning. The entire experimental procedure for both the chemical treatment and the spotting is described in detail in Consolandi et al. [40]. An overview of the Universal Array layout and ZipCodes is provided as Additional file 6. Ligase Detection Reaction and hybridization of the products on the universal arrays were performed according to the protocol described in Castiglioni et al. [25], except for the probe annealing temperature, set at 60°C.
The LDRs were carried out in a final volume of 20 μl with different quantities of purified PCR products: a) all LDRs for specificity tests were performed on 50 fmol of initial PCR product, for having no issues related to target; b) sensitivity tests were performed with decreasing PCR product concentration from 75 to 0.7 fmol; c) relative abundance tests were performed on 1 fmol E. coli PCR amplicon, mixed with human genomic DNA extracted from whole blood, at decreasing concentrations, from 4%, down to 0.02%; d) LDR experiments on the eight faecal samples were performed on 50 fmol of PCR product.