BMC Microbiology BioMed Central Methodology article BMC 22002,

Background The high degree of sequence heterogeneity found in Hepatitis C virus (HCV) isolates, makes robust nucleic acid-based assays difficult to generate. Polymerase chain reaction based techniques, require efficient and specific sequence recognition. Generation of robust primers capable of recognizing a wide range of isolates is a difficult task. Results A position weight matrix (PWM) and a consensus sequence were built for each region of HCV and subsequently assembled into a whole genome consensus sequence and PWM. For each of the 10 regions, the number of occurrences of each base at a given position was compiled. These counts were converted to frequencies that were used to calculate log odds scores. Using over 100 complete and 14,000 partial HCV genomes from GenBank, a consensus HCV genome sequence was generated along with a PWM reflecting heterogeneity at each position. The PWM was used to identify the most conserved regions for primer design. Conclusions This approach allows rapid identification of conserved regions for robust primer design and is broadly applicable to sets of genomes with all levels of genetic heterogeneity.


Background
The detection, identification, and characterization of bacterial populations is an important goal in analytical microbiology. Culture-independent techniques represent a rapid and flexible mean to study bacterial communities; in fact, the use of 16S rRNAs as molecular marker has became routine for microbial ecologists. The most comprehensive strategy to characterize bacterial populations probably consists in 16S rDNA clones sequencing and phylogenetic reconstruction [1]. However, analysis of individual clones in multiple libraries is expensive and time consuming and therefore not suited to large-scale screenings. Other methods to assess the molecular composition of an environmental DNA sample, such as thermal or denaturing gradient gel electrophoresis (DGGE) [2], single stranded conformational polymorphism (SSCP) [3] heteroduplex analysis [4,5], or terminal restriction fragment (T-RFLP or TRF) analysis [6][7][8][9], are more rapid and therefore amenable to large-scale experiments.
Moreover, the employment of group-specific DNA probes complementary to 16S rRNA has provided a framework to study microbial populations in complex systems. The recent development of the DNA microarray technology has added a high throughput experimental format, potentially with great sensitivity [10][11][12].
In the microarray format, the most commonly used procedure is the differential hybridization of a fluorescently labelled target, often a PCR product, with microarray-immobilized oligonucleotide probes. This method, in order to gain good probe specificity, requires very careful probe design and optimized experimental set up.
Here we present our results using a different approach, that combines polymerase chain reaction and a cycled ligase detection reaction with hybridization on a Universal DNA chip [13,14]. As described by Barany and coworkers, this procedure, based on the discriminative properties of the DNA ligation reaction, requires the design of two adjacent probes specific for each target sequence. One oligo brings a 5' fluorescent label and the other a 3' unique sequence named cZipCode. Ligated fragments, obtained in presence of a perfectly matching template by the action of a DNA ligase, are addressed to the location on the microarray where the ZipCode sequence has been spotted. These fragments carry either the fluorescent label and their unique cZipcode sequence and therefore can be detected by laser scanning of the array and idenfied by their location within the array (Fig. 1A,1B).
This approach presents some advantages. Ligase detection reaction had been shown to be a sensitive assay for detecting Single Nucleotide Polymorphisms [14], therefore a difference in a single nucleotide along the 16S rRNA can be employed to distinguish between sequences of different microorganisms. Moreover the system maintain the positive characteristics of the microarray format without requiring the optimization of the hybridization conditions for each probe set. Using such an approach we targeted the 16S rRNA genomic region using 223 sequences of cyanobacteria, 987 of actinomycetes, 284 of clostridia, 281 of bacilli, 69 of myxobacteria and 270 of pseudomonads, selected from the Ribosomal Database Project to identify consensus sequences for each group. Group-specific consensi were used to design selective molecular probes. These probes in a LDR on DNA from pure bacterial cultures, gave excellent selectivity for the target group and sensitivity down to 10 fmol of amplified 16S DNA.

Sequence analysis of 16S rDNA and ligation probes design
We used the ARB software [www.arb-home.de] to perform the sequence alignment of 16S rDNA. The ARB database we used contains 223 sequences of cyanobacteria, 987 of actinomycetes, 284 of clostridia, 281 of bacilli, 69 of myxobacteria and 270 of pseudomonads. These sequences were aligned and clustered according to their phylogenetic lineages yielding 6 "group-specific" consensus sequences. Then, the 6 group consensi were imported in GCG Omiga 2.0 (Oxford Molecular Ltd.). The Omiga software is a graphically oriented package that permits the identification of "group-specific" nucleotide polymorphisms. Thus, the probes were designed complementary to polymorphic regions on the basis of a final alignment among group-specific consensi. The selection process was conducted in several steps. Firstly, we considered the ligase reaction features. As shown in Fig. 1A, after hybridization of a common probe and a discriminating oligo to the target sequence, ligation occurs only if there is perfect complementarity at the junction between the two oligos. For this reason, to obtain ligase discrimination, we selected discriminating oligos with 3' position unique to each group. Common probes were designed immediately 3' to the discriminating oligo from the group-specific consensus. However, if the common probe fell in a region of poor sequence conservation within the consensus, the common probe, and consequently the discriminating oligo, were discarded. Examples of oligo pairs are illustrated in Fig. 2 (for bacilli and pseudomonads). The number of potential pairs identified at this stage is reported in Table  1, column A.
Secondly, among this set of probes, we selected only those pairs differing from all representatives of the other five groups at least for the 3' terminal position of the discriminating oligos, but invariant in all members of their group. This second criterion significantly reduced the number of actinomycete, clostridium and cyanobacteria-specific probe pairs, as shown in Table 1, column B.

Figure 1
Schematic representation of LDR applied to microbial diversity. A) Each microbial group of interest is identified by a Common Probe and a Discriminating Oligo. The common probe is phosphorylated on its 5' end and contains a unique cZip Code affixed to its 3' end. The discriminating oligo carries a fluorescent label (Cy3) on its 5' end, and a discriminating base at its 3' terminal position. The two probes hybridize adjacently to each other on the template DNA (PCR-amplified rDNA) and the nick between the two oligos is sealed by the ligase only if there is perfect complementarity at the junction. The reaction can be thermally cycled B) The presence of a microbial group is determined by hybridizing the content of a LDR to an addressable DNA Universal Array, where unique Zip Code sequences have been spotted.
Finally, in order to discard potentially aspecific probe pairs, we analyzed each common probe and discriminating oligo using the Probe Match tool on RDPII database, which permits to verify probes against all the bacteria sequences not considered in our alignments [15]. This analysis significantly reduced the number of pseudomonads and myxobacteria-specific probe pairs. Furthermore, the identification of a clostridium-specific probe was not possible, while more than one was found for some of the other groups ( Table 1, column C).
For the subsequent experimental work, we decided to select just one probe pair for each group of interest (Tab.2). When more than one base was present in the same position of the consensus, we included inosine, during oligo synthesis, at these degenerate positions.
In order to have a positive control for the Ligation Detection Reaction, a universal probe pair, matching all the studied groups, was designed according to the process described above, and the corresponding Zip code was included in the Universal Array.

Zip Codes assignment and quality control of the universal microarray
We randomly selected six Zip code sequences from those described by Barany [14]. Each Zip code was randomly assigned to a single bacterial group. Each common probe was synthesized in such a way to have the corresponding Zip code complement (cZip code) affixed to its 3' end ( Ta-ble 2). The corresponding sequences (cZipcode + common probe) were checked against the RDPII database in order to avoid problems arising from false hybridisation (although specificity is granted by the ligation reaction). No significant self-annealing of the six common probe-cZip sequences was detected by computer analysis (data not shown).
In order to verify the quality of deposition of the Zip Code oligos to the slides, we performed hybridizations with Cy5 labeled poly(dT) which is complementary to the poly(dA) 10     In the presence of the proper DNA template, the Universal Chip behaved as expected: only group specific spots and universal spots showed positive signal (Fig. 3, panels B through F). In addition, LDR assays were conducted on amplicons obtained from E. coli or C. perfringens genomic DNA: no other spots were detected besides for the strong universal signal (Fig. 3, panels G and H). LDR was tested without template yielding no signals as expected.
This result indicated that, in the absence of the perfectly matching PCR product, the probes present in the LDR mix do not generate false signals. Thus, the LDR reaction proceed in a template-specific manner.
In order to establish the detection limit of the technique we performed LDRs starting from three different amounts of the same substrate. After purification and quantification of the PCR product, we performed Ligation Detection Reaction starting from 100, 10 and 1 fmol of substrate. The detected signal progressively decreases: a barely visible signal was detected even using 1 fmol substrate ( fig. 4) corresponding to 1 ng of a 1500 bp product (600 million copies of target molecules).

Use of complex molecular targets
In order to determine the efficiency of the LDR technique, we carried out different assays varying the complexity of the molecular target. In details, we used artificial mixes of genomic DNA samples or mixes of PCR products.
In a first assay configuration, we mixed equimolar amounts of amplicons derived from the DNA of the five groups of interest, obtained from separate PCR reactions.

Figure 4
Sensitivity of LDR The reaction was performed using 100 fmol (panel A), 10 fmol (panel B), 1 fmol (panel C) of purified PCR product from P. putida DNA. All images were acquired setting both PMT gain and laser power to 85%.

A B C
As shown in Fig. 5A, in presence of mixed PCR products all the expected signals are detected. In a second configuration, we mixed equal quantities of genomic DNA belonging to the five groups of interest, and then performed a single PCR amplification. As shown in Fig. 5B, we were able to detect all the molecular targets.
Similar experiments were performed with balanced (1:1) and unbalanced (1:10) mixes of two out of five groups yielding the expected results (data not shown).
Finally, we considered DNA extracted from environmental water samples (two European lakes which have been fully characterized by TGGE for their content in cyanobacteria (S. Ventura, personal communication)) dividing it into two aliquots: one was amplified using the Universal primers, the other using cyanobacteria-specific primers. The PCR products were used in LDR assays. When a selective amplification of the cyanobacteria 16S rDNA was per-formed, positive signals were obtained only from the rows corresponding to Universal and cyanobacteria positions (Fig. 6, panels A and C). On the contrary, when we employed amplicons obtained with the Universal primers, also some of the other rows showed a positive signal (Fig.  6, panels B and D), suggesting the presence of bacteria belonging to the groups under scrutiny in one or both environmental samples.

Discussion
The main goal of this work was the development of a flexible method to detect bacterial groups. Ligation Detection Reaction, combined with a Universal Microarray, appeared an interesting approach suited for this application [13,14]. It requires PCR amplification of a target region, in this case the bacterial 16S rDNA, which is then subjected to a multiplex cycled LDR. The LDR is achieved using Pfu DNA ligase, a thermostable enzyme which seals the nick between two adjacent oligonucleotides (the common  probe and the discriminating oligo), annealed to a complementary target, only if the oligonucleotides are perfectly base-paired, in particular at the junction site (Fig. 1A). Therefore a single mismatch in 3' terminal position of the discriminating oligo is able to prevent ligation, thus conferring total selectivity [14]. This feature confers an high resolution power to hybridisation, decreasing the effort for the search for stringency conditions. As shown in Fig.  1B, the presence of a specific target is determined by hybridizing the content of a LDR to an addressable DNA universal array, on which every single spot contains oligos with a unique Zip Code. A complementary cZip code is affixed to the 3' end of each common probe. During hybridization, the cZip Code drives the LDR product to the corresponding Zip Code on the chip surface. As every discriminating oligo carries a Cy3 molecule on its 5' end, detection of hybridized LDR products can be accomplished by laser scanning.
Probe design can be considered a crucial point: during the definition of subgroup and group-specific consensi, a cut off of 75% allowed preserving as much sequence information as possible, but required the inclusion of some degenerated positions in the probe sequences. Probes containing too many ambiguous residues were discarded, while a limited number of inosine residues was included in the oligos. Furthermore, we adopted a three-step selection process to ensure as much specificity as possible and this involved the rejection of about 80% of the probes identified after the first step. Due to these stringent criteria and ligation assay requirements, a low number of suitable group-specific probes was identified.
In fact, at this level of phylogenetic resolution, it was very difficult finding unique positions, with the ability to discriminate between groups, that fell in conserved region inside the group itself. In our experience, if the target groups are phylogenetically less distant, like members of the same order, probe design can be much more fruitful. In this case, in fact, a relevant part of diversity can be eliminated by the use of more specific PCR primer, instead of the "universal" primers F27-R1492. Moreover, in a more related subgroup we found that the 16S sequences are more similar thus simplifying probe construction (Castiglioni et al. unpublished results). This suggests that this approach seems particularly appealing for "fishing out" certain bacterial species within a complex microbial community The combined use of selective probes and LDR gave satisfactory results. LDR combines the specificity of the hybridization base pairing with the selectivity introduced by the enzymatic reaction [14], resulting in good power of discrimination as demonstrated by the presented results. It should be emphasized that perfect pairing in the 3' termi-nus of the discriminating oligos and the 5' terminus of the common probe is crucial for ligation. On the contrary mismatches placed along the remaining part of the two sequences are easily tolerated by ligase, conferring a certain flexibility in probe design for test on complex samples.
As described we were able to detect the presence of different groups in balanced and unbalanced mixes (1:1 and 1:10 molar ratio respectively).
The optimized LDR method can be performed starting from low amounts of substrate. As little as 1 fmol of PCRamplified material can be observed in our conditions. Below this limit not enough signal can be observed even increasing the amount of probes and the number of cycles (data not shown). Apart from any consideration regarding overall sensitivity and the feasibility of our procedure for quantitation, these results (sensitivity down to 1 fmol and proper results with unbalanced mixes) suggest the possibility of detecting a low amount of a specific 16S molecular fragment within complex 16S molecular mixtures.

Conclusions
We think this approach is particularly appealing for different reasons. First of all, since the ZipCodes sequences are not related with a specific molecular analysis, they remain constant and their complements can be appended to any set of LDR primers. In this sense, the array can be defined Universal. Moreover, the optimization of hybridization conditions for each probe set is not required, therefore the Universal chip become a versatile tool as new probe pairs can be added to the system without further optimization, thus reducing costs and set up time. Presented results suggest that a combination of careful probe design, PCR and LDR can be a valuable tool for the detection of bacterial groups in the environment although an intensive validation is required in order to ascertain potential interferences in complex natural samples.

Methods
All chemicals and solvents were purchased from Sigma-Aldrich (Italy) and used without further purification. Oligonucleotides were purchased from Interactiva Biotechnologie GmbH (Germany).

Ligation probe design
The probes for Ligation Detection Reaction were designed to be specific to the rDNA 16S sequences of six different bacterial groups: actinomycetes, bacilli, clostridia, cyanobacteria, myxobacteria, and pseudomonads.
For each of these groups, a substantial number of 16S rRNA sequences (see Table 1), chosen among those available in the Ribosomal Database Project II, release 8.0 [http://rdp.cme.msu.edu/html/], were imported in GCG Omiga 2.0 (Oxford Molecular Ltd.). In every group, adopting the RDP taxonomic classification, the sequences were assembled in sub-groups and aligned using the Clustal W algorithm, yielding a consensus sequence with a cut off of 75% (meaning that 3 out of 4 sequences determined the consensus at a given position). Then, sub-group consensi were aligned within each group to extract a "groupspecific" consensus, adopting the same cut off of 75%. Group-specific probe design was carried out on these "group-specific" consensi. The specificity of each probe pair (common probe and discriminating oligo) was controlled on the RDP II database, using the Probe Match tool. All oligos were designed to have melting temperature (T m values between 64 and 70°C. Discriminating oligos were purchased with a Cy3 molecule at their 5' terminal position, while common probes with a phosphate in the same position.

Universal microarray preparation
Each of our universal arrays consists of six rows, each corresponding to a group and containing ten replicas. Microarrays were prepared using Code-Link™ activated slides (Motorola Life Sciences), designed to covalently immobilize NH 2 -modified oligonucleotides. 5' amino-modified Zip Code oligonucleotides, carrying an additional poly(dA) 10 tail at their 5'end, were diluted to 25 mM in Printing Buffer (pH 8.5). Spotting was performed using a noncontact piezo-driven dispensing system (Nanoplotter, Ge-Sim, Germany). Printed slides were left overnight in a saturated NaCl chamber with a relative humidity of 75% (this was obtained adding as much solid NaCl to water as needed to form a 1 cm deep slurry in the bottom of a plastic container with an airtight lid). Slide were subsequently placed 20 minutes in a pre-warmed solution (50°C) containing 50 mM ethanolamine, 0,1 M Tris pH 9, 0,1% SDS. They were rinsed twice with water and washed on a shaker for 40 min in 4X SSC/0.1% SDS at 50°C. Finally they were rinsed twice in distilled water and centrifuged at 800 rpm using microplate carriers.
Quality control of printed surfaces was performed by sampling one slide for each deposition batch. The printed slide was hybridized with 1 mM 5' Cy5 labeled poly(dT) 10 in a solution containing 5X SSC and 0.1 mg/ml salmon sperm DNA at RT for 3 h, then washed for 15 min in 1X SSC. The fluorescent signal was controlled by laser scanning following procedures described in "Array hybridization and detection".
After thermal cycling was complete, 1 ml of proteinase K (1 mg/ml) was added, and the reaction heated at 70°C for 10 min and then quenched at 94°C for 15 min. After this, PCR products were purified by GFX PCR DNA purification kit (Amersham Pharmacia Biotech Inc, Piscataway-NJ), eluted in 50 ml of autoclaved water and quantified by a spectrophotometer.

Ligation detection reaction
Ligation Reaction was carried out in a final volume of 20 ml containing 20 mM Tris-HCl (pH 7.5), 20 mM KCl, 10 mM MgCl 2 , 0.1% NP40, 0.01 mM ATP, 1 mM DTT, 2 pmol of each discriminating oligo, 2 pmol of each common probe and 1-500 fmol of purified PCR products. The reaction mixture was preheated for 2 min at 94°C and centrifuged in a microcentrifuge for 1 min; then 1 ml of 4 U/ ml Pfu DNA ligase (Stratagene, La Jolla, California) was added. The LDR was cycled for 40 rounds of 94°C for 30 sec and 64°C for 4 min in a PCR Express thermal cycler (Hybaid, England).

Array hybridization and detection
In a 0.5-ml microcentrifuge tube, the LDR mix (20 m) was diluted to obtain 65 ml of hybridization mixture contain-ing 5X SSC and 0.1 mg/ml salmon sperm DNA. The mix, after heating at 94°C for 2 min and chilling on ice, was applied onto the slide under an EasiSeal encase of 2.4 cm 2 (Hybaid, England). Hybridization was carried out in the dark at 65°C for one hour and a half, in a temperaturecontrolled water bath. After removal of the chamber, the microarray was washed for 15 min in pre-warmed (65°C) 1X SSC, 0.1% SDS. Finally, the slide was spinned at 800 rpm for 3 min.
The fluorescent signal was detected at 5 um resolution using a ScanArray ® 4000 laser scanning system (Packard GSI Lumonics, Billerica, MA) with green laser for Cy3 dye (l ex 543 nm/l em 570 nm). Both the laser and the photomultiplier (PMT) tube power were set at 70-95%.
To quantify the fluorescent intensity of spots the QuantArray ® quantitative microarray analysis software was employed (Packard GSI Lumonics).