Utilization of tmRNA sequences for bacterial identification

Background Ribosomal RNA molecules are widely used for phylogenetic and in situ identification of bacteria. Nevertheless, their use to distinguish microorganisms within a species is often restricted by the high degree of sequence conservation and limited probe accessibility to the target in fluorescence in situ hybridization (FISH). To overcome these limitations, we examined the use of tmRNA for in situ identification. In E. coli, this stable 363 nucleotides long RNA is encoded by the ssrA gene, which is involved in the degradation of truncated proteins. Results Conserved sequences at the 5'- and 3'-ends of tmRNA genes were used to design universal primers that could amplify the internal part of ssrA from Gram-positive bacteria having low G+C content, i.e. genera Bacillus, Enterococcus, Lactococcus, Lactobacillus, Leuconostoc, Listeria, Streptococcus and Staphylococcus. Sequence analysis of tmRNAs showed that this molecule can be used for phylogenetic assignment of bacteria. Compared to 16S rRNA, the tmRNA nucleotide sequences of some bacteria, for example Listeria, display considerable divergence between species. Using E. coli as an example, we have shown that bacteria can be specifically visualized by FISH with tmRNA targeted probes. Conclusions Features of tmRNA, including its presence in phylogenetically distant bacteria, conserved regions at gene extremities and a potential to serve as target for FISH, make this molecule a possible candidate for identification of bacteria.


Background
In recent years, molecular approaches based on nucleotide sequences of ribosomal RNA (rRNA) have become widely used tools for identification of bacteria [1][2][3][4]. The high degree of evolutionary conservation makes 16S and 23S rRNA molecules very suitable for phylogenetic stud-ies above the species level [3][4][5]. More than 16,000 sequences of 16S rRNA are presently available in public databases [4,6]. The 16S rRNA sequences are commonly used to design fluorescently labeled oligonucleotide probes. Fluorescence in situ hybridization (FISH) with these probes followed by observation with epifluores-cence microscopy allows the identification of a specific microorganism in a mixture with other bacteria [2][3][4]. By shifting probe target sites from conservative to increasingly variable regions of rRNA, it is possible to adjust the probe specificity from kingdom to species level.
Nevertheless, 16S rRNA sequences of closely related strains, subspecies, or even of different species are often identical and therefore can not be used as differentiating markers [3]. Another restriction concerns the accessibility of target sites to the probe in FISH experiments. The presence of secondary structures, or protection of rRNA segments by ribosomal proteins in fixed cells can limit the choice of variable regions as in situ targets for oligonucleotide probes [7,8]. One way to overcome the limitations of in situ identification of bacteria is to use molecules other than rRNA for phylogenetic identification of bacteria, for which nucleotide sequences would be sufficiently divergent to design species specific probes, and which would be more accessible to oligonucleotide probes. For this purpose we investigated the possibility of using tmRNA (also known as 10Sa RNA; [9][10][11]). This molecule was discovered in E. coli and described as small stable RNA, present at ~1,000 copies per cell [9,11]. The high copy number is an important prerequisite for FISH, which works best with naturally amplified target molecules. In E. coli, tmRNA is encoded by the ssrA gene, is 363 nucleotides long and has properties of tRNA and mRNA [12,13]. tmRNA was shown to be involved in the degradation of truncated proteins: the tmRNA associates with ribosomes stalled on mRNAs lacking stop codons, finally resulting in the addition of a C-terminal peptide tag to the truncated protein. The peptide tag directs the abnormal protein to proteolysis [14,15]. 165 tmRNA sequences have so far (August 2001; The tmRNA Website: [http://www.indiana.edu/~tmrna/] ) been determined [16,17]. The tmRNA is likely to be present in all bacteria and has also been found in algae chloroplasts, the cyanelle of Cyanophora paradoxa and the mitochondrion of the flagellate Reclinomonas americana [10,17,18].
In this study we present the first data on suitability of tm-RNA for phylogenetic affiliation and in situ identification of bacteria.

Analysis of ssrA sequences from the species Lactococcus lactis
Lactococcus lactis was chosen to investigate the variability of tmRNA within one species. We used oligonucleotides B1 and B2, designed according to conserved tmRNA regions of B. subtilis (Table 1; [13]) to PCR-amplify the internal part of the ssrA gene from L. lactis strain IL1403 (named by analogy with the E. coli tmRNA gene; [12]). The nucleotide sequence of the amplified fragment was determined using the same primers. In-verse PCR was used to obtain the sequence of the complete gene and of flanking regions [19]. For that purpose, chromosomal DNA of IL1403 was digested separately with restriction endonucleases BclI, BglII, BspHI, Hin-dIII, NsiI or SpeI, respectively. The digested lactococcal DNA from each restriction reaction was ligated and PCRamplified, using primers L3 and L4 (Fig. 1, Table 1). A PCR fragment of approximately 2,000 bp was obtained from HindIII digested and religated DNA. The fragment was sequenced using primers L3, L4, L11 and L12 (Table  1) to obtain the full sequence of ssrA and surrounding regions ( Fig. 1). The limits of the lactococcal ssrA gene were determined after alignment of this sequence with that of the B. subtilis ssrA gene [13].
The lactococcal tmRNA gene is 356 bp long and surrounded by typical regulatory sequences, suggesting a monocistronic operon structure (Fig. 1). The terminal 3'-CCA triplet of tmRNA is not encoded by the gene, and may be added posttrancriptionally, as in the case of B. subtilis [13]. The conservative peptide tag sequence, which is between UAA triplets may be distinguished in the central part of the gene. This sequence allows to pre-

Figure 1
Nucleotide sequence of the L. lactis IL1403 chromosomal fragment encoding the ssrA gene. In capital letters, ssrA gene; in italics, presumed tag sequence, followed by a stop codon, the putative peptide is shown above the DNA sequence; in bold, variable positions 156, 208 and 235; underlined, regulatory regions (-10, -35 and terminator). Localization of primers 3L, 4L, 11L and 12L indicated by arrows. dict a 12 amino acid peptide (AKNNTQTYAMAA; see Fig. 1), which resembles the ssrA-encoded E. coli tmRNA tag sequence (ANDENYALAA; [14,15]). These similar features indicate that in Gram-positive bacteria, as in E. coli, the ssrA gene product may be involved in the turnover of truncated proteins.
To design universal primers for PCR amplification of tm-RNA genes from different Gram-positive bacteria, we aligned tmRNA sequences of L. lactis and B. subtilis. The primers 11L and 12L corresponded to the conserved 5'and 3' ends of the tmRNA molecule, which are identical in L. lactis and B. subtilis (Fig. 1). These primers were used for PCR amplification and sequencing of internal segments of ssrA from different L. lactis strains. Alignment of tmRNA sequences of 16 L. lactis strains showed that there were three variable positions in the gene (156, 208 and 235, Fig. 1). These differences between L. lactis tmRNA sequences allowed us to distinguish 3 groups within the species, which corresponded to L. lactis ssp. lactis, L. lactis ssp. lactis biovar.diacetylactis and L. lactis ssp. cremoris (Table 1). Strain IL890 is an exception, since by ssrA sequence it belongs to the L. lactis ssp. lactis group, although it is classified as biovar. diacetylactis. IL2961 differs from the other L. lactis ssp. lactis strains by an T in the variable position 235 and a C in position 167 (Fig. 1). These differences of IL2961 are not surprising since this strain has an unusual RAPD pattern [20]. Even if the number of strains tested and the divergence of lactococcal tmRNA sequences are too low to make a significant clustering, these results show that biovariety diacetylactis may be distinguished by differences in chromosomal sequences not related to diacetyl production or to a citrate permease encoding plasmid, which is in line with previous reports [21][22][23]. Comparison of tm-RNA sequences from a larger collection of lactococcal strains will allow us to confirm this finding.

Phylogenetic analysis of ssrA genes from gram-positive bacteria
We PCR-amplified and sequenced the internal part of the ssrA gene of 24 strains of Gram-positive bacteria with low G+C content, namely of the genera Bacillus, Lactococcus, Lactobacillus, Streptococcus, Staphylococcus, Leuconostoc, Listeria and Enterococcus. The divergence of tmRNA sequences within one genus differed according the species. For example, we found 5 base differences between ssrA genes of Listeria innocua and Listeria monocytogenes (1.3% of divergence). The 16S rRNA sequences of these species are less divergent (0.5%). In staphylococci, S. saprophyticus tmRNA differs from S. xylosus in only one base (0.3% of divergence). In contrast, the difference between these two species and S. epidermidis is more marked (more than 20% divergence). The divergence between species within the genera Lactococcus, Lactobacillus, Leuconostoc and Enterococcus is 8.9%, 2.9%, 18% and 6.6%, respectively. Sequence comparisons showed that in some cases, as for Listeria innocua and Listeria monocytogenes, the tmR-NAs are more divergent than 16S RNA, thus making tm-RNA a promising tool for species identification. Although the total number of base differences even in a less divergent 16S rRNA sequence might be higher or at least equal to that of the corresponding tmRNA sequence, the advantage with the latter is that only about 350 bases have to be sequenced compared to 1,500 bases for the 16S rRNA sequence. For reconstruction of a tmRNA-based phylogenetic tree, we aligned our sequences to other available ssrA sequences [16,17]. Depending on the treeing method and whether a filter was used or not, the resulting trees differed in the branching order of particular clusters, although the clusters themselves remained stable (data not shown). This finding is reflected in the multifurcations drawn in the consensus tree (Fig. 2). A stable branching order could not be maintained indicating that the resolution of tmRNA might be restricted to subspecies, species or genus level due to high variability. This was also evident when calculating the 50% conservation filter that left only 180 bases for further analyses. For a thorough phylogenetic study above the level of intraspecies or in-  [38]. However, the parental strain CNRZ157 (IL594) was identified as biovar.
trageneric relationships, the tmRNA sequences lack sufficiently high numbers of more conservative positions as one of the prerequisites for phylogenetic markers [24]. However, at the level of genus, species and subspecies they might be an alternative to rRNA sequences that are too conserved for a proper resolution at that level.
The presence of conserved ends on the DNA sequence facilitated the design of universal primers for PCR amplification of the near full-size ssrA gene of numerous Grampositive bacteria. The small size of the molecule is an attractive characteristic since it enabled using the same two primers for sequence determination. The use of PCR amplification and sequencing of tmRNA genes was also demonstrated for other groups of bacteria [10,25]. Based on the above analyses we propose that tmRNA sequences can be used for identification of Gram-positive bacteria on the species and genus level. For some bacteria, e.g., lactococci, differences in tmRNA sequences make strain identification possible on the subspecies level, especially if sensitive techniques such as DGGE or TGGE are used [26].
FISH with the tmRNA targeted probes E. coli W3110 and its ∆ssrA derivative were used to test whether it is possible to specifically visualize bacteria by means of FISH with tmRNA targeted probes. Because of relatively low tmRNA molecule numbers per cell (estimated at ~1,000; [11]) HRP-tyramide signal amplification system was applied [27]. We were able to visualize strain W3110 with the tmRNA targeted HRP-labeled probe (Fig. 4, b). The signal was specific, since the ∆ssrA strain did not give any signal with the same probe (Fig. 4,  d). These results demonstrate that tmRNA, fluorescently labeled after HRP-tyramide signal amplification can be used to distinguish E. coli strains expressing or not the ssrA gene. The ability to visualize bacterial cells also indicates that despite its small size the tmRNA molecule was not totally washed out of the cell during FISH procedure. Using a less sensitive technique with fluorescein labeled probes (E1 -E5, Table 1) we detected a signal only with the strain W3110 ∆ssrA carrying the ssrA gene on plasmid p10SA, but not with the wild type strain. However, no signal was obtained with wild type strain W3110, even if using all five probes simultaneously.
The low efficiency of fluorescein labeled probes may be due to low copy number or poor accessibility of target RNA. The probes used in our study were targeted to regions with different degree of accessibility [28]. However, using these probes for FISH with the plasmid carrying strain, we did not observe differences in signal intensity (data not shown). It is likely that the tmRNA copy number per cell is too low to be detected by FISH using fluorescein labeled probes, which may restrict its use as a target molecule.
Besides rRNA and regulatory RNAII of ColE1-related plasmids, tmRNAs are the only molecules in bacteria that have been thus far detected by FISH with fluorophore-or HRP-labeled oligonucleotide probes [29]. All three types of molecules are non-translated RNAs. Labeling of specific mRNA with FISH did not give any signal in similar conditions despite strong expression of the selected gene (our unpublished results). mRNA was visualized by FISH only with longer, digoxigenin labeled transcript probes followed by antibody detection [30][31][32]. Poor performance of oligonucleotide probes in detection of mRNA may be attributed to involvement of mRNA in the translation process.

Conclusions
We determined the sequences of tmRNA genes in numerous Gram-positive bacteria. The presence of conservative regions at extremities of tmRNA genes that flank divergent sequences and the possibility to detect tmRNA by FISH makes this molecule a potential tool for identification of bacteria.

Strains, media and general methods
The bacterial strains used in this study are listed in Table  2. E. coli and B. subtilis were grown on L-Broth and other bacteria on BHI broth (Difco, Detroit, Mich.). Solid medium contained 1.5% Bacto-Agar (Difco, Detroit, MI). Standard procedures were used for DNA preparation, restriction endonuclease digestion and DNA electrophoresis [33]. Plasmid p10SA, encoding the ssrA gene of E. coli, and strain W3110∆ssrA were kindly provided by Dr. H. Inokuchi [12]. B. subtilis strain 168 is from the collection of Dr. C. Anagnostopulos [34]. Chromosomal DNA of staphylococci was obtained from M. C. Montel (France). Other strains are from the CNRZ collection (INRA, Jouy-en-Josas, France). The Lactococcus lactis ssp. lactis strains were considered belonging to biovariety diacetylactis if they produced more that 4 mg/l diacetyl (P. Tailliez and J. Tremblay, unpublished).

Oligonucleotide probes
The nucleotide sequences of probes are listed in Table 1. These probes were labeled with either fluorescein (F-) or horseradish peroxidase (HRP-). Fluorescent and non-labeled probes were purchased from Eurobio (Les Ulis, France). The probes were labeled with horseradish peroxidase as described before [27]. To obtain the PCR product of corresponding size for strains L. gallinarum CNRZ1931, L. plantarum CNRZ211 and S. xylosus DSM20266, we had to lower the stringency of PCR (annealing temperature 30°C instead of 50°C), which may indicate that the primers were not fully complementary to target sequences. PCR products were run on agarose gel, the fragment of corresponding size was cut from the gel, purified using

Figure 2
Consensus tree based on tmRNA sequences determined in this study. For details on the Lactococcus lactis group see Table 2.
The sequence of E. coli was used for setting the root. The bar represents 10% estimated sequence divergence. * Sequences of bacteria presented without a strain number were taken from the tmRNA databases (references 16,17). Gene Clean kit (Bio101, Vista, CA) and sequenced using DNA sequencing kit (PE Applied Biosystems, Warrington, UK) using the same primers. DNA sequences were analyzed with Staden sequence analysis package [35].
For a phylogenetic study, the new tmRNA sequences [16,36]were aligned on the basis of the published alignment [36] with the aid of the ARB software package [http://www.mikro.biologie.tu-muenchen.de/pub/ARB/] that was used as well for the following analyses. A 50% conservation filter was calculated to exclude the most variable positions [24]. Three different treeing methods, namely maximum likelihood, maximum parsimony and distance matrix were applied. For the latter method only the part of the sequences common to all was used, since it was sensitive to different sequence lengths. Finally, a consensus tree was drawn based on the maximum likelihood tree [24].

DNA sequence accession numbers
Sequences of the tmRNA gene fragment that was amplified using primers 11L and 12L, are available in GenBank (see accession numbers in Table 2). The sequence of primers is not included in the sequences.