- Research article
- Open Access
Tandem repeats analysis for the high resolution phylogenetic analysis of Yersinia pestis
BMC Microbiologyvolume 4, Article number: 22 (2004)
Yersinia pestis, the agent of plague, is a young and highly monomorphic species. Three biovars, each one thought to be associated with the last three Y. pestis pandemics, have been defined based on biochemical assays. More recently, DNA based assays, including DNA sequencing, IS typing, DNA arrays, have significantly improved current knowledge on the origin and phylogenetic evolution of Y. pestis. However, these methods suffer either from a lack of resolution or from the difficulty to compare data. Variable number of tandem repeats (VNTRs) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses in a growing number of pathogens and have given promising results for Y. pestis as well.
In this study we have genotyped 180 Y. pestis isolates by multiple locus VNTR analysis (MLVA) using 25 markers. Sixty-one different genotypes were observed. The three biovars were distributed into three main branches, with some exceptions. In particular, the Medievalis phenotype is clearly heterogeneous, resulting from different mutation events in the napA gene. Antiqua strains from Asia appear to hold a central position compared to Antiqua strains from Africa. A subset of 7 markers is proposed for the quick comparison of a new strain with the collection typed here. This can be easily achieved using a Web-based facility, specifically set-up for running such identifications.
Tandem-repeat typing may prove to be a powerful complement to the existing phylogenetic tools for Y. pestis. Typing can be achieved quickly at a low cost in terms of consumables, technical expertise and equipment. The resulting data can be easily compared between different laboratories. The number and selection of markers will eventually depend upon the type and aim of investigations.
Within the Y. pestis species, strains are separated into three biovars according to their ability to reduce nitrate and to ferment glycerol . Since Y. pestis was connected to plague by Yersin , strains of biovar Antiqua have been generally isolated from Asia and from East and Central Africa, Medievalis was found in Central Asia, and Orientalis worldwide. Y. pestis is thought to have recently evolved from Yersinia pseudotuberculosis, some 1,500 to 20,000 years ago . Based on the biochemical assays, Devignat  proposed that Antiqua strains, causing probably the first known pandemic, represent the ancestor. Medievalis strains are suggested to be responsible for the second pandemic whereas the third pandemic was associated exclusively with Orientalis strains. This overall scenario, although not formally established, is supported by the observed higher diversity in the Antiqua and Medievalis biovars as measured by IS typing and the geographic origin of strains [3–5]. "Pestoides" strains are particular Y. pestis isolates found recently in Russia, and which infect unique species of rodents [5, 6].
Several molecular methods have been used for genotyping Y. pestis strains, mostly based on pulse-field gel electrophoresis (PFGE), insertion sequence polymorphism and ribotyping. Recently, multiple locus variable-number-of-tandem-repeat (VNTR) analysis (MLVA) was shown to be a promising method for genotyping a number of pathogens [7–15]. When applicable, MLVA is of great interest, because data produced in different laboratories can be easily exchanged and merged. This is especially relevant in the case of pathogens such as Y. pestis, for which strains cannot be easily exchanged for security reasons. In this context, the availability of standardized and easy to set-up typing tools to facilitate the research efforts on this pathogen is important. The complete genomic sequences of Y. pestis CO92 , biovar Orientalis, and of strain KIM , biovar Medievalis, have been determined, facilitating the identification of tandem repeats and consequently the selection of primers for MLVA [8, 15].
Until now only small series of Y. pestis strains were typed by MLVA [8, 9] and although the method seemed appropriate to genotype reproducibly and accurately, a large scale study seemed necessary. In the present work, a collection of 180 isolates of Y. pestis, from different geographical origins, and including various Y. pestis reference strains were typed using the 25 VNTRs previously described .
The 25 loci could be amplified (Figure 1 and Additional file 1) in all 180 isolates (Table 1), with the exception of locus ms06 which failed to yield a PCR amplification product in three strains (corresponding to genotypes 4 and 5, Figure 2) despite numerous attempts. Sixty-one different genotypes were identified in this collection, numbered 1 to 61 (Figure 2). Clustering analysis correctly separates Y. pestis Orientalis isolates comprising genotypes 9 to 51 from Antiqua (genotypes 2, 3 and 5 to 8) and Medievalis (genotypes 4 and 52 to 61). Antiqua strains of essentially African origin (genotypes 2, 3, 5, comprising 6 different strains) and the four Antiqua strains from Russia and Asia (genotype 6 and 7 originating from Russia  and genotype 8, Antiqua strains KUMA and Yokohama originating from Manchuria and Japan) are clearly separated. Within biovar Orientalis, a group of isolates among which CO92 (genotype 15) has been assigned to the IS100 typing group O1 using the PCR-based typing method described by Motin et al.  (data not shown; results indicated in Figure 2). This demonstrates that the MLVA clustering correlates well with another molecular typing method. Additional PCR-IS100 typing indicates that the other Orientalis strains, mostly from Vietnam, are of the O2a type (Figure 2). This is in agreement with the report from Motin et al.  suggesting an association of O2a with South-East Asia. These Orientalis isolates are further separated by MLVA into three main branches comprising genotypes 27 to 51. One representative from each Orientalis genotype (genotypes 9 to 51) was assayed by PCR for the presence of the glpD deletion characterised by Motin et al. All strains yield a PCR product of the size corresponding to the deleted allele (data not shown).
Most Medievalis strains are also clustered into one major branch (genotypes 52 to 61) with the exception of the strain representing genotype 4. The Pestoides isolate from Georgia (genotype 1) is (weakly) grouped with the African Antiqua strains. Among Antiqua isolates, the KUMA and Yokohama isolates show the identical MLVA genotype 8, and possess a seemingly specific ms09 allele.
As a complementary analysis, a minimum spanning tree analysis was performed (Figure 3). This kind of analysis is applicable to categorical data sets (see also ). The creation of hypothetical types (open circles) further minimizes the summed distance of all branches of the tree. The numbers indicated in the circles correspond to the genotype numbers listed in Figure 2. The Orientalis strains (genotypes 9 to 51) are closely related, and are grouped into a single complex. The O2a strains from Vietnam constitute a well-defined subgroup. The backbone of the O2a cluster is made by genotypes 28, 40, 46, 48, which altogether represent 97 isolates. The two Antiqua strains KUMA and Yokohama from Asia (genotype 8) are clearly located outside of the Orientalis group in this analysis, in a position intermediate between the Medievalis group (genotypes 52 to 61) and the Orientalis group. All strains classified as Medievalis strains based on the nitrate reductase assay fall within this group, with one exception, the strain from Kenya defining genotype 4 which is very close to Antiqua genotype 5 (two strains from Kenya and Congo). The six genotypes representing Antiqua strains (African strains, genotypes 2, 3, 5; Asian strains, genotypes 6 to 8) are very loosely connected to each other, suggesting a very high diversity of this biovar.
At least two independent types of Medievalis strains
The strain representing genotype 4 is nitrate-reductase negative, and for this reason has been phenotypically assigned to the biovar Medievalis. However, its position in Figures 2 and 3, being very close to genotype 5 (Antiqua strains) and very distinct from the Medievalis cluster, and its geographic origin from Kenya prompted us to investigate the origin of nitrate-reductase deficiency in more detail. The Medievalis strain KIM is nitrate-reductase deficient because of a single point mutation in the napA gene . We have analyzed all Medievalis strains from genotypes 52 to 61 and they showed the same point mutation (data not shown). In contrast the napA gene in the strain representing genotype 4 has been inactivated by a deletion as seen by the absence of amplification of the gene in spite of the use of different primers from this locus, and the absence of an hybridisation signal in a Southern blot experiment (data not shown).
A simple MLVA assay comprising 7 markers
We have evaluated the possibility to define a smaller set of markers for routine typing and comparison of new strains with the data from the present collection. Table 2 lists the main characteristics of the 25 loci. Their name is indicated in the first column. The second column indicates when relevant the name of the corresponding marker in Klevytska et al . The third and fourth columns indicate the repeat size and the number of alleles in our collection of strains and that of Klevytska et al. . The fifth and sixth columns indicate the size range observed in our collection of strains, and the seventh column contains the polymorphism index of each marker. Seven markers (ms01, ms04, ms06, ms07, ms46, ms62, ms70) have a polymorphism index value above 0.6, seven others have an index value between 0.4 and 0.5, the last eleven have polymorphism index below 0.4. Fifty-seven genotypes are resolved with the 7 most polymorphic loci, instead of 61 when using the full set of 25 markers (analysis not shown).
In the present report, an MLVA typing assay comprising 25 markers has been applied to a collection of Y. pestis isolates of various origins, but with a strong bias towards Orientalis strains from South-East Asia. One hundred and eighty strains or isolates (Table 1) have been genotyped and 61 different MLVA genotypes were identified (see Additional file 1). Clustering analysis and minimum spanning tree analysis suggest relations between the different strains and biovars which are in excellent agreement with current knowledge. In spite of the very limited number of Antiqua and Medievalis strains which could be investigated here, the data obtained suggest the existence of two groups of Antiqua strains. The first group from Russia and Asia represented by genotypes 6, 7 (from Russia) and 8 (KUMA, Yokohama) holds an intermediate position between the Medievalis and the Orientalis group. The second group comprises the African Antiqua strains. IS100 typing distinguishes the KUMA and Yokohama isolates which were typed as A2 and A1b type, respectively . They are very similar in DNA microarrays studies .
Medievalis and Orientalis strains derive from Antiqua strains by the loss of metabolic functions, respectively the capacity to reduce nitrate, and to metabolize glycerol. Whereas all Orientalis strains investigated so far (and including this report) are derived from a single ancestor carrying a simple deletion in the glpD gene , we report here that the Medievalis phenotype can be associated with at least two independent mutation events in the napA gene. This underlines the fact that the initial biochemical tests should be complemented, or replaced, by direct molecular analyses of the glpD and napA genes.
It is tempting to speculate about possible scenarios suggested by Figure 3 and current knowledge. Pestoides strains originate from Central Asia (reviewed in ), and are proposed to be an outgroup of the Y. pestis group. Their genetic composition is relatively distinct from Y. pestis, in terms of plasmids and chromosome, as assayed by DNA array analysis for instance . In Figure 3, the Pestoides strain studied is very distantly related to an hypothetical missing link close to the center of the figure. Hypothetical missing links (open circles) are created by the Minimum Spanning Tree software if they result in a reduction of the total tree length. Many missing links are suggested by the software in this area. They enable the connection of the three branches made of the African Antiqua group, the Medievalis group, and the Orientalis group. The strains closest to this central position are the Antiqua strains from Asia. This and the position of the Pestoides strain suggest that all three biovars may originate from Asia. Indeed, much more strains from these regions will be needed to test this hypothesis. For instance, strain Nicholinsk51 was shown by  to share some features of Orientalis strains while being an Antiqua strain from Asia. The authors favored the hypothesis that this strain was a revertant from the Orientalis phenotype. We would rather predict that Nicholinsk51 will be placed by MLVA analysis between the central group of Figure 3 and the Orientalis group.
In total, in our study and that of Klevytska et al. , 61 VNTRs were characterized, and still more exist in the Y. pestis genome  that can be tested for the selection of an optimal set for MLVA if necessary. Different questions can be addressed by different sets of tandem repeat loci. Global phylogenetic investigations will best be done using loci with a low or moderate mutation rate. Forensics, or local outbreaks investigations, may use loci with a higher mutation rate and particularly simple sequence repeats such as the tetranucleotide described by Adair et al. . Mutation rate of a tandem repeat locus and phylogenetic value is very poorly predicted from sequence analysis [15, 22] so that it has to be experimentally measured by typing collections of isolates, as done here (Table 2 and Additional file 1).
Once set-up, MLVA is a very powerful and reproducible genotyping method and it is hoped that this simple molecular tool will help unravel the molecular phylogeny of Yersinia pestis when being applied to a larger number of isolates. In comparison, MLST analysis  proved to be almost non informative within Y. pestis. DNA array analysis [20, 23] shows a very low resolution with only a few different genotypes identified so far. IS typing by Southern blotting has a very high discriminatory power, but the resulting data is not easily comparable between different laboratories. PCR-IS typing developed by Motin et al.  provides exchangeable data but with a much reduced resolution as compared to classical IS typing. MLVA typing may thus turn out to be the method of choice for Y. pestis, once more isolates will have been typed, common genotype databases put together, and reference collections of markers selected. As one step in this direction, the data has been made accessible on our Web site http://bacterial-genotyping.igmors.u-psud.fr. This includes not only the full dataset which can be recovered from Additional file 1, but also the possibility to run queries with new MLVA data. This may be of use for investigators lacking the specialized tools or expertise required to run MLVA clustering analyses. A very satisfying typing resolution (but not a robust phylogenetic analysis) can readily be achieved by PCR amplification of only 7 loci.
Bacterial strains and isolates
Most of the strains are from the collection maintained by the French ministry of defense at Centre d'Etudes du Bouchet (CEB) and others came from French medical military institutions . The Y. pestis strains were isolated mostly in Asia between 1964 and 1979, some in Africa (Kenya, Congo), Kurdistan and Madagascar. Additional reference strains and DNA isolates of different biovars, as identified by the source laboratories, are from the Institute of Microbiology Federal Armed Forces, Munich (Germany), or were kindly provided by Prof. F. Allerberger (Vienna, Austria), Prof. H. Tschäpe (Wernigerode, Germany), Pr H. Mollaret, Dr E. Carniel (Paris, France) and Dr. A. Rakin (Munich, Germany) (Table 1). Thermolysates were prepared by heating a bacterial suspension in water for 30 min at 95°C.
Biovar assignment by molecular tests
The presence of an intact glycerol-3-phosphate dehydrogenase (glpD) gene was tested by PCR as described . Sixty three isolates were typed by PCR for the presence of four IS100 insertions (vlm04 and vlm05, vlm06, vlm25, vlm28) as described by Motin et al. . The single nucleotide change in napA resulting in nitrate reductase deficiency in strain KIM  was used in a SNP typing assay. Primers napAFor (5' GCGCTAAAAGAGAAAGGCCCGA 3') and either SNPnapKIM (5' AGAGCACGAAGGCATCGGCTTA 3') or SNPnapCO92 (5' AGAGCACGAAGGCATCGGCTTC 3') were used in two PCR reactions at 58°C annealing temperature producing a 230 bp amplicon in either reaction.
Minisatellite PCR amplification and genotyping
PCR reactions and analyses were performed as described  using 25 polymorphic markers (Figure 1). Markers ypms04, ypms05, ypms07, ypms20, ypms45 and ypms62 correspond respectively to marker M58, M59, M37, M51, M42 and M34 in Klevytska et al.  (Table 2).
Data management and analyses
Gel images were analyzed using the bionumerics software package version 3.5 (Applied-Maths, Sint-Martens-Latem, Belgium) as previously described . The number of motifs in each allele was deduced from the amplicon size. The resulting data were analyzed with bionumerics as a character dataset. Clustering analysis was done using the categorical parameter and the Ward coefficient. The minimum spanning tree was constructed with the following options: (a) in case of equivalent solutions in terms of calculated distances, the priority rule used was to select the tree with the highest number of branches connecting genotypes differing at only one locus ("Highest number of single locus variants" option).; (b) the creation of hypothetical types (missing links) reducing the total length of the tree was allowed. Polymorphism indexes for each locus were calculated as 1 minus the sum of the squares of the frequency of each allele within the different genotypes identified. The data produced was made accessible from a Web page http://bacterial-genotyping.igmors.u-psud.fr as previously described  taking advantage of the BNServer application (Applied-Maths, Sint-Martens-Latem, Belgium).
Devignat R: Variétés de l'espèce Pasteurella pestis. Nouvelle hypothèse. Bull W H O. 1951, 4: 247-263.
Yersin A: La peste bubonique à Hong-Kong. Ann Inst Pasteur. 1894, 2: 428-430.
Achtman M, Zurth K, Morelli G, Torrea G, Guiyoule A, Carniel E: Yersinia pestis, the cause of plague, is a recently emerged clone of Yersinia pseudotuberculosis. Proc Natl Acad Sci U S A. 1999, 96: 14043-14048. 10.1073/pnas.96.24.14043.
Guiyoule A, Grimont F, Iteman I, Grimont PA, Lefevre M, Carniel E: Plague pandemics investigated by ribotyping of Yersinia pestis strains. J Clin Microbiol. 1994, 32: 634-641.
Motin VL, Georgescu AM, Elliott JM, Hu P, Worsham PL, Ott LL, Slezak TR, Sokhansanj BA, Regala WM, Brubaker RR, Garcia E: Genetic variability of Yersinia pestis isolates as predicted by PCR-based IS100 genotyping and analysis of structural genes encoding glycerol-3-phosphate dehydrogenase (glpD). J Bacteriol. 2002, 184: 1019-1027. 10.1128/jb.184.4.1019-1027.2002.
Anisimov AP, Lindler LE, Pier GB: Intraspecific Diversity of Yersinia pestis. Clin Microbiol Rev. 2004, 17: 434-464. 10.1128/CMR.17.2.434-464.2004.
Frothingham R, Meeker-O'Connell WA: Genetic diversity in the Mycobacterium tuberculosis complex based on variable numbers of tandem DNA repeats. Microbiology. 1998, 144: 1189-1196.
Le Flèche P, Hauck Y, Onteniente L, Prieur A, Denoeud F, Ramisse V, Sylvestre P, Benson G, Ramisse F, Vergnaud G: A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis. BMC Microbiol. 2001, 1: 2-10.1186/1471-2180-1-2.
Klevytska AM, Price LB, Schupp JM, Worsham PL, Wong J, Keim P: Identification and characterization of variable-number tandem repeats in the Yersinia pestis genome. J Clin Microbiol. 2001, 39: 3179-3185. 10.1128/JCM.39.9.3179-3185.2001.
Le Flèche P, Fabre M, Denoeud F, Koeck JL, Vergnaud G: High resolution, on-line identification of strains from the Mycobacterium tuberculosis complex based on tandem repeat typing. BMC Microbiol. 2002, 2: 37-10.1186/1471-2180-2-37.
Onteniente L, Brisse S, Tassios PT, Vergnaud G: Evaluation of the polymorphisms associated with tandem repeats for Pseudomonas aeruginosa strain typing. J Clin Microbiol. 2003, 41: 4991-4997. 10.1128/JCM.41.11.4991-4997.2003.
Pourcel C, Vidgop Y, Ramisse F, Vergnaud G, Tram C: Characterization of a Tandem Repeat Polymorphism in Legionella pneumophila and Its Use for Genotyping. J Clin Microbiol. 2003, 41: 1819-1826. 10.1128/JCM.41.5.1819-1826.2003.
Lindstedt BA, Heir E, Gjernes E, Kapperud G: DNA fingerprinting of Salmonella enterica subsp. enterica serovar typhimurium with emphasis on phage type DT104 based on variable number of tandem repeat loci. J Clin Microbiol. 2003, 41: 1469-1479. 10.1128/JCM.41.4.1469-1479.2003.
Lindstedt BA, Heir E, Gjernes E, Vardund T, Kapperud G: DNA fingerprinting of Shiga-toxin producing Escherichia coli O157 based on Multiple-Locus Variable-Number Tandem-Repeats Analysis (MLVA). Ann Clin Microbiol Antimicrob. 2003, 2: 12-10.1186/1476-0711-2-12.
Denoeud F, Vergnaud G: Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a Web-based resource. BMC Bioinformatics. 2004, 5: 4-10.1186/1471-2105-5-4.
Parkhill J, Wren BW, Thomson NR, Titball RW, Holden MT, Prentice MB, Sebaihia M, James KD, Churcher C, Mungall KL, Baker S, Basham D, Bentley SD, Brooks K, Cerdeno-Tarraga AM, Chillingworth T, Cronin A, Davies RM, Davis P, Dougan G, Feltwell T, Hamlin N, Holroyd S, Jagels K, Karlyshev AV, Leather S, Moule S, Oyston PC, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG: Genome sequence of Yersinia pestis, the causative agent of plague. Nature. 2001, 413: 523-527. 10.1038/35097083.
Deng W, Burland V, Plunkett G., 3rd, Boutin A, Mayhew GF, Liss P, Perna NT, Rose DJ, Mau B, Zhou S, Schwartz DC, Fetherston JD, Lindler LE, Brubaker RR, Plano GV, Straley SC, McDonough KA, Nilles ML, Matson JS, Blattner FR, Perry RD: Genome sequence of Yersinia pestis KIM. J Bacteriol. 2002, 184: 4601-4611. 10.1128/JB.184.16.4601-4611.2002.
Rakin A, Heesemann J: The established Yersinia pestis biovars are characterized by typical patterns of I-CeuI restriction fragment length polymorphism. Mol Gen Mikrobiol Virusol. 1995, 26-29.
Feil EJ, Li BC, Aanensen DM, Hanage WP, Spratt BG: eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol. 2004, 186: 1518-1530. 10.1128/JB.186.5.1518-1530.2004.
Hinchliffe SJ, Isherwood KE, Stabler RA, Prentice MB, Rakin A, Nichols RA, Oyston PC, Hinds J, Titball RW, Wren BW: Application of DNA microarrays to study the evolutionary genomics of Yersinia pestis and Yersinia pseudotuberculosis. Genome Res. 2003, 13: 2018-2029. 10.1101/gr.1507303.
Adair DM, Worsham PL, Hill KK, Klevytska AM, Jackson PJ, Friedlander AM, Keim P: Diversity in a variable-number tandem repeat from Yersinia pestis. J Clin Microbiol. 2000, 38: 1516-1519.
Denoeud F, Vergnaud G, Benson G: Predicting human minisatellite polymorphism. Genome Res. 2003, 13: 856-867. 10.1101/gr.574403.
Radnedge L, Agron PG, Worsham PL, Andersen GL: Genome plasticity in Yersinia pestis. Microbiology. 2002, 148: 1687-1698.
Hernandez E, Girardet M, Ramisse F, Vidal D, Cavallo JD: Antibiotic susceptibilities of 94 isolates of Yersinia pestis to 24 antimicrobial agents. J Antimicrob Chemother. 2003, 52: 1029-1031. 10.1093/jac/dkg484.
Work on the typing and molecular epidemiology of dangerous pathogens is supported by the French and German ministry of defense and is part of a European defense project. CP is on leave from the Institut Pasteur, Paris, France. We thank Isabelle Rebillat for technical help and H. Tschäpe, F. Allerberger, H. Mollaret, E. Carniel and A. Rakin for their generous gifts of purified genomic DNA or strains. We thank France Denoeud for the setting-up of the genotyping web site.
FAM did most of the typing work and CP and GV did the error checking analysis. CP performed the biovar assignment by molecular tests and the PCR-IS typing. FR was in charge of the CEB strain collection and prepared the DNA samples. HN provided DNA from a number of additional reference strains. GV initiated and managed the project, and was in charge of the Bionumerics database and clustering analyses. CP, HN, and GV wrote the report. All authors read and approved the final manuscript.