Tandem repeats analysis for the high resolution phylogenetic analysis of Yersinia pestis
© Pourcel et al; licensee BioMed Central Ltd. 2004
Received: 27 January 2004
Accepted: 08 June 2004
Published: 08 June 2004
Yersinia pestis, the agent of plague, is a young and highly monomorphic species. Three biovars, each one thought to be associated with the last three Y. pestis pandemics, have been defined based on biochemical assays. More recently, DNA based assays, including DNA sequencing, IS typing, DNA arrays, have significantly improved current knowledge on the origin and phylogenetic evolution of Y. pestis. However, these methods suffer either from a lack of resolution or from the difficulty to compare data. Variable number of tandem repeats (VNTRs) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses in a growing number of pathogens and have given promising results for Y. pestis as well.
In this study we have genotyped 180 Y. pestis isolates by multiple locus VNTR analysis (MLVA) using 25 markers. Sixty-one different genotypes were observed. The three biovars were distributed into three main branches, with some exceptions. In particular, the Medievalis phenotype is clearly heterogeneous, resulting from different mutation events in the napA gene. Antiqua strains from Asia appear to hold a central position compared to Antiqua strains from Africa. A subset of 7 markers is proposed for the quick comparison of a new strain with the collection typed here. This can be easily achieved using a Web-based facility, specifically set-up for running such identifications.
Tandem-repeat typing may prove to be a powerful complement to the existing phylogenetic tools for Y. pestis. Typing can be achieved quickly at a low cost in terms of consumables, technical expertise and equipment. The resulting data can be easily compared between different laboratories. The number and selection of markers will eventually depend upon the type and aim of investigations.
Within the Y. pestis species, strains are separated into three biovars according to their ability to reduce nitrate and to ferment glycerol . Since Y. pestis was connected to plague by Yersin , strains of biovar Antiqua have been generally isolated from Asia and from East and Central Africa, Medievalis was found in Central Asia, and Orientalis worldwide. Y. pestis is thought to have recently evolved from Yersinia pseudotuberculosis, some 1,500 to 20,000 years ago . Based on the biochemical assays, Devignat  proposed that Antiqua strains, causing probably the first known pandemic, represent the ancestor. Medievalis strains are suggested to be responsible for the second pandemic whereas the third pandemic was associated exclusively with Orientalis strains. This overall scenario, although not formally established, is supported by the observed higher diversity in the Antiqua and Medievalis biovars as measured by IS typing and the geographic origin of strains [3–5]. "Pestoides" strains are particular Y. pestis isolates found recently in Russia, and which infect unique species of rodents [5, 6].
Several molecular methods have been used for genotyping Y. pestis strains, mostly based on pulse-field gel electrophoresis (PFGE), insertion sequence polymorphism and ribotyping. Recently, multiple locus variable-number-of-tandem-repeat (VNTR) analysis (MLVA) was shown to be a promising method for genotyping a number of pathogens [7–15]. When applicable, MLVA is of great interest, because data produced in different laboratories can be easily exchanged and merged. This is especially relevant in the case of pathogens such as Y. pestis, for which strains cannot be easily exchanged for security reasons. In this context, the availability of standardized and easy to set-up typing tools to facilitate the research efforts on this pathogen is important. The complete genomic sequences of Y. pestis CO92 , biovar Orientalis, and of strain KIM , biovar Medievalis, have been determined, facilitating the identification of tandem repeats and consequently the selection of primers for MLVA [8, 15].
Until now only small series of Y. pestis strains were typed by MLVA [8, 9] and although the method seemed appropriate to genotype reproducibly and accurately, a large scale study seemed necessary. In the present work, a collection of 180 isolates of Y. pestis, from different geographical origins, and including various Y. pestis reference strains were typed using the 25 VNTRs previously described .
List of the Yersinia pestis isolates used for genotyping
Number of isolates
CEB-O-1 to CEB-O-132
CEB-O-133 and O-134
M23, Java9, Java10, TS
195P, CEB-O-135 and CEB-O-136
EV26, EV76(and 3 replicates), 6/69 (and 2 replicates)
CEB-O-137 to CEB-O-139
CEB-M-1 to CEB-M-5, PKH4, PKR25, KIM (and 5 replicates)
CEB-M-6 and CEB-M-7
IP537, A22, 129M22
Turkey 10/5, Turkey 10/3
Most Medievalis strains are also clustered into one major branch (genotypes 52 to 61) with the exception of the strain representing genotype 4. The Pestoides isolate from Georgia (genotype 1) is (weakly) grouped with the African Antiqua strains. Among Antiqua isolates, the KUMA and Yokohama isolates show the identical MLVA genotype 8, and possess a seemingly specific ms09 allele.
At least two independent types of Medievalis strains
The strain representing genotype 4 is nitrate-reductase negative, and for this reason has been phenotypically assigned to the biovar Medievalis. However, its position in Figures 2 and 3, being very close to genotype 5 (Antiqua strains) and very distinct from the Medievalis cluster, and its geographic origin from Kenya prompted us to investigate the origin of nitrate-reductase deficiency in more detail. The Medievalis strain KIM is nitrate-reductase deficient because of a single point mutation in the napA gene . We have analyzed all Medievalis strains from genotypes 52 to 61 and they showed the same point mutation (data not shown). In contrast the napA gene in the strain representing genotype 4 has been inactivated by a deletion as seen by the absence of amplification of the gene in spite of the use of different primers from this locus, and the absence of an hybridisation signal in a Southern blot experiment (data not shown).
A simple MLVA assay comprising 7 markers
Main characteristics and behavior of the 25 tandem repeat loci
repeat size (bp)
nb of alleles*
allele size range (units)
allele size range (bp)
In the present report, an MLVA typing assay comprising 25 markers has been applied to a collection of Y. pestis isolates of various origins, but with a strong bias towards Orientalis strains from South-East Asia. One hundred and eighty strains or isolates (Table 1) have been genotyped and 61 different MLVA genotypes were identified (see Additional file 1). Clustering analysis and minimum spanning tree analysis suggest relations between the different strains and biovars which are in excellent agreement with current knowledge. In spite of the very limited number of Antiqua and Medievalis strains which could be investigated here, the data obtained suggest the existence of two groups of Antiqua strains. The first group from Russia and Asia represented by genotypes 6, 7 (from Russia) and 8 (KUMA, Yokohama) holds an intermediate position between the Medievalis and the Orientalis group. The second group comprises the African Antiqua strains. IS100 typing distinguishes the KUMA and Yokohama isolates which were typed as A2 and A1b type, respectively . They are very similar in DNA microarrays studies .
Medievalis and Orientalis strains derive from Antiqua strains by the loss of metabolic functions, respectively the capacity to reduce nitrate, and to metabolize glycerol. Whereas all Orientalis strains investigated so far (and including this report) are derived from a single ancestor carrying a simple deletion in the glpD gene , we report here that the Medievalis phenotype can be associated with at least two independent mutation events in the napA gene. This underlines the fact that the initial biochemical tests should be complemented, or replaced, by direct molecular analyses of the glpD and napA genes.
It is tempting to speculate about possible scenarios suggested by Figure 3 and current knowledge. Pestoides strains originate from Central Asia (reviewed in ), and are proposed to be an outgroup of the Y. pestis group. Their genetic composition is relatively distinct from Y. pestis, in terms of plasmids and chromosome, as assayed by DNA array analysis for instance . In Figure 3, the Pestoides strain studied is very distantly related to an hypothetical missing link close to the center of the figure. Hypothetical missing links (open circles) are created by the Minimum Spanning Tree software if they result in a reduction of the total tree length. Many missing links are suggested by the software in this area. They enable the connection of the three branches made of the African Antiqua group, the Medievalis group, and the Orientalis group. The strains closest to this central position are the Antiqua strains from Asia. This and the position of the Pestoides strain suggest that all three biovars may originate from Asia. Indeed, much more strains from these regions will be needed to test this hypothesis. For instance, strain Nicholinsk51 was shown by  to share some features of Orientalis strains while being an Antiqua strain from Asia. The authors favored the hypothesis that this strain was a revertant from the Orientalis phenotype. We would rather predict that Nicholinsk51 will be placed by MLVA analysis between the central group of Figure 3 and the Orientalis group.
In total, in our study and that of Klevytska et al. , 61 VNTRs were characterized, and still more exist in the Y. pestis genome  that can be tested for the selection of an optimal set for MLVA if necessary. Different questions can be addressed by different sets of tandem repeat loci. Global phylogenetic investigations will best be done using loci with a low or moderate mutation rate. Forensics, or local outbreaks investigations, may use loci with a higher mutation rate and particularly simple sequence repeats such as the tetranucleotide described by Adair et al. . Mutation rate of a tandem repeat locus and phylogenetic value is very poorly predicted from sequence analysis [15, 22] so that it has to be experimentally measured by typing collections of isolates, as done here (Table 2 and Additional file 1).
Once set-up, MLVA is a very powerful and reproducible genotyping method and it is hoped that this simple molecular tool will help unravel the molecular phylogeny of Yersinia pestis when being applied to a larger number of isolates. In comparison, MLST analysis  proved to be almost non informative within Y. pestis. DNA array analysis [20, 23] shows a very low resolution with only a few different genotypes identified so far. IS typing by Southern blotting has a very high discriminatory power, but the resulting data is not easily comparable between different laboratories. PCR-IS typing developed by Motin et al.  provides exchangeable data but with a much reduced resolution as compared to classical IS typing. MLVA typing may thus turn out to be the method of choice for Y. pestis, once more isolates will have been typed, common genotype databases put together, and reference collections of markers selected. As one step in this direction, the data has been made accessible on our Web site http://bacterial-genotyping.igmors.u-psud.fr. This includes not only the full dataset which can be recovered from Additional file 1, but also the possibility to run queries with new MLVA data. This may be of use for investigators lacking the specialized tools or expertise required to run MLVA clustering analyses. A very satisfying typing resolution (but not a robust phylogenetic analysis) can readily be achieved by PCR amplification of only 7 loci.
Bacterial strains and isolates
Most of the strains are from the collection maintained by the French ministry of defense at Centre d'Etudes du Bouchet (CEB) and others came from French medical military institutions . The Y. pestis strains were isolated mostly in Asia between 1964 and 1979, some in Africa (Kenya, Congo), Kurdistan and Madagascar. Additional reference strains and DNA isolates of different biovars, as identified by the source laboratories, are from the Institute of Microbiology Federal Armed Forces, Munich (Germany), or were kindly provided by Prof. F. Allerberger (Vienna, Austria), Prof. H. Tschäpe (Wernigerode, Germany), Pr H. Mollaret, Dr E. Carniel (Paris, France) and Dr. A. Rakin (Munich, Germany) (Table 1). Thermolysates were prepared by heating a bacterial suspension in water for 30 min at 95°C.
Biovar assignment by molecular tests
The presence of an intact glycerol-3-phosphate dehydrogenase (glpD) gene was tested by PCR as described . Sixty three isolates were typed by PCR for the presence of four IS100 insertions (vlm04 and vlm05, vlm06, vlm25, vlm28) as described by Motin et al. . The single nucleotide change in napA resulting in nitrate reductase deficiency in strain KIM  was used in a SNP typing assay. Primers napAFor (5' GCGCTAAAAGAGAAAGGCCCGA 3') and either SNPnapKIM (5' AGAGCACGAAGGCATCGGCTTA 3') or SNPnapCO92 (5' AGAGCACGAAGGCATCGGCTTC 3') were used in two PCR reactions at 58°C annealing temperature producing a 230 bp amplicon in either reaction.
Minisatellite PCR amplification and genotyping
PCR reactions and analyses were performed as described  using 25 polymorphic markers (Figure 1). Markers ypms04, ypms05, ypms07, ypms20, ypms45 and ypms62 correspond respectively to marker M58, M59, M37, M51, M42 and M34 in Klevytska et al.  (Table 2).
Data management and analyses
Gel images were analyzed using the bionumerics software package version 3.5 (Applied-Maths, Sint-Martens-Latem, Belgium) as previously described . The number of motifs in each allele was deduced from the amplicon size. The resulting data were analyzed with bionumerics as a character dataset. Clustering analysis was done using the categorical parameter and the Ward coefficient. The minimum spanning tree was constructed with the following options: (a) in case of equivalent solutions in terms of calculated distances, the priority rule used was to select the tree with the highest number of branches connecting genotypes differing at only one locus ("Highest number of single locus variants" option).; (b) the creation of hypothetical types (missing links) reducing the total length of the tree was allowed. Polymorphism indexes for each locus were calculated as 1 minus the sum of the squares of the frequency of each allele within the different genotypes identified. The data produced was made accessible from a Web page http://bacterial-genotyping.igmors.u-psud.fr as previously described  taking advantage of the BNServer application (Applied-Maths, Sint-Martens-Latem, Belgium).
Work on the typing and molecular epidemiology of dangerous pathogens is supported by the French and German ministry of defense and is part of a European defense project. CP is on leave from the Institut Pasteur, Paris, France. We thank Isabelle Rebillat for technical help and H. Tschäpe, F. Allerberger, H. Mollaret, E. Carniel and A. Rakin for their generous gifts of purified genomic DNA or strains. We thank France Denoeud for the setting-up of the genotyping web site.
- Devignat R: Variétés de l'espèce Pasteurella pestis. Nouvelle hypothèse. Bull W H O. 1951, 4: 247-263.PubMed CentralPubMedGoogle Scholar
- Yersin A: La peste bubonique à Hong-Kong. Ann Inst Pasteur. 1894, 2: 428-430.Google Scholar
- Achtman M, Zurth K, Morelli G, Torrea G, Guiyoule A, Carniel E: Yersinia pestis, the cause of plague, is a recently emerged clone of Yersinia pseudotuberculosis. Proc Natl Acad Sci U S A. 1999, 96: 14043-14048. 10.1073/pnas.96.24.14043.PubMed CentralView ArticlePubMedGoogle Scholar
- Guiyoule A, Grimont F, Iteman I, Grimont PA, Lefevre M, Carniel E: Plague pandemics investigated by ribotyping of Yersinia pestis strains. J Clin Microbiol. 1994, 32: 634-641.PubMed CentralPubMedGoogle Scholar
- Motin VL, Georgescu AM, Elliott JM, Hu P, Worsham PL, Ott LL, Slezak TR, Sokhansanj BA, Regala WM, Brubaker RR, Garcia E: Genetic variability of Yersinia pestis isolates as predicted by PCR-based IS100 genotyping and analysis of structural genes encoding glycerol-3-phosphate dehydrogenase (glpD). J Bacteriol. 2002, 184: 1019-1027. 10.1128/jb.184.4.1019-1027.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Anisimov AP, Lindler LE, Pier GB: Intraspecific Diversity of Yersinia pestis. Clin Microbiol Rev. 2004, 17: 434-464. 10.1128/CMR.17.2.434-464.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Frothingham R, Meeker-O'Connell WA: Genetic diversity in the Mycobacterium tuberculosis complex based on variable numbers of tandem DNA repeats. Microbiology. 1998, 144: 1189-1196.View ArticlePubMedGoogle Scholar
- Le Flèche P, Hauck Y, Onteniente L, Prieur A, Denoeud F, Ramisse V, Sylvestre P, Benson G, Ramisse F, Vergnaud G: A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis. BMC Microbiol. 2001, 1: 2-10.1186/1471-2180-1-2.PubMed CentralView ArticlePubMedGoogle Scholar
- Klevytska AM, Price LB, Schupp JM, Worsham PL, Wong J, Keim P: Identification and characterization of variable-number tandem repeats in the Yersinia pestis genome. J Clin Microbiol. 2001, 39: 3179-3185. 10.1128/JCM.39.9.3179-3185.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Le Flèche P, Fabre M, Denoeud F, Koeck JL, Vergnaud G: High resolution, on-line identification of strains from the Mycobacterium tuberculosis complex based on tandem repeat typing. BMC Microbiol. 2002, 2: 37-10.1186/1471-2180-2-37.PubMed CentralView ArticlePubMedGoogle Scholar
- Onteniente L, Brisse S, Tassios PT, Vergnaud G: Evaluation of the polymorphisms associated with tandem repeats for Pseudomonas aeruginosa strain typing. J Clin Microbiol. 2003, 41: 4991-4997. 10.1128/JCM.41.11.4991-4997.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Pourcel C, Vidgop Y, Ramisse F, Vergnaud G, Tram C: Characterization of a Tandem Repeat Polymorphism in Legionella pneumophila and Its Use for Genotyping. J Clin Microbiol. 2003, 41: 1819-1826. 10.1128/JCM.41.5.1819-1826.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Lindstedt BA, Heir E, Gjernes E, Kapperud G: DNA fingerprinting of Salmonella enterica subsp. enterica serovar typhimurium with emphasis on phage type DT104 based on variable number of tandem repeat loci. J Clin Microbiol. 2003, 41: 1469-1479. 10.1128/JCM.41.4.1469-1479.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Lindstedt BA, Heir E, Gjernes E, Vardund T, Kapperud G: DNA fingerprinting of Shiga-toxin producing Escherichia coli O157 based on Multiple-Locus Variable-Number Tandem-Repeats Analysis (MLVA). Ann Clin Microbiol Antimicrob. 2003, 2: 12-10.1186/1476-0711-2-12.PubMed CentralView ArticlePubMedGoogle Scholar
- Denoeud F, Vergnaud G: Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a Web-based resource. BMC Bioinformatics. 2004, 5: 4-10.1186/1471-2105-5-4.PubMed CentralView ArticlePubMedGoogle Scholar
- Parkhill J, Wren BW, Thomson NR, Titball RW, Holden MT, Prentice MB, Sebaihia M, James KD, Churcher C, Mungall KL, Baker S, Basham D, Bentley SD, Brooks K, Cerdeno-Tarraga AM, Chillingworth T, Cronin A, Davies RM, Davis P, Dougan G, Feltwell T, Hamlin N, Holroyd S, Jagels K, Karlyshev AV, Leather S, Moule S, Oyston PC, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG: Genome sequence of Yersinia pestis, the causative agent of plague. Nature. 2001, 413: 523-527. 10.1038/35097083.View ArticlePubMedGoogle Scholar
- Deng W, Burland V, Plunkett G., 3rd, Boutin A, Mayhew GF, Liss P, Perna NT, Rose DJ, Mau B, Zhou S, Schwartz DC, Fetherston JD, Lindler LE, Brubaker RR, Plano GV, Straley SC, McDonough KA, Nilles ML, Matson JS, Blattner FR, Perry RD: Genome sequence of Yersinia pestis KIM. J Bacteriol. 2002, 184: 4601-4611. 10.1128/JB.184.16.4601-4611.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Rakin A, Heesemann J: The established Yersinia pestis biovars are characterized by typical patterns of I-CeuI restriction fragment length polymorphism. Mol Gen Mikrobiol Virusol. 1995, 26-29.Google Scholar
- Feil EJ, Li BC, Aanensen DM, Hanage WP, Spratt BG: eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol. 2004, 186: 1518-1530. 10.1128/JB.186.5.1518-1530.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Hinchliffe SJ, Isherwood KE, Stabler RA, Prentice MB, Rakin A, Nichols RA, Oyston PC, Hinds J, Titball RW, Wren BW: Application of DNA microarrays to study the evolutionary genomics of Yersinia pestis and Yersinia pseudotuberculosis. Genome Res. 2003, 13: 2018-2029. 10.1101/gr.1507303.PubMed CentralView ArticlePubMedGoogle Scholar
- Adair DM, Worsham PL, Hill KK, Klevytska AM, Jackson PJ, Friedlander AM, Keim P: Diversity in a variable-number tandem repeat from Yersinia pestis. J Clin Microbiol. 2000, 38: 1516-1519.PubMed CentralPubMedGoogle Scholar
- Denoeud F, Vergnaud G, Benson G: Predicting human minisatellite polymorphism. Genome Res. 2003, 13: 856-867. 10.1101/gr.574403.PubMed CentralView ArticlePubMedGoogle Scholar
- Radnedge L, Agron PG, Worsham PL, Andersen GL: Genome plasticity in Yersinia pestis. Microbiology. 2002, 148: 1687-1698.View ArticlePubMedGoogle Scholar
- Hernandez E, Girardet M, Ramisse F, Vidal D, Cavallo JD: Antibiotic susceptibilities of 94 isolates of Yersinia pestis to 24 antimicrobial agents. J Antimicrob Chemother. 2003, 52: 1029-1031. 10.1093/jac/dkg484.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.