- Research article
- Open Access
Evaluation and selection of tandem repeat loci for Streptococcus pneumoniae MLVA strain typing
BMC Microbiologyvolume 5, Article number: 66 (2005)
Precise identification of bacterial pathogens at the strain level is essential for epidemiological purposes. In Streptococcus pneumoniae, the existence of 90 different serotypes makes the typing particularly difficult and requires the use of highly informative tools. Available methods are relatively expensive and cannot be used for large-scale or routine typing of any new isolate. We explore here the potential of MLVA (Multiple Loci VNTR Analysis; VNTR, Variable Number of Tandem Repeats), a method of growing importance in the field of molecular epidemiology, for genotyping of Streptococcus pneumoniae.
Available genome sequences were searched for polymorphic tandem repeats. The loci identified were typed across a collection of 56 diverse isolates and including a group of serotype 1 isolates from Africa. Eventually a set of 16 VNTRs was proposed for MLVA-typing of S. pneumoniae. These robust markers were sufficient to discriminate 49 genotypes and to aggregate strains on the basis of the serotype and geographical origin, although some exceptions were found. Such exceptions may reflect serotype switching or horizontal transfer of genetic material.
We describe a simple PCR-based MLVA genotyping scheme for S. pneumoniae which may prove to be a powerful complement to existing tools for epidemiological studies. Using this technique we uncovered a clonal population of strains, responsible for infections in Burkina Faso. We believe that the proposed MLVA typing scheme can become a standard for epidemiological studies of S. pneumoniae.
S. pneumoniae infections remain the major cause of pneumonia, meningitis and otitis in many countries, and a growing number of isolates appear to be resistant to penicillin. Purulent meningitis due to S. pneumoniae was recognized more than twenty years ago to be a serious problem in African countries  and is known to occur in a seasonal pattern in sub-Saharan Africa . Serotype 1 is one of the most common pneumococcal serotypes associated with disease, although its prevalence varies among countries .
Apart from serotyping based on the variations of the coat exopolysaccharides, different DNA-based methods utilize genetic polymorphism. Macrorestriction and Pulsed-Field Gel Electrophoresis analysis (PFGE) , and Multiple Loci Sequence Typing (MLST)  are the most frequently used genotyping techniques. An MLST typing system was described by Enright et al.  together with an online identification page. The rep-PCR or BOX PCR assay was described in 1996 by van Belkum . The different techniques have been compared in several studies [7, 8]. Other methods use the sequencing of PCR product such as the gal U gene , or the PCR restriction profile of the cpsA-cpsB genes .
Although some of these techniques have proven their capacity to discriminate efficiently among the multiple serotypes, the data are not always reproducible between different laboratories, some may not be amenable to the making of international databases, or they are time consuming and expensive. Polymorphic tandem repeat sequences also called Variable Number of Tandem Repeats (VNTR) are an interesting class of genetic markers. Multiple alleles may be present at a single locus, and size differences are easily resolved by electrophoresis of PCR products. Tandem repeat typing has proved to be highly appropriate for the typing of pathogenic bacterial species [11, 12], including species with a very high genetic homogeneity such as the Mycobacterium tuberculosis complex, Bacillus anthracis, and Yersinia pestis [13–15].
The availability of genome sequence data from different S. pneumoniae strains greatly facilitates the search for polymorphic DNA sequences . In this report, we have evaluated the polymorphism of selected tandem repeats, and measured their discrimination power, across a diverse collection of strains.
Selection of VNTRs for MLVA typing
At the onset of this study the genome sequences of two strains, R6 and TIGR4 were available. Comparison of these genome sequences using the approach described by Denoeud et al. [16, 17], identified 33 tandem repeats with a repeat unit equal or larger than 12 bp and predicted to display size polymorphism. Preliminary sequence for two additional genomes (Sanger Spanish 23F-1 and TIGR 670-6B) was subsequently made available and was used to select primers for PCR amplification that would match with all four strains. To confirm that the selected markers were indeed polymorphic, a first set of eight isolates including the reference strain R6 and RP28 to RP34 (Table 1 and Table 2) were analyzed (Fig. 1 and data not shown). Eighteen VNTRs were retained to investigate a larger collection, 12 of which with a 45 bp repeat belong to the BOX family of repeated elements  (Table 3). Spneu19, a 60 bp repeats, encodes the choline-binding domain of pneumococcal protein A encoded by PcpA . Spneu36, a 45 bp boxB repeat is fused in strain R6 to gene trzA encoding the N-ethylammeline chlorohydrolase a Atz/Trz family protein. The sequence diversity within the repeat units of the 18 VNTRs was calculated using the Tandem Repeat Finder software , and is indicated as percent matches (Table 3).
Typing of the reference strain collection
A larger collection of 53 isolates comprising isolates from different origins and with a variety of serotypes was then genotyped (Table 1 and Table 2). As a control, strain R6 was systematically analysed with each set of 5 isolates . The primers listed in Table 3 were used essentially as previously described . The VNTRs were amplified very efficiently in most of the isolates. For a few isolates, no amplification was obtained with Spneu19 and Spneu36. The size variations of the amplicons were as expected for an exact multiple of repeats except in a few cases. With marker Spneu38, an amplicon of intermediate size was observed, marked "1.5", with strain R6 (Figure 1), and for Spneu27 RP36 had a "0.5" intermediate size allele. With marker Spneu25, amplification of RP44 and RP32 (serotype 23F) produced a 1.5 kb amplicon (allele coded "19"). When examining this locus in the different sequenced genomes we found that Sanger strain 23F had an Insertion Sequence (IS) inserted in front of the Spneu25 tandem repeat between the PCR primers. Similarly, for Spneu33, PCR amplification of two isolates, RP43 (serotype 19F) and RP18 (serotype 23F), produced a 2.6 kb amplicon (allele arbitrarily coded "20"), suggesting the presence of an IS element in the repeat. For Spneu 38 and Spneu 42, alleles coded "0.1" correspond to the absence of a VNTR unit although a PCR product is observed.
Spneu26 has a peculiar configuration in strain R6. In this strain 2 repeated elements are observed, a 49 bp repeat with low internal homogeneity inserted inside the usual 51 bp repeat.
The putative MLVA profiles of 4 fully or partially sequenced genomes were determined and used in the clustering analysis shown in Figure 2. The data for Spneu19 and Spneu36 were not used for this clustering because of the existence of null alleles in some isolates as mentioned. Combining the 52 isolates of this study (not including RP45 alias R6, represented by the sequenced genome) plus the 4 sequenced strains, 49 genotypes are observed. The global Hunter Gaston diversity index (HGDI) for the described MLVA assay is 0.995. The HGDI for each VNTR marker is shown in Table 3.
The remarkable similarity between independent isolates with the same serotype strongly suggests that these markers, with the exception of Spneu 39, do not vary at a high frequency. This is confirmed by the fact that the size of the 18 VNTR alleles was the same in two independent R6 isolates, and corresponded exactly to those of the sequenced genome.
To assess the reproducibility of the assay, a series of 10 isolates, RP46 to RP55, were genotyped in duplicate, in Orsay and Bordeaux, giving the same fingerprint. The allele size assignment was performed by eye in one site and using the BioNumerics tools in the other site.
Analysis of two small epidemic groups
Part of the isolates in this study correspond to an epidemic situation. They were isolated in the same geographic area in Burkina Faso during outbreaks of meningitis in year 2002–2003 and 2004. In 2002–2003, 8 out of 9 isolates were of serotype 1 and one was of serotype 25F. The eight serotype 1 isolates from 2002–2003 have identical alleles at all markers except Spneu39. Interestingly, for this marker 6 different alleles are observed, all corresponding to a large number of repeats (8 to 16). In 2004, isolates of 4 different serotypes were obtained, three serotype 6, three serotype 1, three serotype 5 and two serotype 12 isolates. The serotype of RP04 and RP05 could not be determined. All these isolates cluster according to their serotype. The three serotype 1 isolates are identical except for marker Spneu37. Although clearly distinct, they are grouped with the serotype 1 strains isolated in 2002–2003 (shown in a box on Fig. 2).
A collection of 18 VNTR markers which can be used to genotype S. pneumoniae strains by simple PCR and agarose gel electrophoresis has been identified. Two of these markers, Spneu19 and Spneu36, belonging respectively to pcpA and trzA, were not kept in the clustering analysis because they were absent from some isolates, although they might be useful in specific situations. Spneu19 is not amplified in isolates of serotype 3 and 6A suggesting that they lack pcpA. It was proposed that the protein encoded by pcpA could be a surface protein involved in cell adhesion with specific proteins of the human extracellular matrix. PCPA is not essential for bacterial growth at least under laboratory conditions as the gene can be knocked-out with no noticeable change in the pneumococcal phenotype . However the polymorphism observed at the choline binding domain might play a role during infection. The effect of Spneu36 polymorphism on trzA is not clear as the 45 bp repeat is apparently not fused to this gene in the TIGR4 genome.
Among the 18 selected markers, 16 consist in 45 to 60 bp repeats with a regular variation and which amplify very efficiently. The size polymorphism can be scored by eye. Two additional markers of 12 bp and 14 bp repeats are also described. In the present study, VNTRs with smaller repeat units, of the microsatellite category (1 to 8 bp long) were not investigated. Due to their sometimes relatively high mutation rate, they may improve the MLVA resolution to investigate local outbreaks.
Most of the VNTRs correspond to the intergenic 45 bp boxB repeat. They belong to a family of elements, present in multiple loci in the S. pneumoniae genome, and composed of three subunits boxA, boxB and boxC. Subunit boxB, 45 bp long, can be tandemly repeated with a high internal sequence similarity . These sequences have been suggested to be regulatory elements shared by coordinately regulated genes. The subunit boxB is the only one that can be tandemly repeated.
The polymorphism of these elements has been used to genotype strains in the BOX-PCR assay. However this assay produces an image (a multi-band pattern) of the added polymorphism of multiple BOX elements whereas the VNTR assay analyses each locus separately. As a result, the MLVA assay is more informative and reproducible, data interpretation is much easier, and genotyping databases can be easily produced .
We performed an MLVA analysis on a collection of isolates originating from Africa and France. Clustering was observed both on the basis of serotype and geographical origin although there are some exceptions. Analysis of the sequenced genomes have shown the existence of many gene transfer events which could explain why strains with the same serotype do not always cluster. Conversely, MLVA aggregates a number of strains of serotypes which are known to be close variants, such as serotype 19F and 23F , and serotype 14 and 9V .
All the African serotype 1 isolates are grouped (Figure 2). Interestingly a cluster of 3 isolates RP07, RP11 and RP02 isolated in 2004 and differing by only one marker, are linked to a second cluster of 9 isolates recovered in 2002–2003 from which they differ at 9 out of 16 markers. This is a large distance suggesting that the existing population of strains able to cause outbreaks is very diverse. A second lineage contains serotype 12 and serotype 5 strains differing at 7 out of 16 markers.
Serotype 1 strains seem to have a propensity to cause meningitis in Burkina Faso, as two related clones of this serotype were found in 50% of the analyzed cases. A similar observation was made in Northern Ghana . In contrast to the majority of other serotypes, nasopharyngeal carriage of serotype 1 is exceptionally observed. This could be linked to 1) a lesser genetic diversity as compared to other serotypes, due to limited exchanges with other streptococci during colonization, 2) a high invasive potential or 3) high attack rates, as supported by the observations of Leimkugel et al. . For serotypes other than serotype 1, distribution across the dendrogram may be associated with the nasopharynx carriage (e.g. 23F).
This preliminary investigation validates a first set of markers for MLVA investigation. The lethality linked to pneumococcal meningitis is high and an appropriate vaccination is necessary, requiring the identification of virulent lineages. The usefulness of the MLVA typing scheme proposed here can now be further determined by investigating a larger population of isolates from Africa which are currently being collected.
However in a species with 90 different genotypes, additional studies will clearly be needed. In particular, it will be useful to see how MLVA compares with MLST . In contrast with MLST, the relatively low cost and moderate expertise required for MLVA typing would allow the systematic typing of any new isolate directly by clinical laboratories within hospitals. All markers proposed here are easy to type with no sophisticated equipment and software, so that it should in principle be feasible to organize networks of clinical laboratories, each one taking in charge the typing of local isolates. To facilitate such projects, shared internet resources enabling the import and analysis of results could be also set-up . Eventually comparison of isolates on such a large scale will provide a precise measure of the stability of each marker, necessary for the optimized interpretation of MLVA typing data.
A total of 53 isolates were analyzed (Table 1 and Table 2). RP14 to RP24 and RP25 to RP35 were generous gifts from respectively Hubert Chardon (CHPA, Aix en Provence, France) and Christine Grandpré (Hôpital d'Instruction des Armées, HIA Percy, France). These isolates were obtained from blood or respiratory specimens (i.e. sputums or bronchoalveolar fluids), from patients with severe pneumonia or septicemia or both. RP36 to RP44 are reference strains for serotypes 1, 6B, 9V, 12F, 14, 18C, 19A, 19F and 23F from the Satens Serum Institute (SSI Denmark) and provided to us by the "Centre National de Référence des Pneumocoques" (CNRP France). RP1 to RP13 and RP46 to RP55 were isolated from cerebro-spinal fluid of patients with meningitis in Bobo-Dioulasso (Burkina Faso) by the "Association pour l'Aide à la Médecine Préventive" (AMP IPP) . Strain R6 (ATCC BAA 255) was used as control under the name RP45. The strain was obtained from two sources, one from the CNRP originating from the SSI in Denmark, and the second from the Institut Pasteur collection (CIP 105880). DNA was purified using the InstaGene kit (Biorad, Marnes la Coquette, France). The minimum inhibitory concentrations of antibiotics (MICs) were determined by the E-test (AB Biodisk). Interpretive criteria for susceptibility or resistance were as recommended by the CASFM (Comité de l'Antibiogramme de la Société Française de Microbiologie)  (Last release, January 2005).
Identification of variable number tandem repeats by genomic sequence comparison
The methods previously described [13, 16, 17, 27] were used to identify tandem repeats with a predicted size which differs between the published genome of S. pneumoniae strain R6 , strain TIGR4  and the preliminary genome 23F (Sanger Institute) and 670-6B (TIGR) sequence data (obtained respectively from  and from ).
The different tandem repeat loci are designated by using the nomenclature described previously . For instance Spneu1579_45bp_507bp_7u (Spneu15) is a tandem repeat locus at position 1579 Kb in the R6 genome. It has a 45 bp motif, a total PCR product length of 506 bp in the R6 strain when using the primer set indicated in Table 3. This allele size corresponds to 7 units. Its common laboratory name is Spneu15 (Table 3).
PCR (Polymerase Chain Reaction) amplification and genotyping
PCR amplifications were performed in a total volume of 15 microliters containing 10 ng of DNA, 1× PCR Reaction Buffer, 1U of Taq DNA polymerase (Qbiogen, Illkirch, France), 200 microM of each deoxynucleotide triphosphate, and 0.3 microM of each flanking primer. The primers are listed in Table 3. Amplifications were performed in a MJ Research PTC200 thermocycler. Initial denaturation step at 94°C for 5 min. was followed by 30 cycles of denaturation at 94°C for 30 s, primer annealing at 60°C for 30 s, and elongation at 72°C for 45 s. The final extension step was at 72°C for 7 min. Three microliters of amplification product were loaded on a 2% standard agarose gel and run until the bromophenol blue had reached the 20 cm position. Gels were stained with ethidium bromide, visualized under UV light, and photographed (Vilber-Lourmat, Marnes-la-Vallée, France). The size markers used were a 100-bp ladder (EZ Load 100 pb PCR Molecular Ruler, Biorad, Marnes la Coquette, France) or 20-bp ladder (EZ Load 20 pb Molecular Ruler, Biorad, Marnes la Coquette, France) according to the tandem repeat unit length. Gel images were managed using the BioNumerics software package (version 4.0, Applied-Maths, Sint-Martens-Latem, Belgium).
Band size estimates were converted to number of units within a character dataset. The VNTR data deduced from the sequenced strains R6, TIGR4, 670-6B and Sanger Spanish 23F-1 were added to the MLVA analysis. Clustering analyses used the categorical coefficient and UPGMA (Unweighted Pair Group Method using Arithmetic averages). The use of the categorical parameter implies that the character states are considered unordered. The same weight is given to a large or a small number of differences in the number of repeats at each locus. The polymorphism indexes of individual or grouped VNTRs was calculated using the Hunter-Gaston discriminatory index (HGDI) .
Variable Number of Tandem Repeats
Multiple locus VNTR
Hunter Gaston discriminatory index
Mar ID, Denis F, Cadoz M: [Epidemiologic features of pneumococcal meningitis in Africa. Clinical and serotypical aspects (author's transl)]. Pathol Biol (Paris). 1979, 27: 543-548.
Parent du Chatelet I, Traore Y, Gessner BD, Antignac A, Naccro B, Njanpop-Lafourcade BM, Ouedraogo MS, Tiendrebeogo SR, Varon E, Taha MK: Bacterial meningitis in Burkina Faso: surveillance using field-based polymerase chain reaction testing. Clin Infect Dis. 2005, 40: 17-25. 10.1086/426436.
Brueggemann AB, Spratt BG: Geographic distribution and clonal diversity of Streptococcus pneumoniae serotype 1 isolates. J Clin Microbiol. 2003, 41: 4966-4970. 10.1128/JCM.41.11.4966-4970.2003.
Hall LM, Whiley RA, Duke B, George RC, Efstratiou A: Genetic relatedness within and between serotypes of Streptococcus pneumoniae from the United Kingdom: analysis of multilocus enzyme electrophoresis, pulsed-field gel electrophoresis, and antimicrobial resistance patterns. J Clin Microbiol. 1996, 34: 853-859.
Enright MC, Spratt BG: A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology. 1998, 144 (Pt 11): 3049-3060.
van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW: Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996, 34: 1176-1179.
Hermans PW, Sluijter M, Hoogenboezem T, Heersma H, van Belkum A, de Groot R: Comparative study of five different DNA fingerprint techniques for molecular typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1995, 33: 1606-1612.
Dunne WMJ, Kehl KS, Holland-Staley CA, Brueggeman AB, Pfaller MA, Doern GV: Comparison of results generated by serotyping, pulse-field restriction analysis, ribotyping, and repetitive-sequence PCR used to characterize penicillin-resistant pneumococci from the United States. J Clin Microbiol. 2001, 39: 1791-1795. 10.1128/JCM.39.5.1791-1795.2001.
Mollerach M, Garcia E: The galU gene of Streptococcus pneumoniae that codes for a UDP-glucose pyrophosphorylase is highly polymorphic and suitable for molecular typing and phylogenetic studies. Gene. 2000, 260: 77-86. 10.1016/S0378-1119(00)00468-6.
Lawrence ER, Arias CA, Duke B, Beste D, Broughton K, Efstratiou A, George RC, Hall LMC: Evaluation of serotype prediction by cpsA-cpsB gene polymorphism in Streptococcus pneumoniae. J Clin Microbiol. 2000, 38: 1319-1323.
Lindstedt BA: Multiple-locus variable number tandem repeats analysis for genetic fingerprinting of pathogenic bacteria. Electrophoresis. 2005, 26: 2567-2582. 10.1002/elps.200500096.
Vergnaud G, Pourcel C: Multiple Locus VNTR (Variable Number of Tandem Repeat) Analysis. Molecular Identification, Systematics, and Population Structure of Prokaryotes. Edited by: Stackebrandt E. 2006, Berlin Heidelberg: Springer-Verlag, 83-104.
Le Flèche P, Hauck Y, Onteniente L, Prieur A, Denoeud F, Ramisse V, Sylvestre P, Benson G, Ramisse F, Vergnaud G: A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis. BMC Microbiol. 2001, 1: 2-10.1186/1471-2180-1-2.
Le Flèche P, Fabre M, Denoeud F, Koeck JL, Vergnaud G: High resolution, on-line identification of strains from the Mycobacterium tuberculosis complex based on tandem repeat typing. BMC Microbiol. 2002, 2: 37-10.1186/1471-2180-2-37.
Pourcel C, Andre-Mazeaud F, Neubauer H, Ramisse F, Vergnaud G: Tandem repeats analysis for the high resolution phylogenetic analysis of Yersinia pestis. BMC Microbiol. 2004, 4: 22-10.1186/1471-2180-4-22.
Denoeud F, Vergnaud G: Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a Web-based resource. BMC Bioinformatics. 2004, 5: 4-10.1186/1471-2105-5-4.
The Microorganisms Tandem Repeats Database.http://minisatellites.u-psud.fr
Martin B, Humbert O, Camara M, Guenzi E, Walker J, Mitchell T, Andrew P, Prudhomme M, Alloing G, Hakenbeck R: A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae. Nucleic Acids Res. 1992, 20: 3479-3483.
Sanchez-Beato AR, Lopez R, Garcia JL: Molecular characterization of PcpA: a novel choline-binding protein of Streptococcus pneumoniae. FEMS Microbiol Lett. 1998, 164: 207-214. 10.1016/S0378-1097(98)00206-7.
Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573.
The MLVA web service.http://bacterial-genotyping.igmors.u-psud.fr
Coffey TJ, Enright MC, Daniels M, Morona JK, Morona R, Hryniewicz W, Paton JC, Spratt BG: Recombinational exchanges at the capsular polysaccharide biosynthetic locus lead to frequent serotype changes among natural isolates of Streptococcus pneumoniae. Mol Microbiol. 1998, 27: 73-83. 10.1046/j.1365-2958.1998.00658.x.
McEllistrem MC, Noller AC, Visweswaran S, Adams JM, Harrison LH: Serotype 14 variants of the France 9V(-3) clone from Baltimore, Maryland, can be differentiated by the cpsB gene. J Clin Microbiol. 2004, 42: 250-256. 10.1128/JCM.42.1.250-256.2004.
Leimkugel J, Adams Forgor A, Gagneux S, Pfluger V, Flierl C, Awine E, Naegeli M, Dangy JP, Smith T, Hodgson A, Pluschke G: An Outbreak of Serotype 1 Streptococcus pneumoniae Meningitis in Northern Ghana with Features That Are Characteristic of Neisseria meningitidis Meningitis Epidemics. J Infect Dis. 2005, 192: 192-199. 10.1086/431151.
PubMLST – Multilocus sequence typing (MLST) databases and software.http://pubmlst.org
Société Française de Microbiologie.http://www.sfm.asso.fr
Vergnaud G, Denoeud F: Minisatellites: mutability and genome architecture. Genome Res. 2000, 10: 899-907. 10.1101/gr.10.7.899.
Hoskins J, Alborn WE, Arnold J, Blaszczak LC, Burgett S, DeHoff BS, Estrem ST, Fritz L, Fu DJ, Fuller W: Genome of the bacterium Streptococcus pneumoniae strain R6. J Bacteriol. 2001, 183: 5709-5717. 10.1128/JB.183.19.5709-5717.2001.
Tettelin H, Nelson KE, Paulsen IT, Eisen JA, Read TD, Peterson S, Heidelberg J, DeBoy RT, Haft DH, Dodson RJ: Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science. 2001, 293: 498-506. 10.1126/science.1061217.
The Sanger Institute.http://www.sanger.ac.uk/Projects/S_pneumoniae/
The Institute for Genomic Research.http://www.tigr.org/
Hunter PR, Gaston MA: Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity. J Clin Microbiol. 1988, 26: 2465-2466.
We thank Hubert Chardon (CHPA, Aix en Provence, France) and Christine Grandpré (Hôpital d'Instruction des Armées, HIA Percy, France) for providing isolates. Work on the typing and molecular epidemiology of dangerous pathogens is supported by the French ministry of defence. We thank Sanofi Pasteur and the French military health services (Service de Santé des Armées) for their support. Preliminary sequence data for S. pneumoniae strain 670 and strain Spanish 23F-1 was obtained respectively from The Institute for Genomic Research  and the Sanger Institute . Sequencing of S. pneumoniae strain 670 was accomplished with support from NIAID (National Institute of Allergy and Infectious Diseases).
SC and SV did most of the typing work and CP and GV did the error checking analysis. JLK was in charge of the definition of the study collection and preparation of the DNA samples. BMN, EV and KS collected and provided bacterial isolates. CP initiated and managed the project. GV was in charge of the BioNumerics database and clustering analyses. CP, JLK, and GV wrote the report. All authors read and approved the final manuscript.