Multiple-locus variable-number tandem-repeat analysis of Streptococcus pneumoniae and comparison with multiple loci sequence typing
BMC Microbiology volume 12, Article number: 241 (2012)
Streptococcus pneumoniae infections remain a major cause of morbidity and mortality worldwide. The diversity of pneumococci was first evidenced by serotyping of their capsular polysaccharides, responsible of virulence, resolving into more than 93 serotypes. Molecular tools have been developed to track the emergence and the spread of resistant, hyper virulent or non-vaccine type clones, particularly DNA-based methods using genetic polymorphism. Pulsed-Field Gel Electrophoresis analysis (PFGE) and Multiple Loci Sequence Typing (MLST) are the most frequently used genotyping techniques for S. pneumoniae. MLST is based on sequence comparison of housekeeping genes clustering isolates within sequence types. The availability of genome sequence data from different S. pneumoniae strains facilitated the search for other class of genetic markers as polymorphic DNA sequences for a Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA). This study aims at confirming the relevance of MLVA of S. pneumoniae, comparing MLST and MLVA performances when discriminating subgroups of strains belonging to the same Sequence Type (ST), and defining a restricted but universal set of MLVA markers that has at least the same discriminatory power as MLST for S. pneumoniae by applying marker sets used by different authors on 331 isolates selected in UK.
A minimum spanning tree was built including the serotypes distribution and comparing MLVA and MLST results. 220 MLVA types were determined grouped in 10 Sequence Types (ST). MLVA differentiated ST162 in two clonal complexes. A minimal set was defined: ms 25 and ms37, ms17, ms19, ms33, ms39, and ms40 including two universal markers. The selection was based on MLVA markers with a Diversity Index >0.8 and a selection of others depending of the population tested and the aim of the study. This set of 7 MLVA markers yields strain clusters similar to those obtained by MLST.
MLVA can discriminate relevant subgroups among strains belonging to the same ST. MLVA offers the possibility to deduce the ST from the MLVA Type. It permits to investigate local outbreaks or to track the worldwide spread of clones and the emergence of variants.
Streptococcus pneumoniae infections remain a major cause of morbidity and mortality worldwide, causing diseases which range in severity from otitis media and sinusitis, to pneumonia, septicaemia and meningitis . S. pneumoniae is a commensal of the human nasopharynx .
The diversity of pneumococci was first evidenced by serotyping of their capsular polysaccharides resolving into more than 93 serotypes [3, 4]. However, only 16 serotypes cause approximately 90% of invasive disease worldwide [1, 5]. Due to the natural transformability in the pneumococcus, horizontal recombination allows that one serotype can belong to different genotypes, and a single genotype can express different capsule genes, i.e. different serotypes. This phenomenon is known as capsular switching [6, 7]. Capsular serotype may be more important than genotype in the ability of pneumococci to cause invasive disease , but there are also some other investigations that underline the importance of genotypes as well [9–13].
Molecular tools, particularly DNA-based methods using genetic polymorphism, have been developed to track the emergence and the spread of resistant, hyper virulent clones or shifts in serotype distribution detected for both non-invasive and invasive disease reported before or since the use of heptavalent protein-polysaccharide pneumococcal conjugate vaccine (PCV7), in different countries [14, 15]. Among them, Pulsed-Field Gel Electrophoresis analysis (PFGE) [16, 17] and Multiple Loci Sequence Typing (MLST)  are the most frequently used genotyping methods for S. pneumoniae. PFGE is based on restriction enzyme pattern analysis; MLST is a sequence based method targeting 7 housekeeping genes. A S. pneumoniae specific MLST scheme targeting aroE, gdh, gki, recP, spi, xpt, and ddl was developed  together with an online identification page at http://www.mlst.net. PFGE and MLST have been extensively compared [15, 17, 19] and both have proven their capacity to discriminate efficiently among genotypes. However PFGE lacks, in some extend, of inter-laboratories reproducibility and MLST is expensive thus may be not affordable for large scale studies. Availability of genome data greatly facilitated the search for polymorphic DNA sequences. Among them, polymorphic tandem repeat sequences also called Variable Number of Tandem Repeats (VNTR) are an interesting class of genetic markers; Multiple alleles may be present at a single locus, and size differences are easily resolved by electrophoresis of PCR products. VNTR has proved to be highly relevant for the typing of pathogenic bacterial species (Haemophilus influenzae; Bacillus anthracis; Yersinia pestis). A S. pneumoniae- Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA) scheme was developed with a dedicated web-based database at http:/http://www.mlva.eu. It targets 17 distinct loci and was used initially to characterise pneumococcal isolates from Burkina Faso . Although discriminatory power of MLVA has been demonstrated, the large number of loci included in the scheme may be a limitation for its use on large scale studies (cost, timeframe, large number of samples).
This study aims at confirming the relevance of MLVA of S. pneumoniae, comparing MLST and MLVA performances when discriminating subgroups of strains belonging to the same Sequence Type (ST), and defining a restricted but universal set of MLVA markers that has at least the same discriminatory power as MLST by comparing the population genetic structure of S. pneumoniae using different published sets of markers [15, 19, 23, 25, 26].
331 invasive isolates of Streptococcus pneumoniae from the Health Protection Agency collection, London, UK, collected during the period 2002–2006, were selected among the 10 major MLST sequence types (STs), circulating in England and Wales (see  and  for detailed MLST methodology), with approximately 30 isolates per ST. Selection included serotypes commonly associated with these STs and all possible serotype variants (Table 1) identified in the HPA collection. Isolates were serotyped by slide agglutination against the full antisera panel from the Danish Statens Serum Institute (Denmark) as part of the Systemic and Respiratory Infection Laboratory (HPA, London) reference service. The isolates were collected from blood (314), cerebral spin fluid (13), pleural fluid (2), abscess (1), and bronchial aspirate (1).
MLVA was performed as previously described . The first 17 VNTRs (Spneu 15 to Spneu 41) were used. The last one (Spneu42) unsuccessfully amplified DNA from the isolates or the reference strains and therefore was avoid in this study. For convenience, the nomenclature “Spneu” meaning Streptococcus pneumoniae was replaced by “ms” meaning minisatellite in this paper.
The genetic diversity was measured by the Hunter-Gaston Diversity Index (DI) on http://www.hpa-bioinformatics.org.uk/cgi-bin/DICI/DICI.pl. A high DI with a narrow confident interval (CI) indicates accurate measurement of a highly variable locus. These loci may be sufficiently variable to be used as an indicator to discriminate between samples or as a starting point for assay development.
The genetic distances between two isolates i and j were calculated as following:
One marker difference is equivalent to 15%, 5/7 different is 70%. In our study, the criteria sets provided by either MLVA or MLST analysis consider two strains similar having at least 70% similarity, i.e. a DLV difference. The interest of the method is to quantify the difference.
The minimum spanning trees by MLST using the 7 house keeping genes and by MLVA were constructed using BioNumerics ver. 5.0 with the categorical coefficient. Priority rules were fixed as following: maximum number of i) Single-locus variants (SLVs); ii) SLVs and double-locus variants (DLVs); iii) Maximum neighbour minimum cluster size of two loci (DLV) and 2 ST, when the seven housekeeping gene markers were used by MLST; iv) Maximum neighbour minimum cluster size of two loci (DLV) and 2 MT, when 17 markers were used and one locus (SLV) and 2 MT when 7 markers are used by MLVA.
The Congruence among Distance Matrices MLST/MLVA was calculated in % of difference of the genetic distance between two isolates depending on the number of markers used using Bionumerics ver.5.0 as well.
The Inter-Matrix Difference (IMD) was calculated using the formula below, where d(i,j) is the genetic distance between i and j, and n the number of isolates. Marker numbers refer to Table 2. The lower the IMD value is the closest is the distance matrices given by the two techniques.
Results and discussion
The discriminatory power of MLVA was compared to that of MLST by analysing 331 isolates of S. pneumoniae which had been previously serotyped and composed 10 sequence types. The discriminatory power was analysed in two steps: first by the analysis of the population including its composition and the genetic diversity using 17 markers, then by analysing the genetic diversity of this population using sets of 7 markers described by different authors [19, 25, 26].
The genetic diversity of the 331 isolates of S. pneumoniae was assessed by MLVA by using 17 markers (Table 2). A total of 220 MLVA types (MTs) were identified and clustered into 11 clonal complexes and 17 singletons by minimum spanning tree analysis (Figure 1A). DI > 0.8 was achieved for three loci: ms17, ms37 and ms39, which represent the most discriminatory effect. The congruence between MLST and MLVA was estimated at 67% (Figure 1A). The locus variation using MLST is a DLV between ST227 and ST306, ST138 and ST176, and a SLV between ST156 and ST162 (Figure 1B). Other ST had 5 loci difference. MLVA underlines genetic variability within MLST types. ST9, ST65 and ST 306 are more clonal than the others, whereas ST 176 is much more diversified by MLVA than by MLST, and ST156 and ST162 presented a unique pattern. ST162 is either grouped with ST156 to form a clonal complex or is forming a clonal complex by itself with a 3 locus difference. Isolates of ST162 formed two distinct MLVA complexes (MC), one mainly associated with serotype 19 F (MC162a) and the other one (MC162b) associated with 9 V, suggesting independent evolutionary biology following divergence from a ST162 common ancestor combined with capsular switching event. Moreover, serotype 14, which is an invasive serotype was shown to be a variant of ST156 and 9 V , and therefore, was clustered within ST156/162. Other isolates of serotype 14 ST9 are well separated from ST156/162.
Knowing the MLVA type it is possible to deduce not only the ST but also the associated serotype depending on the clonality of the serotypes. It is the case for serotype 1 because of its strong clonality, whereas it is not possible for the serotype 19F. Moreover, the carriage is more frequent for certain serotypes, particularly serotype 19F, meaning that isolates belonging to those serotypes often exchange DNA with other carried. So the serotype of a pneumococcus strain can change but not its other genetic characteristics’. Indeed, carriage serotypes are distributed along the dendrogram and can belong to very different genotypes.
However, in order to compare identical number of MLST and MLVA markers, a set of seven MLVA markers was considered. The set includes three markers with the highest discriminatory power (DI > 0.8), one marker with a low discriminatory power acting as an anchor for the dendrogram, and three others, selected for a low IMD and for their ability to distinguish ST 227 and ST 306, and based on previous data . The composition of the MLVA set was adapted as follows: ms17, ms19, ms25, ms27, ms33, ms37, ms39 .
The comparison between MLST and MLVA using seven markers was obtained by construction of a minimum spanning tree (Figure 2A). Congruence MLST/MLVA was 47.2%.
Then, congruence between MLST and MLVA of the reduced MLVA scheme was compared to those obtained when using the seven marker set Elberse’s  (Figure 2C) and the seven marker set Pichon’s  (Figure 2B). Elberse’s scheme was dedicated for studying the population structure of S. pneumoniae whilst Pichon’s markers were selected based on the best combination for highest discriminatory power for outbreak investigation. The genetic distance between the 331 isolates determined by MLST and MLVA and their congruence (Figures 2B, 2C and Table 2) was respectively 65.1% (Pichon’s markers), 43.8% (Elberse’s markers). Previously , congruence MLST/MLVA was estimated to 59% when the same set of isolates was analysed using markers ms17, ms19, ms25, ms33, ms37, ms40 and ms41. Pichon’s markers gave similar congruence to the 17 marker set of this study, or the highest MLST/MLVA congruence comparing the seven markers sets (A, B, C), but ST227/ST306 and ST156/ST162 were grouped within the same clonal complex. MLST/MLVA results are coherent. Indeed, a low genetic distance between two ST is low between two corresponding MT.
Applying sets of markers selected in two other studies on S. pneumoniae, to the population selected in this study, revealed (Table 2) that (i) two markers ms25 and ms37, are commonly used by all authors, including this study, and presented a high DI whichever strains were used and the aim of the study, (ii) several markers were never used: ms26, ms31 and, ms35, (iii) the other markers, ms17, ms19 and ms33 were dependant on the method, i.e., the capacity to discriminate the clonal complexes, (iv) ST discriminant capacity using MLVA varies depending on the set of marker used, and a high percentage of congruence does not mean a better discriminant capacity.
The selection of the markers except for ms25 and ms37 was dependant on the studied population. MLVA based on this study (A), Pichon’s (B), marker sets clustered the study population accordingly to MLST data whilst Elberse’s (C) marker set gave a lower resolving of the population.
The results suggested that 14 out of the 17 markers previously described for S. pneumoniae, can be selected whatever the S. pneumoniae population considered.
In other words, analysis of strains with the same ST but isolated in different countries will give similar results, i.e., many new MLVA types associated with the same ST can be identified as it was observed for Niger strains  (Additional file 1). However, higher the number of markers is, more important the diversity of genotypes observed is. Some markers are specific to the bacterial population .
MLVA can discriminate relevant subgroups among strains belonging to the same ST, and offers the possibility to deduce the ST from the MT.
In this study MLST and MLVA were compared for their discriminatory power for S. pneumoniae populations with purpose to try to define a set of marker that can be used whatever the population and the aim of the study.
The study population was composed by 331 isolates belonging to the top 10 STs in England. MLVA using 17 markers yields clustering of the isolates similar to that obtained by MLST. Moreover, MLVA permits to differentiate within ST different clonal complexes, particularly ST156 and ST162. Our study showed that the number of VNTR loci may be reduced to 7 to achieve a similar cluster pattern to MLST.
In conclusion, prior to any study, 14 markers only, have to be tested. Then, the selection of 7 markers is based on MLVA markers with a DI > 0.8 (including markers ms25 and ms37) and a selection of others including one marker with a low discriminatory power acting as an anchor for the dendrogram, and 4 others depending of the population tested and the aim of the study. The set of markers, whose composition depends on the population studied, could be used either to investigate local outbreaks or to track the worldwide spread of clones and particularly the emergence of variants.
- S. pneumoniae :
Pulsed-Field Gel Electrophoresis analysis
Multiple Loci Sequence Typing
Deoxy nucleic acid
Sequences for a Multiple-Locus Variable-Number Tandem-Repeat Analysis
Variable Number of Tandem Repeats
Hunter-Gaston Diversity Index
Feldman C, Klugman KP: Pneumococcal infections. Curr Opin Infect Dis. 1997, 10: 109-115.
Gray BM, Dillon HC: Clinical and epidemiologic studies of pneumococcal infection in children. Paed Infect Dis. 1986, 5: 201-207.
Park IH, Pritchard DG, Cartee R, Brandao A, Brandileone MC, Nahm MH: Discovery of a new capsular serotype (6C) within serogroup 6 of Streptococcus pneumoniae. J Clin Microbiol. 2007, 45: 1225-1233.
Calix JJ, Nahm MH: A new pneumococcal serotype, 11E, has a 455 variably inactivated wcjE gene. J Infect Dis. 2010, 202: 29-38.
Scott JAG, Hall AJ, Dagan R, Dixon JMS, Eykyn SJ, Fenoll A, Hortal M, Jette LP, Jorgensen JH, Lamothe F, Latorre C, Macfarlane JT, Shlaes DM, Smart LE, Taunay A: Serogroup-specific epidemiology of Streptococcus pneumoniae -associations with age, sex, and geography in 7,000 episodes of invasive disease. Clin Infect Dis. 1996, 22: 973-981.
Coffey T, Daniels M, Enright C, Spratt B: Serotype 14 variants of the Spanish penicillin-resistant serotype 9 V clone of Streptococcus pneumoniae arose by large recombinational replacements of the cpsA-pbp1a region. Microbiol. 1999, 145: 2023-2031.
Jefferies JMC, Smith A, Clarke SC, Dowson C, Mitchell TJ: Indicates high levels of diversity within serotypes and capsule switching genetic analysis of diverse disease-causing pneumococci. J Clin Microbiol. 2004, 42: 5681-5688.
Brueggemann AB, Griffiths DT, Meats E, Peto T, Crook DW, Spratt BG: Clonal relationships between invasive and carriage Streptococcus pneumoniae and serotype- and clone-specific differences in invasive disease potential. J Infect Dis. 2003, 187: 1424-1432.
Enright M, Spratt G: A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiol. 1998, 144: 3049-3060.
Sá-Leão R, Pinto F, Aguiar S, Nunes S, Carriço JA, Frazão N, Gonçalves-Sousa N, Melo-Cristino J, de Lencastre H, Ramirez M: Invasiveness of pneumococcal serotypes and clones circulating in Portugal before widespread use of conjugate vaccines revealing heterogeneous behavior of clones expressing the same serotype. J Clin Microbiol. 2011, 49: 1369-1375.
Hanage WP, Kaijalainen TH, Syrjanen RK, Auranen K, Leinonen M, Makela PH, Spratt BG: Invasiveness of serotypes and clones of Streptococcus pneumoniae among children in Finland. Infect Immun. 2005, 73: 431-435.
Sandgren A, Sjostrom K, Olsson-Liljequist B, Christensson B, Samuelsson A, Kronvall G, Henriques Normark B: Effect of clonal and serotype-specific properties on the invasive capacity of Streptococcus pneumoniae. J Infect Dis. 2004, 189: 785-796.
Sjostrom K, Spindler C, Ortqvist A, Kalin M, Sandgren A, Kuhlmann-Berenzon S, Henriques-Normark B: Clonal and capsular types decide whether pneumococci will act as a primary or opportunistic pathogen. Clin Infect Dis. 2006, 42: 451-459.
Jefferies JM, Smith AJ, Edwards GFS, McMenamin J, Mitchell TJ, Clarke SC: Temporal analysis of invasive pneumococcal clones from Scotland illustrates fluctuations in diversity of serotype and genotype in the absence of pneumococcal conjugate vaccine. J Clin Microbiol. 2010, 48: 87-96.
Elberse KEM, van de Pol I, Witteveen S, van der Heide HGJ, Schot CS, van Dijk A, van der Ende A, Schouls LM: Population structure of invasive Streptococcus pneumoniae in the Netherlands in the pre-vaccination era assessed by MLVA and capsular sequence typing. Plos One. 2011, 6: 1-11.
Lefevre JC, Faucon G, Sicard AM, Gasc AM: DNA fingerprinting of Streptococcus pneumoniae strains by pulsed-field gel electrophoresis. J Clin Microbiol. 1993, 31: 2724-2728.
Hall LM, Whiley RA, Duke B, George RC, Efstratiou A: Genetic relatedness within and between serotypes of Streptococcus pneumoniae from the United Kingdom: analysis of multilocus enzyme electrophoresis, pulsed-field gel electrophoresis, and antimicrobial resistance patterns. J Clin Microbiol. 1996, 34: 853-859.
Aanensen DM, Spratt BG: The multilocus sequence typing network: mlst.net. Nuc Acids Res. 2005, 33: W728-W733.
Koeck J, Underwood A, Brunetaud J, Leroy P, Granger-Ferbos A, Koeck J, Pichon B: Multiple-Locus variable-number tandem-repeat analysis of Streptococcus pneumoniae and comparison with MLST. 2008, Zakopane, Poland: Presented at European Study Group for Epidemiological Markers 8th International Meeting on Microbial Epidemiological Markers, 76-Poster
van Belkum A, Scherer S, van Leeuwen W, Willemse D, van Alphen L, Verbrugh H: Variable number of tandem repeats in clinical strains of Haemophilus influenzae. Infect Immun. 1997, 65: 5017-5027.
Keim P, Price LB, Klevytska AM, Smith KL, Schupp JM, Okinaka R, Jackson PJ, Hugh-Jones ME: Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis. J Bacteriol. 2000, 182: 2928-2936.
Pourcel C, André-Mazeaud F, Neubauer H, Ramisse F, Vergnaud G: Tandem repeat analysis for the high resolution phylogenetic analysis of Yersinia pestis. BMC Microbiol. 2004, 4: 22-
Koeck JL, Njanpop-Lafourcade BM, Cade S, Varon E, Sangare L, Valjevac S, Vergnaud G, Pourcel C: Evaluation and selection of tandem repeat loci for Streptococcus pneumoniae MLVA strain typing. BMC Microbiol. 2005, 5: 66-
Yaro S, Lourd M, Traore Y, Njanpop-Lafourcade BM, Sawadogo A, Sangare L, Hien A, Ouedraogo MS, Sanou O, Du Chatelet I P, Koeck JL, Gessner BD: Epidemiological and molecular characteristics of a highly lethal pneumococcal meningitis epidemic in Burkina Faso. Clin Infect Dis. 2006, 43: 693-700.
Elberse KEM, Nunes S, Sa-leao R, van der Heide HGJ, Schouls LM: Multiple-locus variable number tandem repeat analysis for Streptococcus pneumoniae: comparison with PFGE and MLST. Plos One. 2011, 6: 1-8.
Pichon P, Moyce L, Sheppard C, Slack M, Turbitt D, Pebody R, Spencer DA, Edwards J, Krahe D, George R: Molecular typing of pneumococci for investigation of linked cases of invasive pneumococcal disease. J Clin Microbiol. 2010, 48: 1926-1928.
Pichon B, Bennett HV, Efstratiou A, Slack MP, George RC: Genetic characteristics of pneumococcal disease in elderly patients before introducing the pneumococcal conjugate vaccine. Epidemiol Infect. 2009, 137: 1049-1056.
Platt S, Pichon B, George R, Green J: A bioinformatics pipeline for high-throughput microbial multilocus sequence typing (MLST) analyses. Clin Microbiol Infect. 2006, 12: 1144-1146.
Coffey TJ, Enright MC, Daniels M, Morona JK, Morona R, Hryniewicz W, Paton JC, Spratt BG: Recombinational exchanges at the capsular polysaccharide biosynthetic locus lead to frequent serotype changes among natural isolates of Streptococcus pneumoniae. Mol Microbiol. 1998, 27: 73-83.
Amadou Hamidou A, Djibo S, Boisier P, Varon E, Dubrous P, Chanteau S, Koeck JL: Diversité génétique de souches de pneumocoque isolées de cas de méningite au Niger, 2003–2006. 2007, Marseille, France: Actualités du Pharo, Poster
MLST testing was funded by a UK Department of Health Grant. MLVA testing was funded by the French Military Health Service. Financial competing interest: Non-financial competing interests. No stocks hold or share in an organization that may in any way gain or lose financially from the publication of this manuscript, either now or in the future. No holding or currently applying for any patents relating to the content of the manuscript. No reimbursements, fees, funding, or salary have been received from an organization that holds or has applied for patents relating to the content of the manuscript. No non-financial competing interests (political, personal, religious, ideological, academic, intellectual, commercial or any other).
HvC participated to the methodology comparison and drafted the manuscript. BP participated in the design of the study, performed the MLST, provided the isolates and revised the manuscript critically for important intellectual content. PL conducted and carried out the MLVA protocol. AGF carried out MLVA and molecular genetic data analysis and help to draft the manuscript. AU performed the statistical analysis and revised the manuscript. BS revised the manuscript critically for important intellectual content. JLK conceived of the study, and participated in its design and coordination. All authors read and approved the final manuscript.
Electronic supplementary material
About this article
Cite this article
van Cuyck, H., Pichon, B., Leroy, P. et al. Multiple-locus variable-number tandem-repeat analysis of Streptococcus pneumoniae and comparison with multiple loci sequence typing. BMC Microbiol 12, 241 (2012). https://doi.org/10.1186/1471-2180-12-241
- S. pneumoniae
- Universal marker set
- Population structure