Streptococcus pneumoniae infections remain a major cause of morbidity and mortality worldwide. The diversity of pneumococci was first evidenced by serotyping of their capsular polysaccharides, responsible of virulence, resolving into more than 93 serotypes. Molecular tools have been developed to track the emergence and the spread of resistant, hyper virulent or non-vaccine type clones, particularly DNA-based methods using genetic polymorphism. Pulsed-Field Gel Electrophoresis analysis (PFGE) and Multiple Loci Sequence Typing (MLST) are the most frequently used genotyping techniques for S. pneumoniae. MLST is based on sequence comparison of housekeeping genes clustering isolates within sequence types. The availability of genome sequence data from different S. pneumoniae strains facilitated the search for other class of genetic markers as polymorphic DNA sequences for a Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA). This study aims at confirming the relevance of MLVA of S. pneumoniae, comparing MLST and MLVA performances when discriminating subgroups of strains belonging to the same Sequence Type (ST), and defining a restricted but universal set of MLVA markers that has at least the same discriminatory power as MLST for S. pneumoniae by applying marker sets used by different authors on 331 isolates selected in UK.
A minimum spanning tree was built including the serotypes distribution and comparing MLVA and MLST results. 220 MLVA types were determined grouped in 10 Sequence Types (ST). MLVA differentiated ST162 in two clonal complexes. A minimal set was defined: ms 25 and ms37, ms17, ms19, ms33, ms39, and ms40 including two universal markers. The selection was based on MLVA markers with a Diversity Index >0.8 and a selection of others depending of the population tested and the aim of the study. This set of 7 MLVA markers yields strain clusters similar to those obtained by MLST.
MLVA can discriminate relevant subgroups among strains belonging to the same ST. MLVA offers the possibility to deduce the ST from the MLVA Type. It permits to investigate local outbreaks or to track the worldwide spread of clones and the emergence of variants.
S. pneumoniaeMLSTMLVAUniversal marker setPopulation structure
Streptococcus pneumoniae infections remain a major cause of morbidity and mortality worldwide, causing diseases which range in severity from otitis media and sinusitis, to pneumonia, septicaemia and meningitis . S. pneumoniae is a commensal of the human nasopharynx .
The diversity of pneumococci was first evidenced by serotyping of their capsular polysaccharides resolving into more than 93 serotypes [3, 4]. However, only 16 serotypes cause approximately 90% of invasive disease worldwide [1, 5]. Due to the natural transformability in the pneumococcus, horizontal recombination allows that one serotype can belong to different genotypes, and a single genotype can express different capsule genes, i.e. different serotypes. This phenomenon is known as capsular switching [6, 7]. Capsular serotype may be more important than genotype in the ability of pneumococci to cause invasive disease , but there are also some other investigations that underline the importance of genotypes as well [9–13].
Molecular tools, particularly DNA-based methods using genetic polymorphism, have been developed to track the emergence and the spread of resistant, hyper virulent clones or shifts in serotype distribution detected for both non-invasive and invasive disease reported before or since the use of heptavalent protein-polysaccharide pneumococcal conjugate vaccine (PCV7), in different countries [14, 15]. Among them, Pulsed-Field Gel Electrophoresis analysis (PFGE) [16, 17] and Multiple Loci Sequence Typing (MLST)  are the most frequently used genotyping methods for S. pneumoniae. PFGE is based on restriction enzyme pattern analysis; MLST is a sequence based method targeting 7 housekeeping genes. A S. pneumoniae specific MLST scheme targeting aroE, gdh, gki, recP, spi, xpt, and ddl was developed  together with an online identification page at http://www.mlst.net. PFGE and MLST have been extensively compared [15, 17, 19] and both have proven their capacity to discriminate efficiently among genotypes. However PFGE lacks, in some extend, of inter-laboratories reproducibility and MLST is expensive thus may be not affordable for large scale studies. Availability of genome data greatly facilitated the search for polymorphic DNA sequences. Among them, polymorphic tandem repeat sequences also called Variable Number of Tandem Repeats (VNTR) are an interesting class of genetic markers; Multiple alleles may be present at a single locus, and size differences are easily resolved by electrophoresis of PCR products. VNTR has proved to be highly relevant for the typing of pathogenic bacterial species (Haemophilus influenzae; Bacillus anthracis; Yersinia pestis). A S. pneumoniae- Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA) scheme was developed with a dedicated web-based database at http:/http://www.mlva.eu. It targets 17 distinct loci and was used initially to characterise pneumococcal isolates from Burkina Faso . Although discriminatory power of MLVA has been demonstrated, the large number of loci included in the scheme may be a limitation for its use on large scale studies (cost, timeframe, large number of samples).
This study aims at confirming the relevance of MLVA of S. pneumoniae, comparing MLST and MLVA performances when discriminating subgroups of strains belonging to the same Sequence Type (ST), and defining a restricted but universal set of MLVA markers that has at least the same discriminatory power as MLST by comparing the population genetic structure of S. pneumoniae using different published sets of markers [15, 19, 23, 25, 26].
331 invasive isolates of Streptococcus pneumoniae from the Health Protection Agency collection, London, UK, collected during the period 2002–2006, were selected among the 10 major MLST sequence types (STs), circulating in England and Wales (see  and  for detailed MLST methodology), with approximately 30 isolates per ST. Selection included serotypes commonly associated with these STs and all possible serotype variants (Table 1) identified in the HPA collection. Isolates were serotyped by slide agglutination against the full antisera panel from the Danish Statens Serum Institute (Denmark) as part of the Systemic and Respiratory Infection Laboratory (HPA, London) reference service. The isolates were collected from blood (314), cerebral spin fluid (13), pleural fluid (2), abscess (1), and bronchial aspirate (1).
MLVA was performed as previously described . The first 17 VNTRs (Spneu 15 to Spneu 41) were used. The last one (Spneu42) unsuccessfully amplified DNA from the isolates or the reference strains and therefore was avoid in this study. For convenience, the nomenclature “Spneu” meaning Streptococcus pneumoniae was replaced by “ms” meaning minisatellite in this paper.
The genetic diversity was measured by the Hunter-Gaston Diversity Index (DI) on http://www.hpa-bioinformatics.org.uk/cgi-bin/DICI/DICI.pl. A high DI with a narrow confident interval (CI) indicates accurate measurement of a highly variable locus. These loci may be sufficiently variable to be used as an indicator to discriminate between samples or as a starting point for assay development.
The genetic distances between two isolates i and j were calculated as following:
One marker difference is equivalent to 15%, 5/7 different is 70%. In our study, the criteria sets provided by either MLVA or MLST analysis consider two strains similar having at least 70% similarity, i.e. a DLV difference. The interest of the method is to quantify the difference.
The minimum spanning trees by MLST using the 7 house keeping genes and by MLVA were constructed using BioNumerics ver. 5.0 with the categorical coefficient. Priority rules were fixed as following: maximum number of i) Single-locus variants (SLVs); ii) SLVs and double-locus variants (DLVs); iii) Maximum neighbour minimum cluster size of two loci (DLV) and 2 ST, when the seven housekeeping gene markers were used by MLST; iv) Maximum neighbour minimum cluster size of two loci (DLV) and 2 MT, when 17 markers were used and one locus (SLV) and 2 MT when 7 markers are used by MLVA.
The Congruence among Distance Matrices MLST/MLVA was calculated in % of difference of the genetic distance between two isolates depending on the number of markers used using Bionumerics ver.5.0 as well.
The Inter-Matrix Difference (IMD) was calculated using the formula below, where d(i,j) is the genetic distance between i and j, and n the number of isolates. Marker numbers refer to Table 2. The lower the IMD value is the closest is the distance matrices given by the two techniques.
Genetic diversity of the 331 isolates ofS. pneumoniae
The marker name contains all the numbers necessary to characterize the marker in reference to a given sequenced genome (reference strain, R6). For example, in “ms15_507bp_45bp_7U”, - ms means minisatellite, - 507 bp is the size of the amplification product of this marker; - 45 bp is the size of the repeat unit, - 7 is the number of repeats. Markers used by authors are noticed by a cross (+), authors seven Markers set are noticed as following: (A) this paper, (B) Pichon’s and (C) Elberse’s. The MLST/MLVA congruence in percent by author is indicated at the bottom of the table.
* DI: diversity index. † CI: confidence interval.
Results and discussion
The discriminatory power of MLVA was compared to that of MLST by analysing 331 isolates of S. pneumoniae which had been previously serotyped and composed 10 sequence types. The discriminatory power was analysed in two steps: first by the analysis of the population including its composition and the genetic diversity using 17 markers, then by analysing the genetic diversity of this population using sets of 7 markers described by different authors [19, 25, 26].
The genetic diversity of the 331 isolates of S. pneumoniae was assessed by MLVA by using 17 markers (Table 2). A total of 220 MLVA types (MTs) were identified and clustered into 11 clonal complexes and 17 singletons by minimum spanning tree analysis (Figure 1A). DI > 0.8 was achieved for three loci: ms17, ms37 and ms39, which represent the most discriminatory effect. The congruence between MLST and MLVA was estimated at 67% (Figure 1A). The locus variation using MLST is a DLV between ST227 and ST306, ST138 and ST176, and a SLV between ST156 and ST162 (Figure 1B). Other ST had 5 loci difference. MLVA underlines genetic variability within MLST types. ST9, ST65 and ST 306 are more clonal than the others, whereas ST 176 is much more diversified by MLVA than by MLST, and ST156 and ST162 presented a unique pattern. ST162 is either grouped with ST156 to form a clonal complex or is forming a clonal complex by itself with a 3 locus difference. Isolates of ST162 formed two distinct MLVA complexes (MC), one mainly associated with serotype 19 F (MC162a) and the other one (MC162b) associated with 9 V, suggesting independent evolutionary biology following divergence from a ST162 common ancestor combined with capsular switching event. Moreover, serotype 14, which is an invasive serotype was shown to be a variant of ST156 and 9 V , and therefore, was clustered within ST156/162. Other isolates of serotype 14 ST9 are well separated from ST156/162.
Knowing the MLVA type it is possible to deduce not only the ST but also the associated serotype depending on the clonality of the serotypes. It is the case for serotype 1 because of its strong clonality, whereas it is not possible for the serotype 19F. Moreover, the carriage is more frequent for certain serotypes, particularly serotype 19F, meaning that isolates belonging to those serotypes often exchange DNA with other carried. So the serotype of a pneumococcus strain can change but not its other genetic characteristics’. Indeed, carriage serotypes are distributed along the dendrogram and can belong to very different genotypes.
However, in order to compare identical number of MLST and MLVA markers, a set of seven MLVA markers was considered. The set includes three markers with the highest discriminatory power (DI > 0.8), one marker with a low discriminatory power acting as an anchor for the dendrogram, and three others, selected for a low IMD and for their ability to distinguish ST 227 and ST 306, and based on previous data . The composition of the MLVA set was adapted as follows: ms17, ms19, ms25, ms27, ms33, ms37, ms39.
The comparison between MLST and MLVA using seven markers was obtained by construction of a minimum spanning tree (Figure 2A). Congruence MLST/MLVA was 47.2%.
Then, congruence between MLST and MLVA of the reduced MLVA scheme was compared to those obtained when using the seven marker set Elberse’s  (Figure 2C) and the seven marker set Pichon’s  (Figure 2B). Elberse’s scheme was dedicated for studying the population structure of S. pneumoniae whilst Pichon’s markers were selected based on the best combination for highest discriminatory power for outbreak investigation. The genetic distance between the 331 isolates determined by MLST and MLVA and their congruence (Figures 2B, 2C and Table 2) was respectively 65.1% (Pichon’s markers), 43.8% (Elberse’s markers). Previously , congruence MLST/MLVA was estimated to 59% when the same set of isolates was analysed using markers ms17, ms19, ms25, ms33, ms37, ms40 and ms41. Pichon’s markers gave similar congruence to the 17 marker set of this study, or the highest MLST/MLVA congruence comparing the seven markers sets (A, B, C), but ST227/ST306 and ST156/ST162 were grouped within the same clonal complex. MLST/MLVA results are coherent. Indeed, a low genetic distance between two ST is low between two corresponding MT.
Applying sets of markers selected in two other studies on S. pneumoniae, to the population selected in this study, revealed (Table 2) that (i) two markers ms25 and ms37, are commonly used by all authors, including this study, and presented a high DI whichever strains were used and the aim of the study, (ii) several markers were never used: ms26, ms31 and, ms35, (iii) the other markers, ms17, ms19 and ms33 were dependant on the method, i.e., the capacity to discriminate the clonal complexes, (iv) ST discriminant capacity using MLVA varies depending on the set of marker used, and a high percentage of congruence does not mean a better discriminant capacity.
The selection of the markers except for ms25 and ms37 was dependant on the studied population. MLVA based on this study (A), Pichon’s (B), marker sets clustered the study population accordingly to MLST data whilst Elberse’s (C) marker set gave a lower resolving of the population.
The results suggested that 14 out of the 17 markers previously described for S. pneumoniae, can be selected whatever the S. pneumoniae population considered.
In other words, analysis of strains with the same ST but isolated in different countries will give similar results, i.e., many new MLVA types associated with the same ST can be identified as it was observed for Niger strains  (Additional file 1). However, higher the number of markers is, more important the diversity of genotypes observed is. Some markers are specific to the bacterial population .
MLVA can discriminate relevant subgroups among strains belonging to the same ST, and offers the possibility to deduce the ST from the MT.
In this study MLST and MLVA were compared for their discriminatory power for S. pneumoniae populations with purpose to try to define a set of marker that can be used whatever the population and the aim of the study.
The study population was composed by 331 isolates belonging to the top 10 STs in England. MLVA using 17 markers yields clustering of the isolates similar to that obtained by MLST. Moreover, MLVA permits to differentiate within ST different clonal complexes, particularly ST156 and ST162. Our study showed that the number of VNTR loci may be reduced to 7 to achieve a similar cluster pattern to MLST.
In conclusion, prior to any study, 14 markers only, have to be tested. Then, the selection of 7 markers is based on MLVA markers with a DI > 0.8 (including markers ms25 and ms37) and a selection of others including one marker with a low discriminatory power acting as an anchor for the dendrogram, and 4 others depending of the population tested and the aim of the study. The set of markers, whose composition depends on the population studied, could be used either to investigate local outbreaks or to track the worldwide spread of clones and particularly the emergence of variants.
Pulsed-Field Gel Electrophoresis analysis
Multiple Loci Sequence Typing
Deoxy nucleic acid
Sequences for a Multiple-Locus Variable-Number Tandem-Repeat Analysis
Variable Number of Tandem Repeats
Hunter-Gaston Diversity Index
HIA Robert Picqué
Health Protection Agency, Microbiology Services – Colindale
Gray BM, Dillon HC Jr: Clinical and epidemiologic studies of pneumococcal infection in children.Paed Infect Dis 1986, 5:201–207.View Article
Park IH, Pritchard DG, Cartee R, Brandao A, Brandileone MC, Nahm MH: Discovery of a new capsular serotype (6C) within serogroup 6 of Streptococcus pneumoniae.J Clin Microbiol 2007, 45:1225–1233.PubMedView Article
Calix JJ, Nahm MH: A new pneumococcal serotype, 11E, has a 455 variably inactivated wcjE gene.J Infect Dis 2010, 202:29–38.PubMedView Article
Scott JAG, Hall AJ, Dagan R, Dixon JMS, Eykyn SJ, Fenoll A, Hortal M, Jette LP, Jorgensen JH, Lamothe F, Latorre C, Macfarlane JT, Shlaes DM, Smart LE, Taunay A: Serogroup-specific epidemiology of Streptococcus pneumoniae -associations with age, sex, and geography in 7,000 episodes of invasive disease.Clin Infect Dis 1996, 22:973–981.PubMedView Article
Coffey T, Daniels M, Enright C, Spratt B: Serotype 14 variants of the Spanish penicillin-resistant serotype 9 V clone of Streptococcus pneumoniae arose by large recombinational replacements of the cpsA-pbp1a region.Microbiol 1999, 145:2023–2031.View Article
Jefferies JMC, Smith A, Clarke SC, Dowson C, Mitchell TJ: Indicates high levels of diversity within serotypes and capsule switching genetic analysis of diverse disease-causing pneumococci.J Clin Microbiol 2004, 42:5681–5688.PubMedView Article
Brueggemann AB, Griffiths DT, Meats E, Peto T, Crook DW, Spratt BG: Clonal relationships between invasive and carriage Streptococcus pneumoniae and serotype- and clone-specific differences in invasive disease potential.J Infect Dis 2003, 187:1424–1432.PubMedView Article
Enright M, Spratt G: A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease.Microbiol 1998, 144:3049–3060.View Article
Sá-Leão R, Pinto F, Aguiar S, Nunes S, Carriço JA, Frazão N, Gonçalves-Sousa N, Melo-Cristino J, de Lencastre H, Ramirez M: Invasiveness of pneumococcal serotypes and clones circulating in Portugal before widespread use of conjugate vaccines revealing heterogeneous behavior of clones expressing the same serotype.J Clin Microbiol 2011, 49:1369–1375.PubMedView Article
Hanage WP, Kaijalainen TH, Syrjanen RK, Auranen K, Leinonen M, Makela PH, Spratt BG: Invasiveness of serotypes and clones of Streptococcus pneumoniae among children in Finland.Infect Immun 2005, 73:431–435.PubMedView Article
Sandgren A, Sjostrom K, Olsson-Liljequist B, Christensson B, Samuelsson A, Kronvall G, Henriques Normark B: Effect of clonal and serotype-specific properties on the invasive capacity of Streptococcus pneumoniae.J Infect Dis 2004, 189:785–796.PubMedView Article
Sjostrom K, Spindler C, Ortqvist A, Kalin M, Sandgren A, Kuhlmann-Berenzon S, Henriques-Normark B: Clonal and capsular types decide whether pneumococci will act as a primary or opportunistic pathogen.Clin Infect Dis 2006, 42:451–459.PubMedView Article
Jefferies JM, Smith AJ, Edwards GFS, McMenamin J, Mitchell TJ, Clarke SC: Temporal analysis of invasive pneumococcal clones from Scotland illustrates fluctuations in diversity of serotype and genotype in the absence of pneumococcal conjugate vaccine.J Clin Microbiol 2010, 48:87–96.PubMedView Article
Elberse KEM, van de Pol I, Witteveen S, van der Heide HGJ, Schot CS, van Dijk A, van der Ende A, Schouls LM: Population structure of invasive Streptococcus pneumoniae in the Netherlands in the pre-vaccination era assessed by MLVA and capsular sequence typing.Plos One 2011, 6:1–11.
Lefevre JC, Faucon G, Sicard AM, Gasc AM: DNA fingerprinting of Streptococcus pneumoniae strains by pulsed-field gel electrophoresis.J Clin Microbiol 1993, 31:2724–2728.PubMed
Hall LM, Whiley RA, Duke B, George RC, Efstratiou A: Genetic relatedness within and between serotypes of Streptococcus pneumoniae from the United Kingdom: analysis of multilocus enzyme electrophoresis, pulsed-field gel electrophoresis, and antimicrobial resistance patterns.J Clin Microbiol 1996, 34:853–859.PubMed
Aanensen DM, Spratt BG: The multilocus sequence typing network: mlst.net.Nuc Acids Res 2005, 33:W728-W733.View Article
Koeck J, Underwood A, Brunetaud J, Leroy P, Granger-Ferbos A, Koeck J, Pichon B: Multiple-Locus variable-number tandem-repeat analysis of Streptococcus pneumoniae and comparison with MLST. Zakopane, Poland: Presented at European Study Group for Epidemiological Markers 8th International Meeting on Microbial Epidemiological Markers; 2008:76. Poster
van Belkum A, Scherer S, van Leeuwen W, Willemse D, van Alphen L, Verbrugh H: Variable number of tandem repeats in clinical strains of Haemophilus influenzae.Infect Immun 1997, 65:5017–5027.PubMed
Keim P, Price LB, Klevytska AM, Smith KL, Schupp JM, Okinaka R, Jackson PJ, Hugh-Jones ME: Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis.J Bacteriol 2000, 182:2928–2936.PubMedView Article
Pourcel C, André-Mazeaud F, Neubauer H, Ramisse F, Vergnaud G: Tandem repeat analysis for the high resolution phylogenetic analysis of Yersinia pestis.BMC Microbiol 2004, 4:22.PubMedView Article
Koeck JL, Njanpop-Lafourcade BM, Cade S, Varon E, Sangare L, Valjevac S, Vergnaud G, Pourcel C: Evaluation and selection of tandem repeat loci for Streptococcus pneumoniae MLVA strain typing.BMC Microbiol 2005, 5:66.PubMedView Article
Yaro S, Lourd M, Traore Y, Njanpop-Lafourcade BM, Sawadogo A, Sangare L, Hien A, Ouedraogo MS, Sanou O, Du Chatelet I P, Koeck JL, Gessner BD: Epidemiological and molecular characteristics of a highly lethal pneumococcal meningitis epidemic in Burkina Faso.Clin Infect Dis 2006, 43:693–700.PubMedView Article
Elberse KEM, Nunes S, Sa-leao R, van der Heide HGJ, Schouls LM: Multiple-locus variable number tandem repeat analysis for Streptococcus pneumoniae: comparison with PFGE and MLST.Plos One 2011, 6:1–8.
Pichon P, Moyce L, Sheppard C, Slack M, Turbitt D, Pebody R, Spencer DA, Edwards J, Krahe D, George R: Molecular typing of pneumococci for investigation of linked cases of invasive pneumococcal disease.J Clin Microbiol 2010, 48:1926–1928.PubMedView Article
Pichon B, Bennett HV, Efstratiou A, Slack MP, George RC: Genetic characteristics of pneumococcal disease in elderly patients before introducing the pneumococcal conjugate vaccine.Epidemiol Infect 2009, 137:1049–1056.PubMedView Article
Platt S, Pichon B, George R, Green J: A bioinformatics pipeline for high-throughput microbial multilocus sequence typing (MLST) analyses.Clin Microbiol Infect 2006, 12:1144–1146.PubMedView Article
Coffey TJ, Enright MC, Daniels M, Morona JK, Morona R, Hryniewicz W, Paton JC, Spratt BG: Recombinational exchanges at the capsular polysaccharide biosynthetic locus lead to frequent serotype changes among natural isolates of Streptococcus pneumoniae.Mol Microbiol 1998, 27:73–83.PubMedView Article
Amadou Hamidou A, Djibo S, Boisier P, Varon E, Dubrous P, Chanteau S, Koeck JL: Diversité génétique de souches de pneumocoque isolées de cas de méningite au Niger, 2003–2006. Marseille, France: Actualités du Pharo; 2007. Poster
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.