- Methodology article
- Open Access
Effectiveness of the standard and an alternative set of Streptococcus pneumoniae multi locus sequence typing primers
BMC Microbiologyvolume 14, Article number: 143 (2014)
Multi-locus sequence typing (MLST) is a portable, broadly applicable method for classifying bacterial isolates at an intra-species level. This methodology provides clinical and scientific investigators with a standardized means of monitoring evolution within bacterial populations. MLST uses the DNA sequences from a set of genes such that each unique combination of sequences defines an isolate’s sequence type. In order to reliably determine the sequence of a typing gene, matching sequence reads for both strands of the gene must be obtained. This study assesses the ability of both the standard, and an alternative set of, Streptococcus pneumoniae MLST primers to completely sequence, in both directions, the required typing alleles.
The results demonstrated that for five (aroE, recP, spi, xpt, ddl) of the seven S. pneumoniae typing alleles, the standard primers were unable to obtain the complete forward and reverse sequences. This is due to the standard primers annealing too closely to the target regions, and current sequencing technology failing to sequence the bases that are too close to the primer. The alternative primer set described here, which includes a combination of primers proposed by the CDC and several designed as part of this study, addresses this limitation by annealing to highly conserved segments further from the target region. This primer set was subsequently employed to sequence type 105 S. pneumoniae isolates collected by the Canadian Immunization Monitoring Program ACTive (IMPACT) over a period of 18 years.
The inability of several of the standard S. pneumoniae MLST primers to fully sequence the required region was consistently observed and is the result of a shift in sequencing technology occurring after the original primers were designed. The results presented here introduce clear documentation describing this phenomenon into the literature, and provide additional guidance, through the introduction of a widely validated set of alternative primers, to research groups seeking to undertake S. pneumoniae MLST based studies.
Accurate, reproducible isolate characterization data helps epidemiologists, scientists, physicians, public health officials, and many other professions, better monitor and manage endemic and epidemic infectious disease trends . Historically, bacterial typing schemes have been based on immunological and electrophoretic approaches . Immunological based schemes classify strains on the specificity of antibodies raised against antigenic bacterial components. This approach has been widely applied in the form of capsular serotyping, whereby the antigenic specificity of different intra-species capsule types are used to classify the bacteria [3, 4].
However, many globally significant bacterial pathogens such as Streptococcus pneumoniae and Neisseria meningitidis are readily able to incorporate environmental genetic material into their genomes allowing for rapid genetic variation and interchange of immunogenic components; including those on which serotyping is based . This phenomenon has been observed recently with S. pneumoniae capsular typing following the introduction of the seven-valent pneumococcal conjugate vaccine (PCV7) . As a result of the component specificity of immunological based typing methods, it has become well recognized that strains possessing the same serotype are not necessarily clonally related, nor expected to possess the same repertoire of virulence factors. Immunogenic approaches are now used in more focused ways to explore specific factors, particularly those relevant to guiding vaccine evaluation and development, as was demonstrated with a recent serotype B meningococcal vaccine investigation .
Multi-locus enzyme electrophoresis (MLEE) is another typing method, and is based on the relative electrophoretic mobility of a set of ubiquitously present bacterial enzymes . This approach is not dependent on a single immunogenic component and as such is less influenced by horizontal exchange or positive selection events. However, it is complicated to perform and it is difficult to compare the resulting electrophoretic types between different groups . Similar to the MLEE, pulse field gel electrophoresis (PFGE) classifies individual strains based on the gel electrophoretic mobility of bacterial components: in this case the relative mobility of DNA fragments which have been obtained through restriction enzyme digestion . PFGE has been widely used for typing and has been considered a gold standard for some epidemiological studies, however, there have been challenges in standardizing protocols between different research groups .
Multi-locus sequence typing (MLST) is a classification scheme whereby isolates are typed based on the nucleotide sequences from a set of housekeeping genes that are necessary for the maintenance of basic cellular functions. The nucleotide sequences for each of these housekeeping genes are used to define a unique bacterial sequence type . Each gene is sequenced from individual strains and then compared against existing sequences in a publically accessible, globally maintained database. Those submitted sequences matching ones already in the database are assigned the gene type number of the sequence in the database; if a novel sequence is submitted, the curator of the database assesses the sequencing results and assigns an appropriate gene number. While this approach does address several of the limitations encountered by other typing methods, the cost of sequencing can be a barrier to large scale typing projects. Particularly, because of the potential for error in sequencing reads the standard for determining a gene type requires matching forward and reverse sequences. The S. pneumoniae typing system is based on the partial sequence of seven genes coding for the housekeeping proteins: Shikimate dehyrogenase (aroE), glucose-6-phosphate dehydrogenase (gdh), glucose kinase (gki), transketolase (recP), signal peptidase I (spi), xanthine phosphoribosyltransferase (xpt), and D-alanine-D-alanine ligase (ddl) .
Some preliminary results, and information provided by the curator of the S. pneumoniae MLST database indicated that several of the provided MLST sequencing primers were unable to obtain the full sequence required in each direction. As a result, in cases where a novel gene type is identified based on sequences from the standard primers (Table 1), the investigators are required to design new primers and re-sequence the particular gene (Cynthia Bishop, personal communication, May, 2012). In these circumstances, investigators are required to expend additional time and resources developing new primers, as well as purchasing additional sequencing and validating results. While several investigators in the field are aware of this issue, and all sequences in the MLST database have been correctly verified through subsequent primer redesign and re-sequencing, this limitation has not been specifically addressed in the literature [12, 13] (Cynthia Bishop, personal communication, May 2012).
The lack of description of this limitation in the literature is evidenced by the prevalence of recent studies only referencing the original primers, and not providing any discussion pertaining to the sequencing challenge [6, 14–18]. The purpose of this study is to systematically identify the primers unable to obtain the correct sequence, describe an alternative set of primers, and introduce documentation to the literature offering additional guidance to groups undertaking S. pneumoniae MLST studies. In this investigation, the effectiveness of the standard MLST sequencing primers, and an alternate set of primers were evaluated for their ability to completely sequence, in both directions, the appropriate typing regions of each gene.
This analysis consistently observed that the forward and reverse sequences obtained with the standard MLST primers only completely covered the typing region for two of the seven genes: gki and gdh. The reverse primer for the aroE, and recP genes failed to sequence the last 21 and 10 bases of their respective typing regions (Figure 1A, and B). The forward spi and xpt MLST primers do not sequence the first 6 and 17 bases of their respective typing regions (Figure 1C and D). In the case of ddl, the forward primer was unable to sequence the first 8 bases (Figure 1E) and the reverse did not sequence the last 26 bases (Figure 1F). These observations were consistent across all of the different isolates, both sequencing services, and each replicate. In each of the cases that the full sequence was not obtained, the alignment of the primers with publically available genomic sequences for S. pneumoniae identified that those primers annealed less than 30 base pairs from the required typing region (Figure 1).
A partial set of modified MLST primers for S. pneumoniae were designed and introduced by the US Centers for Disease Control (CDC) . The CDC primers for aroE, the reverse primer for recP, and the forward primer of ddl each annealed within the coding sequence for the gene possessing the typing region, and were able to completely cover the required sequence. However, the CDC forward primer for recP, and both sets of spi and xpt primers annealed to regions of genomic DNA outside of their target gene. While these primers successfully sequenced the correct region, the highly plastic nature of pneumococcal genome suggests these genes may not always be in the same region, and in this case primers that bind outside of the gene may not always be specific to the target region of the genome . To address sequencing errors potentially resulting from this phenomenon, the recP CDC forward primer was replaced with the standard MLST recP forward primer, as this primer annealed within the recP gene and can correctly sequence the typing region. Novel primers that annealed within the gene were also designed to replace the spi and xpt CDC primers. Lastly, the CDC reverse primer for ddl bound only 19 base pairs away from the typing region, and a modified primer binding 57 base pairs from the typing region was designed as a replacement. Analysis of the alternate primer sets (Table 2) using the same five test isolates revealed that, each primer set that was sufficiently down/upstream from the typing region was able to correctly amplify and sequence the appropriate DNA fragment (Figure 2). The effectiveness of the alternative primer set was subsequently validated through sequence typing of 105 diverse isolates collected by the Canadian Immunization Monitoring Program ACTive (IMPACT) surveillance network (Additional file 1: Table S1). In all cases investigated in this study, the modified primers were able to obtain the complete typing sequence, in both directions, for the gene/primer combinations not obtained by the standard primers.
These results demonstrate that the current inability of the standard sequencing primers to effectively sequence the S. pneumoniae MLST typing regions is a result of how close the primers anneal to the typing region of the gene. When sequencing by Sanger chain termination capillary separation is employed, the base pairs immediately after the sequencing primer will not be clearly sequenced . This is a characteristic of the size separating technology used by chain termination sequencing. When the terminated segments are separated based on size, there is poor resolution between the smaller fragments at the start of the sequence. This results in unclear and ambiguous sequencing results for approximately the first 20 – 50 base pairs of the sequence.
Next generation sequencing approaches such as 454, Illumina, and ABI function by determining the sequence for overlapping segments of 35 to 200 base pairs, depending on the specific method, and then assembling these segments into the complete sequence . These next generation techniques have recently been applied to MLST with some success, however, the assembly process can be hindered by highly repetitive sequence in the overlapping sections of the sequence reads. This can potentially result in inaccurate assemblies within sequence typing regions. Additionally, the infrastructure and expertise required to employ next generation sequencing technologies still limits their accessibility to many research groups [21, 22]. Given these limitations, and noting the number of recent studies still making unaltered reference to the standard primers, it remains valuable for researchers in this field to be more aware of the limitations presented by the standard MLST sequencing primers.
The alternative primer set described here addresses the limitation of the standard S. pneumoniae MLST primers by annealing sufficiently far from the target region such that the sequence for the correct segment is consistently obtained. Clear documentation defining the limitations of the standard S. pneumoniae MLST primers and describing an effective set of alternative primers is of particular importance as automated Sanger capillary sequencing remains a highly optimized, and still widely employed method for S. pneumoniae MLST based studies.
Streptococcus pneumoniae strains and genomic preparation
Evaluation of the standard and alternative MLST primers was carried out on five randomly selected isolates from strains collected provided by the Canadian Immunization Monitoring Program ACTive (IMPACT) [23–26]. Isolates were obtained from patients aged 0 – 16 presenting with invasive pneumococcal infection at tertiary care centres across Canada; diagnosis was confirmed by positive S. pneumoniae culture from normally sterile body fluid (blood/cerebrospinal fluid). The IMPACT surveillance study has research ethics board approval at each participating centre to obtain demographic, clinical and microbiologic information on all cases without the requirement for written informed consent. S. pneumoniae strains were verified and serotyped as part of IMPACT’s routine surveillance protocol. The investigation described here was undertaken using IMPACT's 19A invasive strains, collected with ethical approval between 1991 and 2009. Strains were grown overnight at 5% CO2 on Columbia Blood Agar (prepared according to manufacturer’s instructions, Becton Dickinson and Company, Difco™, Sparks, Maryland, USA) plates with Optochin Disk (used according to manufacturer’s instructions, Sigma-Aldrich, Oakville, Ontario, Canada) susceptibility and the presence alpha hemolysis used for species verification. Genomic DNA was then isolated with the QIAamp DNA Mini Kit (used according to manufacturer instructions, Qiagen, Toronto, Ontario, Canada).
Each of the seven typing alleles was evaluated with both the standard (Table 1) and alternative (Table 2) MLST primers. PCR solutions were prepared for each primer set consisting of: 11 μl sterile distilled water, 2.5 μl of 10× reaction buffer (5 ml 1 M KCL, 5 ml 1 M (NH4)2SO4, 5 ml 2 M Tris–HCl pH 8.8, 5 ml 200 mM MgSO4, 5 ml 10% Triton X-100, water to 50 ml), 2.5 μl of 2 mM dNTPs, 2.5 μl of each primer at 5 μM, 1 unit pfu enzyme (Thermo Scientific, Ottawa, Ontario, Canada) and 2 μl of genomic DNA template at 50 – 300 ng/μl. All PCRs were performed in a BioRad (Mississauga, Ontario, Canada) Thermocycler with annealing temperatures specific to each primer set (Table 1 and 2). Amplification was verified by visualizing gene products with gel electrophoresis on a 1% ethidium bromide agarose gel with a voltage of 110 V for 25 minutes. Verified PCR products were purified with the E.Z.N.A Cycle Pure Kit (used according to manufacturer’s instructions OMEGA Biotek, Norcross, Georgia, USA). Purified products were subsequently verified via spectrophotometry (used according to manufacturer’s instructions NanoDrop 1000 Spectrophotometer, Thermo Scientific, Ottawa, Ontario, Canada). Purified samples with a concentration of greater than 3 ng/μl, and 260 nm/280 nm absorbance values between 1.0 and 2.0 were accepted to send for sequencing. Sequencing was carried out at both Macrogen Corporation, Rockville USA, and the University of Calgary, Calgary Canada, DNA Core Services facility.
Assessing sequence coverage
The sequencing results were manually inspected for quality with the open source program 4Peaks, and sequence coverage was inspected by using the Multiple Sequence Alignment by Fast Fourier Transform (MAFFT) program, available through the European Bioinformatics Server . MAFFT was used to align the forward and reverse sequence reads from each test primer set, and isolate, along with 5 known typing regions from the MLST database. The annealing site of each primer was identified by BLASTing the primer’s sequence against publically accessible S. pneumoniae genomic sequences available through the National Center for Biotechnology Information [28, 29]. These results identified where each primer annealed relative to the typing region, and whether the sequencing resulting from the primer was able to consistently cover the required region. This full process was replicated twice for each primer set and each test isolate to confirm the reproducibility of the observations.
Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE, Urwin R, Zhang Q, Zhou J, Zurth K, Caugant DA, Feavers IM, Achtman M, Spratt BG: Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A. 1998, 95 (6): 3140-3145. 10.1073/pnas.95.6.3140.
Urwin R, Maiden MCJ: Multi-locus sequence typing: a tool for global epidemiology. Trends Microbiol. 2003, 11 (10): 479-487. 10.1016/j.tim.2003.08.006.
Bentley SD, Aanensen DM, Mavroidi A, Saunders D, Rabbinowitsch E, Collins M, Donohoe K, Harris D, Murphy L, Quail MA, Samuel G, Skovsted IC, Kaltoft MS, Barrell B, Reeves PR, Parkhill J, Spratt BG: Genetic analysis of the capsular biosynthetic locus from All 90 pneumococcal serotypes. PLoS Genet. 2006, 2 (3): e31-10.1371/journal.pgen.0020031.
Satzke C, Turner P, Virolainen-Julkunen A, Adrian PV, Antonio M, Hare KM, Henao-Restrepo AM, Leach AJ, Klugman KP, Porter BD, Sá-Leão R, Scott JA, Nohynek H, O’Brien KL: Standard method for detecting upper respiratory carriage of Streptococcus pneumoniae: Updated recommendations from the World Health Organization Pneumococcal Carriage Working Group. Vaccine. 2013, 32 (1): 165-179. 10.1016/j.vaccine.2013.08.062.
Gupta S, Maiden MCJ: Exploring the evolution of diversity in pathogen populations. Trends Microbiol. 2001, 9 (4): 181-185. 10.1016/S0966-842X(01)01986-2.
Pillai D, Shahinas D, Buzina A, Pollock R, Lau R, Khairnar K, Wong A, Farrell D, Green K, McGeer A, Low D: Genome-wide dissection of globally emergent multi-drug resistant serotype 19A Streptococcus pneumoniae. BMC Genomics. 2009, 10 (1): 642-10.1186/1471-2164-10-642.
Frosi G, Biolchi A, Sapio ML, Rigat F, Gilchrist S, Lucidarme J, Findlow J, Borrow R, Pizza M, Giuliani MM, Medini D: Bactericidal antibody against a representative epidemiological meningococcal serogroup B panel confirms that MATS underestimates 4CMenB vaccine strain coverage. Vaccine. 2013, 31 (43): 4968-4974. 10.1016/j.vaccine.2013.08.006.
Selander RK, Caugant DA, Ochman H, Musser JM, Gilmour MN, Whittam TS: Methods of multilocus enzyme electrophoresis for bacterial population genetics and systematics. Appl Environ Microbiol. 1986, 51 (5): 873-884.
Hunter SB, Vauterin P, Lambert-Fair MA, Van Duyne MS, Kubota K, Graves L, Wrigley D, Barrett T, Ribot E: Establishment of a universal size standard strain for Use with the PulseNet standardized pulsed-field Gel electrophoresis protocols: converting the national databases to the New size standard. J Clin Microbiol. 2005, 43 (3): 1045-1050. 10.1128/JCM.43.3.1045-1050.2005.
Han H, Zhou H, Li H, Gao Y, Lu Z, Hu K, Xu B: Optimization of Pulse-Field Gel Electrophoresis for Subtyping of Klebsiella pneumoniae. Int J Environ Res Pub Health. 2013, 10 (7): 2720-2731. 10.3390/ijerph10072720.
Enright MC, Spratt BG: A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology. 1998, 144 (11): 3049-3060. 10.1099/00221287-144-11-3049.
Alternative MLST Primers for S. pyogenes and S. pneumoniae. [http://www.cdc.gov/ncidod/biotech/strep/alt-MLST-primers.htm] 
Enright MC, Knox K, Griffiths D, Crook DWM, Spratt BG: Molecular typing of bacteria directly from cerebrospinal fluid. EJCMID. 2000, 19 (8): 627-630. 10.1007/s100960000321.
Marimon JM, Ercibengoa M, García-Arenzana JM, Alonso M, Pérez-Trallero E: Streptococcus pneumoniae ocular infections, prominent role of unencapsulated isolates in conjunctivitis. Clin Microbiol Infect. 2013, 19 (7): E298-E305. 10.1111/1469-0691.12196.
Hanage WP, Bishop CJ, Lee GM, Lipsitch M, Stevenson A, Rifas-Shiman SL, Pelton SI, Huang SS, Finkelstein JA: Clonal replacement among 19A Streptococcus pneumoniae in Massachusetts, prior to 13 valent conjugate vaccination. Vaccine. 2011, 29 (48): 8877-8881. 10.1016/j.vaccine.2011.09.075.
Xu Q, Kaur R, Casey JR, Adlowitz DG, Pichichero ME, Zeng M: Identification of Streptococcus pneumoniae and Haemophilus influenzae in culture-negative middle ear fluids from children with acute otitis media by combination of multiplex PCR and multi-locus sequencing typing. Int J Pediatr Otorhinolaryngol. 2011, 75 (2): 239-244. 10.1016/j.ijporl.2010.11.008.
Elberse KEM, Nunes S, Sá-Leão R, van der Heide HGJ, Schouls LM: Multiple-locus variable number tandem repeat analysis for Streptococcus pneumoniae: comparison with PFGE and MLST. PLoS One. 2011, 6 (5): e19668-10.1371/journal.pone.0019668.
Scott JR, Hanage WP, Lipsitch M, Millar EV, Moulton LH, Hinds J, Reid R, Santosham M, O’Brien KL: Pneumococcal sequence type replacement among American Indian children: a comparison of pre- and routine-PCV7 eras. Vaccine. 2012, 30 (13): 2376-2381. 10.1016/j.vaccine.2011.11.004.
Croucher NJ, Walker D, Romero P, Lennard N, Paterson GK, Bason NC, Mitchell AM, Quail MA, Andrew PW, Parkhill J, Bentley SD, Mitchell TJ: Role of conjugative elements in the evolution of the multidrug-resistant pandemic clone Streptococcus pneumoniae Spain 23 F ST81. J Bacteriol. 2009, 191 (5): 1480-1489. 10.1128/JB.01343-08.
Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces Using Phred. I. Accuracy assessment. Genome Res. 1998, 8 (3): 175-185. 10.1101/gr.8.3.175.
Morozova O, Marra MA: Applications of next-generation sequencing technologies in functional genomics. Genomics. 2008, 92 (5): 255-264. 10.1016/j.ygeno.2008.07.001.
Boers SA, van der Reijden WA, Jansen R: High-throughput multilocus sequence typing: bringing molecular typing to the next level. PLoS One. 2012, 7 (7): e39630-10.1371/journal.pone.0039630.
Scheifele DW, Halperin SA: Immunization monitoring program, active: a model of active surveillance of vaccine safety. Semin Pediatr Infect Dis. 2003, 14 (3): 213-219. 10.1016/S1045-1870(03)00036-0.
Scheifele DW, Halperin SA, Pelletier L, Talbot J: Invasive pneumococcal infections in Canadian children, 1991–1998: implications for New vaccination strategies. Clin Infect Dis. 2000, 31 (1): 58-64. 10.1086/313923.
Bettinger JA, Scheifele DW, Kellner JD, Halperin SA, Vaudry W, Law B, Tyrrell G, for Members of the Canadian Immunization Monitoring Program, Active (IMPACT): The effect of routine vaccination on invasive pneumococcal infections in Canadian children, Immunization Monitoring Program, Active 2000 – 20. Vaccine. 2010, 28 (9): 2130-2136. 10.1016/j.vaccine.2009.12.026.
Bettinger JA, Scheifele DW, Halperin DW, Kellner JD, Tyrrell G, Members of the Canadian Paediatric Society’s Immunization Monitoring Program, Active (IMPACT): Invasive pneumococcal infections in Canadian children, 1998 – 2003. Can J Pub Health. 2007, 98 (2): 111-115.
Katoh K, Misawa K, Kuma K, Miyata T: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30 (14): 3059-3066. 10.1093/nar/gkf436.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215 (3): 403-410. 10.1016/S0022-2836(05)80360-2.
Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA: Database resources of the national center for biotechnology information. Nucleic Acids Res. 2009, 37 (suppl 1): D5-D15.
The authors would like to acknowledge the Canadian Immunization Monitoring Program Active Investigators for collecting the S. pneumoniae isolates that made this project possible. The Canadian Immunization Monitoring Program Active is a national surveillance initiative managed by the Canadian Pediatric Society (CPS) and conducted by the IMPACT investigators on behalf of the Public Health Agency of Canada’s (PHAC) Centre for Immunization and Respiratory Infectious Diseases. The authors would also like to acknowledge Cynthia Bishop for providing her guidance during this investigation and her permission to reference the personal communications between herself and the author’s research team.
Funding for collection of the pneumococcal isolates used in this study was provided by an unrestricted grant to CPS from Wyeth Pharmaceuticals (1991–2005), and the PHAC (2005–2009). Funding to support the laboratory analysis was provided by Pfizer Canada through an investigator-initiated research grant in aid to Dr. James D. Kellner.
The authors declare no competing financial or personal interests with respect to the presentation of these results.
PA contributed to the study’s conception, conducted the experiments, drafted the manuscript, and approved the final submission. Dr. OV is the IMPACT site co-investigator in Calgary Alberta, and was involved with the conception and design of the study, as well as the acquisition of the data. He also revised and approved the submitted manuscript. Dr. JK was involved in the conception and design of the study, and assisted in data acquisition. Dr. K also revised and approved the submitted manuscript. Dr. AS participated in the development of the project, provided technical support, and assisted in the acquisition of data and analysis of results. He revised and approved the submitted manuscript. Dr. JB is the IMPACT epidemiologist; she was involved in the conception and design of the study, provided the data and supervised the data analysis. She revised and approved the submitted manuscript. Dr. JA contributed substantially to the conception, implementation, and interpretation of the results presented in this study. Dr. JA, also revised and approved the submitted manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
About this article
- Multi-locus sequence typing
- Invasive pneumococcal disease
- Molecular epidemiology
- Streptococcus pneumoniae
- Bacterial typing