Serotyping of sub-Saharan Africa Salmonella Strains Isolated from Different Sources Using Multiplex PCR and Capillary Electrophoresis Analysis and whole Genome Sequencing


 The authors have withdrawn this preprint due to author disagreement.

nancial resources and cannot implement well-established yet complex nucleic acid analysis systems or laboratory-developed tests through a network of centralized laboratories. These molecular methods required speci c and complex equipment, sensitive reagents, dedicated infrastructure, and deep technical knowledge, which are not available in many low-and middle-income countries (LMICs). Therefore, it is an often necessity for researchers from LMICs to collaborate with high income countries and test available modern techniques for Salmonella serotyping to determine if the method can be adapted for their country. In this study, the high-throughput molecular determination of Salmonella enterica serovars by use of Salmonella Multiplex Assay for Rapid Typing (SMART) PCR using capillary electrophoresis and whole genome sequencing (WGS) were compared to determine their accuracy in identifying serotypes of NTS isolated from Burkina Faso. The SMART method was developed by Leader

Bacterial strains
The 225 isolates used in this study were obtained from the Laboratoire de Biologie Moléculaire, d'épidémiologie et de surveillance des bactéries et virus transmissible par les aliments (LaBESTA)/Université Joseph KI-ZERBO, Burkina Faso. The strains were isolated from different sources and the serotype of each con rmed following the methodologies described in the International Organization for Standardization 6579 − 2017 (ISO 6579-1, 2017). The isolates were collected from clinical, veterinary, and food produce samples. Speci cally, the breakdown was as follows; 28 from diarrheic patients; 105 from poultry feces, 22 from guinea fowl feces, 19 from poultry carcasses, 17 from eggs, 30 from sh and 4 were from sandwiches.

High-throughput molecular determination of Salmonella enterica serovars
We used the SMART method developed by Leader et al., 2009, with slight modi cation. Salmonella strains were streaked onto blood agar and incubated for 18-20 hrs at 36 °C. Then, one colony from each plate was cultured in 5 mL of Luria Bertani (LB) broth, (Difco™, Becton Dickinson and Company, Sparks, MD) and incubated for 18 hrs at 37 °C with shaking. The genomic DNA was then isolated from the overnight culture using the GenElute bacterial genomic DNA kit (Sigma-Aldrich, St. Louis, MO, USA) and following the kit instructions for use. Once extractions were completed, the DNA was analyzed on the Nanodrop 2000 for DNA quality measuring the 260/280 nm. All DNA were then stored at -20 °C until ready for PCR and library preparation. PCR ampli cation. Genemapper software v3.5 (Applied Biosystems, Foster City, CA, USA) was used to analyze the sizes of resulting PCR products according to the protocol developed by Leader et al. (2009). Scoring was based upon the presence of a PCR product that corresponded to the predicted amplicon size, as detected in control reactions with DNA from S. Typhimurium, S. Typhi, and S. Enteritidis. Each PCR product detected was given a number (1 through 16) according to the size of the amplicon (Leader et al., 2009). The amplicons detected for each isolate were combined to create a SMART code that corresponds to serotypes previously screened by this method.
Whole genome sequencing of Salmonella strains Extracted DNA was quanti ed using the Qubit double-strandedDNA high-sensitivity assay kit according to the manufacturer's instructions (Life Technologies Corp., Carlsbad, CA, USA). The Illumina libraries were prepared using the Nextera XT DNA library preparation kit and Nextera XT index primers (Illumina, san Diego, CA, USA). The library fragment size distribution was checked using the Bioanalyzer 2100 with an Agilent HS DNA kit (Agilent Technologies, Santa Clara, CA,USA) and quanti ed using a Qubit DNA HS assay kit in a Qubit uorometer (Thermo Fisher Scienti c, Waltham, MA, USA). The generated libraries were then sequenced using a MiSeq version 2 reagent kit (Illumina) with 500 and 300 cycles. The pairedend read length of 2 × 250 bp was used for 500 cycles and 2 × 150 bp for 300 cycles on the MiSeq platform (Illumina). The quality metrics of the reads were performed by FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). The sequences were then assembled using the A5-miseq assembler (Coli et al., 2015), and the genome sequence was annotated via the NCBI

Results
The SMART PCR generated some new codes different to those found in the United States Two hundred and twenty-ve samples were analyzed to determine the serotypes of Salmonella strains isolated from Burkina Faso using the SMART PCR database developed for the 50 most common Salmonella serovars found in the U.S. (Leader et al., 2009). Among the 225 Salmonella isolates, 48 (21.33%) serotypes were identi ed based on comparison to the panel of SMART codes. One hundred and seventy-six isolates were assigned new codes not included in the SMART PCR database ( SMART PCR assigned some isolates with new codes without serotype predictions and SeqSero assigned them a serotype. For example, SMART PCR predicted a new code for one isolate and SeqSero predicted Takoradi or Bargny because both share the same antigenic pro le "8:i:1,5" (table 1).

Discussion
Serotyping is an important tool for monitoring for foodborne outbreaks and in understanding the diversity and distribution of serotypes within populations, ocks, and herds. However, serotyping by traditional methods remains inaccessible for many LMICs. In this study e investigated two molecular serotyping method to identifying serotypes for 225 isolates from Burkina Faso. The goal was to serotype Salmonella enterica using rapid and accessible molecular methods as opposed to immunologic approaches to antigen characterization. The traditional method of serotyping using the Kauffman-White Scheme is expensive, time consuming, and training is needed to accurately read results. The SMART PCR is faster and cheaper than traditional serotyping and can be automated to read results. For example, it is possible to test two 96-well plates Salmonella strains in one SMART PCR run. The reagents required are also less expensive than antisera and available from many different vendors worldwide (Leader et al., 2009). Both factors would bene t outbreak control in the developing word. In the present study, 176 isolates were assigned new codes not previously included in the SMART PCR database. This result will be bene t LMICs by extending the original SMART code database with serotypes more prevalent in other part of the world, particularly from Sub-Saharan Africa. This could be used in future studies to analyze the diversity of Salmonella serotypes. Moreover, Salmonella infections can globally circulate and a serotype from another region can potentially emerge as a common serotype persistent in other places than from were rst reported. For example, Wong et al. (2015) demonstrated that a Multidrug Resistant (MDR) S. Typhi H58 emerged in South Asia was propagated to many locations around the world, including countries in Southeast Asia, western Asia and East Africa.
However, a limitation of SMART PCR is the identi cation of new codes without an assigned serotype, previously pulse eld gel electrophoresis patterns were used to compliment SMART codes. The SMART assay was initially developed to identify the 50 most common serotypes from clinical found in the Northwestern USA (Leader et al. 2009). Our study identi ed many new SMART codes associated with uncommon serotypes and there is an urgent need to extend the database to include more Salmonella serotypes as classi ed by the Kauffmann -White scheme (Kauffmann, 1971). This will greatly increase the usability of the SMART PCR around the world. The original SMART codes should be renewed every ve years because Salmonella infections are in constant ux and any serotype can emerge as a top serotype at any time. Moreover, the capillary electrophoresis machine is very sensitive to power surges and needs to be protected appropriately, which is a challenge for laboratories in developing countries.
Furthermore, the widespread application of NGS tools will at some point render capillary electrophoresis redundant. . We initially used SeqSero 1.0 and found many strains but with unknown serotypes. When we used SeqSero 2.0there was a signi cant improvement in serotyping results (data not shown). Therefore, we can say that SeqSero 2.0 is a very powerful tool for determining serotypes using WGS data. However, this tool must be constantly updated to consider new serotypes that are identi ed by the Kauffman-White scheme (Kauffmann, 1971 Whole genome sequencing is a powerful tool for understanding Salmonella epidemiology and distribution of disease. However, sequencing is very expensive, time consuming, and requires data storage capacities and staff with high technical and bioinformatic skills. SeqSero 2 analysis of some isolates provided two possible serotypes sharing the same antigenic pro le (or formula) but with differing minor O antigenic factors or in other cases gave no serotype. These unpredicted serovars are either not included in the SeqSero database and could be shared as part of the iterative construction of the serotype database for use in the next version 3.0.
In this study we noticed that SeqSero and SMART PCR can be complementary for the determination of certain serotypes. For example, SMART PCR predicted some isolates as S. Duesseldorf, and these same isolates were predicted by SeqSero 2 as S. Albany or Duesseldorf.

Conclusions
Salmonella epidemiology is a worldwide public health problem. In this study, the results highlight the accuracy of modern molecular methods and in doing so also the need of less expensive methods for rapid serotyping of Salmonella in developing countries. SeqSero 2.0 is very accurate for Salmonella serotyping, but WGS is very expensive, especially for LMICs and a role for NGS may be as a shared resource with other needs such as antimicrobial drug resistance, a growing risk to public health with grave consequences. Therefore, researchers should continue developing less expensive and accurate methods of Salmonella serotyping that can be accessible worldwide. However, both methods require a clonal culture isolate and so while these molecular tools offer great accuracy, the need for classical microbiology cannot be overlooked in rst culturing and identifying the pathogen from its sample matrix.