Genomic analysis of methicillin-resistant Staphylococcus aureus strain SO-1977 from Sudan

Background Methicillin-resistant Staphylococcus aureus (MRSA) is known as a leading cause of morbidity and mortality. Investigation of the MRSA’s virulence and resistance mechanisms is a continuing concern toward controlling such burdens through using high throughput whole Genome Sequencing (WGS) and molecular diagnostic assays. The objective of the present study is to perform whole-genome sequencing of MRSA isolated from Sudan using Illumina Next Generation Sequencing (NGS) platform. Results The genome of MRSA strain SO-1977 consists of 2,827,644 bp with 32.8% G + C, 59 RNAs and 2629 predicted coding sequences (CDSs). The genome has 26 systems, one of which is the major class in the disease virulence and defence. A total of 83 genes were annotated to virulence disease and defence category some of these genes coding as functional proteins. Based on genome analysis, it is speculated that the SO-1977 strain has resistant genes to Teicoplanin, Fluoroquinolones, Quinolone, Cephamycins, Tetracycline, Acriflavin and Carbapenems. The results revealed that the SO-1977, strain isolated from Sudan has a wide range of antibiotic resistance compared to related strains. Conclusion The study reports for the first time the whole genome sequence of Sudan MRSA isolates. The release of the genome sequence of the strain SO-1977 will avail MRSA in public databases for further investigations on the evolution of resistant mechanism and dissemination of the -resistant genes of MRSA.


Background
Staphylococcus aureus (S. aureus) is a human pathogen known to cause both nosocomial and community-acquired infections [1]. It has been identified, among other classes of bacteria, resistant against some antibiotics. One of the emerged resistant strain of S. aureus is Methicillin-resistant Staphylococcus aureus (MRSA) that is the leading cause of life-threatening infections even in countries with advanced health surveillance and maintenance systems [1,2]. In Sudan, MRSA's incidence rate has increased dramatically and has been reported to be associated with wound infection constituting substantial sources of the high morbidity and mortality rate [3]. Such emergence of resistant strains is due to the overuse of not developed antibiotics that ultimately makes real challenges at treatment. Therefore, there is an urgent need to uncover the genetic basis of their virulence and resistance mechanism for better understanding as well as addressing potential effective drug targets. Over the last decades, Whole-genome sequencing (WGS) technologies witnessed large volumes of produced data including mutant genes, cancer-causing genes and genes predisposing for certain diseases. Moreover, the advanced bench-top sequencers technique, applied in regular clinical laboratories [4] may result in enormous diagnostic developments and challenges [5]. Genomic materials of S. aureus strains have been studied to understand the mechanisms and virulence factors responsible for staphylococcal antibiotic resistance. The premier S. aureus genomes sequenced were; MRSA strains N315 and Mu50 [6] followed by other nine strains [7,8]. The studies revealed that the length of staphylococcal genomes is about 2.8 Mbp with low GC content. The regions of staphylococcal genomes are well conserved, with many massive sequence blocks showing high variability [8]. Although a considerable number of the MRSA resistant to antimicrobials including Methicillin, Ofloxacin, Penicillin, Amikacin, and Vancomycin are reported in Sudan [9], the molecular investigations that help in understanding the mechanism of MRSA epidemics at the whole genome level are yet limited. The present study aims to analyse the whole genome sequence (WGS) of SO-1977 strain and subsequently evaluates the genomic diversity and genotypic prediction of the antimicrobial resistance of MRSA isolated from a patient in Sudan.

Genome project history
The genome sequences of SO-1977 strain were deposited in GenBank® (WGS database). The result was summarized in (Table 1).

Genomic features of strain SO-1977
As can seem from the data in Table 2, the draft genome sequence of S. aureus strain SO-1977 consisted of 2,827,644 bp with a 32.8% GC. The number of predicted coding sequences (CDS), tRNAs and rRNAs was 2629, 51 and 4 respectively. The final assembly contained 151 contigs with N50 of 62,783 bp length. The largest contig assembled was 146,886 bp length.
Genome annotation using RAST ( Fig. 1) Whole-genome annotation of MRSA strain SO-1977 on RAST server revealed a total of 1970 genes belonging to 26 subsystems such as Cofactors, Vitamins, Prosthetic Groups, Pigments, Cell Wall and Capsule and Virulence, Disease and Defense. The graphical circular map of the SO-1977 genomes was shown in Fig. 2.

Genes involved in virulence, disease and defence
Result revealed that 83 genes encoded for virulence, disease, and defence, 28 genes were annotated to be responsible for adhesion, 32 for antibiotic resistances and toxic compounds, 14 for Bacteriocins, ribosomally synthesized antibacterial peptides and 9 for invasion and intracellular resistance (Fig. 3). Some of these genes which coding functional proteins are Fibronectin binding protein, Chaperonin, Two-component response regulator BceR, Folylpolyglutamate synthase, Acetyl-coenzyme A, Carboxyl transferase beta chain, Colicin V production protein, MerR family, Multidrug resistance protein, Mercuric ion reductase and Arsenate reductase. The category of the cell wall and capsule system of peptidoglycan biosynthesis revealed that two genes have a relationship with conferring Methicillin resistance while one gene was related to Penicillin resistance.
Phages, prophages, transposable elements, plasmids (Table 3) The analysis revealed that 35 genes are encoding for Phages, Prophages, Transposable elements, Plasmid of which 33 were annotated to be responsible for Phages, Prophages and Pathogenicity islands.  Pseudo Genes (multiple problems) 13 Genes assigned to SEED 1698 Resistant genes based comparative genomic analysis (Table 4) The Genome annotation and comparison results by RSAT server have shown that SO-1977 strain possesses 29 genes that may be related to multi-drug resistance and the comparison between MRSA strains was shown that 23 resistant genes were present in all strains, two genes were only found in SO-1977 strain conferring resistance against Tetracycline. Furthermore, The SO-1977 strain was the only one having the norA gene providing resistance against Quinolone beside other six genes of the family MarR. Four genes that are responsible for anti-Methicillin resistance (LytH, MecI, Mec and MurE) were only found in MRSA252 strain. Also the results have shown that MRSA252 and MSSA476 are sharing a single common gene for anti-Methicillin resistance (HmrB).

Phylogenetic analysis of nucleotide sequence of strain SO-1977
Result on the phylogenetic of 16S rRNA (MK713975) showed that the SO-1977 strain has the highest similarity with different S. aureus strain (Table 5) (Fig. 4).

Discussion
The present study reported the first genome sequence of S. aureus (MRSA) isolated from Sudan to have phylogenetic allocation using the 16S rRNA gene to represent the evolutionary relationships of the bacteria. In this study, the phylogenetic analysis of the complete 16S rRNA gene sequence of strain SO-1977 (MK713975) has shown that the strain should be assigned to the genus Staphylococcus. The annotated draft genome sequence of SO-1977 strain was 2827,644 bp length containing 2629 coding sequences (CDS). Moreover, the WGS data was used to investigate antimicrobial resistance and virulence mechanism. The multi-drug resistance of this isolate might be generated by the ability of these bacteria to accumulate multiple genes on the resistance (R) plasmids coding for a single drug resistance within a single cell or by the increased expression of genes that code for multi-drug efflux pumps, extruding a wide range of drugs [10]. In this study, S. aureus (MRSA) isolated from Sudan has been demonstrated to possess different resistance mechanisms which can be attributed to the use of resistant genes TcaR, TcaA, TcaB, TetR, TetM, PBP2a (MecA), or by secretion of enzymes (DNA gyrase subunit A, DNA gyrase subunit B, Topoisomerase IV subunit A, Topoisomerase IV subunit B and Beta-lactamase repressor) allowing it to use the efflux pump mechanism. In addition, six putative MarR family transcriptional regulators in the SO-1977 genome were identified. These were recognised as a widely conserved group of multiple antibiotic resistance regulators that respond to a wide range of antibiotics [11]. The MRSA characteristic phenotype is due to the presence of mecA, which encodes a penicillin-binding protein (PBP),  PBP2a, with reduced affinity for b-lactams. MecA is embedded in a large heterologous chromosomal cassette, the SCCmec element. Some MRSA strains carry upstream to the mecA gene such as the regulatory genes mecI-mecR1 that encoding for a repressor and a sensor/inducer of the mecA expression, respectively [12]. In this study, MecA and

Sample preparation
A wound swab specimen was collected from a patient at Soba Hospital, Khartoum, and was inoculated in sheep blood agar and mannitol salt agar at 37°C for 24 h. For the purpose of colonies identification, standard procedures and tests were performed including gram stain, catalase, coagulase, and DNase tests were used to identify the colonies [13]. The positive cultures for S. aureus were then suspended with a concentration similar to turbidity standard equivalent to 0.5% McFarland and streaked on Mueller-Hinton agar (MHA). Oxacillin (6μg\ml) and cefoxitin (30μg\ml) antimicrobial disc were positioned at suitable distances on the bacterial lawns on MHA at 33°C for 24 h. The antibiotic resistance profiling of the strain against a broader range of antibiotics was not performed as a limitation of the study. The growth inhibition zones were then measured according to the standard Kirby -Bauer disc diffusion method and NCCLs guidelines using a calliper [14]. In which the revealed measurements were indicatives of resistant colonies of MRSA strain.

Genomic DNA extraction and sequencing library preparation
Bacterial DNA was extracted using Qiagen Kit following the manufacturer instructions. The concentration and purity of the resultant DNA were photo-metrically determined using a Nano-drop (Thermofisher®). About 5 μg of genomic DNA (A260/280 = 1.93) was used for library preparation and 4 nm of genomic DNA was used as an input for the Nextera XT kit (Illumina). Then samples were targeted for bar-coding using forward (N702) and reverse (N702) primers in 12 cycles of amplification in the PCR machine. Libraries were then quantified on the Bioanalyzer (Agilent Technologies) and combined with an equimolar mixture. Finally, 0.19 ng/ ml was used  as an input for Next-generation sequencing (NGS) and libraries were sequenced on a single run on the Illumina MiSeq instrument (250 bp paired-end reads).

Bacterial genome sequencing and assembly
Poor-quality and adaptor-containing reads were filtered and trimmed using BBTools version 36 [15]. Good quality sequencing reads were assembled using SPAdes version 3.5.0. For the prediction of tRNA and rRNA genes, ARAGORN 1.2.34 and RNAmmer1.2 were used, respectively [16,17]. The protein-coding genes were then predicted using Prodigal 2.60 [18] as well as their function by using BLASTN 2.2.25+ [19] and followed by detecting sequence homologs through searching for various sequence domain databases using HMMER 3.0 (http://hmmer.org/).

Genome annotation
The final draft genome sequence of S. aureus SO-1977 was used for annotation using RAST [20] and NCBI Prokaryotic Genome Annotation Pipeline [21]. The annotated genes were exported from the RAST server into an excel table and manually compared for genomic features. The antibiotic resistance genes of the S. aureus SO-1977, S. aureus MRSA252 (PRJNA265) and S. aureus MSSA476 (PRJNA116329) were retrieved from RAST server then the comparison was done [22]. The graphical circular map of the genomes was made by CGView server [23].  Seq match score, the number of (unique) 7-base oligomers shared between your sequence and a given RDP sequence divided by the lowest number of unique oligos in either of the two sequences were computed using the Maximum Likelihood method implemented in MEGA6 version 6 [24] in all positions containing gaps and missing data were eliminated.

Sequence data access
The genomic data of this study were deposited publicly in DDBJ/ENA/GenBank® under Accession: NFZY00000000, BioProject: PRJNA385553 and Biosample: SAMN 06894057.