Virulence factors and molecular characteristics of Shigella flexneri isolated from calves with diarrhea

Background The natural hosts of Shigella are typically humans and other primates, but it has been shown that the host range of Shigella has expanded to many animals. Although Shigella is becoming a major threat to animals, there is limited information on the genetic background of local strains. The purpose of this study was to assess the presence of virulence factors and the molecular characteristics of S. flexneri isolated from calves with diarrhea. Results Fifty-four S. flexneri isolates from Gansun, Shanxi, Qinghai, Xinjiang and Tibet obtained during 2014 to 2016 possessed four typical biochemical characteristics of Shigella. The prevalences of ipaH, virA, ipaBCD, ial, sen, set1A, set1B and stx were 100 %, 100 %, 77.78 %, 79.63 %, 48.15 %, 48.15 and 0 %, respectively. Multilocus variable number tandem repeat analysis (MLVA) based on 8 variable number of tandem repeat (VNTR) loci discriminated the isolates into 39 different MLVA types (MTs), pulsed field gel electrophoresis (PFGE) based on NotI digestion divided the 54 isolates into 31 PFGE types (PTs), and multilocus sequence typing (MLST) based on 15 housekeeping genes differentiated the isolates into 7 MLST sequence types (STs). Conclusions The findings from this study enrich our knowledge of the molecular characteristics of S. flexneri collected from calves with diarrhea, which will be important for addressing clinical and epidemiological issues regarding shigellosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12866-021-02277-0.


Background
Shigellosis or blood dysentery is widespread in underdeveloped or developing regions with poor hygiene and limited access to clean drinking water and has become a serious threat to public health [1,2]. Shigellosis is caused by nonmotile, facultative anaerobic gram-negative bacilli of the Enterobacteriaceae family, including S. dysenteriae, S. flexneri, S. boydii, and S. sonnei [3][4][5]. Shigella species have high effectiveness in invasive systems that enable bacteria to invade and multiply within the human intestinal epithelia, ultimately leading to severe inflammatory colitis, which is referred to as bacillary dysentery or shigellosis [4].
Various virulence factors located on chromosomes or large virulence inv plasmids are recognized as crucial factors related to the pathogenesis of shigellosis [6]. Moreover, these different virulence factors are associated with the colonization of intestinal cells and intracellular invasion, which may partly explain why various manifestations are detected in the clinic, such as intestinal inflammatory responses and watery diarrhea [1]. Bacterial cell-to-cell movement and dissemination within epithelial cells of the intestine are allowed by the iphH gene, which is encoded by chromosomal DNA and/or recombinant plasmids, while ial, which is encoded by plasmids (invasion-associated loci), enables Shigella bacteria to penetrate intestinal epithelial tissues [7,8]. The chromosomal genes set1A and set1B encode Shigella enterotoxin 1 (ShET-1) [9,10], which is easily detected in all S. flexneri 2a isolates. Shigella enterotoxin 2, which is encoded by the gene sen, is located on a large plasmid associated with the virulence of Shigella and is found in most Shigella of different serotypes and in enteroinvasive Escherichia coli (EIEC) [11,12]. In addition to their enterotoxic activity, ShET-1 and ShET-2 play significant roles in the transport of electrolytes and water in the intestine [12]. VirA located on large virulence plasmids has a great impact on intercellular spreading and invasion [13]. On the other hand, the type III secretion system (T3SS) is regarded as an important component for bacterial entry and is also composed of several proteins, including a needle-shaped oligomer anchored in the protein complex that connects the inner and outer bacterial membranes. The tip of the needle is an oligomer composed of the invasion plasmid antigens ipaB, ipaC, and ipaD [14][15][16]. Furthermore, the upstream ipaB region is often used as a marker to detect the ipaBCD gene.
The natural hosts of Shigella are typically humans and other primates [4], but monkeys, rabbits, calves, fish, chickens and piglets were recently reported to be infected with Shigella and are thus considered new hosts [4,[17][18][19][20][21]. In recent years, S. dysenteriae, S. flexneri, and S. sonnei have been isolated from cows. Although Shigella is becoming a major threat to animals, there is limited information on the genetic background of the isolated strains. Therefore, to identify molecular genotypes and determine the genetic relatedness diversity of local S. flexneri strains, we performed analyses using the multilocus sequence typing (MLST), multilocus variable number tandem repeat analysis (MLVA) and pulsed field gel electrophoresis (PFGE) methods.

Virulence factors
The frequencies of the virulence factor profiles of the S. flexneri isolates are listed in Fig. 1. A total of seven virulence factors, including ipaH (100 %), virA (100 %), ipaBCD (92.59 %), ial (77.78 %), sen (79.63), set1A (48.15 %) and set1B (48.15 %), were detected in those isolates. None of the studied strains possessed the stx gene. The Shigella enterotoxin genes set1A and set1B were present only in S. flexneri 2a, and all of these serotype isolates were positive for these two genes.
Regarding the differences in the distributions of the virulence factors, the 54 S. flexneri isolates fell into seven virulence gene profile types (VTs) ( Table 2). Among these VTs, VT4 (positive for ipaH, virA, ipaBCD, ial, and sen) and VT6 (positive for ipaH, virA, ipaBCD, ial, sen, set1A, and set1B) were the most common, accounting for 29.63 and 44.44 % of all VTs, respectively. Furthermore, 92.59 % of the isolates carried two or more virulence factors. In addition, the virulence factor types were associated with the S. flexneri serotype. VT1 was found only in 4a, and VT4 was present in isolates from each serotype, except 2a. S. flexneri 2a major belonged to VT6 (24/26, 92.31 %).

MLST-based genotype analysis
MLST was performed to analyze the genotypic diversity of S. flexneri isolates based on 15 housekeeping genes. The 54 isolates were divided into seven STs: ST68, ST100, ST103, ST120, ST124, ST135 and ST227. Among  them, ST227 was novel, while six other STs have previously been reported. These seven STs belonged to several clonal complexes (CCs): CC10 (ST100 and ST103), CC26 (ST68), and others (ST120, ST124, ST135 and ST227). The clustering tree ( Fig. 2) based on the MLST data showed that ST68 was a singleton type and that the other six STs contained two or more isolates. The most common ST was ST100 (n = 33, 61.11 %), which included isolates of serotypes 1a, 2a, and Xv. All the isolates of ST124 and ST227 belonged to S. flexneri 6 and 4a, respectively. The cluster tree indicated that isolates belonging to the same serotype were closely clustered based on the province of isolation. In addition, according to the minimum spanning tree (MST) based on the allele, ST100, ST120 and ST135 had closer relationships and differed only in aspC (aspartate aminotransferase), whereas ST68, ST124 and ST227 were very different from the other STs (Fig. 3).

MLVA-based genotype analysis
MLVA based on eight VNTR loci was performed to further characterize the isolated S. flexneri strains. The copy numbers of the eight VNTR loci are listed in Fig. 4. Overall, the 54 isolates based on their unique MLVA profiles were discriminated into 39 different MLVA types (MTs). Among them, twenty-eight MTs belonged to the singleton type, and the other ten MTs contained no more than three isolates. The MLVA cluster tree of the isolates showed that they were divided into five clusters, designated A to E, with a low coefficient of similarity from 20 to 60 % (Fig. 4). Each cluster was further divided into many subclusters. MLVA can cluster different serotype strains separately and distinguish between the same serotype strains. The main cluster, cluster C, was observed to cluster S. flexneri 2a isolates and further divided into 15 MTs. Additionally, clusters A (except GBSF1502176), D and E clustered only the Xv, 2b, and 6 serotype strains, respectively. The results showed differences based on geographical origin and time span in the same serotype.

PFGE-based genotype analysis
The genotypes and genetic relatedness diversity of 54 S. flexneri isolates were assessed by PFGE. NotI-digested S. flexneri chromosomal DNA generated 31 reproducible unique PFGE patterns (PTs), each with 11-16 bands (Fig. 5). Eleven patterns were represented by more than one isolate, with PT20 (n = 8) containing the most isolates, followed by PT18 (n = 5). The dendrogram of S. flexneri isolates showed low similarity (40-60 %) and could be classified into three gross clusters on the basis of their serotypes: clusters A, B and C. Isolates belonging to the same serotype but recovered in different years showed clear relatedness, as indicated by their grouping in the same clusters. The majority of serotype 2a isolates, with the exception of isolate QYSF1511395, grouped together in cluster B. The QYSF1511395 strain isolated from Qinghai Province clustered independently in cluster C. Isolates 1a, 2b and Xv clustered into cluster B and were closely related to the serotype 2a isolates. However, the isolates of serotypes 4a and 6 were assigned to cluster A with a relatively close relationship, but different serotype strains clustered separately.

Discussion
Shigella is an important invasive enteric infectious pathogen known for its sporadic, epidemic and pandemic spread [3],remains still a landmark cause of inflammatory diarrhea and dysentery thus poses a serious challenge to public health and is particularly tracked in most middle-income countries and regions with substandard hygiene and poor-quality water supplies [22]. All four types of Shigella can cause shigellosis, but S. flexneri is the most common bacterial preparation used for shigellosis [23]. The traditional hosts of this pathogen are limited to primates; however, the range of hosts has been extended to many animals in recent decades [4]. The symptoms of shigellosis in humans include diarrhea (100 %), headache (100 %), fever (100 %), nausea (99 %), abdominal cramping (97 %), vomiting (95 %), and bloody stools (51 %); however, the symptoms in animals are unclear [24]. A better understanding of the hosts of  Shigella is needed to assess their potential effects on animal health; otherwise, preventing Shigella from causing disease is a challenge. The pathogenesis of Shigella contributes to the organism's ability to invade, replicate and spread intercellularly within the colonic epithelium. Pathogenic factors cause pathogenic Shigella to invade intestinal epithelial cells, leading to dysentery and other intestinal clinical symptoms in the host [25], and its pathogenesis is often multifactorial and coordinated [26]. Virulence factors have become important indicators of pathogenic bacteria.
Based on the detection of virulence factors, the Shigella isolates used in the present study had vast genetic diversity. Our results showed that ipaH and virA were found in each strain. Arabshahi et al. similarly showed that ipaH was present in all Shigella isolates and that virA was harbored by 88.9 % (8/9) of S. flexneri isolates [27], which agrees with a previous study that demonstrated that ipaH is carried by all four Shigella species as well as by enteroinvasive E. coli (EIEC). Multiple copies (ipaH1.4, ipaH2.5, ipaH4.5, ipaH7.8 and ipaH9.8) on large plasmids and chromosomes may explain why the ipaH gene tested positive in all isolates. Therefore, as a diagnostic tool for detecting Shigella, the ipaH gene is often an appealing target, even in the absence of a plasmid [28]. VirA was initially thought to invade Shigella; however, a structural analysis showed that VirA lacks papain-like protease activity to promote tubulin division. VirA belongs to the GTPase-activating protein family, which is involved in the cleavage of a single membrane into vacuoles. Previous studies have shown that VirA is often present in Shigella and is an important terminal point for bacteria to invade host cells and nucleate actin at one end of bacteria [9,29].
Expert opinions have suggested that the T3SS is essential for host cell invasion and intracellular survival among virulence factors, whereas IpaB, IpaC, and IpaD are key factors of virulent Shigella [9,30,31]. Unlike the ipaH gene, the ial gene is not common. The ipaH gene is located only on the inv plasmid, and compared with the chromosome gene, the stability of the IPAH plasmid for storage/subculturing is poor [6][7][8]. Our results show that the ial gene has high invasiveness in the isolates studied. Therefore, it should be noted that the ial gene is involved in the invasion of intestinal cells and that the higher positivity rate of this gene in S. flexneri might indicate stronger aggressiveness.
The Shigella enterotoxins ShET-1 and ShET-2, which alter electrolyte and water transport in the small intestine, can cause diarrhea and dehydration [22]. ShET-1 is located on chromosomes encoding set1 (A and B subunit) genes, is almost exclusively found in several S. flexneri serotype 2 isolates and is rarely found in other serotypes [32]. Consistent with previous studies, our study showed that set1A and set1B were detected only in Fig. 3 Minimum spanning tree of the 54 S. flexneri isolates from calves with diarrhea based on multilocus sequence typing (MLST). The minimum spanning tree was constructed using the 7 identified STs obtained from the 54 isolates using BioNumerics software. Each circle corresponds to a single ST. The shaded zones in different colors correspond to different serotypes. The size of the circle is proportional to the number of isolates, and the color within the circles represents the serotype of the isolates. The corresponding color, serotype, number of isolates and background information are shown to the right of the minimum spanning tree Fig. 4 Relationship of S. flexneri isolates isolated from calves with diarrhea based on MLVA. Isolates were analyzed using an eight-VNTR locus MLVA scheme. A dendrogram was constructed using UPGMA. The corresponding MLVA types with the copy numbers of the eight VNTRs, serotype, and background information are shown to the right of the dendrogram. The letters A-E represent 5 clusters the S. flexneri 2a strain. The plasmid encoding ShET-2 (encoded by sen) is an enterotoxin hemolysin that causes an inflammatory response during Shigella invasion [12,22]. It has been reported that there is a close relationship among sen, set enterotoxins and bloody diarrhea [22], which implies that sen and set enterotoxins are pathogenic factors of bloody diarrhea. However, unlike ShET-1, ShET-2 could be harbored by other species of Shigella.
The molecular characterization of strains is significant for epidemiological studies. However, few reports are available to systematically understand the molecular characteristics of S. flexneri isolated from animals. Several useful genotyping tools with higher discriminatory power than traditional tests, including MLST [33], PFGE [34] and MLVA [35], have recently been applied to explore and analyze the characteristics of Shigella isolates. Phylogenetic analysis, as an important method for supporting strain isolation, is based on differences in strain genetics.
MLST is an important source of sequence data for relative genetics and thus provides a tool for exploring molecular evolutionary methods among bacteria [36]. With the key elements of 15 housekeeping genes and analysis of the EcMLST database, the advantage of MLST is the comparison of data from different laboratories. Our results suggested that the predominant ST was ST 100, which has previously been found in human S. flexneri isolates [37,38]. Specifically, isolates belonging to the same serotype often showed one ST type, indicating the low discriminative ability of closely related strains within a specific serotype due to the high sequence conservation of housekeeping genes.
Compared with MLST profiles, MLVA and PFGE may be forceful tools that can provide a satisfactory level of discrimination. However, the function of MLVA in the phylogenetic analysis of different bacterial species or serotypes is poorly targeted [39]. Nevertheless, MLVA is an ordinarily used typing tool that has been used to establish genetic relatedness and perform phylogenetic analysis among strains of monomorphic species. In our study, with approximately 20 % similarity, the 54 S. flexneri isolates were divided into 39 different MTs and clustered into 5 groups. Previous studies have also shown the high resolving power of MLVA in closely related strains [40][41][42]. Though applied in a limited collection of S. flexneri isolates, this study indicates the high discriminatory power of the MLVA method for subtyping strains with the same serotype.
With its strong function and widespread use, PFGE is also an applicable typing tool available in the laboratory for discriminating several enteric bacteria, such as Shigella. PFGE has a high degree of intra-and interlaboratory reproducibility when standardized protocols are followed [43]. Thirty-one low homophyly and unique PFGE patterns confirmed the existence of diverse S. flexneri clones and the usefulness of PFGE in local epidemiological studies.

Conclusions
This study demonstrated that spontaneously prevalent S. flexneri in cows shelter the same virulence factors as the prevalent isolates in humans. Therefore, these isolates are a potential threat to public safety. To systematically understand S. flexneri, the PFGE, MLVA and MLST methods were applied to characterize the 54 isolates hereditary. MLVA based on 8 VNTR loci discriminated the 54 isolates into 39 different MTs, PFGE based on NotI digestion ambiguously differentiated the 54 isolates into 31 PTs, MLST based on 15 housekeeping genes differentiated the 54 isolates into 7 STs, and 1 ST (ST227) was novel. Although MLST provided suitable discrimination in S. flexneri subtyping, PFGE and MLVA might both exhibit a higher discriminatory ability. Overall, the data from this study will provide a useful typing resource, which will provide a scientific basis for addressing clinical and epidemiological issues regarding S. flexneri.

Bacterial isolates and bacteriological examination
Animal-based active surveillance was conducted in 3321 calves with diarrhea from five provinces (Gansun, Shanxi, Qinghai, Xinjiang and Tibet) in northwestern China from 2014 to 2016. All of the isolates were collected directly from fresh stool samples following plating on Salmonella-Shigella (SS) selective agar and confirmation on MacConkey (MAC) agar at 37°C for 24 h. Colorless, semitransparent, smooth, and moist circular plaques were considered presumptive Shigella for biochemical confirmation. Biochemical tests were performed on S. flexneri using API20E test strips (bioMerieux Vitek, Marcy-l' Etoile, France), and the serotype was tested by a commercially available agglutinating antibody kit (Denka Seiken, Tokyo, Japan) according to the manufacturers' recommendations. Information on the S. flexneri isolates in this study is listed in Figs. 2 , 4 and 5.

Preparation of DNA templates
The DNA templates for PCR (virulence factors, MLST, MLVA) were directly extracted from bacterial colonies using the boiled lysate method as previously reported [44].

Detection of virulence factors
All 54 strains were tested by PCR for the presence of 8 virulence-associated genes, namely, ipaH, ipaBCD, virA, ial, stx, set1A, set1B, and sen, according to published procedures [15,45,46]. PCRs were performed according to published protocols, and the primer sequences are listed in Table S1.

Multilocus sequence typing
All isolates were subjected to MLST according to the protocols described in the EcMLST database (http:// www.shigatox.net/ecmlst). The PCR products were bidirectionally sequenced, and the sequences of the 15 housekeeping genes were edited by using SeqMan 7.0. Each unique allele was assigned a different number, and the allelic profile (string of fifteen allelic loci) was used to define the ST of each isolate [47]. Clustering and minimum spanning tree (MST) analyses were used to infer relationships among the isolates using the fingerprint analysis software BioNumerics (version 7.1).

Multilocus variable number tandem repeat analysis
MLVA of 8 VNTR loci (SF3, SF4, SF6, SF7, SF8, SF9, SF10 and SF25) was performed using a previously described method [35]. The forward primer for each primer set was labeled at its 5' end with the ABIcompatible dyes HEX, 6'-FAM, TAMRA, and ROX (Table S2). In these cases, the loci were individually amplified, with each 20 µL PCR mixture containing 1 µL of each primer, 1 µL of DNA template, 10 µL of Taq MasterMix (Takara, Japan) and deionized water to a final volume of 20 µL. PCR was performed with a denaturing step at 94°C for 5 min, followed by 30 cycles of amplification at 94°C for 30 s, 55°C for 45 s, and 72°C for 45 s and a final extension at 72°C for 5 min at the final step.
The PCR products were analyzed by capillary electrophoresis on an ABI Prism 3730 XL Genetic Analyzer with the GeneScan 500 LIZ Size Standard as previously described [48]. The number of repeat units for each allele was calculated from the length of the amplicon. The copy number of each VNTR locus was subjected to cluster analysis using the MST algorithm and the categorical coefficient provided in BioNumerics software. Each unique allelic string was designated a unique MLVA type. A dendrogram was constructed by UPGMA clustering based on categorical coefficient analysis [35,49].
Pulsed-field gel electrophoresis DNA fingerprinting was performed by PFGE with the restriction enzyme NotI (TaKaRa; Japan) according to the international standards set by the CDC. PFGE images were photographed with a Universal Hood II (Bio-Rad; USA) and analyzed with BioNumerics using the Dice similarity coefficient, unweighted pair-group method with the arithmetic mean (UPGMA) and 1.0 % band position tolerance. A PFGE type (PT) was defined as a pattern with one or more DNA bands different from other patterns.