Molecular analysis of subtilase cytotoxin genes of food-borne Shiga toxin-producing Escherichia coli reveals a new allelic subAB variant

Background The open reading frames of subAB genes and their flanking regions of 18 food-borne Shiga toxin-producing E. coli (STEC) strains were analyzed. Results All but one subAB open reading frames (ORF) were complete in all STEC strains. The subAB1 genes of nine STEC strains were located on large plasmids. The subAB2 allele (here designated subAB2-1), which was recently described by others to be present in the Subtilase-Encoding PAI (SE-PAI) was found in 6 STEC strains. A new chromosomal subAB2 variant, designated subAB2-2 was detected in 6 strains and was linked to a chromosomal gene hypothetically encoding an outer membrane efflux protein (OEP). Three STEC strains contained both subAB2 variants. DNA analysis indicated sequence conservation in the plasmid-located alleles and sequence heterogeneity among the chromosomal subAB2 genes. Conclusions The results of this study have shown that 18 subAB-PCR positive STEC strains contain complete subAB open reading frames. Furthermore, the new allelic variant subAB2-2 was described, which can occur in addition to subAB2-1 on a new chromosomal locus.


Background
Shiga toxin-producing E. coli (STEC) can cause serious human infections ranging from uncomplicated watery diarrhea to bloody diarrhea, up to the hemolytic uremic syndrome (HUS), including neurological complications [1]. The production of Shiga toxins (Stx) is considered to be the major virulence factor of STEC [2]. In addition to the production of Stx, the generation of histopathological lesions on host enterocytes, termed attaching and effacing lesions, which are caused by proteins encoded on the locus of enterocyte effacement (LEE) can lead to serious symptoms of disease [3]. The intimin-encoding E. coli attaching and effacing (eae) gene is located on the LEE. Intimin is involved in the intimate attachment of STEC to the enterocytes, and the corresponding eae gene has been used as a marker for the presence of the LEE [4]. In contrast, eae-negative E. coli of various serotypes were described to cause serious diseases. Examples of these are the outbreak of hemolytic-uremic syndrome (HUS) caused by a STEC strain of serotype O113:H21 in South Australia in 1998 [5], and more recently, the serious outbreak of diarrhea and HUS in Germany in 2011 with STEC of serotype O104:H4 [6]. Such strains may harbor other important virulence markers than the LEE. Whereas the O104: H4 outbreak strain had an enteroaggregative E. coli backbone, the O113:H21 outbreak strain expressed a subtilase cytotoxin (SubAB) with cytotoxic and apoptotic properties, in addition to Stx [7]. Paton et al. [8] described this novel AB 5 cytotoxin occurring in the eae-negative STEC O113:H21 outbreak strain. This toxin caused cell death in a number of animal and human cells and enhanced survival of pathogenic E. coli strains in macrophages [9]. The initially described subtilase cytotoxin SubAB is encoded by the closely linked subA and subB genes organized in an operon structure on the megaplasmid pO113 [7,8]. The STEC autoagglutinating adhesion saa is also located on pO113, close to the subAB operon [8].
This subtilase cytotoxin consists of a single enzymatic active A-subunit (SubA) and five receptor binding B-subunits (SubB). SubA comprises 347 amino acids and contains the catalytic triad Asp-52, His-89, and Ser-272 typical of subtilase family serine proteases [8]. The SubB protein is 141 amino acids in length and responsible for the receptor mediated cellular uptake. SubA is a serine protease cleaving the chaperone GRP78/BiP in the endoplasmatic reticulum (ER) [10]. This leads to an unfolded protein response and ER stress-induced apoptosis [11]. Moreover, it has been demonstrated that SubAB confers HUS-like symptoms in mice [8,12]. SubB has a high binding specificity for α2-3linked N-glycolylneuraminic acid (Neu5Gc), and a lower binding specificity to α2-3-linked N-acetylneuraminic acid (Neu5Ac) [13]. Human cells are not able to synthesize Neu5Gc but can generate high affinity receptors when incubated with this molecule [14]. It has been hypothesized that ingestion of Neu5Gc rich diet will confer susceptibility to the SubAB toxin [13].
Besides the plasmid-located subAB (subAB 1 ) operon, a chromosomal variant was described in 2010 by Tozzoli et al. [15]. This variant (subAB 2 ) showed only 90.0% sequence identity to the plasmid-located one but was also able to cause cytotoxic effects on vero cells [15]. The chromosomal subAB 2 variant has been recently shown to be harbored on a genomic island. This 8058 bp Subtilase-Encoding PAI (SE-PAI), is positioned between the tRNA gene pheV and the yjhs gene, putatively encoding an 9-O-Acetyl N-acetylneuraminic acid esterase in E. coli strain ED32. The SE-PAI contains an integrase gene, a shiA gene (homologous to the shiA gene of the Shigella flexneri pathogenicity island SHI-2), a sulphatase, the toxigenic invasion locus A (tia) and the subAB operon [16,17].
The aim of the current study was to characterize the subAB genes and their genetic surrounding in a collection of 18 subAB-positive food-borne STEC strains in order to get a more detailed understanding of gene variability, genetic structure, and location.

Molecular methods
Purification of the large STEC plasmids was performed according to Kado et al. [23], with minor modifications. Chromosomal DNA was prepared according to standard methods [24]. Concentration and purity of plasmid and chromosomal DNA was measured by UV-vis spectrophotometry using a Nanodrop 2000 device (Thermo Scientific, Germany).
For detection of the subAB operon on chromosomal or plasmid DNA, a 1066 bp DNA probe spanning the subA and subB gene region was generated by PCR with a Roche PCR DIG probe synthesis kit (Roche Applied Science, Germany) using the primer pair subAB-V-for and subAB-V-rev (Table 2). Strain TS30/08 was used as a template for subAB 1 and strain LM27558 stx2 was used for subAB 2 with the same primer pair. The specificity of the probes were tested by hybridization of the probes with subAB genes cloned in vector pK18 [25] (data not shown). The purified chromosomal and plasmid DNA of subAB-positive strains was separated on a 0.8% agarose gel with 130 V at 4°C for 2 h. Subsequently, DNA was transferred on a nitrocellulose membrane by vacuum blotting (VacuGene XL, GE Healthcare, USA) at 60 mbar, then treated with a 1% (v/v) blocking solution (Roche Applied Science, Germany) and hybridized at 73°C for 20 hours. The detection protocol was performed according to the manufacturer's instruction using sheep anti-digoxigenin-AP Fab fragments (Roche Applied Science, Germany).

PCR-screening, sequencing and sequence analysis
Characterization, and sequencing of subAB alleles as well as the presence of saa or tia genes were determined by amplification with the oligonucleotides shown in Table 2. DNA sequence analysis of subAB open reading frames was carried out by capillary sequencing using a CEQ™ 8000 Genetic Analysis System (Beckman Coulter, Germany) and the CEQ Dye Terminator cycle sequencing (DTCS) quick start kit (Beckman Coulter, Germany) according to the manufacturer's recommendation. Final DNA sequences were obtained by sequencing both complementary strands with an at least two-fold coverage. Oligonucleotides for sequencing were created using the Oligo-Explorer ver. 1.1.2 software (http://www.genelink.com) using nucleotide sequences of E. coli strains 98NK2 (Acc. no. AY258503), ED32 (Acc. no. JQ994271), and 1.02264 (Acc. no. AEZO02000020.1) from the NCBI database. The same sequences were used as reference sequences for phylogenetic analyses and sequence comparison. The obtained sequences for all subAB alleles were submitted to the EBI database and achieved consecutive accession no. from #HG324027 -#HG324047. Editing of raw data and sequence-alignments were carried out using Bioedit, version 7.0.5.3 [27]. Phylogenetic analysis of the different subA genes was conducted using Mega 5.1 with an UPGMA algorithm [28].

Genomic localization of subAB genes
In order to characterize the subAB genes of 18 food-borne STEC from a previous study, which were positive by PCR targeting a fragment of the subAB operon [19], they were initially analyzed for the presence and genetic location of their complete ORF. By purification and gel electrophoresis of plasmid DNA of all 18 STEC strains, it could be demonstrated that all strains carried plasmids of various sizes (data not shown). Sixteen strains carried large plasmids with molecular weights larger than that of plasmid pO157 of E. coli O157:H7 strain EDL933 (representative plasmid preparations are shown in Figure 1A). Southern blot hybridization with a specific DNA probe directed to subAB 1 , showed that 9 strains carried subAB 1 on a large plasmid ( Figure 1A). None of the other strains reacted with the probe (data not shown). Southern blot hybridization of chromosomal DNA preparations (representative DNA preparations are shown in Figure 1B) reacted with a probe directed to subAB 2 and demonstrated chromosomal localization of subAB 2 in the other 9 strains ( Figure 1B).

PCR analysis of subAB and adjacent DNA regions
All STEC strains were analyzed by PCR with specific primers directed to the subAB operon or flanking regions of the two recently described subAB alleles [8,16] (Figure 2). PCR-products were confirmed by DNA-sequencing. For the detection of plasmid-located subAB 1 , primer pair subAB-for5/subAB-rev5 (Figure 2A) was used to amplify the complete ORF, including a region 202 bp upstream and 194 bp downstream of subAB 1. The nine strains with plasmid-located subAB 1 yielded a PCR product of the expected size of 1821 bp, indicating the presence of the subAB 1 variant and complete ORFs in these strains (data not shown). Moreover, saa was present in these strains indicating a similar genetic arrangement as previously described [8].
Since it has been reported that the chromosomal subAB 2 variant of STEC strain ED32 was linked to the tia gene in the chromosomal island SE-PAI [16], corresponding primers were used to test the hypothesis whether the remaining 9 strains contained this particular variant (for a scheme see Figure 2B). In initial experiments PCR with primers tia_lo and tia_sense ( Table 2) was positive in all nine strains and proved the presence of the tia gene. However, PCR products of strains LM27553 stx1 and LM27553 stx2 were larger than expected, indicating insertion of foreign DNA into or closely to the tia gene [15] ( Table 1).
Following this, the structure of the subAB 2 operon and adjacent DNA was analyzed using the primer pair tia_lo/ SubAB2-3′tia targeting the region of the tia gene, an intergenic region (linker), subAB 2 , as well as 316 bp of the downstream region ( Figure 2B). This should reveal a PCR product of 3174 bp. In these PCRs, 6 STEC strains were positive (see Figure 3A, lanes 3, 5-9), indicating the presence of subAB 2 linked to the tia gene (Table 1). However, one of these PCRs with strain LM27553 stx1 as a template, revealed a PCR product of approximately 4500 bp ( Figure 3A, lane 3). Since the open reading frames of subA 2-1 and subB 2-1 in this strain were of the correct size, insertion of foreign DNA between subA 2-1 and tia is assumed. PCR of STEC strains LM14603/08, LM16092/08 and LM27553 stx2 with the same primers was negative ( Figure 3A, lanes 1, 2, and 4), and therefore direct association of subAB 2 with the tia gene could  not be demonstrated. Weak bands in Figure 3A, lanes 1, 2, and 4 reflect unspecific amplification products. Due to these negative results, the subAB 2 reference sequence of STEC strain ED32 (GenBank Acc. No. JQ994271) was searched with BLAST against the NCBI nucleotide database to evaluate the possibility of further subAB gene loci in these strains. Interestingly, a further subAB operon with different flanking regions was detected in Escherichia coli strain 1.2264 in contig 3905 (Acc. No. AEZO02000020.1) and in Escherichia coli strain 9.0111 in contig 1125855384441 (Acc. No. AEZZ02000028.1), which in addition carry the SE-PAI described by Michelacci et al. [16]. The new gene locus carries genes hypothetically encoding parts of a type 1 secretion system (T1SS), and an outer membrane efflux protein (OEP), which are located upstream of subAB 2 and are linked to the latter by a 1496 bp sequence (for a scheme see Figure 2C). Downstream of subAB 2 , the nanR gene hypothetically encoding the transcriptional regulator of the nan-operon was present in a 1400 bp distance in strain E. coli 1.2264 and 3842 bp in E. coli 9.011 where additional putative transposases are inserted (data not shown). In the following, this new gene region is termed OEP-locus.
To test the hypothesis whether in the STEC strains investigated here also two copies of subAB 2 are present, oligonucleotides SubAB5′-OEP and SubA_out were designed ( Figure 2C) and used for PCR amplification of all subAB 2− positive strains. Six strains were positive with these primers ( Figure 3B, lanes 1-6), including the strains LM14603/08, LM16092/08 and LM27553 stx2 , which were  (Table 1) were used to generate a template for sequencing. negative for the SE-PAI ( Figure 3A, lanes 1,2, and 4). Moreover, this demonstrated that STEC strains LM27553 stx1 , LM27564 and LM27558 stx2 contained both chromosomal subAB 2 loci (Table 1).

Sequencing of subAB open reading frames
In order to further prove that the subAB operons contained complete ORFs, we determined the nucleotide sequence of the entire subAB open reading frames of the PCR products derived from the three different gene loci. Results of the DNA sequencing complied with the PCR data (see above), and confirmed the presence of three loci encoding different alleles of subAB. The different alleles of the chromosomal loci were designated subAB 2-1 for the one located in the SE-PAI and subAB 2-2 for the new variant located in the OEP-locus. The sequence of the nine subAB 1 operons was identical and comprised 1486 bp from the start codon of subA 1 to the last base of the stop codon of subB 1 . Sequences were 99.8% identical to the corresponding subAB operon sequence of strain 98NK2 published by Paton et al. [8].
In all 12 chromosomal DNA sequences the A-subunit genes had the same length as the subA 1 genes described above and that from reference strain 98NK2. All but one subB 2 genes had the same length as the reference sequence of ED32 but were one triplet shorter at the 3′-end of the gene, than subB 1 . This resulted in the lack of the Nterminal amino acid serine in the putative SubB2-subunits.
Moreover, the subB 2-2 sequence of strain LM27553 stx1 contained an insertion of a single thymine; generating a stretch of 5 T's at position 1298-1302, which was not present in the subB 2 alleles of the other strains. This resulted in a frame shift in the B-subunit gene, and thereby to a stop codon at position 253 of the ORF. This putatively results in a truncated protein of 84 amino acids instead of 140 amino acids as for the full length SubB2 subunits.
Phylogenetic analysis of all 21 A-subunit genes clearly demonstrated three clusters (Figure 4). Cluster 1 comprises the very homogeneous subA 1 genes, cluster 2 the subA 2-1 genes, including the reference sequence of ED32, and cluster 3 the subA 2-2 genes located in the OEP-locus. In cluster 2 there is a single subA 2-2 allele located on the OEP-locus (Figure 4).
Comparing the whole subAB sequences of 1483 bp (sequences were cut to the same length), the subAB 2-1 sequences of cluster 2, including subAB 2-2 of strain LM27564 were 99.5% identical to each other. The sequence identities of subAB 2-1 to the reference strain ED32 were in a range of 99.2-99.5% for the other subAB 2-1 alleles.
The subAB 2-2 genes of the OEP-locus of strain LM27564 showed 99.1% sequence identity to subAB 2-1 of strain ED32 Figure 4 Sequence analysis and phylogenetic distribution of subA alleles from different genomic loci. Phylogenetic analyses were performed after sequencing and sequence analysis by the software Mega 5.1 using the UPGMA algorithm [28]. Results of the phylogenetic analysis demonstrate clustering of all subA genes according to their genomic loci with exception of subA 2-2 of strain LM27564 which is located in the region between subA 2-1 and subA 2-2 . and only 89.9% with subAB 1 of strain 98NK2 and 97.9% to the OEP-locus of E. coli strain 1.2264. The results of these sequence comparisons show that the sequences of the three alleles are conserved but heterogeneity is present between the loci.

Discussion
The results of this study have shown that those 18 foodborne STEC, which have previously been demonstrated to be subAB-positive by PCR [19] carry complete subAB open reading frames. Besides the plasmid-locus, as originally described by Paton et al. [8], and the SE-PAI described by Michelacci et al. [16], a new chromosomal region, the OEP-locus, was present in six strains analyzed here and demonstrated to harbor subAB 2-2 operons.
It could be shown that all strains contained at least intact open reading frames for one subAB operon, and the codons specifying the amino acids constituting the catalytic triad were present in all cases (data not shown). From the sequence data obtained in our study, it can be concluded that all strains are able to produce functional SubAB subtilase cytotoxins.
The STEC strains analyzed in our study with subtilaseencoding plasmids did not carry chromosomal subAB genes and vice versa. Up to now we do not know whether this is a basic principle or whether this is only observed in our small strain collection. However, we cannot rule out that chromosomal-encoded and plasmid-encoded subAB genes exclude each other or that recombination between plasmids and the chromosome in subAB-carrying strains is low. Phylogenetic analyses of the subA genes clearly differentiated three clusters, the plasmid-located being the most homogeneous one. The chromosomal clusters showed more genetic diversity, indicating a different phylogenetic history ( Figure 4). These phylogenetic differences could reflect a different pathogenic potential and toxicity of subAB-positive strains for humans as it was shown for the different Shiga toxin variants [29,30]. Therefore, it could be important to analyze the enzymatic and toxic activity of the variants in different cell culture and animal models. Moreover, it could be important to analyze which subAB variants are associated with serious diseases and whether further variants exist, which currently have not been described.
In a former study, it could be shown that the 18 strains used here carried gene fragments of the subtilase cytotoxin [19]. These strains were isolated from different foodsources and showed a high serotype heterogeneity demonstrating the wide spread of subAB in stx-positive E. coli. Genetic analysis of these strains demonstrated that the chromosomal encoded subAB 2 -positive strains were all associated with deer meat, whereas the plasmid encoded subAB 1 could be found in strains from different sources. This association of the chromosomal encoded subAB 2 variant with deer was also described in other studies [16,18,31] and suggests the possibility of small ruminants as reservoir for subAB 2 positive STEC.

Conclusions
The results of our analysis have confirmed that subAB should be further considered as a marker for virulence, especially in food-borne STEC strains. The occurrence of more than one subAB allele in particular strains is interesting and raises the question whether multiple gene acquisitions may bear a selective advantage for those strains. The fact that subtilase cytotoxin-producing Escherichia coli have not been frequently involved in outbreaks of human disease could be a hint for a function in other hosts such as small ruminants. Increased detection of subAB in such animals supports this assumption. However, cell culture and animal experiments have shown profound toxic effects on primary human epithelial cells [32]. Therefore, future studies are necessary to investigate the function and expression of the different subAB alleles in more detail.