Virulence gene profiling of enterohemorrhagic (EHEC) and enteropathogenic (EPEC) Escherichia coli strains: a basis for molecular risk assessment of typical and atypical EPEC strains

Background Enterohaemorrhagic E. coli (EHEC) can cause severe disease such as bloody diarrhoea and haemolytic uraemic syndrome in humans. Besides production of Shiga toxins, the presence of LEE (eae-gene) and non-LEE (nle) encoded effector genes harboured on O-islands OI-122, OI-71 and OI-57 is associated with EHEC virulence and their frequency in outbreaks. Genes encoded by the EHEC-plasmid are putative virulence markers of EHEC. EHEC-plasmids, LEE and non-LEE effector genes have also been detected in some strains of enteropathogenic E. coli (EPEC). The objective of this study was to analyze the relationship between EHEC and EPEC for virulence genes encoded by genomic O-islands and by the EHEC-plasmids. Results Nle genes ent/espL2, nleB and nleE (OI-122), nleA, nleF and nleH1-2 (OI-71), nleG5-2 and nleG6-2 (OI-57), espK (CP-933N) and the EHEC-plasmid encoded genes ehxA, espP, etpD and katP were searched in 73 typical and in 235 atypical enteropathogenic E. coli (EPEC) strains. Typical and atypical EPEC each fall into two clusters. Cluster 1 typical (n = 46) and atypical (n = 129) EPEC strains were characterized by the presence of OI-122 encoded genes and grouped together with 64 investigated EHEC strains. Cluster 2 typical (n = 27) and atypical (n = 106) strains grouped together with 52 LEE-negative, Shiga toxin-producing E. coli (STEC) and with 21 apathogenic E. coli strains. Typical EPEC Cluster 1 strains belonged to serotypes frequently involved in severe illness and outbreaks in children (O111:H2, O114:H2, O55:H6, O127:H6 and O142:H6). Atypical EPEC Cluster 1 strains were characterized by serotypes related to EHEC (O26:H11, O55:H7, O145:H28, O103:H2 and O103:H25). Conclusion The OI-122 encoded nleB gene was found to be most closely associated with Cluster 1 strains and may serve as a diagnostic tool for the identification of virulent EHEC and EPEC seropathotypes. OI-71 encoded genes nleA, nleF and nleH1-2 are less associated with Cluster 1 strains. EHEC-plasmid, OI-57 and CP-933 associated genes showed only weak similarities with virulent Cluster 1 EHEC and EPEC strains.


Background
Escherichia coli strains that cause diarrhoea in humans have been divided into different pathotypes according to their virulence attributes and the mechanisms involved in the disease process [1,2]. Five major groups of intestinal pathogenic strains have been established, such as enteropathogenic E. coli (EPEC), enterohemorrhagic E. coli (EHEC), enteroaggregative E. coli (EAEC), enterotoxigenic E. coli (ETEC) and enteroinvasive E. coli (EIEC).
While EPEC is a major cause of infantile diarrhoea in the developing world, EHEC is associated with foodborne outbreaks in the developed world and can cause bloody diarrhoea, haemorrhagic colitis (HC) and the Haemolytic Uraemic Syndrome (HUS) due to the elaboration of Shiga toxin (Stx). More than 400 E. coli serotypes that produce Shiga toxins (STEC) have been described [3]. A small number of these have been shown to be implicated in severe disease such as HC and HUS in humans. A classification scheme has been established to group STEC strains into the five seropathotype groups A-E depending on the severity of disease, the incidence of human infections and the frequency of their involvement in outbreaks [4].
Strains belonging to the EPEC group have been subdivided into typical and atypical EPEC as these differ from each other in their adherence mechanisms to human epithelial cells [5] and in their evolutionary lineages [6]. Typical EPEC adhere in a localized manner mediated by bundle-forming pili that are encoded by EAF (EPEC adherence factor) type plasmids harboured by these strains [5,6]. Atypical EPEC do not carry EAF plasmids and most of these adhere in a localized adherence-like pattern to epithelial cells [5]. Some EPEC strains share similarities with certain EHEC strains in terms of their O:H serotypes, virulence genes and other phaenotypical traits [5,7,8].
The chromosomally encoded locus of enterocyte effacement (LEE) which is present in both, EPEC and EHEC strains plays a major role in their pathogenesis. The LEE carries genes for the attaching and effacing phenotype promoting bacterial adhesion and the destruction of human intestinal enterocytes [2,7,9,10]. Besides LEE encoded genes, a large number of non-LEE effector genes have been found on prophages and on integrative elements in the chromosome of the typical EPEC strains B171-8 (O111:NM) [11] and 2348/69 (O127:H6) [12]. In a homology-based search, all non-LEE effector families, except cif, found in the typical EPEC strains were also present in EHEC O157:H7 Sakai strain [11,12]. On the other hand, some strain specific effectors were only present in EHEC O157:H7 (EspK, EspX) and not in the EPEC strains. Moreover, EPEC O111 and O127 strains were different from each other regarding the presence of some effector genes (EspJ, EspM, EspO, EspV, EspW, NleD, OspB and EspR) [11,12].
It has been shown that EHEC O157:H7 has evolved stepwise from an atypical EPEC O55:H7 ancestor strain [13,14]. Atypical EPEC and EHEC strains of serotypes O26, O103, O111 and O145 have been found to be similar in virulence plasmid encoded genes, tir-genotypes, tccP genes, LEE and non-LEE encoded genes indicating that these are evolutionarily linked to each other [8,[15][16][17][18][19]. The classification of these strains into the EPEC or the EHEC group is merely based on the absence or presence of genes encoding Shiga toxins (Stx) 1 and/or 2. In EHEC strains, stx-genes are typically harboured by transmissible lambdoid bacteriophages and the loss of stx-genes has been described to be frequent in the course of human infection with EHEC [20,21]. On the other hand, it has been demonstrated that stx-encoding bacteriophages can convert non-toxigenic O157 and other E. coli strains into EHEC [22,23].
A molecular risk assessment (MRA) concept has been developed to identify virulent EHEC strains on the basis of non-LEE effector gene typing [24] and a number of nle genes such as nleA, nleB, nleC, nleE, nleF, nleG2, nleG5, nleG6, nleH1-2 and ent/espL2 have been found to be significantly associated with EHEC strains causing HUS and outbreaks in humans [4,16,17,24].
We recently investigated 207 EHEC, STEC, EPEC and apathogenic E. coli strains for the presence of nle genes and EHEC virulence plasmid-associated genes. By statistical analysis, two clusters of strains were obtained. OI-122 encoded genes ent/espL2, nleB and nleE were most characteristic for Cluster 1, followed by OI-71 encoded genes nleH1-2, nleA and nleF. EHEC-plasmid encoded genes katP, etpD, ehxA, espP, saa and subA showed only medium to low influence on the formation of clusters. Cluster 1 was formed by all EHEC (n = 44) and by eight of twenty-one EPEC strains investigated, whereas Cluster 2 gathered all LEE-negative STEC (n = 111), apathogenic E. coli (n = 30) and the remaining thirteen EPEC strains [17]. These findings indicate that some EPEC strains share non-LEE encoded virulence properties with O157:H7 and other EHEC strains. Such EPEC strains could be derivatives of EHEC which have lost their stxgenes but could also serve as a reservoir for the generation of new EHEC strains by uptake of stx-phages [16,20,25,26].
To classify strains of the EPEC group according to their relationship to EHEC we have investigated 308 typical and atypical EPEC strains for the presence of nle-genes of O-islands OI-57, OI-71 and OI-122, as well as prophage and EHEC-plasmid-associated genes. OI-122 encoded genes were found to be significantly associated with atypical EPEC strains that showed close similarities to EHEC regarding their serotypes and other virulence traits. In typical EPEC, the presence of Oisland 122 was significantly associated with strains which are frequently the cause of outbreaks and severe disease in humans.

Results
Cluster analysis of EHEC, EPEC, STEC and apathogenic E. coli strains E. coli pathogroups were established as described in the Methods section. The frequencies and associations between virulence genes and E. coli pathogroups are presented in Table 1. The linkage of genes according to their respective PAI or the EHEC-plasmid was 94.7% (230/243) for OI-122, 41.8% (142/340) for OI-71, 46.2% (80/173) for OI-57 and 1.8% (4/220) for the EHEC-plasmid. As not all PAIs were found to be genetically conserved we decided to perform the cluster analysis on single genes. The results from the cluster analysis using thirteen virulence genes that were taken as cluster variables are presented in Table 2. The 445 strains belonging to 151 different serotypes divided into two clusters. Cluster 1 encompassed all 64 EHEC strains, as well as 46 (63%) of the typical and 129 (54.9%) of the atypical EPEC strains. The remaining 133 EPEC strains, as well as all STEC (n = 52) and apathogenic E. coli (n = 21) were grouped into Cluster 2. The distribution of PAIs and the EHEC-plasmid according to E. coli pathogroups is presented in Figure 1.
The influence of the different virulence genes on the formation of the "EHEC related" Cluster 1 was calculated using the similarity measure of "Rogers and Tanimoto" [27]. The results are presented in Table 3. The OI-122 encoded genes nleB, ent/espL2 and nleE were highly characteristic of Cluster 1 strains (similarity measure > = 0.947). The OI-71 encoded genes nleH1-2, nleA and nleF, as well as nleG6-2 (OI-57) and espK (CP-933N) were also found to be characteristic of Cluster 1 strains but to a lesser degree (similarity measure 0.511-0.684). The presence of the EHEC-plasmid pO157 associated genes and of nleG5-2 (OI-57) had a minor effect on the formation of Cluster 1 (similarity measure 0.382-0.445).

Characteristics of typical EPEC belonging to Clusters 1 and 2
Forty-six (63%) of the 73 typical EPEC strains belonging to nine different serotypes were grouped into Cluster 1. Cluster 2 comprised 27 strains belonging to 12 serotypes ( Table 2). Typical EPEC Cluster 1 strains were all positive for OI-122 encoded genes ent/espL2, nleB and nleE (similarity measure 1.0), as well as for nleH1-2 (OI-71) (similarity measure 0.678) ( Table 4). These genes were absent in typical EPEC Cluster 2 strains, except for nleH1-2 (23.3% positive). All other genes that were investigated showed only low similarity (< 0.5) to Cluster 1 ( Table 4). a) absolute (n) and relative frequencies (%) are shown and the exact 95% confidence level (95%-CI) [48]; b) five strains have lost the EAF plasmid encoding bfpA upon subculture; c) standardized residuals > 1 indicates a major influence on a significant chi-square test. The 73 typical EPEC strains encompassed nineteen different serotypes and one strain was O-rough (Tables  5 and 6). A serotype-specific association with Clusters 1 and 2 was observed. Except for EPEC O119:H6, strains belonging to classical EPEC serotypes such as O55:H6, O111:H2, O114:H2 and O127:H6 grouped in Cluster 1 (Table 5), whereas more rarely observed serotypes were predominant among Cluster 2 strains ( Table 6). The single O111:H2 and the O126:H27 strain assigned to Cluster 2 were both negative for all OI-122 associated genes. All other 17 serotypes of typical EPEC were associated with only one cluster each.

Characteristics of atypical EPEC belonging to Clusters 1 and 2
A total of 235 atypical EPEC strains were investigated ( Table 2). Of these, 129 (54.9%) grouped into Cluster 1. The presence of OI-122 associated genes had the most influence on the formation of atypical EPEC Cluster 1 strains (similarity measures 0.942-1.0, Table 7). By contrast, only four (3.8%) of the 106 atypical EPEC of Cluster 2 were positive for OI-122 genes ent/espL2 (one O125: H6 strain) and nleE (one Ont:H52, O157:H39 and O168: H33 strain) and none of the strains was positive for nleB.

Discussion
The concept of molecular risk assessment [24] has been successfully employed for grouping STEC strains into those that are associated with outbreaks and life-threatening disease in humans and those which cause less severe or are not implicated in human disease. The presence of non-LEE effector genes encoded by O-islands OI-122, OI-71 and OI-57 has been shown to be highly associated with EHEC strains that were frequently involved in outbreaks and severe disease in humans [4,16,17,24,28,29]. In a previous work, we were able to associate the presence of OI-122 and OI-71 encoded genes with an "EHEC-Cluster" comprising forty-four EHEC strains as well as eight of twenty-one EPEC strains investigated [17]. This finding indicates that some EPEC strains are more related to EHEC in their virulence patterns, than others. In order to explore this relationship between EPEC and EHEC more closely, we investigated larger numbers of strains and serotypes of typical and atypical EPEC for thirteen virulence genes associated with EHEC O157 Oislands OI-122, OI-71, OI-57, the EHEC-plasmid and prophage CP-933N. Genes for nleG5-2 and nleG6-2 were included since OI-57 specific genes were previously found to be associated with classical EHEC and also with some EPEC strains [24,28]. The prophage CP-933 associated espK gene was included since its homologues were found in EHEC O157, O26, O103 and O111, in atypical EPEC O55:H7 but not in typical EPEC O127 and O111 strains [11,12,14,30,31].
Our findings indicate that about half of the typical and atypical EPEC strains and serotypes are closely related to EHEC regarding these virulence attributes ( Table 2). The presence of OI-122 encoded genes, followed by OI-71 were most significant for the assignment of EPEC to the "EHEC-related" Cluster 1 confirming data from our previous study performed on a different collection of strains [17]. The OI-57 encoded genes nleG5-2 and nleG6-2, as well as the espK gene were not as strongly associated with Cluster 1, as the OI-122 and OI-71 genes. Recently, the OI-57 associated genes adfO and ckf were reported to be present in 30 (71%) of 42 investigated EPEC strains but a high variability of OI-57 associated orfs in EPEC strains was observed [28]. This could explain the results of our study, where the OI-57 associated nleG5-2 gene was found infrequently in all EPEC, whereas the nleG6-2 gene was frequent in atypical EPEC (45.5%) but rarely found in typical EPEC (12.3%) ( Table 1). Further work is needed to define the genes of OI-57 that are most suitable for the molecular risk assessment of EHEC and EPEC strains.
In our study, EHEC-plasmids were associated with EHEC, STEC and atypical EPEC, but not with typical EPEC strains. EHEC-plasmids are frequently harboured by classical EHEC but also by many LEE-negative STEC strains [32][33][34]. Correspondingly, EHEC-plasmid encoded genes ehxA, etpD, katP and espP had only a small influence on Cluster 1 formation, confirming results of previous studies [16,17]. In this study, EHECplasmid genes were significantly more associated with atypical EPEC Cluster 1 than with Cluster 2 strains. The high proportion of EHEC-plasmid positives among  Cluster 1 strains suggests that many of these may have derived from EHEC by losing stx-genes. A loss of stxgenes was reported to occur frequently in classical EHEC strains [23,26]. EHEC-plasmid genes were found in 23/29 (79.3%) of atypical EPEC Cluster 1 strains belonging to EHEC related serotypes O26:H11, O103: H2, O145:H28 and O157:H7 (data not shown). These 30 EHEC-like strains showed the same virulence characteristics (presence of OI-122 genes) as their homologous EHEC strains. In addition to this, there are epidemiological findings pointing to a closer relationship between "Cluster 1" atypical EPEC and EHEC strains. Significantly (p < 0.05) more typable (78/120 = 65.0%) Cluster 1 strains than Cluster 2 strains belonged to serotypes (18/40 = 45.0%) that are associated with the production of Shiga toxins (38). Only 26.6% (24/90) of the atypical EPEC strains of Cluster 2 showed O:H types (10/46 = 21.7) previously associated with Stx-production.
Typical EPEC were also found to split into Cluster 1 and Cluster 2 strains. Cluster 1 was formed by typical EPEC serotypes O55:H6, O114:H2, O111:[H2], O127:H6 and O142:H6 strains which accounted worldwide for large outbreaks in hospitals, infant wards and day nurseries with a high mortality rate [35][36][37]. Cluster 2 typical EPEC accounted for serotypes that were more rarely associated with outbreaks, except for EPEC O119:H6, the latter was frequently associated with infantile diarrhoea in Brazil [38,39]. On the basis of these findings, a seropathotype classification for typical EPEC similar to those described for STEC [4,24] can be established. Typical EPEC strains associated with outbreaks and high mortality are gathered in Cluster 1 which is mainly characterized by the presence of OI-122 associated genes ent/espL2, nleB, nleE. These findings are supported by two clinical studies showing that the presence of OI-122 encoded genes was significantly associated with diarrhoea in patients infected with atypical EPEC [40,41]. The function of nle-genes in pathogenesis of EHEC and EPEC infection is only partially known [30,42,43]. Further work is needed to explore the contribution of OI-122 effectors to the high infectivity and virulence of EPEC and EHEC strains resulting in outbreaks and severe disease in humans.
It has been shown previously that the evolution of typical and atypical EPEC has occurred from LEE positive ancestor strains and divergent phylogenetic groups of EPEC (EPEC1 to EPEC4) and EHEC (EHEC1 and EHEC2) were established [1,6,37]. Virulence genes harboured by EAF-plasmids, EHEC-plasmids and stx-phages were found in phylogenetically unrelated strains indicating that these were acquired several times during evolution [1]. Their horizontal spread to unrelated strains and the frequent loss of plasmid and bacteriophage inherited determinants makes these less suitable for identifying clones associated with high infectivity and virulence in humans. The OI-122 inherited nle-genes were found to be significantly associated with highly virulent Cluster 1 strains of EHEC and EPEC. They appear to be more stably inherited than plasmid and phage associated genes and could thus serve as an additional diagnostic tool for the reliable identification of EHEC and EPEC infections in humans, animals and EHEC contamination of food sources and the environment.

Conclusion
Our results indicate that the OI-122 pathogenicity island is a common attribute that is significantly associated with highly virulent EHEC and EPEC strains. Of the OI-122 encoded genes, nleB was found as most conserved and thus presents a suitable marker for genetic screening for human virulent EHEC and EPEC strains. Horizontally transferred genetic elements such as the virulence-plasmids and phages were less significantly associated with the highly virulent clones of EHEC and EPEC strains.

Bacteria
A total of 445 E. coli strains from the collection of the National Reference Laboratory for Escherichia coli (NRL-E.coli) were investigated. These originated from humans (n = 286), domestic animals (n = 84) and food (n = 70). Five strains were of unknown origin. The 445 strains were grouped into apathogenic E. coli (n = 21), atypical EPEC (n = 235), typical EPEC (n = 73), EHEC (n = 64) and STEC (n = 52) according to the presence or absence of genes encoding Stx (stx 1 + stx 2 ), intimin (eae) and bundle forming pili (bfpA). All strains were investigated for their O (lipopolysaccharide) and H (flagellar) serotypes. Non-motile strains were examined for their flagellar (fliC) genotype as previously described [44]. Highly purified total DNA of the strains was prepared from 0.5 ml overnight cultures of bacteria using the RTP ® Bacteria DNA Mini Kit (Invitek, Berlin, Germany).

Detection of genes by real-time PCR
To investigate the presence of seventeen genes previously described as virulence markers of STEC, EPEC and EHEC the real-time PCR method was employed using the GeneDisc ® array as previously described [17], or the Applied Biosystems 7500 real time PCR system. Standard cycling conditions (15 sec 94°C, 1 min 60°C and 40 cycles) were used for the Applied Biosystems 7500 system. The primers and probes for the detection of following genes (stx 1 , stx 2 , eae, ehxA, espP etpD, katP, nleA, nleF, nleH1-2 ent/espL2, nleB, nleE) have been described previously [16]. Primers and probes for the detection of bfpA, nleG5-2, nleG6-2 and espK were developed for this work (Table 10). The reference strains for STEC and EHEC were used as previously described [16]. Strain E2348/69 (O127:H6) [12] served as control for typical EPEC and strain CB9615 (O55:H7) [14] as a control of atypical EPEC. E. coli K-12 strain MG1655 [45] served as a negative control for the eighteen virulence markers investigated in this work.

Statistical analysis
The seventeen virulence genes that were investigated in the 445 E. coli strains are listed in Table 1. To analyse the relationship between the seventeen virulence factors investigated in this work and the E. coli pathogroups, the presence of the virulence factors was calculated per pathogroup (Table 1). For the analysis of associations between the virulence factors and the E. coli pathogroups univariate analysis with a chi-square test was used. If frequencies were low Fisher's exact tests was used for the calculation. As a significance level, α was set to 0.05. All p-values ≤ α were considered statistically significant. To determine which virulence genes were major contributors in the elimination of the null hypothesis we calculated standardized residuals. When the absolute value of the residual is greater than 1.00 we can conclude that there is a major influence on a significant chi-square test between a given pathotype and the respective virulence gene (Table 1). A cluster analysis was performed in order to analyse similarities between the E. coli pathogroups.
Since the presence or absence of virulence genes is binary scaled, the similarity was calculated according to "Rogers and Tanimoto" [27]. The linkage between groups was selected as the cluster method.