Novel multiplex real-time PCR assays reveal a high prevalence of diarrhoeagenic Escherichia coli pathotypes in healthy and diarrhoeal children in the south of Vietnam

Background Diarrhoeagenic Escherichia coli (DEC) infections are common in children in low-middle income countries (LMICs). However, detecting the various DEC pathotypes is complex as they cannot be differentiated by classical microbiology. We developed four multiplex real-time PCR assays were to detect virulence markers of six DEC pathotypes; specificity was tested using DEC controls and other enteric pathogens. PCR amplicons from the six E. coli pathotypes were purified and amplified to be used to optimize PCR reactions and to calculate reproducibility. After validation, these assays were applied to clinical samples from healthy and diarrhoeal Vietnamese children and associated with clinical data. Results The multiplex real-time PCRs were found to be reproducible, and specific. At least one DEC variant was detected in 34.7% (978/2815) of the faecal samples from diarrhoeal children; EAEC, EIEC and atypical EPEC were most frequent Notably, 41.2% (205/498) of samples from non-diarrhoeal children was positive with a DEC pathotype. In this population, only EIEC, which was detected in 34.3% (99/289) of diarrhoeal samples vs. 0.8% (4/498) non-diarrhoeal samples (p < 0.001), was significantly associated with diarrhoea. Multiplex real-time PCR when applied to clinical samples is an efficient and high-throughput approach to DEC pathotypes. Conclusions This approach revealed high carriage rates of DEC pathotypes among Vietnamese children. We describe a novel diagnostic approach for DEC, which provides baseline data for future surveillance studies assessing DEC burden in LMICs.


Background
Diarrhoeal illness remains the second-highest cause of mortality and morbidity worldwide [1][2][3]; the main burden of this disease occurs in children in South Asia, Southeast Asia, and Africa [3]. Among the bacterial pathogens associated with diarrhoea in children, Escherichia coli are repeatedly the most common food borne pathogenic species identified [3][4][5][6]. However, identifying diarrhoea-causing E. coli can be complex, as pathogenic variants cannot be delineated from commensal E. coli solely by microbiological culture.
For ETEC, heat-stable toxin producing strains (ST-ETEC) are among the most important pathogens associated with diarrhoea in children [9][10][11]. Similarly, typical EPEC (possessing both eae and bfp virulence genes) are more strongly associated with diarrhoea in children in developing regions than atypical EPEC strains which lack bfp [6,12]. EIEC are virtually indistinguishable from Shigella spp., which are essentially an independent genus within the broader E. coli population. STEC are commonly associated with food-borne disease outbreaks in developed countries and have higher mortality than other E. coli pathotypes due to sequelae of haemolytic uraemic syndrome (HUS) [13][14][15]. The epidemiology of STEC in LMICs, particularly in children in Southeast Asia, are not well described.
The proportion of diarrhoeal disease associated with DEC in Vietnam is not well investigated as measuring the prevalence of these pathogens in diarrhoeal cases and non-diarrhoeal controls is laborious and not routinely performed. Of the limited DEC studies conducted in Vietnam an investigation originating in Hanoi detected DEC in 22% of stool sample from diarrhoeal cases and 12% of controls using conventional multiplex PCR [16]. Here, we aimed to develop a set of standardized multiplex real-time PCR assays to identify the various DEC in complex samples in a comparatively short turnaround time. To establish the multiplex real-time PCR assays to identify the six DEC pathotypes we designed new or adapted existing specific primers and probes for nine DEC associated genes. The real-time PCR assays were optimized and then used to determine the prevalence of DEC in children with and without diarrhoea disease in Ho Chi Minh City (HCMC), Vietnam. Lastly, we combined these PCR data with available clinical data to identify clinical features in children infected with differing DEC pathotypes and to determine the potential effect of DEC in the stools of diseased and non-diseased children.

Results
Multiplex real-time PCR assay for detecting diarrhoeagenic Escherichia coli We firstly validated PCR amplification for ETEC, EAEC, EIEC/Shigella, EPEC, and STEC in monoplex using cloned target sequences and then with genomic DNA extracted from the various E. coli pathovars. The sensitivity of the primer and probe sets was determined by generating a series of standard curves using 10-fold dilutions of control plasmid DNA. The limit of detection for all targets, including the uidA control, was five copies per reaction, with the exception of aggR which could be detected down to 50 copies per reaction. Each primer and probe set were tested against a panel of commonly isolated pathogens found in stool samples, which included Staphylococcus aureus, Klebsiella pneumoniae, Salmonella spp., Campylobacter coli, Campylobacter jejuni, Shigella sonnei, Shigella flexneri, Enterobacter, Proteus, norovirus, and rotavirus (these viruses were selected as they are most commonly found viruses in the stools of children with diarrhoea). No amplification was observed in any sample other than those containing E. coli.
Ultimately, the PCR assays were multiplexed into four reactions, and the sensitivity, intra-assay and inter-assay CVs across the nine target sequences were calculated for each multiplexed PCR reaction. The Ct values for each target were equivalent between the monoplex and multiplex reactions, confirming that multiplexing did not impact sensitivity. The intra-assay and inter-assay CVs ranged from 0.01 to 1.54% and from 0.01 to 2.12%, respectively ( Table 1). The linear regressions of the standard curves were between 0.992-0.999 for all targets tested. The resulting efficiency of the amplification ranged from 90.9 to 105.7%, demonstrating the multiplex real-time PCR assays were well optimized, reproducible, and specific. Within the EPEC pathotype, atypical EPEC positive samples (eae positive, bfpA negative) were more prevalent than typical EPEC positive samples (eae positive, bfpA positive); 93.9% (322/343) vs. 6.1% (21/343), respectively. ETEC was detected in 6% (182/2815) of samples, with only a limited number of these samples (8.2%; 15/182) producing an amplicon for heat stable toxin (estA). Four diarrhoeal patients harboured samples containing the Shiga toxin-producing genes (stx1/stx2). Among the four cases associated with an STEC positive sample, one was positive for eae and one was positive a Intra-assay variation was calculated by measuring the co-efficient of variance of the Ct value on three concurrently run assays b Inter-assay variation was calculated by comparing variation in Ct value on three independently run assays for both eae and rfbE_O157. Of the two STEC cases that were amplification positive for eae and rfbE_O157, one was additionally positive for eltB (ETEC), the other was positive for aggR (EAEC).

Clinical manifestations of diarrhoeagenic Escherichia coli mono-infection
To investigate clinical syndromes associated with the various DEC in Vietnam, clinical data associated with the patients were accessed and compared between pathotype groups ( Generally, we found that infections associated with DEC positive samples were uncomplicated; > 90% of patients had improved or recovered after 3 days and their median hospital stay was 5 days [IQR 3-7 days]. The use of antimicrobials within this study population was high, with 81.3% (1513/1861) of patients receiving empirical antimicrobial treatment prior to any diagnostic testing, which may impact on the detection of various DEC, depending on their susceptibility profile. Fluoroquinolones, specifically ciprofloxacin, were the most commonly used class of antimicrobials in those with a DEC in their stool (957/1512, 63.3%).
Diarrhoeagenic Escherichia coli from faecal specimens of diarrhoeal hospitalized children vs. healthy nondiarrhoeal children Between March 2016 and August 2016, 498 MC sweeps were additionally collected from faecal samples taken from healthy children residing in HCMC and participating in a cohort study [17]. The majority of healthy children were male (269/498; 54.0%), with their age when sampled ranging from 24 months to 5 years (median age 46.4 months, IQR 35.6-52.5 months). In a comparable manner to the diarrhoeal samples, we screened the MC extractions from these healthy children with the multiplex real-time PCRs to detect DEC. At least one pathotype of DEC was detected in 41.2% (205/498) of samples associated with non-diarrhoeal children (Table 3).
To determine the prevalence and distribution of the various DEC in healthy and diarrhoeal children, we compared the data from the healthy children with a subset of the data from matched children in the diarrhoeal study which were between the ages of 2 and 5 years old (319 children; median age 31.5 months, IQR 26.7-38.9 months). The prevalence of ETEC, EAEC, and EHEC_ O157 in faecal samples was not significantly different between children with or without diarrhoea (Table 3, Fig. 2). Furthermore, EPEC was detected significantly The distribution of DEC co-infection among the cases and the controls was complex and highly variable (Fig. 2). The most common co-infections in the diarrhoeal group were EAEC + EIEC/Shigella (3.8%, 12/319) and EAEC + EIEC/Shigella + ETEC (2.2%, 7/319); whereas EPEC + EAEC (3.4%, 17/498) was more common in the healthy control group. Co-infection with more than one DEC was more likely to be associated with diarrhoeal disease than with healthy controls (16.3%, 52/319 vs. 9.6%, 48/498, p = 0.005, χ 2 test). However, due to the predominant presence of EIEC/Shigella in the diarrhoeal group, EIEC/Shigella infection was a potential confounder.
To disaggregate the potential confounding effect of EIEC/Shigella, we performed binary univariate and multivariate logistic regression to identify variables and DEC that were associated with diarrhoeal disease in children aged 24-60 months (Table 4). In the univariate model, co-infection with ETEC, mono-infection with EIEC/Shigella, co-infection with EIEC/Shigella, and coinfection without EPEC, EHEC_O157, and STEC were significantly associated with diarrhoea. However, after controlling for confounders, only mono or co-infection with EIEC/Shigella and wasting were determined to be significantly associated with diarrhoea. Conversely, mono-infection with ETEC, EAEC, and obesity were significantly more common in the non-diarrhoeal children.

Discussion
Here, we developed and applied an efficient and robust collection of real-time PCR assays for identifying DEC in MC sweeps isolated from stool samples from a collection of healthy and diarrhoeal children. This approach, in comparison to the traditional method, is straightforward, cost-effective and has a comparatively short turn-around time [18]. Ultimately, the four multiplex real-time PCR assays could detect ten target sequences corresponding with six pathotypes of DEC, which permitted detection of these pathogens with a high degree of accuracy and utility. However, there are some limitations with our approach. Due to their high genetic similarity, we are unable to differentiate between EIEC and Shigella spp. by using real-time PCR, as the invasion plasmid antigen H (ipaH) and the uidA (the internal control gene for E. coli) are present in both [19]. Further limitations of this approach are associated with issues of how pathotypes such as EPEC, EHEC, and STEC are defined. Through bacterial genomics, we know that organisms lacking either eae or stx or both may still belong to the EHEC group [20,21]. In addition, the stx genes have been found in other pathotypes of E. coli [22,23]. Therefore, it is impossible to definitively assign an E. coli to a DEC pathotype without genome sequencing. However, pathotyping DEC through detecting virulence genes remains useful for assessing the potential prevalence of the various pathogenic forms of E. coli in any given population. In addition to the methodological constraints of the study, as our control samples came from healthy children over 12 months of age, we could not evaluate associations with diarrhoea in children under 1 year of age and we recognise that co-infection with organisms that were not detected may impact on disease presentation. While ETEC is the most common DEC internationally, the prevalence of ETEC in this setting was found to be considerably lower than other regions [9,18]. This result is probably due to the study inclusion criteria, as only children presenting with bloody and/or mucoid diarrhoeal illness were enrolled, whereas ETEC is most commonly associated with watery diarrhoea [7]. Here, LT-ETEC were more prevalent than ST-ETEC, which is consistent with earlier studies on ETEC infections in children. However, in these previous studies the association between LT-ETEC infection and diarrhoea was weak [6,9,18]. In contrast, in the Global Enteric Multicentre Study (GEMS), ST-ETEC but not LT-ETEC was attributed as a major cause of diarrhoea in all age groups [24]. To determine whether ST-ETEC is an important pathogen in Vietnam, it will be necessary to carry out additional studies focusing on children presenting with watery diarrhoea.
EAEC was the most commonly detected pathotype in children with diarrhoea in this study, which is again consistent with earlier studies that reported high detection rates of EAEC compared to other DEC in Vietnam [16,25]. Several articles have raised the possibility that not all EAEC are pathogenic, and that variants within this group may have different propensities to cause disease [26][27][28][29][30][31]. However, several outbreaks and human volunteer studies have unequivocally shown that some EAEC can cause disease [26][27][28][29][30][31]. Here, one third of EAEC mono-infections required antimicrobial IV treatment (i.e. the third generation cephalosporins or imipenem; data not shown) associated with a more severe disease presentation. Notably, samples from three children in this study generated positive PCR amplicons for both EAEC and stx. These cases may represent mixed infections of EAEC and STEC, or potentially hybrid organisms, such as those associated with an extensive outbreak in Europe in 2011 [23]. Although EAEC was not associated with diarrhoea in children within the 24-60-month age group in this study, it was the most commonly detected pathotype from children with wasting. This observation is consistent with the findings of the recent the MAL-ED study, which reported that EAEC infection is associated with growth shortfall, irrespective of disease [32].
EPEC was the most common DEC gene target amplified from faecal samples of diarrhoea and non-diarrhoea children. The overwhelming majority of the amplicons generated from both healthy and diseased cohorts were associated with aEPEC. These data are again consistent with EPEC literature, which suggests that typical EPEC is commonly identified in the African continent [18,33], while atypical EPEC tends to predominate in other regions [34]. A case-control study conducted in seven LMICs found that typical EPEC infections were significantly associated with mortality in children under 5 years [6]. The high prevalence of atypical EPEC positive samples in our study group (24-60 months of age) may be partially associated with colonization in the first year of life, as asymptomatic infection with ETEC, EAEC, and EPEC have previously been associated with weaning and the termination of breastfeeding [35].
STEC O157 cause severe diarrhoea and are associated with a high mortality rate in food-borne outbreaks in western countries [36]. EHEC_O157 in this setting had a low prevalence and more than half the positive samples were positive for the rfbE_O157 gene alone, which suggests these are likely to be of lower pathogenicity. Only two samples that tested positive for EHEC_O157 also produced amplicons for the Shiga-toxin gene (stx2), suggesting that O157-STEC is not a significant cause of gastrointestinal symptoms in this location. In the age matched comparison, STEC were isolated from children in the healthy group only. This observation is consistent with data originating in Indonesia, where STEC was detected significantly more frequently in non-diarrhoeal children [37].
In previous studies, co-infection with more than one DEC (or with other enteric pathogens) was found to be significantly associated with diarrhoea [26,[38][39][40][41]. In this study, we found that co-infection with DEC was not associated with diarrhoea and was also common in healthy children. Notably, only co-infection with EIEC/ Shigella was significantly associated with diarrhoeal disease. However, as EIEC/Shigella infection alone was highly significantly associated with diarrhoeal illness, the contribution of other DEC to disease in EIEC/Shigella infection is unclear. In a multivariate logistic regression model, DEC co-infection in the absence of EIEC/Shigella was not associated with diarrhoea. This suggests that EIEC/Shigella is the most important cause of DEC mediated moderate-to-severe diarrhoea in this setting.

Conclusions
Multiplex real-time PCR is an efficient method for detecting the six major pathotypes of DEC in a collection of clinical samples. This new methodology provides a useful alternative to classical microbiology for large-scale microbiological and epidemiological studies. Using this approach, we found a high prevalence of DEC in the stools of both healthy and diarrhoeal children in Vietnam. EAEC and atypical EPEC were the most commonly detected DEC in both groups; whereas, EIEC/Shigella was the only DEC significantly associated with diarrhoeal disease. This study provides new methodology and baseline data for further clinical, epidemiological, and genomic studies in Vietnam and across Southeast Asia and shows that DEC are highly prevalent but not generally associated with diarrhoeal disease in Vietnam.

Study design
Children aged ≤15 years with diarrhoeal illness admitted to one of the three collaborating tertiary hospitals in HCMC, Vietnam from May 2014 to April 2016 were eligible for enrolment. Those with diarrhoeal illness (cases) were defined as ≥3 passages of loose stools within 24-h period along with at least one loose stool containing blood and/or mucus [42]. We excluded children if they had suspected or confirmed intussusception at the time of enrolment [43]. Controls were healthy children between the age of 12-60 months enrolled in diarrhoeal disease cohort in District 8 in HCMC from 2014 to 2016 [17]. The enrolled children attended HVH for routine health check every six months. An anal swab of healthy child was collected by study nurses at these routine visits.

Primer and probe design
The selected target genes for each pathotype were: ETEC, eltB (heat-labile toxin) and/or estA (heat-stable toxin); EAEC, aggR (transport regulator gene [44]); EIEC, ipaH (secreted protein encoded on pINV [19]); EPEC, eae (encoding the intimin adherence gene [7]) and bfpA (encoding a structural component of the bundle forming pilus [45]); STEC, stx1 and/or stx2 (Shiga toxins [7]); and rfbE_O157 (encoding the lipopolysaccharide O157 antigen, the most common STEC serogroup in regions where surveillance data is available). The uidA gene, which encodes beta-glucuronidase and is present in all E. coli, was used as an internal control to monitor both DNA extraction and PCR amplification. A flowchart of the combined assay strategy is shown in Fig. 1.
We classified the DEC amplification results using the following approach; ETEC positive samples were divided into LT-ETEC (eltB positive only); ST-ETEC (estA positive only); and LT-ST-ETEC (eltB and estA positive). Amplification of aggR was sufficient for classification as EAEC and a positive amplification for ipaH identified EIEC/Shigella. EPEC positive samples were divided into typical EPEC (carrying both eae and bfpA) and atypical EPEC (the presence of eae only). The STEC pathotype was identified by the presence of stx1 and/or stx2, and the presence or absence of rfb_O157 was used to differentiate between STEC O157 and the non-O157 STEC serogroups. STEC that have the potential to cause HUS carry additional virulence genes, specifically eae and aggR.

Isolation of nucleic acids and construction of control plasmids
Nucleic acids were purified from prototypic E. coli strains and a variety of other gastrointestinal pathogens using Wizard Genomic DNA Purification Kit (Promega). PCR amplicons were generated for each of the 11 target genes and ligated into pCR™ 2.1-TOPO® (Invitrogen, Applied Biosystem, UK). Purified plasmids were used as template to optimize PCR reactions and measure assay reproducibility. Plasmid concentrations (ng/μl) were quantified using a Nanodrop spectrophotometer (Thermo-Scientific, UK), and converted to copy number using the URI Genomics and Sequencing Center online tool (http:// cels.uri.edu/gsc/cndna.html).

Real-time PCR
Multiplex real-time PCR reactions were performed in a 25 μl reaction mixture containing a final concentration of 1X buffer, 0.2 mM deoxynucleoside triphosphates (dNTPs), 3.5 mM of MgCl 2 , 0.2 μM of each forward and reverse primers, 0.08 μM of each probe and 1 U of Hotstart Taq polymerase (QIAGEN, Germany). Five μl of DNA template was used for each PCR reaction. The real-time PCR cycling conditions were as follows: 95°C for 15 min, followed by 45 cycles of 95°C for 15 s, then 60°C for 60s, using the Light Cycler 480 II system

Reproducibility and linearity analysis
The precision and reproducibility of the real-time PCR assays were assessed using the co-efficient of variance (CV%), measured by dividing the standard deviations of the Ct values by the mean Ct values for each selected concentration. The Ct values of three replicates assayed simultaneously were compared to measure intra-assay reproducibility. The inter-assay reproducibility was calculated from data generated on three separate days. Linearity was determined by linear regression, using Ct values produced from 10-fold dilutions of control plasmid DNA.

Specimen culture and storage
Diarrhoeal faecal specimens were collected in sterile containers and transported to the laboratory within 24 h [43]. Anal swabs from non-diarrhoeal children were also transported to the laboratory within 24 h for processing. Specimens were inoculated onto MacConkey agar (MC, Oxoid), and incubated at 37°C for 18-24 h [43]. Following incubation, a sweep of colonies was taken from the entire MC agar plate and suspended in 20% glycerol in  Brain Heart Infusion (BHI) broth, before being stored at − 80°C.

Crude DNA extraction
Eighty μl of the stored colony sweep suspension was centrifuged at 4000 rpm for 10 min, and the pellet was resuspended in 80 μl of molecular grade water (Sigma). The resulting suspension was mixed by gently pipetting up and down, before being boiled at 96°C for 10 min and cooled to room temperature. The lysate was centrifuged at 4000 rpm for 10 min to remove cellular debris, and 5 μl of supernatant was subjected to the real-time PCR assays.

Data collection and statistical analysis
Data were exported into Microsoft Excel (Microsoft, USA), and analysed using Stata v11 (StataCorp, College Station TX, USA). Descriptive comparisons between groups were conducted using non-parametric tests including χ 2 test or Fisher's exact test for categorical variables and the Kruskal-Wallis test for continuous data. Growth status of participating patients were assessed using the WHO global database on growth and nutrition and Prevention and Management of Obesity for Children and Adolescents-Healthcare guidelines using the macro package of Stata v11 developed by WHO [50,51]. Due to the age difference between diarrhoeal and non-diarrhoeal groups, the comparative analyses were performed between all the children in healthy group and the subset children in the diarrhoea group that were aged 24-60 months. Logistic regression was performed to determine the associations with diarrhoea using each type of infection considered as an independent variable. Infections were classified as mono-infection of each pathotype of DEC or co-infections of each specific pathotype and other pathotypes. The types of co-infection were repeated due to multi pathotype co-infection; hence the p-value for univariate model was considered significant when p < 0.01. Multivariable logistic regression models were performed and incorporated monoinfections, each specific type of co-infection, gender and growth status with diarrhoea and non-diarrhoea as binary outcomes (performed on Stata v11, StataCorp, College Station TX, USA). For the latter, a p-value of < 0.05 was considered significant. The figure for mixed-infections ( Fig. 2) was generated using the UpSetR package and restructured manually to generate the side by side bar graphs for comparing two groups [52].