Internalin profiling and multilocus sequence typing suggest four Listeria innocua subgroups with different evolutionary distances from Listeria monocytogenes

Background Ecological, biochemical and genetic resemblance as well as clear differences of virulence between L. monocytogenes and L. innocua make this bacterial clade attractive as a model to examine evolution of pathogenicity. This study was attempted to examine the population structure of L. innocua and the microevolution in the L. innocua-L. monocytogenes clade via profiling of 37 internalin genes and multilocus sequence typing based on the sequences of 9 unlinked genes gyrB, sigB, dapE, hisJ, ribC, purM, gap, tuf and betL. Results L. innocua was genetically monophyletic compared to L. monocytogenes, and comprised four subgroups. Subgroups A and B correlated with internalin types 1 and 3 (except the strain 0063 belonging to subgroup C) and internalin types 2 and 4 respectively. The majority of L. innocua strains belonged to these two subgroups. Subgroup A harbored a whole set of L. monocytogenes-L. innocua common and L. innocua-specific internalin genes, and displayed higher recombination rates than those of subgroup B, including the relative frequency of occurrence of recombination versus mutation (ρ/θ) and the relative effect of recombination versus point mutation (r/m). Subgroup A also exhibited a significantly smaller exterior/interior branch length ratio than expected under the coalescent model, suggesting a recent expansion of its population size. The phylogram based on the analysis with correction for recombination revealed that the time to the most recent common ancestor (TMRCA) of L. innocua subgroups A and B were similar. Additionally, subgroup D, which correlated with internalin type 5, branched off from the other three subgroups. All L. innocua strains lacked seventeen virulence genes found in L. monocytogenes (except for the subgroup D strain L43 harboring inlJ and two subgroup B strains bearing bsh) and were nonpathogenic to mice. Conclusions L. innocua represents a young species descending from L. monocytogenes and comprises four subgroups: two major subgroups A and B, and one atypical subgroup D serving as a link between L. monocytogenes and L. innocua in the evolutionary chain. Although subgroups A and B appeared at approximately the same time, subgroup A seems to have experienced a recent expansion of the population size with higher recombination frequency and effect than those of subgroup B, and might represent the possible evolutionary direction towards adaptation to enviroments. The evolutionary history in the L. monocytogenes-L. innocua clade represents a rare example of evolution towards reduced virulence of pathogens.

Intriguingly, some L. monocytogenes strains tend to lose virulence factors that play critical roles in infection, which has been considered as a rare example of evolution towards reduced virulence of pathogens [4,10]. Certain L. monocytogenes lineage IIIA strains are presumed to have identifiable linkage between L. monocytogenes and L. innocua by possessing many genes common to L. monocytogenes [e.g., Listeria pathogenicity island I (LIPI-1), inlAB locus, bsh and hpt], and sharing many gene deletions similar to L. innocua (e.g., inlC, inlI, inlJ, internalin cluster between ascB and dapE, and arginine deiminase island lmo0036-lmo0041) [11][12][13]. Therefore, the population structure and biodiversity in L. innocua may, from the other side of the coin, provide clues for the evolutionary history in the L. monocytogenes-L. innocua clade. Unfortunately, comprehensive knowledge on the phylogenetic structure of L. innocua is still lacking.
Various strain typing methods have been developed and improved with a general shift from phenotype-based to genotype-based strategies [14]. Given its accuracy, reproducibility and increased speed of DNA sequencing, DNA sequence-based multilocus sequence typing (MLST) has gained more popularity [15], and provided an overview of the population structure of L. monocytogenes [4,16]. Internalin profiling seems to be instrumental to subtype L. monocytogenes strains into different serovars [17]. Moreover, internalin loci are also present in non-pathogenic species, including L. innocua, and seem to play broad roles not merely limited to attachment and invasion of host cells [18][19][20].
In this study, we attempted to delineate a phylogenetic framework based on internalin profiling and MLST analysis of a collection of L. innocua isolates from various food sources, and further to investigate microevolution in the L. innocua-L. monocytogenes clade.

Biochemical patterns of L. innocua and L. monocytogenes strains
All L. innocua and L. monocytogenes strains displayed similar utilization patterns for xylose (negative), mannitol (negative) and glucose (positive), while hemolysis could distinguish these two species with L. monocytogenes showing β-hemolysis and all L. innocua strains being non-hemolytic. With regard to the rhamnose utilization pattern, three L. innocua strains 386, L19 and L103 (3/34, 8.8%) as well as L. monocytogenes sublineages IIIB and IIIC strains covering serovars 4a, 4b and 4c were atypically negative for rhamnose fermentation (Table 1).

MLST correlates with internalin profiling of L. innocus strains
Sixty-four strains in the L. monocytogenes-L. innocua clade were classified into 61 unique sequence types (ST) in the MLST scheme with a high discrimination index (DI = 0.99, 0.76 to 0.98 per gene). The concatenated sequence data showed that L. innocua was genetically monophyletic as compared to L. monocytogenes, with 34 L. innocua and 30 L. monocytogenes strains bearing 391 (6.69%) and 820 (14.03%) polymorphisms respectively. The average nucleotide diversity π of L. innocua was lower than that of L. monocytogenes (1.06% vs 4.38%). However, the nonsynonymous/synonymous mutation rate of L. innocua was higher than that of L. monocytogenes (0.0865 vs 0.0500) ( Table 3).   With L. welshimeri as the outgroup species, the phylogenetic tree revealed nine major branches of the L. monocytogenes-L. innocua clade, four corresponding to the recognized L. monocytogenes lineages I, II, IIIA/C and IIIB, one harboring the low-virulent L. monocytogenes lineage IIIA strains reported in our previous study [11], and the other four beloning to L. innocua ( Figure 1). The majority of L. innocua strains were placed in two branches: one contained 19 strains (55.9%) representing STs 1, 4, 5, 7, 9-17, 21-23, 25 and 31, and the other harbored 13 strains (38.2%) representing STs 2, 3, 6, 8, 18-20, 24, 26 and 28-30. Remarkably, L. innocua strain L43 (ST27) showed the least genetic distance to the main cluster of L. monocytogenes. This strain seems to serve as the evolutionary intermediate between L. monocytogenes and L. innocua main clusters together with the low-virulent L. monocytogenes lineage IIIA strain 54006. Additionally, L. innocua strain 0063 (ST6) was present on the halfway between the L. innocua main cluster and strain L43 ( Figure 1).
Based on the MLST scheme and internalin profiling, L. innocua could be divided into at least four subgroups. Two main subgroups A and B located in the two major branches of the phylogenetic tree, which correlated with IT1 and IT3 (except strain 0063) (19/34, 55.9%), and IT2 and IT4 (13/34, 38.2%) respectively. In addition, one IT3 strain 0063 and one IT5 strain L43 present in two individual branches formed subgroups C and D respectively ( Table 2).

Phylogeny and population history of L. innocua
As aforementioned, L. innocua was genetically monophyletic (π = 1.06%) as compared to L. monocytogenes (π = 4.38%). When sequence data were analyzed after stratification by subgroups, the number of polymorphisms and genetic diversity within each subpopulations were reduced (Table 3), suggesting a barrier for genetic exchange between these L. innocua subgroups. Such barrier was also observed between L. monocytogenes lineages (Table 3), consistent with one previous report [21].
The exterior/interior branch length ratio test demonstrated that L. innocua and its subgroup A as well as L. monocytogenes and its lineage I showed a significantly smaller exterior/interior branch length ratio (p < 0.05) than expected under the coalescent model ( Figure 2). This suggests that the contemporary L. innocua population experienced a recent expansion of its population size, consistent with a population bottleneck. Specifically, L. innocua subgroup A underwent expansion of the population size (p = 0.027), while subgroup II did not (p = 0.176) (Figure 2).
The rate of recombination within bacterical species can differ widely from one species to another. In the L. innocua-L. monocytogenes clade, both the relative frequency of occurrence of recombination versus mutation (ρ/θ) and the relative effect of recombination versus point mutation (r/m) were about two to three times higher in L. innocua than in L. monocytogenes (Table 5). L. innocua subgroup A exhibited significantly higher frequency (ρ/θ = 3.7697) and effect (r/m = 12.0359) of recombination than subgroup B (ρ/θ = 0.2818; r/m = 4.8132), consistent with a definite population expansion of subgroup A as aforementioned. However, the higher recombination rate of L. innocua subgroup A did not seem to contribute to nucleotide diversity (π for subgroups A and B are 0.46% and 0.77% respectively) ( Table 3 and Table 5). On the other hand, both the frequency and effect of recombination in L. monocytogenes lineage II were higher than those in lineages I and III ( Table 5).
The phylogram based on the analysis with correction for recombination revealed that the time to the most recent common ancestor (TMRCA) of L. innocua subgroups A and B was similar (Figure 3), suggesting that these two subgroups appeared at approximately the same time. In addition, our study also showed the TMRCA of L. monocytogenes lineages I and II were similar, consistent with a recent report [24].  All of these L. innocua strains were nonpathogenic to ICR mice (Table 1).

Discussion
The ecological, biochemical and genetical resemblance as well as the clear differences of virulence between L. monocytogenes and L. innocua make this bacterial clade attractive as models to examine the evolution of pathogenicity in Listeria genus. L. monocytogenes causes lifethreatening infections in animals and human populations, and exhibits a diversity of strains with different pathogenicity [25]. L. innocua has once been postulated as the nonpathogenic variant of L. monocytogenes, and holds the key to understanding the evolutionary history  Profiling of 37 internalin genes grouped the L. innocua strains into five internalin types, IT1 to IT5, with IT1 and IT2 as the major types ( Table 2). The MLST scheme identified two major phylogenetic branches containing the majority of sequence types (29/31, 93.5%), and other two bearing one strain each (Fig 1). Consequently, L. innocua consists of at least four distinct subgroups A, B, C and D. Subgroup A correlates with one of the major branches including all the IT1 and IT3 strains with the exception of one IT3 strain 0063 belonging to subgroup C, while subgroup B correlates with the other major branch covering all the IT2 and IT4 strains (Table B2B). Therefore, it is inferred that a certain L. innocua subgroup possibly contains several serovars and exhibits different internalin patterns, which is similar to the fact that each lineage of L. monocytogenes contains several serovars and exhibits more than one internalin patterns, as exemplified by the internalin island between ascB and dapE in our previous report [17].
The majority of L. monocytogenes lineage I strains harbor inlC2DE, and a small number of 1/2b strains carry inlGC2DE instead. Within L. monocytogenes lineage II strains, the majority of 1/2a and 1/2c strains harbor inlGC2DE and inlGHE respectively. In addition, L. monocytogenes lineage III strains show the greatest level of diversity [8,17]. The L. innocua subgroup A strains either contain a whole set of L. monocytogenes-L. innocua common and L. innocua-specific internalin genes, or lack lin1204 and lin2539, and the L. innocua subgroup B strains either lack lin1204 or lack lin0661, lin0354 and lin2539 instead. Besides, the subgroup D strain L43, which shows the least genetic distance to L. monocytogenes, lacks lin1204 but bears L. monocytogenes-specific inlJ in the counterpart region in L. monocytogenes genomes ( Table 2). We propose that certain internalin genes such as lin0354, lin0661, lin1204 and lin2539 could be potential genetic markers for subgroups of L. innocua.
The phylogenetic tree revealed nine major branches of the L. innocua-L. monocytogenes clade, five belonged to L. monocytogenes representing lineages I, II, and III, consistent with previous reports [11,24,26], and the other four represented L. innocua subgroups A, B, C and D (Fig  1). Overall, L. innocua is genetically monophyletic compared to L. monocytogenes, and the nucleotide diversity of the L. innocua species is similar to that of L. monocytogenes lineage I but less than those of L. monocytogenes lineages II and III. In evolutionary terms, younger bacterial species has lower level of genetic diversity [15]. The results from this study offer additional evidence that L. ND, not done. a. The ratio of probabilities that a given site is altered through recombination and mutation, representing a measure of how important the effect of recombination is in the diversification of the sample relative to mutation. b. The ratio of rates at which recombination and mutation occur, representing a measure of how often recombination events happen relative to mutations.
innocua possibly represents a relatively young species as compared to its closest related pathogenic species L. monocytogenes.
Previous studies suggest that L. monocytogenes represents one of the bacterial species with the lowest rate of recombination [4,27]. In this study, strains in the L. innocua-L. monocytogenes clade exhibit similar value of ρ/θ to those of the Bacillus anthracis-Bacillus cereus clade [28] and slightly higher than those of Staphylococcus aureus [29], but still considerably lower than those of pathogens such as Clostridium perfringens [30], Neisseria meningitis [31] and Streptococcus pneumoniae [29]. Both the relative frequency of occurrence of recombination versus mutation (ρ/θ) and the relative effect of recombination versus point mutation (r/m) of L. innocua were higher than those of L. monocytogenes. More strikingly, recombination rates of L. innocua subgroup A were particularly high (Table 5). Wirth et al. [32] proposed from the data for Escherichia coli that epidemic and virulent bacteria face an increased selective pressure for rapid diversification in response to host immune defenses, resulting in higher recombination rates. L. monocytogenes is an opportunistic pathogen with wide host ranges as well as a saprotroph found in different environments [2,33]. Though lineage I strains were responsible for almost all major human listeriosis outbreaks and the majority of sporadic cases [6], those of lineage II exhibited higher recombination rate according to our observation and the findings by Bakker et al. [24]. Bakker et al. [24] proposed that higher recombination in lineage II was not due to selective forces involved in its virulence. Recombination may be critical for lineage II to successfully compete and survive in a board range of different environments. Lineage II strains are more commonly found at higher levels than lineage I strains in natural environments including foods [24,34]. Similarly, we postulate that the nonpathogenic species L. innocua descending from its pathogenic ancestor has better adaptability to contemporary environmental niches. Removal of some gene loci related to virulence (e.g., LIPI-1, inlAB and bsh) in Listeria could be regarded as adaptive gene loss, which favors its survival in environmental niches as a saprotroph [9,11].
L. innocua subgroups A and B strains have similar TMRCA and exhibit similar genetic distances to L. monocytogenes, suggesting that these two subgroups appeared at approximately the same time (Fig 2). However, subgroup A experienced a recent expansion of the population size, consistent with the higher recombination frequency (r/m) and effect (ρ/θ) of subgroup A as compared to those of subgroup B. This further implies that these two subgroups have distinct inclinations and adaptive abilities to environments and occupy different habitats, while subgroup A might face increased selective pressures resulting in higher recombination rates. Additional support for this indication is that the majority of subgroup A isolates (belonging IT1) contain a whole set of L. monocytogenes-L. innocua common and L. innocuaspecific internalin genes which may play broad roles in enhancing the adaption to various environments. Hence, the L. innocua subgroup A strains might represent the possible evolutionary direction towards adaptation. Interestingly, the higher recombination rate of L. innocua subgroup A did not seem to contribute to nucleotide diversity. One possible explanation is that members of subgroup A recombines only with other members of subgroup A, so that recombination does not introduce novel polymorphisms into subgroup A and therefore does not increase the genetic diversity.
Previous reports indicate that horizontal gene transfer might have occurred earlier to form a more ancestral L. monocytogenes strain, which would then give rise to L. innocua through gene deletion events possibly via lowvirulent L. monocytogenes lineage IIIA strains [11,13]. In this study, L. innocua subgroup D strain L43 exhibits the least genetic distances to L. monocytogenes (Fig 1), and constituted another evolutionary intermediates between L. monocytogenes and L. innocua main clusters. Therefore, L. innocua strain L43 and L monocytogenes strain 54006 [11] might serve as intermediate linkage strains in deciphering the evolution of the L. innocua-L. monocytogenes clade.
The strain L43 seems to share a "hybrid" genetic background derived from L. innocua and L. monocytogenes by the MLST data and its carriage of L. monocytogenes-specific virulence gene inlJ. InlJ is a sortase-anchored adhesin specifically expressed in vivo [35], but its function in atypical L. innocua strains requires further investigation. Another atypical L. innocua strain PRL/NW 15B95 has been characterized as having the entire LIPI-1 embedded into an otherwise typical L. innocua genetic background [9]. However, we did not see its presence in the strain L43. PRL/NW 15B95 falls into the main L. innocua cluster based on sequencing of 16S-23S intergenic regions, 16S rRNA and iap genes, and has possibly acquired LIPI-1 by a later transposition event, based on the finding of a 16 bp Tn1545 integration consensus sequence flanking the virulence island [9]. Thus, unlike L43, PRL/NW 15B95 does not constitute an evolutional intermediate between L. monocytogenes and L. innocua. Complementary transfer of only some of the virulence genes such as LIPI-1 did not change the avirulent character of PRL/NW 15B95 [9]. In this study, all L. innocua strains were nonpathogenic in mice models (Table 1).

Conclusion
This study reveals that L. innocua is a relatively young species descending from L. monocytogenes. The evolutionary history in the L. monocytogenes-L. innocua clade represents a rare example of evolution towards reduced virulence of pathogens. L. innocua is genetically monophyletic and comprises four subgroups based on internalin profiling and MLST scheme. The majority of L. innocua strains belong to two major subgroups A and B, and one atypical subgroup might serve as a link between L. monocytogenes and L. innocua main cluster in the evolutionary chain. While subgroups A and B appeared at approximately the same time, the subgroup A strains seem to represent the possible evolutionary direction towards adaptation to enviroments. It is believed that the phylogenetic structure and evolutionary history of L. innocua will be much clearer if a larger strain collection and the whole genome sequences of more representative strains become available.

Bacterial strains
A total of 68 Listeria strains were examined in this study (Table 1). These included 30 L. monocytogenes strains representing three lineages, 34 L. innocua strains, 5 from reference collections, 13 from meat, 8 from milk and 8 from seafoods, and 4 L. welshimeri strains. Listeria strains were retrieved from glycerol stocks maintained at -80°C, and cultured in brain heart infusion broth (BHI; Oxoid, Hampshire, England) at 37°C.

Carbohydrate fermentation and hemolytic reactions
The recommended biochemical patterns for differentiating Listeria spp. included L-rhamnose, D-xylose, D-mannitol and glucose utilization and hemolytic reactivity, and were tested by using conventional procedures [36,37].

DNA manipulations
Genomic DNA was extracted using a protocol reported previously [12]. Oligonucleotide primers were synthesized by Invitrogen Biotechnology (Shanghai, China) (  (Table 6 and Additional file 1; table S2), and the duration of extension depending on the expected length of amplicon (1 min per kb, at 72°C). For DNA sequencing analysis, PCR fragments were purified with the AxyPrep DNA Gel Extraction Kit (Axygen Inc., USA) and their sequences determined by dideoxy method on ABI-PRISM 377 DNA sequencer.

Internalin profiling
By sequence comparison of L. monocytogenes strains F2365, H7858 (serovar 4b), EGDe and F6854 (serovar 1/ 2a) and L. innocua strain CLIP11262, we investigated the presence or absence of 14 L. monocytogenes-L. innocuacommon and 4 L. innocua-specific internalin genes as well as 19 L. monocytogenes-specific internalin genes by PCR with specific primers outlined in Additional file 1; table S1. Due to the conserved repeats present in internalin multigene family [19], primers were designed based on the distinguishable regions through sequence comparison. As inlH and inlC2 shared highly identical nucleotide sequences, a common primer set was employed [17].

Detection of virulence genes
Five categories of virulence genes found in L. monocytogenes were assessed by using primers listed in Additional file 1; table S2, including (i) stress response genes conferring tolerance to harsh conditions within the host (e.g. bsh, arcB, arcD, lmo0038 and arcC); (ii) internalin genes responsible for adhesion and invasion of host cells (e.g. inlA, inlB, inlC, inlF and inlJ); (iii) genes involved in escape from vacuole and intracellular multiplication (e.g. plcA, hly, mpl, plcB and hpt); (iv) the gene associated with intracellular and intercellular spread (e.g. actA); and (v) regulatory genes (e.g. prfA).

Mouse infection
The virulence potential of 33 L. innocua strains and 30 L. monocytogenes isolates was assessed in ICR mice by a previously reported protocol [38]. The animal experiment was approved by the Laboratory Animal Management Committee of Zhejiang University, and the mice were handled under strict ethical conditions. Briefly, 5 female ICR mice at 20-22 g (Zhejiang College of Traditional Chinese Medicine, Hangzhou, China) were inoculated intraperitoneally with ~10 8 CFU each strain in a 0.1 ml-volume. Mice in the control group were injected with 0.1 ml PBS. The mice were observed daily and mortalities recorded until all of the mice inoculated with the virulent EGDe strain died. Relative virulence (%) was calculated by dividing the number of dead mice with the total number of mice tested. On the 15th day post-inoculation, all surviving mice were euthanized.

Data analysis
For each MLST locus, an allele number was given to each distinct sequence variant, and a distinct sequence type (ST) number was given to each distinct combination of alleles of the 9 genes. MEGA 4.0 was used to construct a neighbor-joining tree of L. innocua and L. monocytogenes isolates using the number of nucleotide differences in the concatenated sequences of 9 loci with 1,000 bootstrap tests [39]. L. welshimeri was used as outgroup species. DNAsp v4.10.9 [40] was used to calculate the number of alleles, number of polymorphic sites, nucleotide diversity (π, mean pairwise nucleotide difference per site), Tajima's D (for testing the hypothesis that all mutations are selectively neutral [41]), numbers of synonymous mutations and nonsynonymous mutations and the rate of nonsynonymous to synonymous changes with a Jukes-Cantor correction. Discrimination index (D.I.) values of selected genes were calculated according to the method previously described [42] on the basis of allelic types [j], numbers of strains belonging to each type [nj], and the total numbers of strains analyzed [N] with the following equation (higher D.I. values indicate better discriminatory power): ClonalFrame v1.1 was employed to show the evolution of ρ/θ and r/m as chain run. These two complementary measures were used to assess the relative contribution of recombination and mutation in the creation of the sample population from a common ancestor. Specifically, ρ/θ is the ratio of rates at which recombination and mutation occur, representing a measure of how often recombination happen relative to mutation [43], while r/m is the ratio of probabilities that a given site is altered through recombination and mutation, representing a measure of how important the effect of recombination is in the diversification of the sample population relative to mutation [44].
To infer the population history in the L. innocua-L. monocytogenes clade, ClonalFrame GUI, which treated each gene as an independent unit in the input file, was used to calculate the ratio of the sum of the external branches (the ones that connect a leaf of the tree) to the sum of the internal branches (the ones that connect two internal nodes of the tree) [45]. The distribution of these ratios was then compared to the distribution of the external/internal branch length ratio as expected under the coalescent model [46]. If the external/internal branch length ratio is significantly smaller than expected, it means that the inferred genealogy is unexpectedly "starlike", which is consistent with an expansion of the population size.
The chi-squared test was used to test significant associations between L. innocua subgroup and isolate source.

Additional material
Authors' contributions JC conceived and designed the study, performed and interpreted the phylogenetic and statistical analyses, participated in the collection of the sequence  data and animal assays, and drafted the manuscript. QC performed the PCR amplification and participated in the collection of the sequence data. LJ participated in evaluation of the results and in revision of the manuscript. CC and FB participated in the PCR amplification, biochemical tests and animal assays. JW and FM participated in the analysis of sequence data. WF supervised the project, participated in the design of the study and data interpretation, and helped draft the manuscript. All authors read and approved the final manuscript.