Common genomic features of Campylobacter jejuni subsp. doylei strains distinguish them from C. jejuni subsp. jejuni

Background Campylobacter jejuni has been divided into two subspecies: C. jejuni subsp. jejuni (Cjj) and C. jejuni subsp. doylei (Cjd). Nearly all of the C. jejuni strains isolated are Cjj; nevertheless, although Cjd strains are isolated infrequently, they differ from Cjj in two key aspects: they are obtained primarily from human clinical samples and are associated often with bacteremia, in addition to gastroenteritis. In this study, we utilized multilocus sequence typing (MLST) and a DNA microarray-based comparative genomic indexing (CGI) approach to examine the genomic diversity and gene content of Cjd strains. Results A geographically diverse collection of eight Cjd strains was examined by MLST and determined to be phylogenetically distinct from Cjj strains. Microarray-based CGI approach also supported this. We were able to demonstrate that Cjd strains exhibited divergence from Cjj strains NCTC 11168 and RM1221 in many of the intraspecies hypervariable regions. Moreover, multiple metabolic, transport and virulence functions (e.g. cytolethal distending toxin) were shown to be absent in the Cjd strains examined. Conclusion Our data demonstrate that Cjd are phylogenetically distinct from Cjj strains. Using the CGI approach, we identified subsets of absent genes from amongst the C. jejuni genes that provide clues as to the potential evolutionary origin and unusual pathogenicity of Cjd.


Background
Campylobacter jejuni consists of two subspecies: C. jejuni subsp. jejuni (Cjj) and C. jejuni subsp. doylei (Cjd). Cjd strains originally were identified as gastric campylobacters from human gastric biopsies [1] and nitrate-negative campylobacters from pediatric patients with gastroenteritis [2]. The major phenotypic characteristic commonly used to distinguish Cjd strains from Cjj strains, is the inability of Cjd to reduce nitrate. Other phenotypic characteristics also associated with Cjd include variable growth at 42°C [3], high susceptibility to cephalothin [3], and the absence of γ-glutamyl transferase (GGT) and L-arginine arylamidase activity [4]. However, like Cjj, all Cjd strains are hippuricase positive.
Despite the unusual clinical symptomatology, Cjd is isolated infrequently and few strains exist (compared to Cjj) for this subspecies. One possible reason is that many clinical laboratories do not characterize Campylobacter isolates past the genus level or subspeciate C. jejuni isolates as Cjj or Cjd. Moreover, normal Cjj culturing methods select against Cjd that are susceptible to cephalothin and exhibit variable growth at 42°C. This appears to be supported by the fact that in South Africa, Cjd strains represent a significant proportion of the total campylobacters isolated from human clinical samples; where 16% of the non-Cjj/coli Campylobacter isolates were Cjd. These isolates were obtained using the Cape Town Protocol [3,11] which uses passive filtration through a 0.65 µM membrane filter, growth at 37°C and no antibiotic selection.
In this study, we have exploited the sequencing of housekeeping genes (MLST) and DNA microarray-based CGI to examine the genomic relationship between Cjd strains, to determine the relationship between Cjd and Cjj strains, and to identify possible genomic factors that may have contributed to the emergence of Cjd strains. The MLST indicated that Cjd strains in our geographically diverse collection are highly related. Moreover, CGI analysis identified genomic regions that are commonly absent from Cjd strains and may contribute to their pathogenic characteristics.

Results and discussion
Sequence typing of Cjd strains Multilocus sequence typing (MLST) of the eight Cjd strains listed in Table 1 identified six novel sequence types (STs) (Fig. 1A). Prior to this study, no Cjd STs were present (or at least STs from strains identified definitively as Cjd) in the Campylobacter jejuni/coli MLST database. It was intriguing that all eight Cjd strains contained the same aspA allele, aspA63, considering the geographical and temporal range of the strains. However, aspA has been shown previously to be associated also with a subgroup of campylobacters (e.g. the "jejuni-like" aspA103 allele in cluster II C. coli strains from turkeys [28]); therefore, it is possible that aspA is linked to a locus important for colonization or virulence, and that conservation at this locus is maintained by interstrain recombination. The aspA63 allele has been identified only three times before, in STs ST-6, ST-1458 and ST-2532. The phylogenetic relationship between the six Cjd STs and representative Cjj and C. coli (Cc) STs was also determined (Fig. 1B). The housekeeping gene sequences of each ST were concatenated and aligned. The Cjd strains form a distinct clade within the resulting dendogram supporting their subspecies designation. A similar organization of Cjd strains into a distinct cluster was observed previously using AFLP [29,30] and were observed in this study also by comparative genomic indexing (see below).
The presence of a single Cjd cluster invites the possibility that Cjd strains form a discrete group within the species and that no (or at best minimal) genetic exchange occurs between the two subspecies. Unfortunately, the resolution MLST database that may not have been identified to the subspecies level. Therefore, to determine the nature of the origin and/or evolution of the Cjd subspecies and to determine whether Cjd alleles arise through the accumulation of point mutations or through a more rapid mechanism of exchange and recombination, additional experiments and analyses will be necessary. Such experiments would entail the characterization of a much larger and geographically diverse set of Cjd strains and the further identification of strains within the C. jejuni MLST database to the subspecies level. Characterization of additional Cjd strains might explain also the noteworthy conservation of the aspA63 allele within the subspecies.

Comparative genomic indexing of the Cjd strains
We further examined the genomic diversity of these eight Cjd strains more comprehensively by comparative genomic indexing (CGI) analysis. The CGI analysis allowed the assessment of gene content for each Cjd strain relative to the multi-strain C. jejuni DNA microarray, described previously [23], that comprises 1530 genes from NCTC 11168 and 227 genes from RM1221. Genomic DNAs from both NCTC 11168 and RM1221 were used as a reference DNA mixture and competitively hybridized with genomic DNA from each of the Cjd strains. We observed that 23.4% (418 of 1786) of the genes represented on the microarray were highly divergent or absent (trinary score of 0) in at least one Cjd strain and 262 of these genes were absent from all of 8 of the Cjd strains in this study. In comparison, we observed 21.5% (385 of 1786) of the genes were highly divergent or absent in at least 1 of 35 geographically and temporally diverse Cjj strains from both humans and animals using this microarray [23]. The CGI data sets as trinary scores for the Cjd strains are available as an additional table [see Additional file 1].
Using cluster analysis, we examined the genomic relationship between the Cjd strains and these 35 diverse Cjj strains that were analyzed previously [23]. Figure 2 depicts the relationship among the strains using a standard correlation function and bootstrapping (see Material and Methods) where the linkage distance between strains is represented by branch lengths in the resulting hierarchical cluster. These microarray-based CGI results at the whole genome level demonstrate that the Cjd strains formed a Cjd allelic profiles and phylogenetic relationships among C. jejuni subspecies and C. coli based on MLST.A. Cjd allelic pro-files and MLST STs. B Figure 1 Cjd allelic profiles and phylogenetic relationships among C. jejuni subspecies and C. coli based on MLST.A. Cjd allelic profiles and MLST STs. B. Phylogenetic relationship between the six Cjd STs and representative Cjj and C. coli (Cc) STs. Allele sequences for each ST were first concatenated and then aligned using CLUSTALX. The dendogram was constructed using the neighbor-joining algorithm and the Kimura 2parameter distance estimation method. Bootstrap values greater than 75%, generated from 500 replicates, are shown at the nodes. The scale bars represent substitutions per site. Numeric labels represent STs; the STs of the sequenced Cjj and Cc strains are indicated by the strain name in parentheses. Phylogenetic analyses were performed using MEGA version 3.1 [47]. The C. jejuni strains in panel B were identified to the subspecies level using a novel multiplex PCR method [35].
discrete cluster, correlating well with our MLST dendogram.

Intraspecies hypervariability regions and genomic elements
As with Cjj strains, the majority of genes that were divergent in the Cjd strains were contained within described regions of intraspecies plasticity (PR)/hypervariability (HV) [e.g. HV regions 1-18 [22,23] and regions PR1-PR7 [21]. The Cjd strains exhibited identical presence/absence patterns for genes within 11 of the 18 intraspecies hypervariability regions described recently [23]. Interestingly, the genes Cj0728-Cj0734 within HV region 9 that were shown recently to be phosphate-regulated [31] were absent or highly divergent in all of the Cjd strains exam-ined. It should be noted that the Cjd strains were not all identical at the hypervariable regions related to surface structures, including the lipooligosaccharide (LOS) biosynthesis locus (HV region 11) [21][22][23][24]32], the flagellar modification locus (HV region 12) [21][22][23]33], and the capsular polysaccharide biosynthesis locus (HV region 13) [21][22][23]25]. Furthermore, all of the Cjd strains were distinct from both NCTC 11168 and RM1221 in these three HV regions. This suggests that the Cjd strains do not produce one distinct set of surface molecules, nor do they produce the same surface molecules observed for either NCTC 11168 or RM1221.
We previously demonstrated the occurrence of four Campylobacter jejuni integrated elements (CJIE), similar to those identified in RM1221, in other strains of C. jejuni [23]. The CJIE1, CJIE2 and CJIE4 show similarity to bacteriophage while CJIE3 may be an integrated plasmid. For Cjd strains, fewer than 3 genes from the Campylobacter Mulike phage (CJIE1) or CJIE4 were present. We observed that strain RM4099 possessed a majority of the genes within CJIE2 described in strain RM1221 [23,34], while Cjd strains RM1512, RM1513 and RM4098 possessed a smaller number of genes within CJIE2 (Fig. 2). Although not visible in Figure 2, these same strains (RM4099, RM1512, RM1513 and RM4098) also possessed a cluster of genes from CJIE3 (Cje1093-Cje1100), suggesting the possibility that a plasmid (integrated or otherwise) is present in these strains.

Common deletions in Cjd strains
The Cjd strains also had missing genes outside the C. jejuni intraspecies hypervariable regions. Indeed, 46 genes not identified previously within the C. jejuni hypervariable regions, but missing from at least half of the Cjd strains, is presented in Table 2. Of these, 21 genes were absent from all Cjd strains (e.g. Cj0005c and Cj0091) and 8 genes were detected in only one Cjd strain (e.g.Cj0555 and Cj0636). The largest clusters of highly divergent loci contained three genes (Cj0201c-Cj0203 and Cj1040c-Cj1042c), with the remainder distributed uniformly throughout the chromosome, assuming genomic positions similar to the two sequenced Cjj strains. The majority (32/46; 70%) of the Cjd highly-divergent loci encode proteins with unknown or general function. The remaining genes, such as argBC, napAB,proA, dcuB, ceuCD, and sdaC, mostly encode metabolic or transport proteins. Furthermore, the microarray results suggest that the observed absence of nitrate reductase activity in Cjd strains is due most likely to deletions in napA and/or napB. We have confirmed recently these nap mutations by PCR and sequencing and our evidence strongly suggests that Cjd arose from a single evolutionary event, i.e. the napA deletion, with the divergence at napB occurring subsequently [35]. Despite these results, these metabolic and transport-related genes pro-  Table 1. Details of the diverse collection of Cjj strains have been described previously [23]. The genes are represented in the order of their positions from the RM1221 and are followed by the NCTC 11168 genes that include RM1221 pseudogenes. The four CJIEs from RM1221 and the capsular polysaccharide biosynthesis regions (HV 13) from RM1221 and NCTC 11168 are indicated. The gene status is color-coded: blue, present; yellow, divergent/unknown; red, absent and gray, no data. For cutoffs of absence and presence predictions, refer to Materials and Methods. An average linkage hierarchical clustering of the C. jejuni strains was compiled in GeneSpring version 7.3 from the CGI data for each element with the standard correlation and bootstrapping. Cjd strains are labeled in yellow and Cjj strains are labeled in white. The scale for distance score is also shown. Putative molybdenum-containing oxidoreductase --------

Cj0079c
Cytolethal distending toxin subunit CdtA Sun/nucleolar protein family protein -- vide too few clues into their possible role in Cjd pathogenicity.
The total absence of the virulence gene cdtA (and neartotal absence of cdtB) among the Cjd strains, however, may be related to pathogenicity ( Table 2, Cj0078c and Cj0079c). The cdt genes encode subunits of the cytolethal distending toxin (CDT), indicating that Cjd strains do not produce this toxin. As CDT has been associated with virulence properties in C. jejuni [36][37][38][39], and Cjd has been isolated almost exclusively from human clinical samples, it is intriguing to speculate what role loss of a virulence factor plays in Cjd pathogenicity. Specifically, CDT from C.jejuni has been shown to arrest eukaryotic cells in the G 2 /M phase of the cell cycle [36,38] and induces the release of the proinflammatory cytokine, interleukin-8 (IL-8) [37]. It is possible that due to the absence of CDT,Cjd strains fail to induce the release of IL-8 and this may be responsible partly for the much higher incidence of Cjd in blood cultures compared to Cjj. Interestingly, deletions of portions of cdtA and cdtB have recently been reported for three CDT-negative C. jejuni strains that were also isolated from patients with bacteremia [40]. This further supports a relationship between the absence of cdt genes and bacteremia; however, it is not clear if the strains examined in that study were Cjj or Cjd. Additionally, CGH experiments comparing 11 [18], 18 [21], or 51 [22]Cjj strains reported complete conservation of the cdt locus across this subspecies, suggesting that production of CDT may be yet another marker that distinguishes Cjj and Cjd.

Conclusion
Although Cjd strains have been distinguished from Cjj strains, due to phenotypic characteristics associated with Cjd strains, including the inability to reduce nitrate, high susceptibility to cephalothin and variable growth at 42°C [3], the genomic relationship between these subspecies has not been reported previously. Two types of genomic relationships between the two subspecies were possible: first, that the Cjd strains represented a phylogeneticallyand phenotypically-distinct clade within C. jejuni and second, that the two subspecies were indistinguishable phylogenetically but possessed unique phenotypic characteristics. Initial studies that differentiated Campylobacters by amplified fragment length polymorphism (AFLP) profiling [29,30] suggested that the first possibility was the most likely. Consistent with these observations, using MLST and CGI analysis, we confirmed in this study that Cjd strains are phylogenetically distinct from Cjj strains. Indeed, all of the Cjd strains examined from our geographically diverse collection had the same aspA63 allele and possessed a common set of absent or highly variable genes. This set of genes included both genes in the intrastrain HV regions and 21 genes outside these regions. Moreover, we recently demonstrated that the loss of nitrate reductase activity in 27 Cjd strains was the result of an identical 2761 bp deletion in the nitrate reductase large subunit-encoding gene, napA [35]. Also, there is a high frequency of novel MLST alleles among the Cjd strains. Together, these analyses suggest that Cjd strains arose from a common ancestor. Considering C. jejuni is generally a non-clonal organism with a high frequency of gene exchange, it is possible that Cjd strains are ecologically isolated from Cjj strains, with MLST alleles arising mainly by mutation rather than recombination. However, it is also possible that Cjd strains share the same environment with Cjj strains but have a barrier to recombination. Characterization of additional Cjd strains might provide clues as to the potential evolutionary origin of Cjd and their possible ecological niche.
It is not clear how the common genotypic characteristics of Cjd, identified in this study, relate to the observed clinical characteristics such as bacteremia [3]. Nevertheless, the absence of the virulence related genes, cdtA and cdtB may play a role. It is also possible that there are unidentified virulence-related genes present in the Cjd strains that are absent from Cjj strains. The current CGI analysis measures the gene content of the Cjd strains relative to the NCTC 11168 and RM1221 genes present on the microarray; thus, Cjd-specific genes would not be detected by the CGI analysis. The current genomic sequencing project of Cjd strain 269.97 by J. Craig Venter Institute (JCVI) may identify such unique Cjd genes [41]. It is also possible that many of the differences between the two subspecies are a result of more subtle differences. These subtle differences could be bp changes that result in proteins with a different topology or substrate specificity or bp changes that affect gene regulation or protein expression) that would not be detected using an amplicon-based DNA microarray with CGI analysis but would be observed from the sequence data. Additional, comparative genomic analysis will be possible once the sequence is available.

Bacterial strains, growth conditions and chemicals
Campylobacter strains used in this study are listed in

Multilocus sequence typing of Cjd isolates
The Cjd isolates in Table 1 were typed as described previously [6,42], using the Cj/Cc MLST primer sets. An inhouse Perl program, MLSTparser was used to extract allele sequences as described previously [42]. All allelic sequences were queried against the Campylobacter jejuni/ coli MLST database [43]. Alleles already present in the database were assigned those numbers; novel alleles and sequence types ( Microarray hybridization reactions were performed as described previously [23]. Fluorescently-labeled reference and test DNAs were combined in 45 µl Pronto! cDNA hybridization solution (Corning, Corning, NY) and heated to 95°C for 5 min. Fifteen µl of the hybridization mixture was applied to a microarray and sealed with a coverslip. The microarray slide was placed in a hybridization chamber (Corning) and incubated at 42°C for 18 h. Following hybridization, the slides were washed twice in 2 × SSC, 0.1% sodium dodecyl sulfate at 42°C for 10 min, twice in 1 × SSC at room temperature for 10 min, and finally twice in 0.2 × SSC at room temperature for 5 min.
The microarray slides were dried by centrifugation at 300 × g for 10 min before scanning. At least two hybridization reactions were performed for each test strain.

DNA microarray data acquisition and analysis
We recently described the construction and utilization of the multi-strain C. jejuni microarray in a comparative genomic hybridization (CGH) study [23]. Briefly, the multi-strain C. jejuni microarray contains 1530 PCRamplified sequences from the annotated open reading frames (ORFs) of C. jejuni strain NCTC 11168 and 227 PCR-amplified sequences representing novel C. jejuni strain RM1221 ORFs spotted in duplicate onto Ultra-GAPS glass slides (Corning Inc., Corning, NY).
DNA microarrays were scanned using an Axon GenePix 4000B microarray laser scanner (Axon Instruments, Inc. Union City, CA). Features and the local background intensities were detected and quantified with GenePix 4.0 software (Axon Instruments, Inc.). Poor features were excluded from further analysis if they contained abnormalities or were within regions of high fluorescent background. The data were filtered so that spots with a reference signal lower than the background plus 2 standard deviations of the background were discarded. Signal intensities were corrected by subtracting the local background, and then the Cy5/Cy3 ratios were calculated. To compensate for unequal dye incorporation, data normalization was performed as described previously [23,45]. Dye-swapping was performed for selected strains and no effects due to differences in dye incorporation on down-stream analysis were observed. The presence or absence of genes from Cjj strain NCTC 11168 and Cjj strain RM1221 in the other Cjj and Cjd strains was determined based on a comparison of normalized hybridization signal ratios of the test strain to the combined reference strains Cjj strain NCTC 11168 and Cjj strain RM1221 for the respective gene spots. The NCTC 11168 and RM1221 strain-specific spots hybridized to only half of the reference DNA (Cy5labeled mixture of NCTC 11168 and RM1221 DNA), increasing the Cy3/Cy5 ratio 2 fold. Therefore, the ratios for these spots were divided by 2 before determining the status of the gene. The ratios for spots of each individual gene were then averaged. As previously described [23], we defined the status of a gene as present when the Cy3/Cy5 (test/reference) intensity ratio was > 0.6, as divergent/ unknown when the Cy3/Cy5 intensity ratio was between 0.6 and 0.3, and absent when the Cy3/Cy5 intensity ratio was < 0.3. The presence, divergence and absence status for all genes was converted into trinary scores (present = 2; divergent/unknown = 1; absent = 0). The trinary gene scores for each replicate for all strains were analyzed further with GeneSpring microarray analysis software version 7.3 (Agilent Technologies, Redwood City, CA) and subjected to average-linkage hierarchical clustering with the standard correlation and bootstrapping.