Genotyping of Chlamydophila psittaci using a new DNA microarray assay based on sequence analysis of ompA genes

Background The currently used genotyping system for the avian zoonotic pathogen Chlamydophila (C.) psittaci has evolved from serology and is based on ompA sequence variations. It includes seven avian and two non-avian genotypes. Restriction enzyme cleavage of the amplified ompA gene and, less frequently, ompA sequencing are being used for examination, but, beside methodological limitations, an increasing number of recently tested strains could not be assigned to any established genotype. Results Comprehensive analysis of all available ompA gene sequences has revealed a remarkable genetic diversity within the species C. psittaci, which is only partially covered by the present genotyping scheme. We suggest adjustments and extensions to the present scheme, which include the introduction of subgroups to the more heterogeneous genotypes A, E/B and D, as well as six provisional genotypes representing so far untypable strains. The findings of sequence analysis have been incorporated in the design of a new DNA microarray. The ArrayTube™ microarray-based ompA genotyping assay has been shown to discriminate among established genotypes and identify so far untyped strains. Its high specificity, which allows detection of single-nucleotide polymorphisms, is due to the parallel approach consisting in the use of 35 hybridization probes derived from variable domains 2 and 4 of the ompA gene. Conclusion The traditional genotyping system does not adequately reflect the extent of intra-species heterogeneity in ompA sequences of C. psittaci. The newly developed DNA microarray-based assay represents a promising diagnostic tool for tracing epidemiological chains, exploring the dissemination of genotypes and identifying non-typical representatives of C. psittaci.


Background
The obligate intracellular bacterium Chlamydophila (C.) psittaci, the causative agent of psittacosis in birds and humans, is a well-established pathogen responsible for regular outbreaks of disease in psittacine birds and domestic poultry [1], as well as cases of atypical pneumonia in exposed persons [2,3].
The current definition of the species C. psittaci includes the former avian serovars of Chlamydia psittaci, but even under the recently revised taxonomic classification of the family Chlamydiaceae [4] it remains a heterogeneous taxon in terms of host range and virulence. To facilitate epidemiological studies, strains of the agent were subdivided into serovars A, B, C, D, E, and F on the basis of their immune reaction with a panel of monoclonal antibodies (MAbs) recognising specific epitopes of the major outer membrane protein (MOMP). Each serovar was assumed to exhibit more or less stringent host specificity [5][6][7]. Later on, Sayada et al. [8] suggested restriction fragment length polymorphism (RFLP), i.e. PCR amplification of the ompA gene with subsequent restriction enzyme analysis, for differentiation among C. psittaci isolates. Vanrompay et al. [9] were able to demonstrate by comparison of serotyping and PCR-RFLP that the serovars had genetic equivalents in the corresponding genotypes, which were defined by their restriction enzyme cleavage pattern. Thus, nine different genotypes have been generally accepted to date, seven of which are thought to predominantly occur in a particular order or class of Aves and two in non-avian hosts, i.e. genotype A in psittacine birds, B in pigeons, C in ducks and geese, D in turkeys, E in pigeons, ducks and others, F in parakeets, WC in cattle, and M56 in rodents. In addition, Geens et al. [10] suggested the introduction of genotype E/B to represent a group of isolates from ducks. Most of the avian genotypes have also been identified sporadically in isolates from cases of zoonotic transmission to humans.
While serotyping can be conducted with cultured strains only, PCR-RFLP was also used with DNA extracted from clinical samples [9]. However, there are obvious limitations as substantial amounts of a PCR amplicon are needed to produce distinctive and reproducible RFLP patterns on ethidium bromide-stained agarose gels. Related genotypes tend to have quite similar patterns, which may be difficult to distinguish, and typing results based on different enzyme patterns (e.g. AluI vs. MboII) may be contradictory. Genetically aberrant strains cannot be genotyped using the above-mentioned PCR-RFLP procedure. Sequencing of the ompA gene and alignment with type strain sequences can also be used to identify the genotype of C. psittaci strains [10,11], since genotype-specific sites are located in the gene's variable domains (VD) VD2 and VD4. However, all these approaches have been used in a rather pragmatic manner, i.e. with the aim of working out distinctive features of isolates for epidemiological purposes, while disregarding the scope of natural sequence variability and avoiding a molecular definition of individual genotypes at the nucleotide level.
In this situation, DNA microarray technology can be expected to provide added value because of its highly par-allel approach, which implies the potential to exploit minor sequence differences at multiple sites for discrimination among samples [12]. Using the ArrayTube™(AT) platform, we recently demonstrated that the performance parameters of an AT microarray assay for species identification of chlamydiae were comparable to real-time PCR in terms of sensitivity and superior in specificity [13]. This prompted us to take further advantage of the AT technology's high discriminatory capacity [14,12], rapidity and relatively low cost by exploring its suitability for ompA genotyping analysis. However, we immediately realized that a comprehensive analysis of currently known ompA sequences was necessary before considering the definition of specific probes for each genotype.
In the present study, we report the results of an extensive investigation on sequence similarity among all available ompA sequences from species of the genus Chlamydophila and describe the development of a DNA microarray-based assay for ompA genotyping of C. psittaci.

Analysis of ompA sequences of Chlamydophila strains
The NCBI database was searched for ompA sequences by repeatedly running BLAST queries with sequences from already known entries. In the course of this process it became evident that some C. psittaci sequences were more similar to ompA of other Chlamydophila spp. than to any genotype of C. psittaci, which is in line with earlier publications [15][16][17]. Therefore, we extended our analysis to include all sequences from the genus Chlamydophila.
Comparison of all available GenBank entries revealed 63 unique sequences of ompA genes. The alignment of these sequences was the basis for calculation of a sequence similarity matrix (see Additional file 1) and construction of a split network graph (Fig. 1). These graphs are useful tools for presentation of sequence similarity-based relatedness, but are not designed to characterize phylogenetic relationships. We minimized distorting effects due to alignment of differently sized sequences by excluding the shortest items and selecting a 992-nt sequence window which contains all variable domains. The most striking feature of the split network graph is the great diversity among ompA variants of C. psittaci, which clearly exceeds variations among other species of the genus. While C. pneumoniae, C. abortus, C. pecorum, C. felis, and C. caviae each are located on a single branch, several genotypes of C. psittaci form their own separate branches. At least 12 distinct clades belonging to this species can be identified, of which only 5 can be directly attributed to currently accepted genotypes, i.e. C, D, F, M56, and WC. Another remarkable feature is the ABE cluster, i.e. a grouping harboring the closely related genotypes A, B, E, and E/B. Similarity of the underlying ompA sequences is above 98% within the cluster and higher than 99.4% within the clade of genotypes B, E and E/B. To visualize these relationships, a magnified scale was used for presentation in Fig. 1a.
The M56 and Mat116 sequences represent less closely related side branches of the ABE stembranch. Strains of genotypes C, D and F are all located clearly separated from each other, and a number of strains not assigned to one of the accepted genotypes (strains 1V, YP84, R54, 6N, and CPX0308) are placed fairly isolated, forming their own branches at considerable genetic distances from the established genotypes. Genotype F (represented by strain VS225), as well as strains YP84 and R54, were found to be side branches of the C. abortus branch.
As a result of this sequence analysis, we identified 20 individual type strains, each of which represents a unique ompA sequence. The findings are summarized in Table 1. In this classification, genotypes A, D and E/B can be further divided into subgroups, and the six strains at the bottom of the table represent untyped C. psittaci strains.
All in all, the present sequence analysis has shown that the extent of intra-species variation goes beyond the area covered by currently accepted genotypes.

Selection of hybridization probes
Using the global alignment of ompA sequences (see Additional file 2), we selected a panel of 35 oligonucleotide Split network graph constructed from a global alignment of 63 ompA sequences retrieved from GenBank Figure 1 Split network graph constructed from a global alignment of 63 ompA sequences retrieved from GenBank. Accession numbers are shown for each sequence represented. The length of connecting lines between two items is equivalent to their genetic distance. The scale bar denotes 1 substitution per 100 nucleotides. Clades representing an established genotype of C. psittaci are encircled by a dashed line and designated accordingly in bold print. Provisional genotypes are designated as suggested in Table 1. Clades representing other Chlamydophila spp. are encircled by a solid line and labeled with the species name. Basic data of the strains represented by accession numbers can be found in Additional file 3. Fig. 1a Detail showing the ABE cluster. Subgroups of genotypes A and E/B are indicated at the respective GenBank accession number.
probes for identification of C. psittaci genotypes, which are shown in Table 2. The probe binding sites are located in VD2 and VD4 of the ompA gene. A compilation showing the number of mismatches between targets of genotypes and hybridization probes is given in Table 3. These data can be used for two purposes: a) to identify the genotype of the target DNA, and b) to predict signal intensities because perfect matches between probe and target will produce the strongest signals.

Optimization of microarray hybridization
To ensure the availability of sufficient amounts of target DNA for hybridization against a set of covalently bound oligonucleotide probes in a spatially accessible structure, amplification was conducted as duplex PCR. Using primer pairs VD1-f/VD2-r and 201CHOMP/ompA-rev, two biotinylated fragments containing VDs 1+2 and VDs 3+4, respectively, were produced. The resulting amplification products gave rise to hybridization signals at comparable intensity levels for VD2 and VD4 probes. In contrast, when PCR products comprising the entire ompA gene were hybridized, signals generated by binding to VD4 probes were significantly lower than those of VD2 probes (data not shown). This bias was compensated by doubling molar concentrations of the second primer pair in the above duplex PCR. Biotin labeling of target DNA via the use of 5'-biotinylated primers was preferred over internal labeling using biotin-dUTP because of higher sensitivity (data not shown) and lower cost.
To optimize the specificity of microarray hybridization, we systematically studied the influence of hybridization temperature (T H ) and washing conditions. While hybridization patterns of the various genotypes were satisfactorily discernible at T H = 60°C, stringency was insufficient as indicated by poor resolution between signals of perfectmatch probes and one-mismatch probes (data not shown). In contrast, the introduction of high-temperature (50°C) and low-salt buffer wash steps after both the hybridization reaction and incubation with the streptavidin-HRP conjugate led to high signal ratios of perfectmatch vs. one-mismatch probes, so that single-nucleotide polymorphisms (SNPs) could be detected (see next paragraph). This high-stringency wash protocol even allowed T H to be lowered to 50°C in order to gain sensitivity.

Microarray hybridization of type strains
Type strains of C. psittaci genotypes A, B, C, D, E, E/B, F, M56 and WC were examined using the AT ompA genotyping array. Barplots showing the distinct hybridization patterns are presented in Fig. 2. Comparison with theoretically expected, i.e. calculated patterns, revealed excellent agreement (data not shown) in all individual cases. Notably, regarding type strain patterns in the light of matching parameters given in Table 3, it could be confirmed that signal intensities of completely matching probes were significantly higher than those pertaining to probes having one or two mismatches to the target. For instance, Fig. 3 (upper part) illustrates that the signal of genotype B-specific probe VD2-03 with type B strain CP3 was more than 5 times stronger than with type E strain CPMN, which has a single mismatch in the target sequence. Conversely, the same applies to genotype E-specific probe VD2-04. Hybridization duplexes of targets and probes differing in two nucleotides were more than 10 times weaker than their perfect-match counterparts (see Fig. 3, lower part).

Microarray hybridization of field isolates
A group of 12 field strains of C. psittaci was examined using the present AT microarray (Table 4). Genotypes were identified from the pattern of hybridization signals according to the matching scheme in Table 3. In addition, all strains were ompA sequenced, and genotypes were determined according to individual positions in the above global alignment. PCR-RFLP results are also given for comparison in Table 4. The findings of AT ompA genotyping are in complete agreement with the data of the other two tests.

Discussion
The currently accepted system of genotyping for C. psittaci strains has evolved historically from its serological predecessor. The fact that serotypes have genotype equivalents [9], which has also been reported for, e.g., E. coli [14], * provisional genotypes proved to be helpful in the transition to the more accessible DNA-based typing methods, which are easier to standardize. However, given the serological history, it is not surprising that the ompA sequence analysis has revealed two notable limitations of the present genotype classification, i.e. the lack of complete coverage of naturally occurring strains and a general imbalance resulting from significant variations in genetic distances between individual genotypes. The latter reflects an inherent bias since the ideal panel would include genotypes genetically equidistant from each other.
Given the described deficiencies, the question about the usefulness of the present genotyping scheme inevitably arises. For instance, does the close genetic relatedness within the ABE cluster justify its subdivision into the four genotypes A, B, E and E/B?
The authors are of the opinion that there are important arguments in favor of maintaining the present classification and nomenclature, provided it is amended by a few adjustments and extensions.
i) Despite the amazing genetic heterogeneity displayed by ompA sequence database entries, it seems that the vast majority of field strains belong to the ABE cluster. This is indicated by several published studies [18,10,11,19], by the data presented in Table 4 and also the long-term experience of the authors' laboratories (data not shown). The   small group of so far untyped isolates appears to represent only a small proportion of naturally occurring strains.
ii) Each genotype should be defined by a representative reference strain and its complete ompA sequence. While a genotyping system based on a single gene may not appear sufficiently comprehensive in the age of genomics, it should be noted that ompA encodes the major protein antigen of chlamydiae, and even the minor sequence variations including SNPs are mostly translated into different amino acids (and, potentially, different epitopes). The fact that genotype-specific antigens can be distinguished by specific MAbs indicates that these differences do matter in the context of immunogenicity, host preference, virulence and, thus, epidemiological importance. This is why we suggest maintaining the currently accepted genotypes, including the closely related ones of the ABE cluster.
iii) To account for the natural variability of ompA sequences in C. psittaci strains, emerging branches of untyped isolates should become provisional genotypes until their epidemiological relevance and representative status are proven. To account for intra-genotype variability, we suggest that heterogeneous genotypes, such as A, E/ B and D, should be further divided into subgroups named after a typical representative, e.g. A-VS1, A-6BC, etc. (see Table 1). This approach will ensure openness and flexibility of the genotyping scheme.
iv) The amended genotyping scheme should be re-evaluated and overhauled when a sufficiently large number of complete genome sequences of C. psittaci strains becomes available.
If ompA genotyping of C. psittaci is to be conducted in the framework of surveillance and monitoring studies, the performance of PCR-RFLP will be insufficient because of  its limited sensitivity and inability to characterize atypical strains. As mentioned above, serological typing is limited to only six serotypes, but is not amenable to highthroughput examination and it lacks a strictly molecular basis. The real-time PCR procedure proposed by  can be very sensitive and specific for identification of seven of the established genotypes, but it is rather laborious and expensive because each sample has to be examined in seven individual runs. Sequencing of the complete ompA gene should serve as the gold standard for genotyping of C. psittaci since the respective DNA sequence contains all details necessary for genotype identification and the data can be stored in public databases to be easily accessible to researchers and diagnosticians.
Illustration of the specificity of the hybridization reaction on the AT microarray for genotyping of C. psittaci Figure 3 Illustration of the specificity of the hybridization reaction on the AT microarray for genotyping of C. psittaci. Alignments of target (ompA gene segments) and probe sequences are shown on the left-hand side, and the respective hybridization signals (including internal staining marker) are given on the right-hand side. Upper part: The signal generated by duplex formation at genotype E-specific probe VD2-04 is reduced to approx. 20% when the target has a single mismatch, such as genotype B. This applies also to genotype B-specific probe VD2-03, when reacting with genotype E. Lower part: Signal is reduced to less than 10% in the case of two mismatches on the target sequence.

$ (
However, due to cost and labor intensiveness, it is not feasible at present to use DNA sequencing of this 1212-bp gene as a routine procedure in all diagnostic laboratories.
Being more rapid and economical, the AT microarray assay developed in the present study represents a powerful tool for sequence-based, sensitive and reproducible highthroughput genotyping. In principle, the procedure consists in parallel probing of 35 different targets and, in view of the recognition of minor nucleotide sequence variations including SNPs, amounts to re-sequencing the discriminatory regions in VDs 2 and 4 of the ompA gene. Future use of the AT ompA genotyping assay will enable diagnosticians in human and veterinary medicine to trace epidemiological chains, explore the dissemination of the various genotypes and other strains of C. psittaci, as well as identify new representatives of this amazing pathogen.

Conclusion
According to the data of ompA analysis, C. psittaci is genetically the most heterogeneous species of the genus Chlamydophila. The traditional genotyping system should be amended and extended because it does not adequately reflect the extent of intra-species heterogeneity. Serology, PCR-RFLP and real-time PCR are unable to discriminate among all currently accepted genotypes and identify strains of new types. While genotyping based on complete ompA sequences should serve as gold standard, many of the smaller laboratories may not be able to use it in routine diagnosis. The results of the present study have demonstrated that the newly developed DNA microarraybased ompA genotyping assay represents a promising diagnostic tool.

Chlamydial strains
Genomic DNA of the following reference strains, each representing a particular genotype, was used to generate master patterns for the ompA genotyping assay: In addition, the field isolates given in Table 4 were examined in this study.
Products were electrophoresed in 1% agarose gels, and the specific bands of approximately 1200 bp were excised with a scalpel and DNA extracted using the innuPREP Gel Extraction Kit (Analytik Jena, Jena, Germany). DNA sequencing was carried out by cycle sequencing using the BigDye™ Terminator Cycle Sequencing Ready Reaction Kit * AT hybridization signal pattern assigned according to Table 3 (Applied Biosystems, Darmstadt, Germany) according to the instructions of the manufacturer. The following primers were used: CTU, VD1-f (5'-ACT ACG GAG ATT ATG  TTT TCG ATC GTG T-3 In silico analysis of ompA sequences All available sequences of the ompA gene of C. psittaci were downloaded for analysis from the database of the National Center for Biotechnology Information (NCBI). A total of 210 sequence entries were found (by the date of manuscript submission). Of these, 89 sequences shorter than 500 nt were excluded from further analysis for failing to cover all four variable domains. Of the remaining 121 entries, 68 were found to belong to C. psittaci and 53 to other Chlamydophila species. Only 25 sequences from C. psittaci and 28 from other Chlamydophila spp. were found to include the full coding sequence (CDS) of approximately 1212 nt. In contrast, 70 entries lacked terminal parts of the CDS, but were retained because for some genotypes not a single complete sequence was available. All items were included in a global ompA sequence alignment using the program E-INS-I of the MAFFT package [20].
Classification was done first by visual inspection of the alignment using Clustal X [21] and subsequently by calculating a sequence similarity matrix (see Additional file 1), from which a split network graph was constructed (Fig. 1) using the program SplitsTree4 [22].
Before starting the similarity matrix calculation, redundant items were removed, i.e. any sequence that could be retrieved under another sequence was deleted, and only unique sequences were kept. The sequences were brought to identical length by cropping highly conserved 3' ends in order to avoid distorted results due to alignment of differently sized sequences. Thus, the analyzed segment comprised 992 nucleotide positions (median sequence length 942 nt) including all four variable domains. The final set contained 63 unique sequences.

Microarray design
The present array carries 35 oligonucleotide probes recognizing targets from VD2 and VD4 of the ompA gene of Cp. psittaci. Nucleotide sequences of all probes are provided in Table 2. The oligonucleotides had the following features: average size 26 nt (22 -30), melting temperature 60.3°C (59.7 -61.2), G+C contents 46.0 mol-% (37.0 -59.0). Each probe sequence was subjected to local BLAST analysis against all known C. psittaci ompA sequences to verify the respective genomic target and rule out unwanted cross-reactions. Biotinylated oligonucleotide probes were added to monitor the staining reaction, mark the corners of the array and facilitate normalization of signal intensities. Each genotype probe was spotted four-fold, the controls 15-fold, thus bringing the total number of spots on the array to 155. Fabrication of the AT microarrays was described previously [12].

DNA extraction
Cell cultured strains were DNA extracted using the High Pure PCR Template Preparation Kit (Roche Diagnostics, Mannheim, Germany).

Biotinylation PCR and AT hybridization
Target DNA was amplified and biotinylated by duplex PCR using two pairs of ompA primers covering all variable domains. Primer pair VD1-f and VD2-r (sequence as above, but 5'-biotinylated) gives rise to a 418-bp product which includes VD1 and VD2, whereas primers 201CHOMP and ompA-rev (sequence as above, but 5'biotinylated) defined a product of 570 bp covering VD3 and VD4. The temperature-time profile was: initial denaturation at 96°C for 60 sec, 40 cycles of 94°C for 30 sec, 50°C for 60 ssec, 72°C for 30 sec, and final elongation at 72°C for 240 sec.
Normalized intensities of the spots were calculated according to the following equation: NI = 1 -(M/BG), with NI being the normalized intensity, M the average intensity of the spot, and BG the intensity of the local background. Values range from 0 (no signal) to 1 (strong signal).

PCR-RFLP genotyping
Extracted DNA was amplified by PCR using primers CTU/ CTL, and the product was subjected to restriction enzyme digestion using AluI and MboII as described by Sayada et al. [8]. Following agarose gel electrophoresis, cleavage patterns were compared with those of the reference strains for genotypes A, B, C, D, E and F.