The current study revealed three novel and significant characteristics of the evolution and genetic diversity of C. pecorum infections in the koala: (1) the ompA gene has a phylogenetic history that is congruent with other gene targets in the C. pecorum genome, yet is phylogenetically-insufficient for use as a single gene marker; (2) the tarP and ORF663 genes are potentially useful in representing C. pecorum genomic diversity and evolution, and (3) koala C. pecorum infections appear to be monophyletic, possibly suggesting a limited number of cross-host transmission events between koalas and non-koala hosts.
The ompA gene is one of the most polymorphic genes across all Chlamydia species  and as a result, was previously selected as the molecular marker of choice in epidemiological and genotyping studies of C. pecorum infections of the koala. This increased nucleotide diversity is reported to be due to the antigenicity of MOMP and the selective pressure of the host's immune response . Early C. trachomatis studies and more recent C. pecorum studies suggested that the phylogenetic categorisation of the ompA gene is not concordant with pathobiotypes, tissue tropisms, or the evolution of the genome as a whole [7, 11, 18, 20, 21]. Based on these findings, the use of ompA gene as a molecular marker of koala C. pecorum genetic diversity also required re-evaluation.
Assumptions on the validity of ompA as a genetic marker for koala C. pecorum strains must be preceded by an appreciation of the koala C. pecorum phylogeny. Without in-depth MLST studies to determine the true C. pecorum phylogeny, this study applied our four genes of interest (ompA, incA, ORF663 and tarp), to a multi-locus approach to phylogeny in an effort to recreate the most accurate phylogenetic signal (Figure 2) using single gene targets. Some level of phylogenetic discordance is expected between these genes given their diverse metabolic function, chromosomal location, possibility for evolutionary rate heterogeneity and the susceptibility of all four genes to recombination events. However, this multi-locus method benefits from a "majority rule" approach by allowing the amplification of congruous phylogenetic information while reducing the effects of phylogenetic "noise". In addition, the equalisation of outer branch lengths serves to resolve minor phylogenetic inconsistencies. Together, this results in a more accurate phylogeny than that inferred from a single gene [55, 56]. There was no perturbation of the tree topology when each gene was sequentially omitted from analysis, alleviating concerns that individual genes may dominate and sweep the phylogenetic signal. It is expected that the systematic addition of further gene data will continue to produce a more refined and resolute phylogeny, however we suggest that the phylogenetic tree using concatenated sequences of ompA, incA, ORF663, and tarP provides a preliminary and useful indication of the true phylogenetic relationship between these koala C. pecorum samples and a prelude to future MLST and phylogenetic studies.
The phylogenetic tree generated from concatenated data clearly defines two distinct lineages between the four populations investigated: (1) the Pine Creek and East Coomera populations (separated by ~500 kms), and (2) the Narangba and Brendale populations (separated by ~5 kms), while each lineage is further subdivided into two clades, each representing an individual population. From an evolutionary standpoint, this phylogenetic reconstruction appears valid. For example, it is clear that the Brendale and Narangba populations remain geographically (and genetically) similar, as do the East Coomera and Pine Creek populations, albeit to a lesser degree. The genetic diversity and uniqueness of geographically isolated C. pecorum strains is presumably the result of disturbances to koala population distribution and structure from land clearing and urban pressure over the last 200 years of European settlement, leading to the formation of isolated koala colonies in which C. pecorum strains continue to undergo local selection and adaptation. The question that remains is how effective are the four shortlisted genes in abbreviating this vast phylogenetic information for epidemiological study?
Beginning with ompA, previous C. pecorum studies suggest that this gene is reflective of the overall evolution of the C. pecorum genome [7, 23], however these studies are based on broad comparisons between chlamydial species and do not represent evolutionary lineages on an intra-species level. Alternatively, intra-species C. trachomatis studies have indicated that the ompA locus differs from other regions of its genome . The results of the present study illustrate a tendency for the phylogenetic topology of the ompA gene to separate the Narangba/Brendale populations from the Pine Creek/East Coomera populations while other, more divergent strains do not cluster according to their respective population. This data would appear to correlate with previous C. pecorum fine-detailed epidemiological studies where it was concluded, using the ompA gene, that an association between the site of koala capture and the genotype of its resident C. pecorum strain usually exists, while some genotypes were distributed widely into different geographic areas . The phylogenetic divisions offered by the tree using concatenated sequences, however, clearly show that regions of the ompA gene are actively contributing to a misinterpretation of the "true" phylogenetic signal. This observation supports previous conclusions that ompA is ineffective as a genome-representative marker. It is therefore suggested that while the ompA gene continues to be a useful fine-detailed comparative marker, it remains suboptimal for any phylogenetic, evolutionary and/or biogeographic analysis. Both the tarP and ORF663 genes, conversely, are appealing alternatives to ompA.
The tarP gene encodes the translocated actin-recruiting phosphoprotein  which has important virulent functions involved in the attachment of the chlamydial elementary body to the host cell . The tarP gene's tendency for negative selection and relatively low mean nucleotide diversity reinforces its important biological role in the chlamydial cell and typifies a gene that changes slowly enough to make it useful as an evolutionary chronometer . Recent C. trachomatis studies have suggested that the full-length tarP gene, based on the inverse relationship between the number of tyrosine repeats and the number of actin-binding domains, can be correlated with clinical phenotype , highlighting its potential as a useful genetic marker.
The koala C. pecorum tarP gene phylogenetic tree produced two distinct clades which, interestingly, revealed a clear separation between the Brendale and Narangba isolates and the Pine Creek and East Coomera isolates. Initially, there appeared to be no distinction between ocular and urogenital sites of infection, however upon further inspection, it was clear that (with the exception of Nar/Dion (Left Eye)), all the ocular isolates remained confined to one phylogenetic clade (among seven urogenital isolates) which are distinct from the remaining urogenital isolates. Importantly, this ocular "outlier" (Nar/Dion (Left Eye)) retains 100% nucleotide similarity with the remaining isolates within the Narangba population, all of which were isolated from urogenital sites of infection. Coupled with the fact that isolate 'Ned' from the East Coomera population harbours genetically distinct ocular and urogenital isolates of C. pecorum, this suggests that high rates of transmission within these confined koala populations may contribute to the transfer of C. pecorum from one body site to another and that the site of detection may not be the original niche of the strain . It appears that the tarP gene has potential as a phenotypic-dependent marker, however, importantly, further investigation is required that utilises the full-length tarP gene (in conjunction with wider geographic sampling) to properly determine its true potential.
From a full genome evolutionary standpoint, the separation of the Brendale/Narangba populations from the Pine Creek/East Coomera populations is a distinction that is clearly mirrored in the overall phylogenetic analysis using concatenated data. This suggests that tarP, although having a relatively low rate of substitution, is capable of more accurately and specifically differentiating koala strains according to geography than ompA and ORF663, albeit with reduced resolution. For these reasons, tarP also appears promising as an evolutionary indicator and may be classified as a "neutral marker", characterised by its selective constraints yet ability to reflect sequence diversity between koala populations that are geographically separate . However, as a "neutral marker", the tarP gene remains less useful when estimating a population's adaptive potential or local population divergence.
ORF663 encodes a hypothetical protein and includes a 15 nucleotide variant coding tandem repeat (CTR) region that putatively associates it with a virulence-related role. Interestingly, this gene has not been identified in any other chlamydial species and BLAST search reveals no similarities to any other sequences in the database. The C. pecorum ORF663 gene was the most polymorphic gene among all investigated and represents a locus under considerable positive selection. Using this gene, we were able to observe the most distinctions between strains by identifying seven separate genotypes. These genotypes highlight the considerable discriminatory capacity of ORF663 which correlates with (while extending) the divisions made by ompA and tarP, by isolating the Narangba and Brendale populations into their own genotypes while separating the more heterogeneous Pine Creek and East Coomera populations into multiple genotypes. Where the tarP gene represents a neutral marker that assumes isolates within a population are equally related to each other, ORF663 can be considered a "divergence-based" or "contingency" marker that is capable of characterising diversity both within and between populations for fine-detailed epidemiological study.
The value of the marker genes identified in this study was extended to consider the genetic diversity between C. pecorum infections in koalas and non-koala hosts. Previous research has suggested that, supported by ompA VD3/4 sequence data, C. pecorum is a polyphyletic organism in Australian koala populations. This hypothesis originated from the similarity of one or two koala ompA genotypes to European bovine isolates of C. pecorum [7, 11] and based on this data, a model was proposed whereby koalas obtained C. pecorum infections as a result of a series of cross-species transmission events from sheep and/or cattle [7, 8, 11, 60]. While similar results were obtained using ompA data in this study (Figure 3), the phylogenetic analysis has already suggested in inadequacy of the ompA gene alone in representing C. pecorum's true evolutionary course within koala populations. Indeed, both this and previous studies utilised a 465 bp fragment of the ompA locus (VD 3/4) which, while containing the majority of ompA's nucleotide variation, would remain largely insufficient to describe the extensive genetic diversity that has accumulated in global isolates of C. pecorum.
Consequently, we prepared an unrooted phylogenetic tree from the concatenation of incA, ompA, and ORF663 sequences, revealing a surprising alternative picture that clearly distinguishes koala C. pecorum strains from non-koala hosts (Figure 4). This distinction is further supported by the noticeable difference in branch lengths between koala C. pecorum sequences and non-koala hosts, suggesting that as a whole, koala strains are much more closely related to each other than to other non-koala host strains. This result is significant as it may be an example of an alternate evolutionary model in which koalas obtained C. pecorum as a result of a limited number of cross-host transmission events in the past and have subsequently evolved along an evolutionary trajectory that is distinct from that seen in sheep and cattle isolates. This result also reinforces the benefit and efficacy of applying more phylogenetically-robust data (the concatenation of three congruent genes) to the epidemiological study of C. pecorum infections, both in koala and non-koala hosts. It must be noted however, that this remains a cautionary finding. Without ompA, incA, and ORF663 nucleotide sequences from Australian sheep and cattle isolates it remains impossible to truly establish a compelling cross-host transmission hypothesis for koala isolates. Nevertheless, this data cannot be completely discounted and functions as preliminary insight into the genetic diversity of koala isolates of C. pecorum.