Strains of non-typeable (NT) Haemophilus influenzae asymptomatically colonize the human pharynx, but are also opportunistic pathogens that cause localized respiratory tract infections such as otitis media, pneumonia, bronchitis, sinusitis, and COPD exacerbation [1, 2]. Bacterial factors that differentiate disease from commensal strains are largely unknown since the population structure of NT H. influenzae is genetically heterologous . The association of bacterial factors with disease-causing strains can be inferred, however, by comparing the prevalence of genetic traits between epidemiologically defined collections of disease and commensal strains [4–7] or, alternatively, between the pathogenic species and a phylogenetically close but non-pathogenic relative [8–11].
Haemophilus haemolyticus is a phylogenetically close relative of NT H. influenzae, but has not been associated with disease [7, 12, 13]. The two species reside in the same host niche, overlap extensively by both taxonomic and phylogenetic analyses [10, 14, 15], and exchange DNA through natural transformation [10, 13, 16]. Given their close relationship, but difference in disease potential, NT H. influenzae and H. haemolyticus likely possess common genes or genetic traits for commensal growth but differ in genes or traits that facilitate disease .
Historically, H. haemolyticus has been considered a rarely encountered commensal that was easily differentiated from NT H. influenzae by its hemolytic phenotype [17–19]. Recent studies, however, have shown that 20-40% of isolates in various NT H. influenzae collections were miss-classified, and found to be non-hemolytic H. haemolyticus [7, 13]. These observations suggest that H. haemolyticus is significantly more prevalent in the pharynges than previously thought, and that clinical differentiation of the species from throat and sputum samples is inadequate . Therefore, we recently sought to differentiate the species by their relative proportions of selected NT H. influenzae virulence genes and observed that a probe made to licA, a NT H. influenzae gene necessary for phosphorylcholine (ChoP) modification of LOS, hybridized to 96% of NT H. influenzae isolates and to 42% of H. haemolyticus isolates . The relationship of ChoP expression between NT H. influenzae and H. haemolyticus is unknown but differences between the species may highlight important roles in NT H. influenzae virulence.
In studies addressing NT H. influenzae virulence, ChoP-modified LOS has been shown to promote bacterial adherence and invasion of host cells through interaction with the platelet activating factor receptor, to increase bacterial resistance to host antimicrobial peptides such as cathelicidin (or LL-37/hCAP18), and to modulate the host inflammatory response directed toward bacteria present in biofilms [20–22]. Paradoxical to its role in enhancing colonization and virulence, ChoP can bind C-reactive protein (CRP) which initiates C1q binding that leads to activation of the classical complement pathway and bactericidal killing . The concentration of CRP (in both serum and respiratory tract secretions) dramatically increases during inflammation, and has been proposed to facilitate clearance of ChoP-expressing bacteria in the respiratory tract [24, 25]. Human ChoP-specific antibodies capable of eliciting in vitro bactericidal activity against some H. influenzae strains have also been identified, suggesting a further liability of H. influenzae ChoP expression . H. influenzae may avoid CRP and anti-ChoP antibody binding, however, by phase varying ChoP expression and by strain-dependent localization of ChoP substitutions within LOS [27, 28].
In H. influenzae, ChoP expression is controlled by a contingency locus, lic1, that contains the licA, licB, licC, and licD genes (encoding a choline kinase, a choline permease, a pyrophosphorylase, and a diphosphonucleoside choline transferase, respectively) . Contingency loci, such as lic1, contain simple sequence repeats (SSR) that provide an organism with the ability to phase vary specific phenotypes in response to host challenges . In lic1, the SSR are tetranucleotide (5'-CAAT-3') and are present at the 5' end of licA, the first gene in the locus . During replication, intragenic SSR repeats undergo slipped-strand mispairing which results in translational phase variation, and the rate of these mutations is proportional to the length of the repeat region . De Bolle et al  found that mutation rates of a H. influenzae type III restriction modification gene (mod) engineered to contain 17-38 tetranucleotide (AGTC) intragenic repeats increased linearly with the number of repeats. In contrast, the same gene containing 5-11 repeats demonstrated rare, if any, phase-variation. Thus, higher numbers of repeats in a contingency locus may protect the bacteria by decreasing the response time to host challenges . Among H. influenzae strains, however, the number of licA gene 5'-CAAT-3' repeats range from 3-56, and patterns pertaining to virulence have not been identified [32, 33].
Depending on the H. influenzae strain, ChoP may be substituted at different positions within LOS. Substitutions may occur on oligosaccharides that extend from any one of the three conserved inner-core heptose residues (heptose I, II, and III) or, alternatively, directly to heptose IV, an outer core heptose that extends from heptose I [34, 35]. These substitutions are dictated largely by the diphosphonucleoside choline transferase encoded by the licD gene. Three licD gene alleles mediate ChoP substitutions at different positions within LOS and, for simplification, we have named the alleles to reflect their association with a given heptose-residue: licD
, and licD
. Although ChoP has been associated with heptose II residues in selected strains, a specific licD allele mediating these substitutions has not been experimentally documented . The deduced LicD proteins are 265-268 amino acids in length and range in sequence identity from 74-88% with much of the variation occurring in the central part of the primary structure [28, 35]. Although most NT H. influenzae strains possess a single licD allelic gene that facilitates one ChoP substitution, Fox et al  recently reported that 4/25 (16%) of NT H. influenzae middle ear strains possessed two different licD alleles, each present in a separate, phase-variable lic1 locus, that together could produce up to two ChoP substitutions in the strain's LOS.
Both the number and position of ChoP substitutions within LOS may affect binding of host clearance molecules such as CRP or natural ChoP antibodies [26, 28]. For instance, H. influenzae strains with dual ChoP substitutions bind more CRP, and H. influenzae strains with ChoP substitutions positioned from the distal heptose III residue are 10-fold more sensitive to CRP-initiated bactericidal killing than ChoP associated with the proximal heptose I in the same strains [28, 35]. Consequently, strains with proximal ChoP substitutions (i.e. heptose I) may be more protected from CRP-mediated clearance, and LOS structural studies on selected NT H. influenzae strains have found that ChoP predominate at this position . The overall prevalence of these substitutions in the NT H. influenzae population, however, is not known. Differences in the prevalence of single or combined licD gene alleles between NT H. influenzae and H. haemolyticus may reflect the importance of ChoP structures in NT H. influenzae virulence.
The presence of a licA gene in H. haemolyticus suggests that it may contain a lic1 locus and express ChoP in a manner similar to H. influenzae . Since ChoP expression among NT H. influenzae strains can vary greatly due to genetic factors listed above, we speculated that differences in the prevalence of these factors between strain populations of H. influenzae and H. haemolyticus may highlight, in part, which ones provide an advantage to H. influenzae in transcending from commensal to disease-related growth.