A single amino acid substitution in aromatic hydroxylase (HpaB) of Escherichia coli alters substrate specificity of the structural isomers of hydroxyphenylacetate

Background A broad range of aromatic compounds can be degraded by enteric bacteria, and hydroxyphenylacetic acid (HPA) degrading bacteria are the most widespread. Majority of Escherichia coli strains can use both the structural isomers of HPA, 3HPA and 4HPA, as the sole carbon source, which are catabolized by the same pathway whose associated enzymes are encoded by hpa gene cluster. Previously, we observed that E. coli B REL606 grew only on 4HPA, while E. coli B BL21(DE3) grew on 3HPA as well as 4HPA. Results In this study, we report that a single amino acid in 4-hydroxyphenylacetate 3-hydroxylase (HpaB) of E. coli determines the substrate specificity of HPA isomers. Alignment of protein sequences encoded in hpa gene clusters of BL21(DE3) and REL606 showed that there was a difference of only one amino acid (position 379 in HpaB) between the two, viz., Arg in BL21(DE3) and Cys in REL606. REL606 cells expressing HpaB having Arg379 could grow on 3HPA, whereas those expressing HpaB with Gly379 or Ser379 could not. Structural analysis suggested that the amino acid residue at position 379 of HpaB is located not in the active site, but in the vicinity of the 4HPA binding site, and that it plays an important role in mediating the entrance and stable binding of substrates to the active site. Conclusions The arginine residue at position 379 of HpaB is critical for 3HPA recognition. Information regarding the effect of amino acid residues on the substrate specificity of structural isomers can facilitate in designing hydoxylases with high catalytic efficiency and versatility.


Background
Hydroxyphenylacetic acid (HPA) is an aromatic compound that is abundantly present in nature [1,2]. Its structural isomers, 3HPA and 4HPA, are phenylacetic acids in which a hydroxy group substitutes the hydrogen atom at the meta and para positions on the benzene ring (Additional file 1: Figure S1). In the human intestine, 3HPA and 4HPA are the major metabolites produced during the degradation of flavonoids, proanthocyanidin [3] and kaempferol [4], respectively. Further, 4HPA is the main product of L-tyrosine fermentation in the intestine [2]. Additionally, 4HPA has been proposed as a candidate hepatoprotective drug [5] and a biological marker for depression and anxiety [6]. Furthermore, 4HPA can be derived from the biodegradation of lignin, which is an abundant component of the lignocellulosic biomass [7,8].
Aromatic compounds are predominantly degraded by bacteria and fungi [1]. Most enteric bacteria can use HPA as a carbon source, with E. coli being the most studied [2,8]. Among the laboratory strains of E. coli, E. coli B, C, and W can grow on 3HPA and 4HPA, whereas E. coli K-12 cannot [9]. In E. coli, both compounds are catabolized via the homoprotocatechuate (3,4-hydroxyphenylacetate) (HPC) pathway and are subsequently converted into pyruvate and succinate. The hpa gene cluster contains 8 genes which are organized into hpaBC (HPA hydroxylase operon) and hpaGEDFHI (HPC metacleavage operon), two regulatory genes (hpaR and hpaA), and hpaX encoding the HPA transporter [10] (Additional file 1: Figure S2). As the G + C content of the hpa cluster and the E. coli genome is similar, and as most enteric bacteria can utilize 3HPA and 4HPA, the lack of growth of E. coli K-12 on HPA might be due to the loss of the hpa cluster present in the ancestors (Additional file 1: Figure S2) [2].
Both 3HPA and 4HPA are hydroxylated to HPC by the 4-hydroxyphenylacetate 3-monooxygenase complex (HpaBC), which catalyzes the initial step in aerobic HPA catabolism [2]. The complex is a flavin adenine dinucleotide (FAD)-dependent hydroxylase consisting of a monooxygenase (HpaB) and flavin reductase (HpaC) [11]. Although HpaB requires the reduced FAD supplied by HpaC, when only HpaB was expressed without any concurrent expression of HpaC in E. coli K-12, it was able to show hydroxylating activity [12]. E. coli HpaB can hydroxylate a broad range of phenolic compounds, from simple phenol to complex phenylpropanoids [2,13,14]. The crystal structure of E. coli HpaB has been recently determined, suggesting that a unique loop structure covering the active site is essential for the catalytic versatility [15].
The E. coli B lineages, BL21(DE3) and REL606 have long been used for numerous biotechnological applications and long-term experimental evolution, respectively [16]. In our previous studies, the two strains showed different utility of HPA as the sole carbon source. BL21(DE3) could grow on both isomers, 3HPA and 4HPA [17]; however, REL606 could grow only on 4HPA [18]. In this study, we report that a single amino acid residue in HpaB is responsible for the altered substrate specificity of the HPA isomers. The single nucleotide of the corresponding amino acid was subjected to site-directed mutagenesis to provide experimental evidences. Based on protein structure homology modeling and substrate docking simulation, the single amino acid residue was shown to have structural importance in recognizing 3HPA but not 4HPA.

Results
Identification of a single amino acid change in HpaB of E. coli REL606 Our previous phenotype microarray tests of the two E. coli B strains had revealed that REL606 utilized only 4HPA [18], whereas BL21(DE3) utilized both 3HPA and 4HPA [17]. We first checked whether there is any difference in sequence between hpa gene clusters of these strains. Pairwise sequence alignments of 11,127-bp long genomic regions encompassing 11 hpa genes revealed that only two nucleotides were different in protein coding sequences between the two strains. Compared to BL21(DE3), REL606 exhibited one non-synonymous substitution in hpaB (C → T at 1135th bp downstream of the start codon and arginine→cysteine at 379th amino acid residue) (Fig. 1a) and one synonymous substitution in hpaH (C → T at 195th bp downstream of the start codon).
Intrigued by the amino acid difference in HpaB, we performed phylogenetic analysis of 20 HpaB homologs (Fig. 1b). The reconstructed phylogenetic tree was in agreement with a previous report [11]. In most cases (14 out of 20), arginine occupied the position 379 of HpaB. Interestinlgy, this arginine residue was conserved in all enteric bacteria but not in REL606.
The historical origin of the REL606 and BL21(DE3) has been well-documented [21]. The two E. coli B strains had a common ancestor sometime between 1942 and 1959 and went through different sets of genetic manipulations [22]. Detailed comparison of the genomic sequences of BL21(DE3) and REL606 provided plausible explanation for every single base-pair difference [22]. Evidently, the C-to-T transition in hpaB was caused by 1-methyl-3-nitro-1-nitrosoguanidine (MNNG)-mediated mutagenesis of the REL606 progenitor, and subsequently, the mutation became unintentionally fixed through single-colony isolation.
We investigated the effect of BL21(DE3)-derived hpaB expression in REL606 on bacterial growth by employing 3HPA as the sole carbon source. Among REL606 harboring each of the three plasmids [pHCE-IIB, pHCE-IIB-HpaB(R379), and pHCE-IIB-HpaB(C379)], only that with pHCE-IIB-HpaB(R379) exhibited growth on defined medium supplemented with 3 g/L of 3HPA (Fig. 2). In the culture supernatant of this strain, the 3HPA concentration decreased with increase in cell density, and extracellular hydroxylated 3HPA (HPC) accumulated up to a maximum concentration of 0.7 g/L after 30 h of incubation. This result demonstrates that inability of REL606 to grow on 3HPA can be completely rescued by the expression of BL21(DE3)-derived hpaB.

Site-directed mutagenesis of HpaB
As substitution of arginine with cysteine in HpaB resulted in E. coli that was unable to metabolize 3HPA, we performed site-directed mutagenesis to explore whether replacement with other amino acids has the same effect on substrate specificity. The arginine residue at position 379 of the BL21(DE3)-derived HpaB was replaced with glycine and serine. Glycine is the smallest residue, and the lack of a side group makes glycine the most flexible amino acid, and thus, glycine residue is often located in enzyme active site regions [23]. Serine differs from cysteine only with respect to the switch of sulfur atom with an oxygen and can form a disulfide bond.
We constructed pHCE-IIB-HpaB(G379) and pHCE-IIB-HpaB(S379) constitutively expressing HpaB with a single amino acid substitution at position 379 (glycine and serine, respectively). Each of the constructed vectors was transformed into E. coli REL606. The expression of the HpaB proteins was confirmed by running the Fig. 1 Sequence alignment of HpaB homologs. a Pairwise sequence alignment between HpaBs from E. coli BL21(DE3) and REL606 at the nucleotide and protein sequence levels. The aligned nucleotides were identical except for those at position 1135 of hpaB. b Phylogenetic tree of HpaB homologs. Multiple alignments of 20 protein sequences were performed using MUSCLE [19]. The maximum likelihood tree was built using MEGA X with bootstraps of 1000 replicates [20]. Bootstrap values (as percentages) are denoted at internal nodes. For each HpaB homolog of E. coli BL21(DE3), the BLASTP result is given as the accession number, amino acid identity, and e-value. Scale bar indicates substitution per amino acid sequence site. Amino acid residues at position 379 are shown in the box at the right proteins on a sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gel (Additional file 1: Figure S3). The gel image showed that endogenous HpaB was hardly detected from REL606 transformed with the empty plasmid and a large amount of the HpaB variants cloned in pHCE-IIB was expressed with culture time. The transformed strains were grown on defined medium supplemented with 0.76 g/L (5 mM) 3HPA or 4HPA as the sole carbon source (Fig. 3). As a comparison, REL606 cells expressing BL21(DE3)-derived hpaB [pHCE-IIB-HpaB(R379)] and REL606-derived hpaB [pHCE-IIB-HpaB(C379)] were tested. All the strains expressing HpaB with an amino acid substitution at position 379 (Arg, Cys, Gly, or Ser) grew on 4HPA (Fig. 3a). However, when 3HPA was used as the carbon source, only cells expressing HpaB with Arg379 exhibited growth (Fig. 3b). These results suggest that Arg379 in HpaB plays an important role in recognizing 3HPA but not 4HPA.
The strains expressing HpaB with an amino acid substitution at position 379 (Arg, Cys, Gly, or Ser) were also tested for their ability to degrade L-tyrosine (Additional file 1: Figure S4). In the defined medium supplemented with 3 g/L glucose and 0.54 g/L (3 mM) L-tyrosine, brown coloration in the medium was only observed for REL606 expressing the BL21(DE3)-derived hpaB gene from pHCE-IIB-HpaB(R379). As catechol derivatives form spontaneous black or brown oxidation products [13], the brown colour of the culture medium is a read-out of HpaB-mediated hydroxylation of L-tyrosine. This result demonstrated that the inability of REL606 to degrade L-tyrosine can be rescued by the expression of BL21(DE3)-derived hpaB. Collectively, these results suggest that the arginine residue at position 379 of HpaB is critical for recognition of 3HPA and L-tyrosine.

Structural importance of the 379th postion in HpaB
So far, the crystal structures of FAD-dependent HpaB were determined as the apoenzyme form from E. coli (PDB ID: 6 EB0) [15] and enzyme complex with FAD and 4HPA from Thermus thermophilus HB8 (PDB ID: 2YYJ) [24]. To gain structural insights into the importance of the 379th amino acid residue with respect to the recognition of HPAs by HpaB, its structural position was identified using the crystal structure of E. coli HpaB apoenzyme form [15], which has the same amino acid sequences as that of HpaB of BL21(DE3) (Fig. 4a). Geometry of Cys379 was predicted by homology modeling based on the crystal structure of HpaB of E. coli [15] (Fig. 4b). The active site at which 4HPA binds to was identified using the crystal structure of the HpaB-FAD-4HPA complex from T. thermophilus [15,24].
From the crystal structure of the HpaB-FAD-4HPA complex of T. thermophilus, 4HPA was predicted to bind to the active site and an extra binding site in HpaB [24]. Their structural positions were identified using the crystal structure of E. coli HpaB apoenzyme form [15]. As shown in Fig. 4, the 379th position was located   [15] and a groove which acts as the binding of FADH 2 and the substrate [24]. The location of the position 379 in the vicinity of the predicted extra binding sites and its reduced interaction with C-terminal tail due to the presence of Cys379 might suggest that Arg379 is optimized for the entrance and stable binding of substrates into the active site. Detailed structural analysis is required to understand how the identified region affects substrate specificity. The phenol hydroxyl group of 4HPA forms a hydrogen bond with the binding site of HpaB, which is structurally conserved between HpaBs of E. coli and T. thermophilus [11]. Thus, we reasoned that orientation of the hydroxyl group of HPA may be important in the recognition of the substrate. To predict the conformation and affinity of 3HPA and 4HPA binding to HpaB, the substrates were docked into the active site of the HpaB-FAD-4HPA complex from T. thermophilus [24] (Additional file 1: Figure S5). The binding conformation of 4HPA based on the docking simulation matched well with that from the crystal structure [root mean square deviation (RMSD) of the average distance between the backbone atoms of 4HPA was 1.053 Å and the predicted binding affinity was − 6.7 kcal/mol], which validated our docking protocol. However, the RMSD for the simulated docking of 3HPA compared to that of 4HPA from the crystal structure increased to 1.921 Å and the predicted binding affinity decreased to − 6.1 kcal/mol. This suggests that orientations of 3HPA and 4HPA are different, further implying that they may interact with different residues of HpaB when they bind to the active site of HpaB.

Discussion
In this study, we experimentally demonstrated that a single amino acid substitution in HpaB resulted in the inablility to utilize 3HPA, but did not affect 4HPA utilization. Previously, we found that E. coli B REL606 could not grow by utilizing 3HPA as the sole carbon source [18], whereas its closely related B strain, BL21(DE3) could grow on 3HPA as well as on 4HPA [17]. Alignment of protein sequences encoded in the complete hpa gene clusters of two E. coli B strains revealed that only one amino acid in HpaB was different between the two strains, resulting in a single amino acid change from arginine in BL21(DE3) to cysteine in REL606 at position 379 (Fig. 1a). Constitutive expression of HpaB containing Arg379 resulted in REL606 that was capable of growing on 3HPA (Fig. 2). However, the expression of HpaB containing either Gly379 or Ser379 had no effect on the growth of REL606 on 3HPA or 4HPA (Fig. 3a). Structural modeling of E. coli HpaB showed that the 379th position is located not in the active site, but in the vicinity of 4HPA binding sites (Fig.  4), suggesting the important role of Arg379 in mediating the entrance and stable binding of HPA by HpaB. Taken together, these results provide conclusive evidence that the amino acid at position 379 in HpaB of E. coli determines the substrate specificity for 3HPA and 4HPA isomers. It is worth noting that this study showcases how genomic and phenomic comparison between closely related strains [17,18] can lead to unexpected biological discovery at the molecular level.
HpaB of E. coli is an aromatic hydroxylase having a broad substrate specificity range and can hydroxylate 3-HPA, 4HPA, chloro-and methyl-aromatics (e.g., 3-chloro-4HPA, 4-chloro-PA, 4-chlorophenol, 3-chlorophenol, and p-cresol), and dihydroxylated aromatic compounds [e.g.,  [15,24]. 4HPA (colored yellow) bindings were identified using the crystal structure of the HpaB-FAD-4HPA complex from T. thermophilus HB8 (PDB ID: 2YYJ). Distance between Ser462 in the C-terminal helical arm to Arg379 or Cys379 were predicted to be 2.70 Å and 4.75 Å, respectively. HPC, 2,5-dihydroxyphenylacetic acid, catechol, resorcinol, hydroquinone, and 3,4-dihydroxy phenylalanine (L-DOPA)] [2,13,14]. It is remarkable that a single amino acid substitution affected the promiscuous substrate range, which led to the inability to degrade 3HPA and Ltyrosine. Although it is not uncommon that enzyme function and activity can be altered by a single amino acid substitution, to the best of our knowledge, complete loss of degradation activity with respect to native subtrates has been rarely reported, particularly for discriminating between the structural isomers.
Due to its high catalytic efficiency and versatility, HpaB has great potential in biotechnological and pharmaceutical applications [11]. It has been used to produce potential antioxidants of trihydroxyphenolic acids [25]; hydroxylated phenylpropanoids, which have attractive pharmacological properties [14]; and L-DOPA, which is used for the treatment of Parkinson's disease [26,27]. Information regarding the substrate specificity that is determined by a single amino acid substitution will contribute to a better understanding of substrate binding and may provide the opportunity to develop biocatalytic hydroxylation processes that require highlydeveloped substrate specificity.

Construction of HpaB-expression vectors
Plasmids and primers used in this study are listed in Table 1 and Table S1 of Additional file 1, respectively. hpaB was PCR-amplified from the genomic DNA of E. coli BL21(DE3) or REL606 using hpaB-F/hpaB-R primers. Plasmid pHCE-IIB was used for the constitutive expression of hpaB. The pHCE-IIB has a strong constitutive promoter cloned from the thermostable D-amino acid aminotransferase (D-AAT) gene of Geobacillus toebii [31]. The ribosomal binding site of the D-AAT promoter was modified to match perfectly with the 3′ end of the E. coli 16S rRNA [31]. The purified PCR product and pHCE-IIB vector were digested with both BamHI and XbaI and were then ligated into the pHCE-IIB vector using T4 ligase. The constructed vectors were electroporated into E. coli REL606.

Structural modeling and molecular docking
To locate the geometric 379th postion in HpaB, the three-demensional structure of E. coli HpaB (PDB ID: 6 EB0) [15] was visualized using UCSF Chimera [32]. The active site where 4HPA binds to was identified using the crystal structure of the HpaB-FAD-4HPA complex (PDB ID: 2YYJ) [24]. The homology model structure of HpaB from E. coli REL606 was generated based on the crystal structure of HpaB of E. coli [15] using Modeller v9.21 [33]. The model structure with the lowest DOPE score was selected. Molecular docking of 3HPA and 4HPA into the HpaB structure was performed with AutoDock Vina [34] using the active site of the HpaB-FAD-4HPA complex from T. thermophilus HB8 [24]. The docking parameters of exhaustiveness and number of modes were set to 1000 and 100, respectively.
Additional file 1 Table S1. Primers used for PCR amplification and sitedirected mutagenesis of hpaB. FigureS1. Chemical structures of 3-and 4-hydroxyphenylacetate (HPA). Figure S2. Comparison of gene clusters for HPA catabolism in laboratory strains of E. coli BL21(DE3), REL606, and W. Figure S3. Confirmation of the expression of HpaB variant proteins cloned in pHCE-IIB. Figure S4. Growth curves of REL606 expressing hpaB variant proteins in the defined medium supplemented with L-tyrosine. Figure S5. Molecular docking of HPAs into the HpaB component from the crystal structure of the HpaB-FAD-4HPA complex from T. thermophilus HB8.