Identification of regulatory network topological units coordinating the genome-wide transcriptional response to glucose in Escherichia coli
© Gutierrez-Ríos et al; licensee BioMed Central Ltd. 2007
Received: 15 January 2007
Accepted: 08 June 2007
Published: 08 June 2007
Glucose is the preferred carbon and energy source for Escherichia coli. A complex regulatory network coordinates gene expression, transport and enzyme activities in response to the presence of this sugar. To determine the extent of the cellular response to glucose, we applied an approach combining global transcriptome and regulatory network analyses.
Transcriptome data from isogenic wild type and crp- strains grown in Luria-Bertani medium (LB) or LB + 4 g/L glucose (LB+G) were analyzed to identify differentially transcribed genes. We detected 180 and 200 genes displaying increased and reduced relative transcript levels in the presence of glucose, respectively. The observed expression pattern in LB was consistent with a gluconeogenic metabolic state including active transport and interconversion of small molecules and macromolecules, induction of protease-encoding genes and a partial heat shock response. In LB+G, catabolic repression was detected for transport and metabolic interconversion activities. We also detected an increased capacity for de novo synthesis of nucleotides, amino acids and proteins. Cluster analysis of a subset of genes revealed that CRP mediates catabolite repression for most of the genes displaying reduced transcript levels in LB+G, whereas Fis participates in the upregulation of genes under this condition. An analysis of the regulatory network, in terms of topological functional units, revealed 8 interconnected modules which again exposed the importance of Fis and CRP as directly responsible for the coordinated response of the cell. This effect was also seen with other not extensively connected transcription factors such as FruR and PdhR, which showed a consistent response considering media composition.
This work allowed the identification of eight interconnected regulatory network modules that includes CRP, Fis and other transcriptional factors that respond directly or indirectly to the presence of glucose. In most cases, each of these modules includes genes encoding physiologically related functions, thus indicating a connection between regulatory network topology and related cellular functions involved in nutrient sensing and metabolism.
In their natural environments, bacteria must adapt to changing physicochemical conditions. Adaptation responses are controlled by a complex network of sensory and regulatory proteins that modulate cellular functions at the transcriptional and posttranscriptional levels. Nutrient availability, ranging from sufficiency to total deprivation, is one of the environmental variables the cell is constantly sensing. Among nutrients, carbohydrates are particularly important to the cell since they are utilized as both carbon and energy sources. Glucose is the most abundant aldose in nature, being present mostly in polymeric states as starch and cellulose . This sugar is the preferred carbon and energy source for the gram-negative bacterium Escherichia coli (E. coli) . Specialized protein systems are present in E. coli to sense, select and transport glucose. This sugar is internalized and phosphorylated by the phosphoenolpyruvate:sugar phosphotransferase system (PTS). This system catalyzes group translocation, a process that couples transport of sugars to their phsosphorylation. The PTS is widespread in bacteria but absent in Archaea and eukaryotic organisms [3, 4]. It is composed of soluble non sugar-specific protein components, Enzyme I (EI) and the phosphohistidine carrier protein (HPr) which relay a phosphoryl group from the glycolytic intermediate, phosphoenolpyruvate (PEP), to any of the different sugar-specific enzyme II complexes. Glucose is imported by the IIGlc complex, composed of the soluble IIAGlc enzyme and the integral membrane permease IICBGlc .
The preferred nutritional status of glucose for E. coli is evidenced by the observed repression and inhibition exerted by this sugar on gene expression and the activities of enzymes and transporters related to the consumption of other carbon sources. This example of global regulation is called carbon catabolite repression (CCR) . As a sensor of the presence of glucose in the external medium, the PTS plays a central role in CCR. When glucose is present in the medium and it is being transported by the PTS, the IIAGlc protein is non-phosphorylated, and in this state, it binds to various non-PTS permeases inhibiting uptake of other carbon sources. This form of IIAGlc also binds to the enzyme glycerol kinase (GK), inhibiting its activity. When glucose is absent from the culture medium, IIAGlc is mainly in its phosphorylated state. In this condition, IIAGlc~P binds to the enzyme adenylate cyclase (AC), activating its cyclic AMP (cAMP) biosynthetic capacity. Therefore, cAMP concentrations increase in the cell. Then cAMP binds to the cAMP receptor protein (CRP) and promotes the induction of catabolite-repressed genes.
The global transcriptional response of E. coli to different nutrient/environmental conditions has been studied using microarray technology. These studies have revealed complex genome-wide expression patterns that reflect the roles of different cellular regulators on cell adaptability and survival. Some of these works have focused on analyzing the effects on global transcription patterns of growing E. coli in minimal or complex media with different glucose concentrations [6–9]. These studies have enabled the identification of genes whose transcript levels change in response to each specific condition. In order to characterize the cellular response to glucose, conditions must be chosen that represent sufficiency and the complete lack of this nutrient. A comparison of genome-wide transcriptome patterns between strains grown under these conditions should be adequate for identifying the group of genes displaying a transcriptional response to glucose which we term, the "glucose stimulon". In this work, we use transcriptome data, collected under conditions of glucose absence or excess in a complex medium. Analyses of the data set were used to identify the genes encoding cellular functions that respond to this stimulus and enable the cell to adapt to nutrient availability. Topological analysis of the regulatory network involved in this response revealed modular organization where global and local transcriptional factors integrate different signals to detect and respond to the presence of glucose.
Results and Discussion
Global transcriptome response to the presence of glucose in complex medium
Transcriptome data was obtained from previously reported experiments performed with E. coli strain BW25113 and an isogenic crp mutant (LJ3017). These strains were grown in LB medium with (LB+G) or without (LB) 0.4% glucose. Total RNA was extracted from each condition, processed and hybrydized to the Affymetrix E. coli array which includes 4327 genes . Three data sets were obtained for each of three experimental conditions: wild type grown in LB medium (WT), wild type grown in LB medium + glucose (WTg) and a crp- mutant grown in LB medium (CRP). Starting with these data, differentially transcribed genes were selected using an outlier iteration method [12–14]. Analysis of the data from the WTg/WT log ratios, allowed the identification of genes having a significant change in transcript level (Table 1) . 180 genes showed increased and 200 reduced relative transcript levels. Of these 380 genes, 87 belong to the hypothetical, unknown class.
The presence of glucose had a significant effect on transcript levels of genes encoding enzymes of central metabolism. Upregulation with glucose was detected for the genes encoding the E1 and the lipoate acetyltransferase/dihydrolipoamide acetyltransferase subunits of the pyruvate dehydrogenase multienzyme complex (Pdh) as well as the genes encoding phosphotransacetylase and acetate kinase, that constitute an acetate synthesis pathway. On the other hand, downregulation was observed for genes encoding nearly all enzymes involved in gluconeogenesis, the TCA cycle and the glyoxylate bypass. The observed responses are consistent with the expected glycolytic metabolism induced by exogenous glucose.
Functions related to nucleotide biosynthesis and salvage pathways of purines and pyrimidines were found to change in response to glucose. Growth in LB+G medium reduced transcript levels of genes encoding proteins involved in (deoxy)ribose phosphate degradation, as well as the salvage pathways for both adenine, hypoxanthine, and their nucleosides and pyrimidine ribonucleotides and pyrimidine deoxyribonucleotides. By contrast, transcript levels for genes encoding enzymes that participate in the de novo biosynthesis of purine and pyrimidine ribonucleotides were increased. These results suggest that the cell exists in a metabolic state where it is importing and interconverting ribo and deoxyribonucleotides present in the LB medium, but addition of glucose induces another state where de novo synthesis capacity is increased.
For genes encoding enzymes of amino acid metabolism, different effects of glucose were observed. Downregulation in LB+G medium relative to LB medium was detected for genes involved in biosynthetic pathways for aromatic amino acids, aspartate, cysteine, isoleucine-valine, phenylalanine and threonine. Interestingly, downregulation was also observed for genes encoding activities involved in the degradation of aspartate, cysteine, glycine and threonine. In addition, as mentioned above, a decrease in transcript level was detected for genes encoding importers for alanine, glutamine, glycine, histidine, proline and serine. These results indicate a reduction in import and degradation capacity for several amino acids when growing in LB+G medium. This can be explained considering that in this condition amino acids utilization as carbon sources should be significantly reduced. The apparent reduction in the demand for external amino acids to be used as alternative carbon sources or building blocks could also be a consequence of increased capacity for the de novo synthesis of amino acids once glucose is available. However, as noted above, induction of genes of amino acid synthesis pathways was never observed, and, in fact, repression was observed for several pathways. Therefore, the effects of glucose on degradative and biosynthetic capacities do not seem to be global but amino acid-specific.
A general trend of upregulation in LB medium was detected for genes encoding proteases, indicating higher proteolytic activity under this condition when compared to growth in LB+G medium. This makes sense since peptide degradation and protein turnover can provide carbon and energy for biosynthetic purposes in the absence of glucose. A similar pattern was observed for heat shock proteins and chaperones. These results suggest a higher protein turnover rate in the absence of glucose. The possible presence of partially degraded or misfolded proteins when the cells are growing in LB medium could cause the induction of heat shock proteins and chaperones, as has been previously reported [18, 19]. It should be pointed out that several of the induced proteases are involved in regulatory processes (Fig. 1). The overall regulatory effects of such a response remain to be determined. A decrease in transcript level for heat shock proteins and chaperones upon growth in LB+G medium indicates that functions related to protein turnover are reduced by the presence of glucose, suggesting a lower capacity or need to use of amino acids derived from proteins as sources of carbon or protein constituents, or that proteins are more stable in an energized cell.
Medium composition had an important effect on genes encoding proteins involved in translation. Increased transcript levels were observed for genes encoding 20 of the 30 ribosomal protein components of the 50S subunit and 16 of the 22 ribosomal proteins of the 30S subunit. Also increased were transcript levels for 46 tRNA genes, grouped in 14 transcriptional units (TUs). Two of these TUs include genes rrnA and rrnD, encoding two of the seven 16S ribosomal RNAs. These genes are known to be subject to growth rate-dependent regulation . In cultures used to obtain the RNA to generate the transcriptome data, a 5% higher growth rate was observed when comparing LB+G to LB conditions [10, 21]. Therefore, induction of genes encoding ribosomal proteins, tRNAs and rRNAs is an expected response to the higher growth rate in LB+G medium.
Cell division and replication functions were found to respond to medium composition. Glucose lowered transcript levels for genes encoding DNA replication inhibitor protein CspD and the cell division inhibitor and membrane ATPase MinCD of the MinC-MinD-MinE and DicB-MinC systems. The cspD gene is known to be induced upon glucose starvation [22, 23]. An increase was observed in transcript level for the gene encoding PriB protein that is a component of the multiprotein complex called primosome. This complex is believed to be involved in the restart of stalled DNA replication forks. The concerted down regulation of inhibiting and up regulation of activating chromosomal replication and cell division functions is consistent with a cellular response to favorable growth conditions afforded by the presence of glucose.
We found several transcription-related functions to be induced by glucose, these included the α and β subunits of the RNA polymerase core enzyme, as well as the elongation and antitermination factors GreA, NusA and NusG. Under the same growth condition, repression was observed for genes encoding the transcriptional termination factors NusB and Mfd. Thus, the observed responses for these functions are consistent with an expected increase in the transcriptional rate and efficiency caused by the increased biosynthetic demand of the higher growth rate in the presence of glucose. However, we also detected a reduction in transcript levels for genes encoding sigma 70, sigma E and sigma 38. It remains to be determined what the net consequences on the transcriptional capacity and RNA polymerase promoter selectivity would be, resulting from the observed expression changes.
Increased transcript levels were detected for the gene encoding agmatine ureohydrolase (speB), an enzyme involved in the putrescine biosynthetic pathway. Genes encoding the integral membrane component of the flagellar export apparatus FliO (fliO) displayed a decrease in transcript levels. Putrescine synthesis in E. coli can proceed from the decarboxylation of arginine to agmatine and its subsequent hydrolysis to putrescine, reactions catalyzed by the products of genes speA and speB, respectively. The higher transcript levels when growing in LB+G medium for potABD and speB encoding components of the spermidine/putrescine ATP-dependent importer and an enzyme of the putrescine biosynthetic pathway, respectively, are indicative of an increased demand for polyamines when conditions favor a higher growth rate for E. coli. Growth in medium containing glucose is known to repress flagellum synthesis. Gene fliO is a member of the flagellar class II operon fliLMNOPQR, encoding proteins of the export apparatus and the motor/switch complex for flagellar function. This operon can be transcribed by either sigma 70 or the flagellum-specific sigma 28.
This analysis enabled us to demonstrate that glucose causes a change in transcript levels of 380 genes, grouped in 142 TUs, corresponding to 9% of the E. coli genome. If it is assumed that complete operons are induced when at least one gene member is detected in the microarray, then, this number would increase to 492 genes, corresponding to 11% of the E. coli genome. The comparison of the observed transcriptome pattern under the two nutritional conditions studied revealed global responses that involve functions not limited to nutrition/metabolism. Although E. coli displays high and similar growth rates in LB and LB+G media, this analysis reveals different transcriptome patterns that are consistent with distinct physiological states under these two conditions.
Transcriptional regulatory elements and mechanisms involved in glucose responses in E. coli
In recent years, many groups have concentrated on the study of the transcriptional responses of genes that integrate the regulatory network (RN) of some model organisms such as S. cerevisiae and E. coli [27, 28]. Some of these studies have analyzed the connectivities between the genes and TFs to understand topological properties of the RN [28, 29] and infer modules that reflect a correlation between physiological and genetic responses. External stimuli provoke changes in the RN that help the cell to contend with a changing environment. The development of microarray technologies, gives us the opportunity to study globally the expression of genes in response to a given stimulus and try to detect the part of the RN (subnetwork) responsible for the adaptative response.
The second part of this study consisted on the identification of the transcriptional RN involved in the observed glucose responses. This analysis represents an approach to understand at a systems level the behavior of the RN. The complete RN in the current version of the RegulonDB data base  represents 693 interactions involving 402 genes and 89 TFs. From the 380 regulated genes identified in the WTg/WT experiment, 142 possess a known regulatory interaction in RegulonDB. For these genes, we extracted from RegulonDB, the known information about TFs involved in their regulation. With this information, the RN was defined. We organized the regulatory interactions (RI) in strict simple and complex regulons (as previously described . This data organization enabled us to analyze the interplay of the TFs involved in the regulatory changes of expression shown in the microarray data. We observed that 114 of these genes are regulated or coregulated by a global TF  (CRP, FNR, IHF, Fis, ArcA, NarL or Lrp), and only 28 of them don't interact with a global regulator (zntA, mtgA, mgrB, metK, sufB, lon, cysK, uspA, fliO, fruB, pps, pckA, entC, nrdF, nrdH, nrdI, gatY, gatZ, gatA, ilvC, rpoD, rpsU, ahpC, hisJ, sufB, glnB, speB, proX). The TFs involved in the regulation of these 28 genes are GadX, CysB, FadR, FhlD, FruR, Fur, GatR, LexA, OxyR, IlvY, MetJ, PhoB, PurR and NtrC.
Our data revealed a very small number of genes encoding TFs (hupB, crp, fis, marA, cytR, yagA and hcaR) that responded to the conditions studied (presence of glucose or loss of crp function). Although this will be explained in detail below, it should be pointed out that several of these TFs are involved in the regulation of a large number of the genes displaying a significant response to glucose.
As previously reported, we found that glucose responses are highly dependent on the TF, CRP , which is a global dual regulator, that governs the expression of at least 140 genes and corregulates gene expression with 75 other TFs . In E. coli, CCR is mainly mediated by the PTS. When glucose is present in the culture medium, protein IIAGlc lacks the capacity to activate adenylate cyclase; therefore, cAMP is present at relatively low levels. Lacking cAMP, the CRP protein cannot bind DNA and activate catabolite-repressed genes . Therefore, in the presence of glucose, CRP is unable to exert its usually positive effect on its regulated genes. The microarray and RegulonDB data revealed that of the 142 genes with known regulatory interactions, 50 are CRP regulated. Seven of these genes (crp, cstA, ivbL, ilvB, putP, spf and trxA), are regulated only by CRP. The other 42 genes are corregulated by CRP and one or more of 26 other TFs. From the 50 CRP affected genes, RegulonDB data indicates that 34 of them are activated by CRP and other TFs, 7 of them are exclusively activated by CRP, 6 are dual regulated and 3 genes present two CRP sites with opposite functions (Table 1). Except for the gene putP the seven genes that are solely regulated by a negative CRP binding site are induced in our experiment as expected. In the cases of truB, infB nusA and rpsO, the effect of Fis seems to enhance the expression of these genes, suggesting that the repression of putB could occur because of the presence of another TF, alternative regulatory mechanisms or additional CRP binding sites acting as positive regulators.
Transcriptome data showed that some of the genes positively regulated by CRP were down-regulated, in spite of the presence of other positive TFs like MalT, TorR and FNR. This effect had been previously described for the melAB and malM promoters [33, 34], where CRP acts as a coactivator with a second TF. In our data, we found this response for the malE and malM genes, in which CRP triggers the repositioning of MalT to an appropriate activating position, causing the genes to be expressed . The rest of the CRP regulated genes that do not appear repressed by glucose, are exclusively negatively regulated by CRP (trxA), or have one or more regulators that may counteract the effect of CRP (Table 1).
We found an important number of genes to be under the influence of Fis. RegulonDB reports 94 genes regulated by Fis. Our RN data showed 52 genes affected in the presence of glucose by Fis, grouped in 21 transcription units, out of which 48% belong to the Fis simple regulon, sharing some interesting characteristics: a) All are positively regulated by Fis, b) all are tRNA genes and c) when a binding site was reported, the central position varies between -66 and -75. Other members of the group, like tyrT, alaT and tyrV share the same characteristics except that they have three or two Fis binding sites. In the case of the genes alaU, ileU and thrV, a site for the nucleoid-structuring protein (HNS) has been characterized. It has been reported that the Fis site located near the promoter (between -71 and -78) is essential for promoter activation . We observed another group of Fis-regulated genes that share their regulatory region with accessory TFs and additional Fis sites. The group of genes including truB, b3170, nusA, infB and rpsO, are co-transcribed by the complex regulon – ArgR(-), CRP(-) and Fis(+) --. According to our data, this group appeared coordinately induced. We assume that this induction is caused by Fis activation together with no repressing effect of CRP (inactive in the presence of glucose) or ArgR.
The nuo genes, encoding the proton-translocating NADH:quinone oxidoreductase, appeared coordinately expressed, and all of the nuo genes are organized in a 13 genes operon (one of the longest transcription units in the genome). It has been reported that regulation of the expression of the nuo operon is subject to ArcA, that mediates anaerobic repression and NarL that mediates anaerobic activation in the presence of nitrate . FNR and IHF act as weak repressors under anaerobic conditions , and Fis has been reported to stimulate expression of the operon in early exponential phase and to a lesser extent in the late exponential and stationary growth phases . No significant difference in dissolved oxygen tension is expected when comparing cultures in LB or LB+G. Therefore, it can be speculated that transcriptional downregulation of the nuo operon is caused by medium composition or cell growth rate by an unknown mechanism. We detected an increase in the activity of marA, a gene that codes for the MarA TF, which is known to regulate its own expression . Previous reports demonstrated that Fis stimulates expression of marA when MarA acts as an activator .
CRP has been described as the master regulator largely responsible for the expression pattern when E. coli is grown in glucose as the carbon source. However, very little is known about the influence of Fis on the gene expression pattern under the same conditions. We found a previous report showing that Fis is the factor mostly responsible for catabolite repression at the nrf promoter . Experiments from other groups revealed that Fis assists both Mlc repression and CRP-cAMP activation of ptsG through the formation of Fis-CRP-Mlc or Fis-CRP nucleoprotein complexes at the ptsG promoter depending on the glucose availability in the growth medium [39, 40]. Considering the large fraction of genes regulated by Fis identified in our study, it is clear that this TF has an important role in the cellular response to glucose.
Cluster analysis of transcriptome data for selected genes of wild type and crp- strains
The TU including genes aceE and aceF is positively regulated by CRP, dualy regulated by FNR and negatively regulated by PdhR. Considering that upregulation was also observed in the crp mutant, it can be inferred that CRP is not participating in this response. No changes in dissolved oxygen tension are expected when comparing cultures in LB or LB+G; therefore regulation by FNR can be ruled out. On the other hand, in LB+G, glucose catabolism should cause an increase in pyruvate concentration when compared to growth in LB medium. If this is the case, pyruvate can bind to and inactivate the repressor PdhR, thus causing the observed induction.
Another remarkable observation resulted from examination of the genes that appeared repressed, but a binding site for CRP or for other TFs regulated by CRP has not been identified (considering the information available in Regulon DB or EcoCyc). This was the case for the pckA, lon, gatA, gatZ, gatY, gcvH, gcvT, osmE, dppA, pspE, ilvC, rpoD, lysU, and tdh genes. Some of them, as mentioned before, are carrier proteins related to the import of alternative carbon and nitrogen sources (gatA, gatZ, gatY and dppA). The genes aceA and pckA deserve special attention because their regulator, the fructose repressor (FruR), is known to be partially inactivated in the presence of glucose. Fructose-1-phosphate and fructose-1-6-diphosphate, (direct products of glycolysis), bind to FruR and inactivate its DNA-binding capacity [41, 43]. As FruR positively regulates the expression of these two genes, the inactivation of the regulator causes the gene to be down regulated, a result that can be observed in our data. In addition, we found the gene fruB to be upregulated by the presence of glucose. This gene is repressed by FruR. In this case, we again find evidence of FruR inactivation by glycolytic intermediates. These are significant results, as they allowed us to infer that a higher internal level of the glycolytic intermediate fructose-1-6-bisphosphate is present in the cells growing in the LB+G medium, when compared to the LB grown cells.
The genes osmE and ompF displayed a significant change in their levels of expression being induced in the crp- mutant and repressed in the presence of glucose. It has not been reported that CRP directly regulates these well characterized genes. Instead, CRP directly controls the expression of the ompR gene, whose product controls the expression of ompF. Our result is consistent with a report showing an increment in the expression level of ompF under glucose limitation . The effect is caused by the absence of cAMP that increases the levels of phosphorylated OmpR, which repress expression of ompF.
We have presented some of the relevant observations that can be extracted from table 1 and the cluster analysis comparing the wild type and the crp- mutant. This analysis has shown that, as has been pointed out before, catabolic repression is mainly controlled by CRP, but that a small set of genes respond as a consequence of the intervention of TFs that have no described relationship with CRP. On the other hand, the prevalent role of Fis in the activation of genes under the LB+G conditions becomes evident in this analysis. It is known that fis gene transcription levels respond to growth rate, as can be expected since cells in LB+G medium grow 5% faster than cells in LB. Interestingly, it was also found that in the crp- mutant, a strain that grows 5% slower than the wild type strain in the same LB medium, fis transcript levels are increased 3 fold (Table 1). Thus, these results show that CRP is playing an important role in fis regulation, resulting in its derepression when glucose is present.
Topological analysis of the regulatory network involved in the glucose response
The experimental results revealed that transcription factors CRP and Fis, are major regulators causing an extended response to glucose. However, it is clear that other TFs are also involved in controlling the genes found to respond to glucose. To help in identifying the relative roles of these TFs, an analysis of the properties of the regulatory network and its subnetworks (modules) is required. Resendis et al , demonstrate that the analysis of the regulatory network in terms of its topology will evidence the relationship between modules and physiological functional classes . Starting from the identified RN, we then performed a topological analysis to identify modules involved in the observed transcriptional responses.
We found four cases (submodules 8.2, 8.6, 8.8 and 8.19), in which one gene presents the opposite expression pattern compared to the other members of the group. In all cases, the gene with opposite expression pattern lacks one of the TF binding sites present in the other genes. An example is the subbranch containing the genes aceE, aceF and aspA, in which the first two genes are corregulated positively by CRP, positively or negatively by FNR, and negatively by PdhR. If we consider only the information found in RegulonDB and EcoCyc, the increased levels of expression of aceE and aceF should be a consequence of the inactivation of CRP and PdhR. Considering the low levels of cAMP, and the increase of pyruvate as a product of glycolysis , we can assume that FNR might activate or not repress the aceE and aceF genes. The aspA gene, which is positively regulated by CRP and FNR, appears down regulated in the presence of glucose. This result is consistent with the finding that aspA is under catabolite repression control [49, 50].
The analysis of transcriptome data collected under conditions of glucose deficiency and sufficiency in a complex medium enabled us to identify functions involved in the adaptation of E. coli to these two different growth conditions. The known repressive effects of glucose on gluconeogenesis and on alternative carbon source import and metabolism were clearly demonstrated. Furthermore, when glucose was present in the medium, an increase in overall protein synthesis capacity was observed. Also, responsive to the presence of glucose were genes encoding different cellular functions including cell division, replication, transcription, and the biosynthesis of cofactors, nucleic acids, amino acids and lipids. This analysis also revealed that functions related to proteolysis and protein folding are apparently more important when E. coli is growing in LB medium as compared with LB+G medium.
The topological analysis of the RN involved in the regulation of a subset of glucose-responsive genes, revealed eight modules including 37 TFs. Most of the RN topological modules include genes encoding functions with similar physiological roles, and together they represent a significant part of the glucose stimulon. The modules we identified partially correspond to the regulatory subnetworks originating at sensor TFs (origons) that have been identified in the complete E. coli RN. The difference can be explained considering that we have limited our analyses to specific growth conditions and a subset of the RN. It can be assumed that this is still a partial representation of the RN involved in this response, since the functions of a significant number of TFs in E. coli are still unknown [30, 51]. In spite of this shortcoming, our results and those previously reported by other groups indicate that CRP and Fis play a dominant role in the transcriptional responses detected in this study. This analysis places CRP and Fis as central TFs in the subset of the E. coli RN that senses and responds to glucose and other sugars. These two regulatory proteins integrate different types of signals that reflect the nutritional composition of the medium and the physiological state of the cell, causing a corresponding genome-wide transcriptional response.
Current limits in sensitivity and specificity for transcriptome analysis methodologies, together with our incomplete knowledge of the properties and interactions of TFs, still do not allow a thorough understanding of the cellular response to specific stimuli. However, integrative analysis of transcriptome and RN data as performed here, should provide a framework for the future generation of models representing the cell's capacity to respond to a changing environment.
Source of experimental data
Transcriptome data was obtained from previously reported experiments performed with E. coli strain BW25113 and an isogenic crp mutant (LJ3017). Briefly, strains were grown at 37°C with agitation in Luria-Bertani (LB) broth containing 50 mM potassium phosphate, pH 7.4, and 0.2 mM Lcysteine with or without 0.4% glucose. Cells were grown in triplicate in 25 ml of medium in shake flasks starting at an OD600 of 0.05 and harvested in the exponential growth phase when cultures reached an OD600 of 0.5. When grown in LB medium, generation times for strains BW25113 and LJ3017 corresponded to 37 and 43 min., respectively. In LB+G medium, generation times for strains BW25113 and LJ3017 corresponded to 35 and 41 min., respectively[10, 21]. Total RNA was extracted from each sample, processed and hybrydized to the Affymetrix E. coli array which includes 4327 genes and intergenic regions.
Array scanning, data collection and normalization were performed following the procedure described by Caldwell et al. 2001. Three data sets were obtained for each of three experimental conditions: wild type grown in LB medium (WT), wild type grown in LB medium + glucose (WTg) and a crp- mutant grown in LB medium (CRP). The data sets for each strain and condition were compared pair-wise to determine the Pearson correlation coefficient. For each triplicate data set, the two sets with the highest Pearson correlation coefficient were retained for further analysis.
For each pair of data sets of all experimental conditions, the reliability of the data for each gene was calculated according to the Affymetrix statistical algorithms reference guide (Affimetrix, Inc., 2004). A "Present" absolute call is assigned to a gene when the signal/noise ratio is higher than an internally calculated threshold. When signal value data for each gene displayed a "Present" absolute call in both duplicate experiments, both values were considered to be reliable. The two signal values for that gene were averaged, and the resulting data were used in subsequent analyses. Using this approach, the number of genes considered for further analysis corresponded to 1908, 1910 and 3083 for WT, WTg and CRP conditions, respectively. Using the signal averages for each condition, we then calculated the WTg/WT and CRP/WT log ratios.
Identification of differentially transcribed genes
Differentially transcribed genes were selected using an outlier iteration method [12–14]. The method consists in calculating the average and the standard deviation (SD) of the log ratio for all sets of genes under the four conditions. In order to identify significant levels of gene expression, we assumed that the threshold value of significance is two SD. Thus, any gene with a log ratio higher than two SD from the mean is considered an outlier. Outliers were removed from the population and gathered in a differentially expressed subset. For the rest of the genes, we calculated again the averages of their log ratios and their SD values. Selection of the outliers was determined as in the previous case. The process was repeated until no outliers were detected in each situation. Using this method, the number of genes selected corresponded to 380 for WTg/WT and 333 for CRP/WT. For CRP/WT, 196 genes were down regulated and 137 up regulated. Table S1 shows the genes identified in this study, where values for WTg/WT and CRP/WT log ratios are provided. In addition, when known, the regulatory phrase for each gene is indicated and also, when a gene is part of an operon, the genes belonging to it are indicated. Information about gene functions and operon organization was obtained from RegulonDB  and EcoCyc . It should be pointed out that the terms "induced" and "repressed" are used in this work to indicate increased or decreased transcript levels, respectively. These terms do not imply a particular mechanism of gene regulation.
Clustering of microarray data
We applied a hierarchical centroid linkage clustering algorithm[41, 55] with correlation uncentered as similarity measure, to the WTg/WT and CRP/WT log ratios. The clustering results were visualized using the Treeview program.
Extraction of condition-specific subnetworks
For each microarray condition (WTg/WT or CRP/WT), we reconstructed a condition specific subnetwork as follows. From the transcriptional regulatory network (RN) of E. coli, we extracted the genes identified for each microarray condition, the TFs regulating their expression, and the transcriptional interactions between TFs and regularted genes. In these subnetworks, nodes represent genes, and edges represent the transcriptional interactions. Known regulatory sites and transcriptional unit organization were obtained from RegulonDB ) and EcoCyc.
Identification of condition-specific modules
We identified the WTg/WT condition-specific modules applying to the condition specific subnetwork the methodology described in Resendis-Antonio et al. That is to say, we clustered the genes based on their shortest distance within the network. Afterwards, we annotated each gene with its corresponding microarray expression level.
LB+G WTg, WT, TF, RN, PTS.
We thank Heladia Salgado Nancy Mena and Verónica Jiménez for technical assistance. We also thank the Computational Unit and the 'Macroproyecto de Tecnologías de la Información y la Computación de la Universidad Nacional Autónoma de México' for the use of their computer facilities. This work was supported by grants IN205005-2 and IN203705-3 from PAPIIT-UNAM, and CONACyT. J.A. Freyre-González is supported by Ph.D. fellowship number 176341 from CONACyT-México.
- Preston RD: The physical biology of plant cell walls. 1974, London, Chapman and HallGoogle Scholar
- Saier MH, Ramseier TM, Reizer J: Regulation of carbon utilization. Escherichia coli and Salmonella. Cellular and Molecular Biology. Edited by: Neidhardt FC. 1996, Washington, D.C., American Society for Microbiology, 1325-1343.Google Scholar
- Postma PW, Lengeler JW, Jacobson GR: Phosphoenolpyruvate: Carbohydrate phosphotransferase systems. Escherichia coli and Salmonella. Cellular and Molecular Biology. Edited by: Neidhardt FC. 1996, Washington, D.C., American Society for Microbiology, 1149-1174.Google Scholar
- Saier MH: Vectorial metabolism and the evolution of transport systems. J Bacteriol. 2000, 182: 5029-5035. 10.1128/JB.182.18.5029-5035.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Tchieu JH, Norris V, Edwards JS, Saier MH: The complete phosphotranferase system in Escherichia coli. J Mol Microbiol Biotechnol. 2001, 3: 329-346.PubMedGoogle Scholar
- Hua Q, Yang C, Oshima T, Mori H, Shimizu K: Analysis of gene expression in Escherichia coli in response to changes of growth-limiting nutrient in chemostat cultures. Appl Environ Microbiol. 2004, 70: 2354-2366. 10.1128/AEM.70.4.2354-2366.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu M, Durfee T, Cabrera JE, Zhao K, Jin DJ, Blattner FR: Global transcriptional programs reveal a carbon source foraging strategy by Escherichia coli. J Biol Chem. 2005, 280: 15921-15927. 10.1074/jbc.M414050200.View ArticlePubMedGoogle Scholar
- Tao H, Bausch C, Richmond C, Blattner FR, Conway T: Functional genomics: expression analysis of Escherichia coli growing on minimal and rich media. J Bacteriol. 1999, 181: 6425-6440.PubMed CentralPubMedGoogle Scholar
- Oh MK, Rohlin L, Kao KC, Liao JC: Global expression profiling of acetate-grown Escherichia coli. J Biol Chem. 2002, 277: 13175-13183. 10.1074/jbc.M110809200.View ArticlePubMedGoogle Scholar
- Gosset G, Zhang Z, Nayyar S, Cuevas WA, Saier MH: Transcriptome analysis of Crp-dependent catabolite control of gene expression in Escherichia coli. J Bacteriol. 2004, 186: 3516-3524. 10.1128/JB.186.11.3516-3524.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Selinger DW, Cheung KJ, Mei R, Johansson EM, Richmond CS, Blattner FR, Lockhart DJ, Church GM: RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat Biotechnol. 2000, 18: 1262-1268. 10.1038/82367.View ArticlePubMedGoogle Scholar
- Britton RA, Eichenberger P, Gonzalez-Pastor JE, Fawcett P, Monson R, Losick R, Grossman AD: Genome-wide analysis of the stationary-phase sigma factor (sigma-H) regulon of Bacillus subtilis. J Bacteriol. 2002, 184: 4881-4890. 10.1128/JB.184.17.4881-4890.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Loos A, Glanemann C, Willis LB, O'Brien XM, Lessard PA, Gerstmeir R, Guillouet S, Sinskey AJ: Development and validation of corynebacterium DNA microarrays. Appl Environ Microbiol. 2001, 67: 2310-2318. 10.1128/AEM.67.5.2310-2318.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Zheng D, Constantinidou C, Hobman JL, Minchin SD: Identification of the CRP regulon using in vitro and in vivo transcriptional profiling. Nucleic Acids Res. 2004, 32: 5874-5893. 10.1093/nar/gkh908.PubMed CentralView ArticlePubMedGoogle Scholar
- http://www.ibt.unam.mx/biocomputo/gutierrez_rios.htm. 2000, http://www.ibt.unam.mx/biocomputo/gutierrez_rios.htm
- Plumbridge J: Expression of ptsG, the gene for the major glucose PTS transporter in Escherichia coli, is repressed by Mlc and induced by growth on glucose. Mol Microbiol. 1998, 29: 1053-1063. 10.1046/j.1365-2958.1998.00991.x.View ArticlePubMedGoogle Scholar
- Cronan JE, LaPorte D: Tricarboxylic acid cycle and glyoxylate bypass. Escherichia coli and Salmonella: cellular and molecular biology. Edited by: Neidhardt FC, Curtiss III R, Ingraham JL, Lin ECC, Low KB, Magasanik B, Reznikoff WS, Riley M, Schaechter M and Umbarger HE. 1996, Washington, D.C., ASM Press, 206-216. 2ndGoogle Scholar
- Goff SA, Goldberg AL: Production of abnormal proteins in E. coli stimulates transcription of lon and other heat shock genes. Cell. 1985, 41: 587-595. 10.1016/S0092-8674(85)80031-3.View ArticlePubMedGoogle Scholar
- Parsell DA, Sauer RT: Induction of a heat shock-like response by unfolded protein in Escherichia coli: dependence on protein level not protein degradation. Genes Dev. 1989, 3: 1226-1232. 10.1101/gad.3.8.1226.View ArticlePubMedGoogle Scholar
- Gourse RL, Gaal T, Bartlett MS, Appleman JA, Ross W: rRNA transcription and growth rate-dependent regulation of ribosome synthesis in Escherichia coli. Annu Rev Microbiol. 1996, 50: 645-677. 10.1146/annurev.micro.50.1.645.View ArticlePubMedGoogle Scholar
- Zhang Z, Gosset G, Barabote R, Gonzalez CS, Cuevas WA, Saier MH: Functional interactions between the carbon and iron utilization regulators, Crp and Fur, in Escherichia coli. J Bacteriol. 2005, 187: 980-990. 10.1128/JB.187.3.980-990.2005.PubMed CentralView ArticlePubMedGoogle Scholar
- Yamanaka K, Inouye M: Growth-phase-dependent expression of cspD, encoding a member of the CspA family in Escherichia coli. J Bacteriol. 1997, 179: 5126-5130.PubMed CentralPubMedGoogle Scholar
- Yamanaka K, Zheng W, Crooke E, Wang YH, Inouye M: CspD, a novel DNA replication inhibitor induced during the stationary phase in Escherichia coli. Mol Microbiol. 2001, 39: 1572-1584. 10.1046/j.1365-2958.2001.02345.x.View ArticlePubMedGoogle Scholar
- Szumanski MB, Boyle SM: Influence of cyclic AMP, agmatine, and a novel protein encoded by a flanking gene on speB (agmatine ureohydrolase) in Escherichia coli. J Bacteriol. 1992, 174: 758-764.PubMed CentralPubMedGoogle Scholar
- Silverman M, Simon MI: Bacterial flagella. Annu Rev Microbiol. 1977, 31: 397-419. 10.1146/annurev.mi.31.100177.002145.View ArticlePubMedGoogle Scholar
- Liu X, Matsumura P: Differential regulation of multiple overlapping promoters in flagellar class II operons in Escherichia coli. Mol Microbiol. 1996, 21: 613-620. 10.1111/j.1365-2958.1996.tb02569.x.View ArticlePubMedGoogle Scholar
- Blais A, Dynlacht BD: Constructing transcriptional regulatory networks. Genes Dev. 2005, 19: 1499-1511. 10.1101/gad.1325605.View ArticlePubMedGoogle Scholar
- Resendis-Antonio O, Freyre-Gonzalez JA, Menchaca-Mendez R, Gutierrez-Rios RM, Martinez-Antonio A, Avila-Sanchez C, Collado-Vides J: Modular analysis of the transcriptional regulatory network of E. coli. Trends Genet. 2005, 21: 16-20. 10.1016/j.tig.2004.11.010.View ArticlePubMedGoogle Scholar
- Balazsi G, Barabasi AL, Oltvai ZN: Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli. Proc Natl Acad Sci U S A. 2005, 102: 7841-7846. 10.1073/pnas.0500365102.PubMed CentralView ArticlePubMedGoogle Scholar
- Salgado H, Gama-Castro S, Peralta-Gil M, Diaz-Peredo E, Sanchez-Solano F, Santos-Zavaleta A, Martinez-Flores I, Jimenez-Jacinto V, Bonavides-Martinez C, Segura-Salazar J, Martinez-Antonio A, Collado-Vides J: RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 2006, 34: D394-D397. 10.1093/nar/gkj156.PubMed CentralView ArticlePubMedGoogle Scholar
- Gutierrez-Rios RM, Rosenblueth DA, Loza JA, Huerta AM, Glasner JD, Blattner FR, Collado-Vides J: Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 2003, 13: 2435-2443. 10.1101/gr.1387003.PubMed CentralView ArticlePubMedGoogle Scholar
- Martinez-Antonio A, Collado-Vides J: Identifying global regulators in transcriptional regulatory networks in bacteria. Curr Opin Microbiol. 2003, 6: 482-489. 10.1016/j.mib.2003.09.002.View ArticlePubMedGoogle Scholar
- Barnard A, Wolfe A, Busby S: Regulation at complex bacterial promoters: how bacteria use different promoter organizations to produce different regulatory outcomes. Curr Opin Microbiol. 2004, 7: 102-108. 10.1016/j.mib.2004.02.011.View ArticlePubMedGoogle Scholar
- Richet E, Vidal-Ingigliardi D, Raibaud O: A new mechanism for coactivation of transcription initiation: repositioning of an activator triggered by the binding of a second activator. Cell. 1991, 66: 1185-1195. 10.1016/0092-8674(91)90041-V.View ArticlePubMedGoogle Scholar
- Travers A, Muskhelishvili G: DNA microloops and microdomains: a general mechanism for transcription activation by torsional transmission. J Mol Biol. 1998, 279: 1027-1043. 10.1006/jmbi.1998.1834.View ArticlePubMedGoogle Scholar
- Bongaerts J, Zoske S, Weidner U, Unden G: Transcriptional regulation of the proton translocating NADH dehydrogenase genes (nuoA-N) of Escherichia coli by electron acceptors, electron donors and gene regulators. Mol Microbiol. 1995, 16: 521-534. 10.1111/j.1365-2958.1995.tb02416.x.View ArticlePubMedGoogle Scholar
- Wackwitz B, Bongaerts J, Goodman SD, Unden G: Growth phase-dependent regulation of nuoA-N expression in Escherichia coli K-12 by the Fis protein: upstream binding sites and bioenergetic significance. Mol Gen Genet. 1999, 262: 876-883. 10.1007/s004380051153.View ArticlePubMedGoogle Scholar
- Martin RG, Rosner JL: Fis, an accessorial factor for transcriptional activation of the mar (multiple antibiotic resistance) promoter of Escherichia coli in the presence of the activator MarA, SoxS, or Rob. Journal of Bacteriology. 1997, 179: 7410-7419.PubMed CentralPubMedGoogle Scholar
- Browning DF, Grainger DC, Beatty CM, Wolfe AJ, Cole JA, Busby SJ: Integration of three signals at the Escherichia coli nrf promoter: a role for Fis protein in catabolite repression. Mol Microbiol. 2005, 57: 496-510. 10.1111/j.1365-2958.2005.04701.x.View ArticlePubMedGoogle Scholar
- Shin D, Cho N, Heu S, Ryu S: Selective regulation of ptsG expression by Fis. Formation of either activating or repressing nucleoprotein complex in response to glucose. J Biol Chem. 2003, 278: 14776-14781. 10.1074/jbc.M213248200.View ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMed CentralView ArticlePubMedGoogle Scholar
- Nasser W, Schneider R, Travers A, Muskhelishvili G: CRP modulates fis transcription by alternate formation of activating and repressing nucleoprotein complexes. J Biol Chem. 2001, 276: 17878-17886. 10.1074/jbc.M100632200.View ArticlePubMedGoogle Scholar
- Bledig SA, Ramseier TM, Saier MH: Frur mediates catabolite activation of pyruvate kinase (pykF) gene expression in Escherichia coli. J Bacteriol. 1996, 178: 280-283.PubMed CentralPubMedGoogle Scholar
- Ramseier TM, Negre D, Cortay JC, Scarabel M, Cozzone AJ, Saier MH: In vitro binding of the pleiotropic transcriptional regulatory protein, FruR, to the fru, pps, ace, pts and icd operons of Escherichia coli and Salmonella typhimurium. J Mol Biol. 1993, 234: 28-44. 10.1006/jmbi.1993.1561.View ArticlePubMedGoogle Scholar
- Liu X, Ferenci T: An analysis of multifactorial influences on the transcriptional control of ompF and ompC porin expression under nutrient limitation. Microbiology. 2001, 147: 2981-2989.View ArticlePubMedGoogle Scholar
- Jenkins DE, Auger EA, Matin A: Role of RpoH, a heat shock regulator protein, in Escherichia coli carbon starvation protein synthesis and survival. J Bacteriol. 1991, 173: 1992-1996.PubMed CentralPubMedGoogle Scholar
- Wonderling LD, Stauffer GV: The cyclic AMP receptor protein is dependent on GcvA for regulation of the gcv operon. J Bacteriol. 1999, 181: 1912-1919.PubMed CentralPubMedGoogle Scholar
- Quail MA, Haydon DJ, Guest JR: The pdhR-aceEF-lpd operon of Escherichia coli expresses the pyruvate dehydrogenase complex. Mol Microbiol. 1994, 12: 95-104. 10.1111/j.1365-2958.1994.tb00998.x.View ArticlePubMedGoogle Scholar
- Bell PJ, Andrews SC, Sivak MN, Guest JR: Nucleotide sequence of the FNR-regulated fumarase gene (fumB) of Escherichia coli K-12. J Bacteriol. 1989, 171: 3494-3503.PubMed CentralPubMedGoogle Scholar
- Woods SA, Miles JS, Roberts RE, Guest JR: Structural and functional relationships between fumarase and aspartase. Nucleotide sequences of the fumarase (fumC) and aspartase (aspA) genes of Escherichia coli K12. Biochem J. 1986, 237: 547-557.PubMed CentralView ArticlePubMedGoogle Scholar
- Perez-Rueda E, Collado-Vides J, Segovia L: Phylogenetic distribution of DNA-binding transcription factors in bacteria and archaea. Comput Biol Chem. 2004, 28: 341-350. 10.1016/j.compbiolchem.2004.09.004.View ArticlePubMedGoogle Scholar
- Caldwell R, Sapolsky R, Weyler W, Maile RR, Causey SC, Ferrari E: Correlation between Bacillus subtilis scoC phenotype and gene expression determined using microarrays for transcriptome analysis. J Bacteriol. 2001, 183: 7329-7340. 10.1128/JB.183.24.7329-7340.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- http://regulondb.ccg.unam.mx/. 2007,http://regulondb.ccg.unam.mx/
- http://www.ecocyc.org/. 2007, http://www.ecocyc.org/
- De Hoon MJ, Imoto S, Nolan J, Miyano S: Open source clustering software. Bioinformatics. 2004, 20: 1453-1454. 10.1093/bioinformatics/bth078.View ArticlePubMedGoogle Scholar
- Saldanha AJ: Java Treeview--extensible visualization of microarray data. Bioinformatics. 2004, 20: 3246-3248. 10.1093/bioinformatics/bth349.View ArticlePubMedGoogle Scholar
- Keseler IM, Collado-Vides J, Gama-Castro S, Ingraham J, Paley S, Paulsen IT, Peralta-Gil M, Karp PD: EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 2005, 33: D334-D337. 10.1093/nar/gki108.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.