Airway epithelium is the primary target for pathogens. It functions not only as a mechanical barrier, but also as an important sentinel of the innate immune system. However, the interactions and processes between host airway epithelium and pathogens are not fully understood.
In this study, we identified responses of the human airway epithelium cells to respiratory pathogen infection. We retrieved three mRNA expression microarray datasets from the Gene Expression Omnibus database, and identified 116 differentially expressed genes common to all three datasets. Gene functional annotations were performed using Gene Ontology and pathway analyses. Using protein-protein interaction network analysis and text mining, we identified a subset of genes functioned as a group and associated with infection, inflammation, tissue adhesion, and receptor internalization in infected epithelial cells. These genes were further identified in BESE-2B cells in response to Talaromyces marneffei by Real-Time quantitative PCR (qRT-PCR). In addition, we performed an in silico prediction of microRNA-target interactions and examined our findings.
Using bioinformatics analysis, we identified several genes that may serve as biomarkers for the diagnosis or the surveillance of early respiratory tract infection, and identified additional genes and miRNAs that warrant further fundamental experimental research.
Respiratory tract infections are common diseases caused by a number of diverse pathogens. These diseases are serious public health concerns globally and pose significant challenges for the World Health Organization. Although treatment using potent anti-infection drugs and effective vaccines has greatly reduced mortality worldwide, respiratory tract infections remain a leading cause of death, in particular, for developing countries . In 2015, there were 17.2 billion cases of upper respiratory infections . A respiratory tract infection is a major risk to patients with chronic respiratory diseases, such as asthma, chronic obstructive pulmonary diseases, and bronchiectasis. Infections caused by viral and/or bacterial pathogens exacerbate chronic respiratory diseases .
The airway epithelium is an extremely important barrier against respiratory pathogens. It covers a large surface area and is the primary target for respiratory pathogens . In addition to acting as a natural barrier, recent studies found that airway epithelium functions as an important sentinel of the innate immune system against pathogens . Interactions between epithelial cells and pathogens are inherently complex and encompass many different factors, including adhesion, internalization, and induction of the immune response . In addition to being able to invade epithelial cells, some pathogens are capable of reproducing within infected cells, and in avoiding detection by the host immune system . Therefore, prevention of pathogens from invading airway epithelial cells is crucial for the prophylaxis and treatment of respiratory infections. In the present study, we investigated interactions between airway epithelium cells and respiratory pathogens to better understand the underlying pathogenesis. Our findings may provide improved prophylaxis, surveillance, earlier accurate diagnosis, and better treatments.
Currently, high-throughput technologies such as microarrays and next-generation sequencing combined with bioinformatics analysis enable the generation and analysis of very large datasets, including mRNA, miRNA, and long non-coding RNA expression profiles, and DNA methylation. Such datasets are available in public archives such as the Gene Expression Omnibus (GEO). In this study, we retrieved three datasets of mRNA expression microarrays from GEO, and using bioinformatics analysis, we identified a group of genes as biomarkers in airway epithelial cells response to infection by respiratory pathogens, and we also identified several candidate targets for further fundamental experimental research.
The mRNA expression profile datasets GSE3397 (Additional file 1), GSE6802 (Additional file 2), and GSE48466 (Additional file 3) were generated using the GPL570, GPL571 and GPL570 microarray platform at Duke University Medical Center, the Technical University Munich, and the University of Louisville, respectively. Thirty-five samples were used consisting of 10 samples of normal human bronchial epithelial cells and 25 samples of human bronchial epithelial cells exposed to the H1N1 influenza virus, the respiratory syncytial virus, Staphylococcus aureus, and Pseudomonas aeruginosa (Table 1).
Differentially expressed genes
Using GEO2R, we identified 2033, 1241, and 12,950 differentially expressed genes (DEGs) between normal and infected airway epithelial cells from the GSE3397, GSE6802, and GSE48466 datasets, respectively. We found that 116 genes were differentially expressed in all three datasets (Fig. 1). This set of 116 genes underwent further evaluation.
This subset of 116 genes was uploaded to DAVID database for biological and functional assessments, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Using Gene Ontology (GO) analysis, we found that these DEGs were usually involved in one of the following 10 biological processes: regulation of foam cell differentiation, regulation of tyrosine phosphorylation of the STAT protein, myeloid leukocyte differentiation, regulation of the JAK-STAT cascade, response to cytokine stimulus, myeloid cell differentiation, cAMP-mediated signaling, anti-apoptosis, ectoderm development and tissue morphogenesis (Table 2). We found that the most significantly enriched pathways were the ubiquitin mediated proteolytic pathway, NOD-like receptor (NLR) signaling pathway, and the apoptosis pathway (Table 3).
Protein and protein network
Next, the 116 DEGs were analyzed using the STRING database, and 71 protein-protein interactions (PPI) pairs were derived (Fig. 2a), which then underwent analysis using Cytoscape to construct PPI network. Using the plugin DCOME, we identified three significant modules consisting of the following 12 hub genes (Fig. 2b): CTSS, NOTCH4, IL8, CREB1, TCF3, SERPINA1, PTGER3, RGS4, OPRM1, MPP6, FGFR1, and NSUN3. We found using enrichment analysis that the main roles of these hub genes were in the inflammatory response.
Verification of hub genes
To verify the 12 hub genes, we used qRT-PCR to identify the expression level of these differentially expressed mRNAs in human bronchial epithelial cells infected with Talaromyces marneffei (T.marneffei, Fig. 3). The results of qRT-PCR showed that all hub genes, except for OPRM1 were differentially expressed (P < 0.05, Table 4), and were consistent with the three microarray datasets. These suggest that the 12 hub genes may function as a group, and have a role in the pathogens infection to human bronchial epithelial cells.
Text mining and prediction of miRNA
To demonstrate further the relationship between the hub genes and respiratory tract infection, text mining was performed using COREMINE. Co-occurrence analysis of the literature was conducted using “gene symbols”, “receptor internalization”, “tissue adhesion”, “inflammation”, and “infection” as search terms. We found that 10 genes (CTSS, NOTCH4, IL8, CREB1, TCF3, SERPINA1, PTGER3, RGS4, OPRM1, and FGFR1) were identified in the text-mining searches. Although all 10 of these genes were related to both inflammation and infection in the literature, we also found that two of these genes IL8 and SERPINA1 were associated with tissue adhesion, whereas six genes IL8, CREB1, RGS4, PTGER3, OPRM1, and FGFR1 were associated with and receptor internalization (Fig. 4).
Searching the miRWalk, miRanda, miRDB, RNA22, and Targetscan databases, we predicted miRNAs for the 12 hub genes previously identified. Although eight miRNAs (hsa-miR-762, hsa-miR-93-5p, hsa-miR-20a-5p, hsa-miR-3192-5p, hsa-miR-1294, hsa-miR-1972, hsa-miR-106b-5p and hsa-miR-526b-3p) were predicted by all 5 databases, we found that only one gene (TCF3) has a miRNA-target interaction. In addition, we found that hsa-miR-762 was associated with pathogen infection . There was no evidence in the literature regarding the remaining miRNAs and an association with infectious diseases.
The GEO database includes high-throughput gene expression datasets and other functional genomics data . The online tool, GEO2R is based on the R programming language and performs statistical analysis enabling users to access and analyze practically any GEO Series, regardless of data type . This tool was used to analyze published microarray data in the GEO database. However, because the scope of microarray analysis in independent studies may have been limited, and that different experiments may identify different gene or mRNA targets, the robustness and reliability of these previous findings may be limited. To address this potential limitation in the present study, we used three independent microarray datasets of gene expression from airway epithelial cells responding to infection by respiratory pathogens, and collectively analyzed and evaluated all three datasets in our study. Therefore, our identification of genes that are significantly dysregulated in airway epithelial cells exposed to pathogens was identified using the collective data from three different microarray expression studies.
In the present study, we found 116 DEGs in our analysis of three independent microarray datasets. We used GO enrichment to understand better the underlying biological processes that are associated with these genes. GO is a major bioinformatics initiative to unify the characterization of gene and gene product attributes in all organisms . For the present study, GO terms may identify the putative role that our set of genes may have in the process of the pathogen infection of airway epithelial cells. Our analysis of biological processes and signaling pathways demonstrated that these 116 genes were mainly involved in myeloid cell differentiation, cytokine stimulus, regulation of tyrosine phosphorylation of the STAT protein, regulation of the JAK-STAT cascade, and the NLR signaling pathway. These findings indicate that the functions of these genes are closely related to the immune response. For example, the JAK-STAT cascade is a major intracellular signaling response elicited by class I cytokine receptors, the activation of which results in direct and rapid changes in gene expression upon cytokine stimulation . In addition, The NLR family pyrin domain containing 3 (NLRP3) inflammasome is a multiprotein complex that orchestrates innate immune responses to infection and cell stress through activation of caspase-1 (CASP1) and maturation of the inflammatory cytokines pro-interleukin-1β (pro-IL-1β) and pro-IL-18 .
To examine further the interrelationships between the DEGs, we constructed a PPI network and identified the following 12 hub genes: CTSS, NOTCH4, IL8, CREB1, TCF3, SERPINA1, PTGER3, RGS4, OPRM1, MPP6, FGFR1, and NSUN3. We found that these hub genes may function as a group and have an important role in viruses or bacteria infection. In order to investigate whether the group of hub genes is altered by particular pathogen, we further identified these hub genes by qRT-PCR in BEAS-2B cells infected by T.marneffei, a thermal dimorphic pathogenic fungus, which is able to cause fatal systemic infections in human. Our results showed that these hub genes involved in human bronchial epithelial cell in response to T.marneffei. So we posit that these genes may have clinical applications serve as biomarkers for early respiratory tract infection. In addition, we found using enrichment analysis that these hub genes were also associated with inflammation.
Text mining indicated that the majority of these hub genes were related to tissue adhesion, receptor internalization, inflammation, and infection. For example, human epithelial cells infected with Chlamydia secrete IL8, which is known to be activated by p42/44 MAPK cascade . Aronoff et al. demonstrated that the prostaglandin E receptors 2 and 3 (PTGER2-PTGER3) axis has important roles in the prevention and treatment of infectious diseases . In addition to the host defense response, adhesion, receptor internalization, and inflammatory responses are intrinsic to pathogen invasion. Of the 12 hub genes identified, we found that two genes were associated with tissue adhesion, six genes were associated with receptor internalization, and 10 genes were associated with inflammation. This finding demonstrates that these genes may have key roles in the process of pathogen infection in airway epithelial cells. Adhesion, receptor internalization and inflammatory response were intimately related to the invasion of pathogens and host defense against pathogens . However, because there are no prior reports in the literature regarding the two hub genes MPP6 and NSUN3, and their potential role in airway epithelial cell infection, further studies to investigate their functions are warranted.
To provide additional targets for further research, we also preformed miRNA prediction of these hub genes. We found that a single gene (TCF3) has a miRNA-target interaction. Whereas among the eight candidate targets of microRNAs, hsa-miR-762 was previously reported to be associated with Pseudomonas aeruginosa infection , we did not find any prior investigations with the remaining seven in silico predicted miRNAs and infection. However, based on our findings, these miRNAs may warrant future study.
In summary, by integrating the data from 3 independent microarray studies, and though an extensive functional assessment including evaluation of signaling pathways, annotation of biological processes, determination of protein-protein interactions, text mining, and identification of miRNA-target interactions, we identified a robust set of DEGs associated with airway epithelial cell response to pathogen infection. This group of genes may be tested in the sputum or bronchoalveolar lavage fluid samples as biomarkers for the diagnosis or the surveillance of early respiratory tract infections. In addition, we identified several genes and miRNAs as intriguing targets for further research. Understanding the pathogenesis will not only lead to improving prognosis and diagnosis, but also better therapeutic strategies and treatments to prevent respiratory tract infections.
Data collection and identification of DEGs
Three mRNA microarray expression profile datasets GSE3397 , GSE6802  and GSE48466  were retrieved from the GEO database (http://www.ncbi.nlm.nih.gov/geo/, accession numbers: GSE3397, GSE6802, and GSE48466) . DEGs were screened using GEO2R, which is an interface web tool available in GEO. The identification of DEGs between human airway epithelial cells exposed to respiratory pathogens and normal cells was statistically filtered using a P-value < 0.05.
GO and KEGG Pathway enrichment analyses were performed using the online tool DAVID version 6.8 (https://david.ncifcrf.gov/) . A P < 0.05 was considered statistically significant. Protein and protein interaction networks were constructed using STRING version 10.0 (http://string-db.org/) . The STRING database enables a critical assessment and integration of PPI, including physical as well as functional associations . DEGs were mapped into the protein search, the organism was defined as Homo sapiens, and the genes were queried in the interaction networks. The Molecular Complex Detection (DCOME) in Cytoscape was used to analyze the modules of the PPI network. A score greater than three was identified as hub nodes and edges. Text mining for prediction of gene function was performed using COREMINE (http://www.coremine.com/medical/) . To predict miRNA-target interactions in the 3′-UTR region of genes, we used the miRWalk, miRanda, miRDB, RNA22 and Targetscan databases (http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/) .
T.marneffei strain and conidia preparation
T.marneffei strain was isolated from the sputum of a patient suffering from disseminated T. marneffei infection at the first affiliated hospital of Guangxi Medical University. T.marneffei was isolated as part of standard care of the patient, and was cultured on potato dextrose agar (PDA) medium (Beijing Luqiao Technology) at 25 °C for 7–10 days. Colonies were washed with sterile phosphate buffed saline (PBS), and then conidia were collected by centrifugation.
Cell line culture and fungal infection
The human bronchial epithelial cell line BEAS-2B (ATCC® CRL-9609™) was stocked at the Experimental Center of Guangxi Medical University. Cells were cultured in RPMI1640 medium mixed with 10% fetal bovine serum (Invitrogen) at 37 °C. The test group was infected with conidia of T.marneffei for 4 hours.
RNA extraction and qRT-PCR
Cells of both control and test groups were used to extract total RNA with TRIzol reagent (Invitrogen), following the manufacturer’s instructions. RNA quality and quantity were measured by Nucleic Acid Protein Detector. Total RNA was used for synthesis of cDNA with the First Strand cDNA Synthesis Kit (TaKaRa). The qRT-PCR were run using the following program: 1 cycle at 95 °C for 30 s, 40 cycles at 95 °C for 3 s, and an extension step at 60 °C for 30 s. The relative expression level of mRNAs was performed by using 2−ΔΔCT analysis method .The primers used are as follows (Table 5). GAPDH expression served as internal control.
Results are shown as mean ± standard deviation (SD) for three repeated independent experiments for each group. Statistical comparisons were conducted by using SPSS20.0 and significance was assessed by two-tailed Student’s t-test. Results with p < 0.05 were considered statistically significant.
Differentially expressed genes
Gene Expression Omnibus
Kyoto Encyclopedia of Genes and Genomes
Pillay-van Wyk V, Msemburi W, Laubscher R, Dorrington RE, Groenewald P, Glass T, Nojilana B, Joubert JD, Matzopoulos R, Prinsloo M, et al. Mortality trends and differentials in South Africa from 1997 to 2012: second National Burden of disease study. Lancet Glob Health. 2016;4(9):e642–53.
Disease GBD, Injury I, Prevalence C. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990-2015: a systematic analysis for the global burden of disease study 2015. Lancet (London, England). 2016;388(10053):1545–602.
Hippenstiel S, Opitz B, Schmeck B, Suttorp N. Lung epithelium as a sentinel and effector system in pneumonia--molecular mechanisms of pathogen recognition and signal transduction. Respir Res. 2006;7:97.
Ikeda M, Enomoto N, Hashimoto D, Fujisawa T, Inui N, Nakamura Y, Suda T, Nagata T. Nontypeable Haemophilus influenzae exploits the interaction between protein-E and vitronectin for the adherence and invasion to bronchial epithelial cells. BMC Microbiol. 2015;15:263.
Mun J, Tam C, Chan G, Kim JH, Evans D, Fleiszig S. MicroRNA-762 is upregulated in human corneal epithelial cells in response to tear fluid and Pseudomonas aeruginosa antigens and negatively regulates the expression of host defense genes encoding RNase7 and ST2. PLoS One. 2013;8(2):e57850.
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 2013;41(Database issue):D991–5.
Hill DP, D'Eustachio P, Berardini TZ, Mungall CJ, Renedo N, Blake JA. Modeling biochemical pathways in the gene ontology. Database: the journal of biological databases and curation. 2016; 2016: article ID baw126. https://doi.org/10.1093/database/baw126.
Li HS, Watowich SS. Innate immune regulation by STAT-mediated transcriptional mechanisms. Immunol Rev. 2014;261(1):84–101.
Aronoff DM, Lewis C, Serezani CH, Eaton KA, Goel D, Phipps JC, Peters-Golden M, Mancuso P. E-prostanoid 3 receptor deletion improves pulmonary host defense and protects mice from death in severe Streptococcus pneumoniae infection. Journal of immunology (Baltimore, Md : 1950). 2009;183(4):2642–9.
Huang YC, Li Z, Hyseni X, Schmitt M, Devlin RB, Karoly ED, Soukup JM. Identification of gene biomarkers for respiratory syncytial virus infection in a bronchial epithelial cell line. Genomic medicine. 2008;2(3–4):113–25.
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(D1):D447–52.
de Leeuw N, Dijkhuizen T, Hehir-Kwa JY, Carter NP, Feuk L, Firth HV, Kuhn RM, Ledbetter DH, Martin CL, van Ravenswaaij-Arts CM, et al. Diagnostic interpretation of array data using public databases and internet sources. Hum Mutat. 2012;33(6):930–40.
The raw data generated or analysed during this study are available online (http://www.ncbi.nlm.nih.gov/geo/, accession numbers: GSE3397, GSE6802, and GSE48466). The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Authors and Affiliations
Department of Respiratory Medicine, the First Affiliated Hospital of Guangxi Medical University, Nanning, 530021, Guangxi, China
Yinghua Li, Jianquan Zhang, Xiaoning Zhong & Zhiyi He
Department of Respiratory Medicine, the Second Affiliated Hospital of Guangxi Medical University, Nanning, 530007, Guangxi, China
ZH and YL conceived and designed the study. YL performed the experiments, analyzed the data and wrote the manuscript. GL, JZ and XZ provided suggestions during manuscript preparation, and critically read and revised the manuscript. All authors of the manuscript have contributed,read and agreed to its content and are accountable for all aspects of the accuracy and integrity of the manuscript.
T.marneffei was isolated as part of standard care of the patient and the subsequent isolation of the microorganism was undertaken according to standard laboratory processes. The patient in this study provided oral informed consent, and the rights and interests of the patient were fully considered and protected. Our study was approved by the First Affiliated Hospital of Guangxi Medical University Ethical Review Committee [Approval number: 2018(KY-E-013)]. The Medical Ethics Committee of First Affiliated Hospital of Guangxi Medical University also approved the method of obtaining consent.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The GSE48466 mRNA expression profile dataset. 12950 DEGs between normal and infected airway epithelial cells were identified. (XLSX 964 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Li, Y., Liu, G., Zhang, J. et al. Identification of key genes in human airway epithelial cells in response to respiratory pathogens using microarray analysis.
BMC Microbiol18, 58 (2018). https://doi.org/10.1186/s12866-018-1187-7