Skip to main content

The composition of lung microbiome in lung cancer: a systematic review and meta-analysis

Abstract

Background

Although recent studies have indicated that imbalance in the respiratory microbiome composition is linked to several chronic respiratory diseases, the association between the lung microbiome and lung cancer has not been extensively studied. Conflicting reports of individual studies on respiratory microbiome alterations in lung cancer complicate the matter for specifying how the lung microbiome is linked to lung cancer. Consequently, as the first meta-analysis on this topic, we integrate publicly available 16S rRNA gene sequence data on lung tissue samples of lung cancer patients to identify bacterial taxa which differ consistently between case and control groups.

Results

The findings of the current study suggest that the relative abundance of several bacterial taxa including Actinobacteria phylum, Corynebacteriaceae and Halomonadaceae families, and Corynebacterium, Lachnoanaerobaculum, and Halomonas genera is significantly decreased (p < 0.05) in lung tumor tissues of lung cancer patients in comparison with tumor-adjacent normal tissues.

Conclusions

Despite the underlying need for scrutinizing the findings further, the present study lays the groundwork for future research and adds to our limited understanding of the key role of the lung microbiome and its complex interaction with lung cancer. More data on demographic factors and tumor tissue types would help establish a greater degree of accuracy in characterizing the lung microbial community which accords with subtypes and stages of the disease and fully capturing the changes of the lung microbiome in lung cancer.

Peer Review reports

Background

With the advent of culture-independent DNA sequencing technologies and the development of Next Generation Sequencing (NGS) techniques, the previously unknown world of the human microbiome (the entire microorganisms inhabit a specific environment (e.g. the human body) including bacteria, archaea, eurkaryotes, and viruses along with their genomes and surrounding environmental conditions) [1] has been recognized and received considerable attention. Numerous studies have investigated the interplay between the microbiome and the host immune system during health and disease, arriving at a consensus that a dysbiotic microbiome may be correlated with disease onset and progression [2, 3]. Due to the initial assumption considering the lungs as sterile sites, the dynamic changes that may occur in the lower respiratory tract microbiome were previously completely neglected. However, new findings revealing the existence of a low-density yet diverse microbial ecosystem in healthy lungs, have confirmed its critical role during respiratory diseases [4, 5]. It has been demonstrated that an impaired lung microbiome is associated with the development of chronic lung diseases such as chronic obstructive pulmonary disease (COPD) [6,7,8], cystic fibrosis (CF) [9,10,11], asthma [12,13,14], and idiopathic pulmonary fibrosis (IPF) [15,16,17]. In recent years, the specific impact of the microbiome on lung cancer has gained increasing interest. Lung cancer is one of the most serious lung diseases and common cancer in both men and women. With a high mortality rate (1.6 million annually), lung cancer is the leading cause of cancer death worldwide [18]. The disease is initially asymptomatic and usually diagnosed in advanced stages. Late diagnosis and high mortality rate of lung cancer emphasize the importance of identifying microbial composition and potential signatures according to the stages of the disease. Dozens of studies have relied mostly on 16S rRNA amplicon sequencing approach using different clinical samples including bronchoalveolar lavage (BAL) fluid, sputum, saliva, and lung biopsy to evaluate the contribution of the lung microbiome in relation to lung cancer [19,20,21,22,23,24]; as the application of shotgun sequencing in the respiratory microbiome is still in its infancy [5]. Although most of the prior investigations have found significant differences in the taxonomic composition of the lung microbiome in the disease state, specifying the microbial profiles and patterns that may contribute to the pathogenesis of the disease is still a major challenge owing to inconsistencies in reported studies. Contradictory results may stem from the lack of a standard pipeline in preprocessing the metagenomic data, differences in study designs, clinical sample types, and computational methods as well as other confounding factors and inter-study batch effects such as experimental procedures, targeted hypervariable regions of 16S rRNA gene for amplification and sequencing platforms [25]. These limitations hinder the generalizability of the results, necessitating the need for meta-analysis studies. To address this, meta-analyses are conducted with the aim of reducing the bias of individual studies, specifically the small sample size, and thus yielding more robust results; as the power of meta-analysis has been demonstrated by recent microbiome meta-analyses which have identified some disease-associated microbial signatures [26,27,28,29]. In the present study, as the first meta-analysis of the lung microbiome in lung cancer (LC) patients, we reprocessed and integrated 16S rRNA gene sequence data on lung biopsy specimens with geographically different sample origins across five studies consisting of 356 tumor tissues and 493 tumor-adjacent normal tissues. We have aimed to identify the differences in the microbiome between the two groups and determine the possible associations between the taxonomic composition of the lung microbiome and lung cancer.

Results

To investigate the possible changes of the lung microbiome in lung cancer, raw sequence data from a total of five studies were processed into relative abundance data. Samples used in these studies were obtained from patients in different geographic regions, yet subjects were age-homogeneous. To distinguish between cases and controls in terms of taxonomic changes of the lung microbiome, differences between tumor tissues and tumor-adjacent normal tissues were determined using Generalized Additive Models for Location, Scale and Shape (GAMLSS) [30, 31] with a zero-inflated beta distribution (BEZI) within each study. Regression coefficients from the GAMLSS-BEZI model were then retrieved as summary statistics and combined by employing a random-effects meta-analysis model to seek consistent associations between lung cancer and the lung microbiome at various taxonomic levels. At phylum level, there was a significant difference (p = 0.016) in Actinobacteria between cases and controls (decreased relative abundance in tumor tissues). As can be seen in Fig. 1, although one study with the smallest sample size showed non-significant enrichment in Actinobacteria in tumor tissues, four other studies showed decreased relative abundance of Actinobacteria in cases vs. controls, with a significant decrease (p = 0.039) in study 4. The results obtained from the individual studies and the meta-analysis at the phylum level are presented in Table 1. The changes of all phyla across all studies can be compared in Fig. 2.

Table 1 Results of GAMLSS-BEZI and Random Effects Meta-analysis across included studies at the phylum level
Fig. 1
figure1

Heat map of changes in the relative abundance of Actinobacteria phylum across all studies. Regression coefficients from GAMLSS-BEZI are log (odds ratio) (log(OR)) of changes in the relative abundance of a specific bacterial taxon between the case and control groups and pooled log (OR) estimate is from a random-effects meta-analysis. The case group is considered as the reference group and shown in the heat map. Log(OR) > 0 indicates an increase and log(OR) < 0 indicates a decrease in the relative abundance of Actinobacteria in tumor tissues as compared to tumor-adjacent normal tissues

Fig. 2
figure2

Phylum level meta-analysis between tumor tissues and tumor-adjacent normal tissues; heat map for representing changes of all phyla. Statistically significant differences between the two groups with p-values < 0.05 are denoted with * and those with p-values < 0.0001 are denoted with **. The white parts in the heat map represent the bacterial taxa that are not available in a particular study. S (1-5): Study; MA: Meta-analysis

At family level, Corynebacteriaceae (p = 0.012 across four of the studies) and Halomonadaceae (p = 0.016 across two of the studies) were significantly decreased in tumor tissues. As shown in Fig. 3, there was a consistent decrease (significantly decreased in study 2) in the relative abundance of both Corynebacteriaceae and Halomonadaceae in tumor tissues. The results obtained from the individual studies and the meta-analysis at the family level are summarized in Table 2. The changes of all families across all studies can be compared in Fig. 4.

Table 2 Results of GAMLSS-BEZI and Random Effects Meta-analysis across included studies at the family level
Fig. 3
figure3

Heat map of changes in the relative abundance of Corynebacteriaceae and Halomonadaceae families across all studies. Regression coefficients from GAMLSS-BEZI are log (odds ratio) (log(OR)) of changes in the relative abundance of a specific bacterial taxon between the case and control groups and pooled log (OR) estimate is from a random-effects meta-analysis. The case group is considered as the reference group and shown in the heat map. Log(OR) < 0 indicates a decrease in the relative abundance of Corynebacteriaceae and Halomonadaceae in tumor tissues as compared to tumor-adjacent normal tissues. The white parts in the heat map represent the bacterial taxa that are not available in a particular study

Fig. 4
figure4

Family level meta-analysis between tumor tissues and tumor-adjacent normal tissues; heat map for representing changes of all families. Statistically significant differences between the two groups with p-values < 0.05 are denoted with * and those with p-values < 0.0001 are denoted with **. The white parts in the heat map represent the bacterial taxa that are not available in a particular study. S (1-5): Study; MA: Meta-analysis

At genus level, the relative abundance of three genera was found to be significantly decreased in tumor tissues as compared to controls; Corynebacterium (p = 0.012 across four of the studies), Lachnoanaerobaculum (p = 0.015 across two of the studies), and Halomonas (p = 0.018 across two of the studies). As illustrated in Fig. 5, the direction of changes was consistent across studies. The relative abundance of Corynebacterium was consistently decreased in cases relative to controls across the four studies (significantly decreased in study 2). Analogous to that of Corynebacterium, the relative abundance of Lachnoanaerobaculum and Halomonas was also consistently lower in cases across the two studies (studies 2 and 5); with a significant result in study 2. The results obtained from the individual studies and the meta-analysis at the genus level are shown in Table 3. The changes of all genera across all studies can be compared in Fig. 6.

Table 3 Results of GAMLSS-BEZI and Random Effects Meta-analysis across included studies at the genus level

Taken together, these results suggest that there is an association between lung microbiome dysbiosis and lung cancer.

Fig. 5
figure5

Heat map of changes in the relative abundance of Corynebacterium, Lachnoanaerobaculum, and Halomonas genera across all studies. Regression coefficients from GAMLSS-BEZI are log (odds ratio) (log(OR)) of changes in the relative abundance of a specific bacterial taxon between the case and control groups and pooled log (OR) estimate is from a random-effects meta-analysis. The case group is considered as the reference group and shown in the heat map. Log(OR) < 0 indicates a decrease in the relative abundance of Corynebacterium, Lachnoanaerobaculum, and Halomonas in tumor tissues as compared to tumor-adjacent normal tissues. The white parts in the heat map represent the bacterial taxa that are not available in a particular study

Fig. 6
figure6

Genus level meta-analysis between tumor tissues and tumor-adjacent normal tissues; heat map for representing changes of all genera. Statistically significant differences between the two groups with p-values < 0.05 are denoted with * and those with p-values < 0.0001 are denoted with **. The white parts in the heat map represent the bacterial taxa that are not available in a particular study. S (1-5): Study; MA: Meta-analysis

Discussion

Although prior studies have shown the major impact of microbiome dysbiosis on respiratory diseases such as COPD, cystic fibrosis, asthma, and idiopathic pulmonary fibrosis, very little was found in the literature on microbiome alterations during lung cancer. Available reports on the microbiome composition specific to lung cancer have been highly inconsistent and therefore the question remains unanswered. In reviewing the literature, no meta-analysis was found on the association between the microbiome and lung cancer, mainly due to the fact that the lung microbiome is considered an emerging research area. Sterility of the lungs is still a matter of controversy. Although microbial culture of healthy individuals’ lower respiratory tract specimens is negative, metagenomics studies have convinced scientists of the existence of genomes of various microorganisms residing in the lungs most of which are not culturable. In this regard, there are some issues that cause studies to obtain different results; including type of clinical sample, different sampling methods, challenges of the process of metagenomic analysis, and the personalized nature of the human microbiome, and therefore identifying and determining the composition and abundance of the lung microbiome in healthy individuals is still a major challenge. Accordingly, as the first meta-analysis aiming to represent taxonomic changes of the lung microbiome in lung tumor tissues, the primary aim of this meta-analysis is to integrate the results of such conflicting studies for better understanding of the alterations in the microbiome content.

Similar to other meta-analyses on microbiome-disease state, we have used cross-sectional studies here. Although the microbiome may be dynamic and change over time as environmental conditions change, the most important part of each individual’s microbiome is stable, namely, the core microbiome (microbial taxa or genes that are stable over time and shared by all or most of the population and are particularly important for the host’s biological function). In fact, the core microbiome is considered as the microbiome’s fingerprint of each individual and most microbiome studies, both individual and meta-analyses, try to identify a pattern of the core microbiome in different people under different conditions. We showed some significant decreases in the relative abundance of several bacterial phyla, families, and genera. Although the phylum Proteobacteria and especially the genus Streptococcus have been suggested as key bacteria in lung cancer [25], the findings of this study do not support the domination of Proteobacteria in tumor tissues. The results of this study showed a significant decrease in Halomonadaceae, a family of Proteobacteria, and Halomonas, a genus of Proteobacteria, in tumor tissues of LC patients. Another finding was that the relative abundance of the phylum Actinobacteria was significantly decreased in tumor tissues. This finding is consistent with that of Zhuang et al. [32] who reported a decrease of Actinobacteria in fecal samples of LC patients. However, this result is contrary to that of Apopa et al. [23] who found an increased level of Actinobacteria in LC tissue samples. Apopa et al. also reported an increased abundance of Proteobacteria, Bacteroidetes, and Firmicutes, differing from the results found in our study. The results of this study did not show any significant decrease in Firmicutes, as suggested in a study by Greathouse et al. [22]. However, one of its genera, Lachnoanaerobaculum was found to be significantly lower in tumor tissues in this meta-analysis. It is interesting to note that in all except one study (Apopa et al.) included in this meta-analysis, a decrease in Actinobacteria was observed in tumor tissues of LC patients. This inconsistency may partly be explained by the small sample size of the Apopa et al. study. Decreases in the family Corynebacteriaceae and the genus Corynebacterium of the phylum Actinobacteria were also statistically significant in this study. It has been demonstrated by previous studies that the relative abundance of multiple genera is significantly different in LC patients relative to controls. Hosgood et al. found lower diversity in sputum samples of LC patients which was accompanied by an increase in the relative abundance of Granulicatella, Abiotrophia, and Streptococcus in comparison with healthy controls [20]. Similarly, a lower diversity and an increased abundance of Streptococcus was also reported in a study of protected specimen brush (PSB) of malignant parts of the lungs compared to healthy controls [33]. Tsay et al. have reported that Streptococcus and Veillonella were more abundant in lower airway samples of LC patients compared to patients with benign lung diseases and healthy controls [34]. Yan et al. also found an enriched abundance of the family Veillonellaceae, and Veillonella, Capnocytophaga, and Selenomonas genera in the saliva of LC patients [35]. An enriched abundance of Veillonella and Megasphaera genera was also reported in a study of BAL fluid samples of LC patients compared to patients with benign mass like lesions [19]. Most of available studies have used sputum, PSB, saliva, and BAL fluid specimens to study the microbiome of LC patients rather than a lung biopsy, mainly due to the difficulty of its sampling procedure. But, the high risk of contamination by the upper respiratory tract normal flora associated with the aforementioned sample types should not be ignored, particularly in the case of sputum. In fact, the family Veillonellaceae and the genera Veillonella and Streptococcus are members of the microbial community of the oral cavity and the reports of these genera as differentially abundant taxa between LC patients and controls may be an indication of cross contamination. In this regard, the clinical sample with the lowest risk of contamination by the upper respiratory tract flora is lung biopsy in which samples of lung tissue are isolated from the respiratory tract making it an ideal sample for the lower respiratory system. It is interesting to note that the relative abundance of Streptococcus and Veillonella was not significantly different between cases and controls in this meta-analysis that may be explained by the fact that all studies included in this meta-analysis used lung biopsy specimens. In a study by Peters et al. lung tumor tissue microbiome has been reported to be less diverse than paired normal tissue [36]. The results of this study showed a significant decrease in the relative abundance of Halomonas in tumor tissues. This finding is consistent with that of D’Alessandro-Gabazza et al. who also detected the presence of this genus in the lung tissue of patients with lung cancer as well as idiopathic pulmonary fibrosis [37]. This finding was also reported by Li et al. who showed a difference in the genus Halomonas between lung cancer and control groups [38]. All these observations emphasize the need for further investigation of the role of this genus in lung cancer, which may serve as an indicator for prognosis of the disease. Investigating the respiratory microbiome for clinical diagnosis and treatment of respiratory diseases is still in its early stage. In this study, we tried to collectively find a pattern or biomarker based on the composition of the lung microbiome to help differentiate between normal and cancerous conditions. Regarding the poor prognosis of lung cancer and high morbidity and mortality of the disease, development of diagnostic, therapeutic, and prophylactic approaches (especially, probiotics and prebiotics administration) based on microbiome composition is of crucial importance in the field of lung cancer.

To investigate the host-microbiome interactions during disease states, several meta-analyses to date have taken full advantage of NGS technology to discover microbial patterns which are linked to specific diseases, including inflammatory bowel disease (IBD), obesity, COPD, and colorectal cancer. Walters et al. [39] focusing on 16S rRNA gene sequencing studies found a consistent pattern in taxonomic alterations in the gut microbiome of IBD patients. Using supervised learning, they could differentiate IBD from non-IBD individuals. In general, their observations on taxonomic changes were akin to individual studies, including a decrease in Firmicutes and Bacteroides and an increase in Proteobacteria and Actinobacteria, though different in several respects. Unlike prior studies, which had reported decreased Bacteroidetes, their meta-analysis did not show any statistically significant difference in Bacteroidetes phylum between IBD subjects and healthy controls. In addition, considering inconsistencies in the findings of previous microbiome studies regarding the interaction between airway microbiome and host in COPD, Wang et al. [28] analyzed COPD sputum sample microbiome using a total of 15 metagenomic datasets adopting a multi-omic meta-analysis approach. To identify taxonomic alterations in the airway microbiome in COPD versus controls, they limited their meta-analysis by combining the results across two 16S rRNA gene datasets due to the availability of two datasets with the case-control design. Using random-effects meta-analysis, a total of 12 genus-level taxa were identified to be statistically significant between the two groups. Moreover, by training a random forest classifier on COPD datasets, these 12 genera demonstrated to have the potential to distinguish COPD patients from controls. Using an independent multi-omic cohort, they validated their findings indicating that these genera could be considered as the taxonomic signature of airway microbiome in COPD. In fact, these results indicate that some significant associations reported in individual studies may result from insufficient sample sizes and therefore with more statistical power provided with meta-analyses, the consistency and significance of these associations can be assessed more accurately. There are some points concerning performing a meta-analysis on microbiome studies. High heterogeneity of the data generated by high-throughput sequencing of 16S rRNA gene amplicons poses a challenge for inter-study comparisons. To take this matter into account, microbiome data are often standardized to relative abundance data where all microbial taxa range from zero to one. We adopted this standardization approach while combining microbiome data across different studies in the present meta-analysis as it provides greater statistical power in order to identify a core set of microorganisms (core microbiome). A meta-analysis can be conducted adopting different approaches, including aggregate data meta-analyses (AD-MAs) and individual participant data meta-analyses (IPD-MAs). Of the two approaches, combining effect sizes, p-values, and ranks are examples of the former which aggregates summary statistics from individual studies. In the latter case, individual datasets of all included studies are merged into a single dataset. In some respects, a meta-analysis of microbiome data is considered more challenging due to a great deal of variation among microbiome studies. Inherently heterogeneous data place limitations on merging individual datasets into a single dataset. For this reason, although we reprocessed all raw sequence data through a similar pipeline, in this meta-analysis, in an effort to minimize the bias and heterogeneity of data from various sources, we chose to take the first approach rather than directly merging individual datasets as it is more robust to between-study heterogeneity [40]. Specifically, we preferred to combine effect sizes by applying a random-effects model as it has been suggested as a more statistically conservative approach compared to the p-value combination [41, 42]. There are some methodological limitations in this study: (1) it was not possible to determine the relationship between demographic factors (such as age, gender, and smoking history) and taxonomic relative abundance of bacteria due to insufficient metadata. Consequently, we were not able to include them as covariates in the statistical model. To determine how exactly these variables might affect changes in the relative abundance of bacterial taxa, further research with more focus on covariate adjustment is suggested; (2) due to the lack of sufficient data on tumor tissue types, it was not feasible to differentiate lung adenocarcinoma (AC) specific microbiome composition from lung squamous cell carcinoma (SCC) in our analysis; (3) the reported taxa were not significant after adjustment for multiple comparisons mostly due to a considerable number of non-significant associations in this study which can diminish the potential significance of the observed associations. On the one hand, as it has been written about in the literature [43,44,45], imposing a strict adjustment for multiple comparisons is not always necessary and is less critical in the case of our study since it is not feasible to compare this study’s statistical significance with findings obtained by other studies as there is no large-scale study on the lung microbiome in lung cancer. On the other hand, we acknowledge the need for further investigation to confirm the observed associations in this study; (4) in general, studies identifying the microbial communities at different body sites and determining their relationship with various human diseases belong to an emerging area of research. Initially, the focus of the mentioned studies was mostly on taxonomic profiling of the microbial community. However, research has shown that due to the resilience property of the microbiome (members of the microbiome of a region have some function overlaps and thus different microbiome compositions can have similar overall functions), functional profiling should also be analyzed along with taxonomic profiling. Functional analysis is performed through meta-transcriptomics and metabolomics studies, which is still in its infancy in the case of the respiratory tract microbiome and its association with lung cancer. Therefore, considering the taxonomic composition through 16S rRNA approach is not enough and should be completed by functional analyses in future investigations. Despite the limitations, the present results could have important implications for identifying microbial biomarkers which are linked to the pathogenesis of the disease and facilitate future research. There are still many unanswered questions toward understanding the myriad roles of the lung microbiome and airway host-microbiome interactions with lung cancer. To develop a deeper understanding, further studies, which take the mentioned limitations into account, will be needed. In future investigations, drawing a distinction between different subtypes of Non-Small Cell Lung Cancer (NSCLC) as well as considering the stage of lung cancer could be more informative in characterizing the lung microbial community in LC patients.

Conclusions

As the first meta-analysis of the lung microbiome in relation to lung cancer, the present research was undertaken to assess how the composition of the lung microbiome differs between lung tumor tissues and normal tissues. The results of this investigation show that some bacteria differ significantly between the two groups. Despite the fact that some of the findings of this study have not previously been described, the results of this research support the idea that the microbiome may be a key factor in cancer development. An important question raised by this study is whether or not these bacterial taxa are specific to lung cancer. To be considered specific indications of lung cancer, the results of this study need to be validated by further research. The insights gained from this study may be of assistance to microbial biomarker discovery and consequently the early diagnosis of lung cancer.

Methods

Search Strategy, Inclusion Criteria, and Study Selection

A systematic literature review was conducted in PubMed, SRA (Sequence Read Archive), and EBI (European Bioinformatics Institute) and was last updated on January 8, 2021. The literature search was based on studies that evaluated the relationship between the microbiome and lung cancer. There was no restriction on the publication date. Published studies were identified using the following keywords: (“lung cancer“[Title/Abstract] OR “lung neoplasm“[Title/Abstract] OR “pulmonary neoplasm“[Title/Abstract] OR “pulmonary cancer“[Title/Abstract]) AND (“microbiome“[Title/Abstract] OR “Metagenom*“[Title/Abstract]). Eligibility criteria required studies to be case-control studies using 16S rRNA gene sequencing for taxonomy quantification of the lung microbial community in LC patients with publicly available raw sequence data. Fifty-eight initially identified articles were reviewed on the basis of titles and abstracts and review articles and meta-analyses were excluded. Full-text of the remaining articles were then assessed and articles with no metadata or accession number along with articles including patients who had received treatment were also excluded. A total of 12 eligible articles meeting our inclusion criteria were selected for further assessment from which 4 studies were chosen for meta-analysis. Eligible studies encompassed a variety of sample types including feces, saliva, BAL fluid, sputum, PSB, and lung biopsy; although due to an insufficient number of studies to perform a meta-analysis, we restricted the sample type to lung biopsy since a total of 5 datasets on lung biopsy specimens were available, one of which (study 4) was unpublished yet obtained from the same population of study 3, as they both had some identical samples. Therefore, due to the small number of available studies on lung biopsy samples of LC patients, we identified the identical samples by subject ID provided in metadata and excluded those samples from study 4 and then considered study 4 as a separate study with its own unique samples. Risk of bias in included studies was also assessed by means of Newcastle-Ottawa Scale (NOS) [46] the results of which are summarized in Table 4. All identified articles were independently assessed by two reviewers. In the case of any discrepancy in study inclusion, a third reviewer was involved and the issue was resolved by discussion. Figure 7 presents an overview of the systematic literature review and Table 5 provides the characteristics of all included studies in this meta-analysis.

Table 4 Quality assessment of the included studies in meta-analysis
Table 5 Characteristics of the included datasets in meta-analysis
Fig. 7
figure7

The systematic literature review flow diagram

Data Acquisition

Raw sequence data were gathered from NCBI SRA database and corresponding metadata indicating case or control status for each sample were acquired either by search in SRA using the accession number or by personal communication with the authors.

Data Pre-processing and Taxonomic Profiling

All 16S rRNA marker gene sequencing data were processed through a standardized pipeline in QIIME 2 (version 2020.6) [49]. The first step in this process was to assess the quality of the sequence reads (using FastQC [50]). Paired-end demultiplexed sequences in FASTQ files were quality filtered to identify and remove low-quality reads. Non-biological sequences such as adapters and primers were also separated and trimmed. Running the DADA2 [51] denoising method, paired-end reads were joined. Once QC filtering and the denoising step were completed, denoised data were summarized and an ASV (Amplicon Sequence Variant) table was generated. In the final stage of the process and for the purpose of assigning taxonomy, a Naive Bayes classifier was trained on Greengenes (GG v13.8) ribosomal reference database [52], and taxonomy classification was conducted. This process was carried out independently for all runs of each study. All the parameters used for quality filtering are available in Additional file 1.

Statistical Analyses

The overall workflow is shown in Fig. 8. The analysis was based on the conceptual framework proposed by Ho et al. [53]. Prior to statistical analyses, relative abundance data were obtained by dividing the count value of each taxon by the total counts per sample. Taxonomic relative abundance data were then filtered to remove taxa with the average relative abundance less than \(5\times {10}^{-5}\) as well as taxa which were present in fewer than 5 % of samples within each study. The remaining taxa were retained for statistical modeling. All statistical analyses were performed from phylum to genus level within each dataset individually. To compare the relative abundance of the lung bacterial taxa between case and control groups, a GAMLSS-BEZI regression model was fitted in each study; this approach was adopted since it both accurately captures the actual distribution of relative abundance data and specifically addresses zero inflation in microbiome data. The resulting regression coefficient estimate of each taxon from each study (\({\beta }_{tk}\)) was obtained from GAMLSS-BEZI and considered as effect size and together with its corresponding standard error were retrieved for meta-analysis. If each study contains T taxa, \({\beta }_{tk}\) denotes the effect size for taxon t in study k (\(1\le t\le T\) and \(1\le k\le K\)). To account for inherent heterogeneity among microbiome studies, a random-effects meta-analysis model with inverse variance weighting was then applied to combine calculated effect sizes and their standard errors across all included studies. The following assumption is made by a random-effects model (REM) to combine effect sizes in K studies:

$$\beta_{tk}{\left|\theta_{tk},\;\sigma_{}^{}\right.}_{tk}\sim N\left(\theta_{tk},\sigma_{tk}^2\;\right)$$
$$\theta_{tk}\vert\mu,\tau\sim {N}(\mu,\tau^2)$$
$$\beta_{tk}\left|\mu,\;\tau,\;\sigma_{tk}\sim N\left(\mu,\;\sigma_{tk}^2 +\;\tau^2\right)\right.$$

Then the pooled effect size for \({\beta }_{tk}\) is calculated as follows:

$$\widehat\mu=\frac{\sum_{k=1}^K\;w_{tk}\;\beta_{tk}}{\sum_{k=1}^K\;w_{tk}}$$

Where,

$${w}_{tk}=\frac{1}{{\sigma }_{tk}^{2}+{\widehat{\tau }}^{2}}$$

And \({w}_{tk}\) denotes the weight assigned to study k. \({\tau }^{2}\) is the between-study variance which was estimated based on the DerSimonian and Laird (DL) method.

All taxa available in at least 2 datasets were retained for meta-analysis. Significance levels were set at the 5 % level and all statistical analyses were carried out using R version 4.0.2.

Fig. 8
figure8

The method workflow for downstream analysis including data pre-processing step, feature table construction, and statistical analyses

Availability of data and materials

The raw sequence data analyzed during the current study are available in the NCBI SRA repository, https://www.ncbi.nlm.nih.gov/Traces/study/ with the following accession numbers: PRJNA472758, PRJNA624822, PRJNA303190, PRJNA327258, and PRJNA647170.

R codes and other materials are available upon request.

References

  1. 1.

    Marchesi JR, Ravel J. The vocabulary of microbiome research: a proposal. Microbiome. 2015;3:31. https://doi.org/10.1186/s40168-015-0094-5.

  2. 2.

    Carding S, Verbeke K, Vipond DT, Corfe BM, Owen LJ. Dysbiosis of the gut microbiota in disease. Microbial Ecol Health Dis. 2015;26(1):26191.

    Google Scholar 

  3. 3.

    Round JL, Mazmanian SK. The gut microbiota shapes intestinal immune responses during health and disease. Nature Reviews Immunol. 2009;9(5):313–23.

    CAS  Google Scholar 

  4. 4.

    Dickson RP, Huffnagle GB. The lung microbiome: new principles for respiratory bacteriology in health and disease. PLoS Pathog. 2015;11(7):e1004923.

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Moffatt MF, Cookson WO. The lung microbiome in health and disease. Clin Med. 2017;17(6):525.

    Google Scholar 

  6. 6.

    Sze MA, Dimitriu PA, Hayashi S, Elliott WM, McDonough JE, Gosselink JV, et al. The lung tissue microbiome in chronic obstructive pulmonary disease. Am J Respiratory Critical Care Med. 2012;185(10):1073–80.

    Google Scholar 

  7. 7.

    Huang YJ, Kim E, Cox MJ, Brodie EL, Brown R, Wiener-Kronish JP, et al. A persistent and diverse airway microbiota present during chronic obstructive pulmonary disease exacerbations. OMICS J Integrative Biol. 2010;14(1):9–59.

    CAS  Google Scholar 

  8. 8.

    Pragman AA, Kim HB, Reilly CS, Wendt C, Isaacson RE. The lung microbiome in moderate and severe chronic obstructive pulmonary disease. 2012.

  9. 9.

    Fodor AA, Klem ER, Gilpin DF, Elborn JS, Boucher RC, Tunney MM, et al. The adult cystic fibrosis airway microbiota is stable over time and infection type, and highly resilient to antibiotic treatment of exacerbations. 2012.

  10. 10.

    Armougom F, Bittar F, Stremler N, Rolain J-M, Robert C, Dubus J-C, et al. Microbial diversity in the sputum of a cystic fibrosis patient studied with 16S rDNA pyrosequencing. Eur J Clin Microbiol Infect Dis. 2009;28(9):1151–4.

    CAS  PubMed  Google Scholar 

  11. 11.

    Carmody LA, Zhao J, Schloss PD, Petrosino JF, Murray S, Young VB, et al. Changes in cystic fibrosis airway microbiota at pulmonary exacerbation. Ann Am Thoracic Soc. 2013;10(3):179–87.

    Google Scholar 

  12. 12.

    Hilty M, Burke C, Pedro H, Cardenas P, Bush A, Bossley C, et al. Disordered microbial communities in asthmatic airways. PloS one. 2010;5(1):e8578.

    PubMed  PubMed Central  Google Scholar 

  13. 13.

    Huang YJ, Nelson CE, Brodie EL, DeSantis TZ, Baek MS, Liu J, et al. Airway microbiota and bronchial hyperresponsiveness in patients with suboptimally controlled asthma. J Allergy Clin Immunol. 2011;127(2):372–81.

    PubMed  Google Scholar 

  14. 14.

    Marri PR, Stern DA, Wright AL, Billheimer D, Martinez FD. Asthma-associated differences in microbial composition of induced sputum. J Allergy Clin Immunol. 2013;131(2):346–52.

    CAS  PubMed  Google Scholar 

  15. 15.

    Spagnolo P, Molyneaux PL, Bernardinello N, Cocconcelli E, Biondini D, Fracasso F, et al. The role of the lung’s microbiome in the pathogenesis and progression of idiopathic pulmonary fibrosis. Int J Mol Sci. 2019;20(22):5618.

    CAS  PubMed Central  Google Scholar 

  16. 16.

    Molyneaux PL, Cox MJ, Wells AU, Kim HC, Ji W, Cookson WO, et al. Changes in the respiratory microbiome during acute exacerbations of idiopathic pulmonary fibrosis. Respiratory Res. 2017;18(1):1–6.

    Google Scholar 

  17. 17.

    Molyneaux PL, Cox MJ, Willis-Owen SA, Mallia P, Russell KE, Russell A-M, et al. The role of bacteria in the pathogenesis and progression of idiopathic pulmonary fibrosis. Am J Respiratory Crit Care Med. 2014;190(8):906–13.

    CAS  Google Scholar 

  18. 18.

    Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136(5):E359–86.

  19. 19.

    Lee SH, Sung JY, Yong D, Chun J, Kim SY, Song JH, et al. Characterization of microbiome in bronchoalveolar lavage fluid of patients with lung cancer comparing with benign mass like lesions. Lung cancer. 2016;102:89–95.

    PubMed  Google Scholar 

  20. 20.

    Hosgood HD III, Sapkota AR, Rothman N, Rohan T, Hu W, Xu J, et al. The potential role of lung microbiota in lung cancer attributed to household coal burning exposures. Environ Mol Mutagenesis. 2014;55(8):643–51.

    CAS  Google Scholar 

  21. 21.

    Zhang W, Luo J, Dong X, Zhao S, Hao Y, Peng C, et al. Salivary microbial dysbiosis is associated with systemic inflammatory markers and predicted oral metabolites in non-small cell lung cancer patients. J Cancer. 2019;10(7):1651.

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Greathouse KL, White JR, Vargas AJ, Bliskovsky VV, Beck JA, von Muhlinen N, et al. Interaction between the microbiome and TP53 in human lung cancer. Genome Biol. 2018;19(1):1–16.

    Google Scholar 

  23. 23.

    Apopa PL, Alley L, Penney RB, Arnaoutakis K, Steliga MA, Jeffus S, et al. PARP1 is up-regulated in non-small cell lung cancer tissues in the presence of the cyanobacterial toxin microcystin. Front Microbiol. 2018;9:1757.

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Yu G, Gail MH, Consonni D, Carugno M, Humphrys M, Pesatori AC, et al. Characterizing human lung tissue microbiota and its relationship to epidemiological and clinical features. Genome biology. 2016;17(1):1–12.

    Google Scholar 

  25. 25.

    Xu N, Wang L, Li C, Ding C, Li C, Fan W, et al. Microbiota dysbiosis in lung cancer: evidence of association and potential mechanisms. Transl Lung Cancer Res. 2020;9(4):1554.

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Wirbel J, Pyl PT, Kartal E, Zych K, Kashani A, Milanese A, et al. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nature Med. 2019;25(4):679–89.

    CAS  PubMed  Google Scholar 

  27. 27.

    Thomas AM, Manghi P, Asnicar F, Pasolli E, Armanini F, Zolfo M, et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nature Med. 2019;25(4):667–78.

    CAS  PubMed  Google Scholar 

  28. 28.

    Wang Z, Yang Y, Yan Z, Liu H, Chen B, Liang Z, et al. Multi-omic meta-analysis identifies functional signatures of airway microbiome in chronic obstructive pulmonary disease. ISME J. 2020;14(11):2748–65.

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Sze MA, Schloss PD. Leveraging existing 16S rRNA gene surveys to identify reproducible biomarkers in individuals with colorectal tumors. MBio. 2018;9(3):e00630-18.

    PubMed  PubMed Central  Google Scholar 

  30. 30.

    Rigby RA, Stasinopoulos DM. Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2005;54(3):507–54.

    Google Scholar 

  31. 31.

    Stasinopoulos DM, Rigby RA. Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Software. 2007;23(7):1–46.

    Google Scholar 

  32. 32.

    Zhuang H, Cheng L, Wang Y, Zhang Y-K, Zhao M-F, Liang G-D, et al. Dysbiosis of the gut microbiome in lung cancer. Front Cell Infect Microbiol. 2019;9:112.

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Liu HX, Tao LL, Zhang J, Zhu YG, Zheng Y, Liu D, et al. Difference of lower airway microbiome in bilateral protected specimen brush between lung cancer patients with unilateral lobar masses and control subjects. Int J Cancer. 2018;142(4):769–78.

    CAS  PubMed  Google Scholar 

  34. 34.

    Tsay J-CJ, Wu BG, Badri MH, Clemente JC, Shen N, Meyn P, et al. Airway microbiota is associated with upregulation of the PI3K pathway in lung cancer. Am J Respiratory Critical Care Med. 2018;198(9):1188–98.

    CAS  Google Scholar 

  35. 35.

    Yan X, Yang M, Liu J, Gao R, Hu J, Li J, et al. Discovery and validation of potential bacterial biomarkers for lung cancer. Am J Cancer Res. 2015;5(10):3111.

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Peters BA, Hayes RB, Goparaju C, Reid C, Pass HI, Ahn J. The microbiome in lung cancer tissue and recurrence-free survival. Cancer Epidemiol Prevent Biomarkers. 2019;28(4):731–40.

    CAS  Google Scholar 

  37. 37.

    D’Alessandro-Gabazza CN, Méndez-García C, Hataji O, Westergaard S, Watanabe F, Yasuma T, et al. Identification of halophilic microbes in lung fibrotic tissue by oligotyping. Front Microbiol. 2018;9:1892.

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Li K-j, Chen Z-l, Huang Y, Zhang R, Luan X-q, Lei T-t, et al. Dysbiosis of lower respiratory tract microbiome are associated with inflammation and microbial function variety. Respiratory Res. 2019;20(1):1–16.

    Google Scholar 

  39. 39.

    Walters WA, Xu Z, Knight R. Meta-analyses of human gut microbes associated with obesity and IBD. FEBS letters. 2014;588(22):4223–33.

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Med. 2008;5(9):e184.

    PubMed  PubMed Central  Google Scholar 

  41. 41.

    Marot G, Foulley J-L, Mayer C-D, Jaffrézic F. Moderated effect size and P-value combinations for microarray meta-analyses. Bioinformatics. 2009;25(20):2692–9.

    CAS  PubMed  Google Scholar 

  42. 42.

    Zhou G, Stevenson MM, Geary TG, Xia J. Comprehensive transcriptome meta-analysis to characterize host immune responses in helminth infections. PLoS Neglected Tropical Dis. 2016;10(4):e0004624.

    Google Scholar 

  43. 43.

    Rothman KJ. No adjustments are needed for multiple comparisons. Epidemiology. 1990:43–6.

  44. 44.

    Feise RJ. Do multiple outcome measures require p-value adjustment? BMC Med Res Methodol. 2002;2(1):1–4.

    Google Scholar 

  45. 45.

    Althouse AD. Adjust for multiple comparisons? It’s not that simple. The Annals of thoracic surgery. 2016;101(5):1644–5.

    PubMed  Google Scholar 

  46. 46.

    Wells GA, Shea B, O’Connell D, Peterson J, Welch V, Losos M, et al. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses. Oxford; 2000.

  47. 47.

    Nejman D, Livyatan I, Fuks G, Gavert N, Zwang Y, Geller LT, et al. The human tumor microbiome is composed of tumor type–specific intracellular bacteria. Science. 2020;368(6494):973–80.

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Kovaleva O, Podlesnaya P, Rashidova M, Samoilova D, Petrenko A, Zborovskaya I, et al. Lung Microbiome Differentially Impacts Survival of Patients with Non-Small Cell Lung Cancer Depending on Tumor Stroma Phenotype. Biomedicines. 2020;8(9):349.

    CAS  PubMed Central  Google Scholar 

  49. 49.

    Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology. 2019;37(8):852–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Andrews S. FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom; 2010.

    Google Scholar 

  51. 51.

    Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods. 2016;13(7):581–3.

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2012;6(3):610–8.

    CAS  PubMed  Google Scholar 

  53. 53.

    Ho NT, Li F, Wang S, Kuhn L. metamicrobiomeR: an R package for analysis of microbiome relative abundance data using zero-inflated beta GAMLSS and meta-analysis across studies using random effects models. BMC Bioinformatics. 2019;20(1):188.

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Affiliations

Authors

Contributions

All authors were involved in designing and conducting the study as well as writing and proofing the manuscript. The author(s) read and approved the final manuscript.

Corresponding authors

Correspondence to Ali Ahmadi or Mohammad Gholami Fesharaki.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Conmpeting interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

QIIME Quality Filter Parameters. The parameters used in DADA2 denoising step.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Najafi, S., Abedini, F., Azimzadeh Jamalkandi, S. et al. The composition of lung microbiome in lung cancer: a systematic review and meta-analysis. BMC Microbiol 21, 315 (2021). https://doi.org/10.1186/s12866-021-02375-z

Download citation

Keywords

  • Lung microbiome
  • Lung cancer
  • Meta-analysis
  • 16S rRNA gene
  • Metagenomics