Skip to main content

Table 1 Types of contamination and strategies to exclude them

From: The electronic tree of life (eToL): a net of long probes to characterize the microbiome from RNA-seq data

A Types of contamination

Class

Type

Comments

1A

Molecular biology reagents

If contaminated, the same signal will be present in all samples

1B

Sample contamination during dissection

Expect environmental contaminants such as spores, pollens, and skin microbes (caution, skin microbes have been implicated in several human diseases)

2A

Lifelong in vivo biocontamination from blood and the environment

Environmental contaminants such as microparticles of the same size as spores and pollens have been demonstrated to enter tissues; microbes of a similar size rapidly enter human tissues

2B

In vivo biocontamination: perimortem

When analyzing a tissue in relation to human disease, contamination may arise from in vivo dissemination of microbes (e.g., respiratory disease) unrelated to the primary disorder; invasion of diseased tissue may be a consequence rather than a cause of tissue degeneration

BStrategies to exclude contaminants

Method

Strategy

Comments

Negative controls

Exclude all signals present in blank workups

Negative controls alone are not sufficient to detect all contaminating species. In addition, RNA-seq data rarely have blank controls because these are rejected as errors by the sequencing instrument

Common contaminants

Consider excluding common contaminant species such as those listed in Salter et al. [80] and Sanabria et al. [83]

Caution is urged because common contaminant microbes may themselves be the cause of disease

Within-batch consistency

Exclude signals that are present at similar levels in all samples

Caution is urged because, if applied to gut or lung, this would exclude many of the major species that are known to be present

Between-batch consistency

Only include signals that are present (and at different levels) in independent datasets from the same tissue

A major caveat is that, in different individuals with the same pathology, diverse organisms can be the cause of that pathology (e.g., viruses, bacteria, and fungi can all cause inflammatory lung disease).

Differential signals

Only include differential signals between, for example, disease samples versus controls

Consistent differential signals point unambiguously to species that are not contaminants

Microheterogeneity

High-resolution strain/substrain mapping. Contaminants introduced during sample work-up are likely to be of the same specific genotypes in different samples, whereas true signals are most likely heterogeneous in their exact sequences

Download sequences from different samples of the target tissue and prepare phylogenetic trees