DNA quantity and quality
Ninety DNA samples were extracted from 10 individual fecal samples using three protocols in triplicate aliquots (three protocols × 10 individuals × triplicate) (Fig. 1). One sample was excluded from all downstream analyses as it appeared to be mislabeled or to have a technical error, based on the 16S rRNA sequence results. To assess whether the quantity and quality of extracted DNA were appropriate for library preparation for 16S rRNA gene sequencing, we first measured the DNA concentration and DNA purity using a NanoDrop ND-1000 spectrophotometer. DNA concentration was the highest for protocol P (P-value < 0.0001; mean ± SD: protocol P 93.97 ± 27.73 ng/μL, protocol SB 35.84 ± 27.46 ng/μL, protocol S 23.74 ± 18.33 ng/μL; Fig. 2a). Considering the elution volumes, DNA quantities of 9.397 ± 2.773 μg, 7.168 ± 5.491 μg, 4.748 ± 3.666 μg were obtained per extraction with protocols P, SB, and S, respectively (Fig. 2b). Furthermore, samples obtained using protocol P had 260/280 absorbance ratios closest to 1.8, indicating that these samples were the least contaminated with proteins or RNAs (P-value < 0.0001; mean ± SD: protocol P 1.884 ± 0.0138, protocol SB 1.962 ± 0.1693, protocol S 2.234 ± 0.1886; Fig. 2c). Moreover, protocol P displayed the lowest standard deviation for DNA quantity and quality (Fig. 2), indicating that protocol P yielded DNA with consistent quantity and quality from fecal samples. Additionally, although the 260/230 absorbance ratios in protocol P were lower than the desirable range (2.0–2.2) (Additional file 1: Fig. S1), which may have been affected by residual buffer components from the extraction steps, the 16S libraries for all samples were well constructed. This indicates that all tested protocols effectively extracted DNA of sufficient quantity and quality to construct the 16S libraries.
Microbial diversity
Further, we analyzed the effect of DNA extraction protocols on the gut microbiome data through 16S rRNA sequencing with the Illumina MiSeq platform. Costea et al. reported that the alpha diversity index (Shannon’s diversity) serves as an optimal criterion for DNA extraction performance as it displays positive correlations with the recovered relative abundance of gram-positive bacteria [9]. We thus compared Shannon’s diversity index for the three protocols to predict the accuracy of the recovered abundance profile. Significant differences in the Shannon’s diversity were observed exclusively between protocols S and P (P-value < 0.05), with no differences observed between protocols SB and P (Fig. 3a). Therefore, based on Shannon’s diversity index, protocol P appeared to offer the optimal performance, albeit comparable to that of protocol SB.
We also compared the microbial richness and evenness and found that the mean values of observed amplicon sequence variants (ASVs) were highest in the protocol P samples, however, these differences were not statistically significant across protocols. Meanwhile, the Pielou’s evenness of protocol P was significantly higher than for protocols S and SB (Additional file 2: Fig. S2), indicating that although the extraction protocols did not exhibit differences in the number of distinct microbes, they did affect the relative abundances of observed microbes. These differences in the relative abundances may lead to microbial evenness as well as Shannon’s diversity.
Variation in microbiome profiles
To evaluate the effect sizes of DNA extraction protocols on gut microbiome profiles, the beta diversity was determined in accordance with the Bray-Curtis distance. The principal coordinates analysis (PCoA) plot revealed that samples from each individual clustered together irrespective of the extraction protocol (Fig. 3b).
Furthermore, we determined the Bray-Curtis distances, 1) between samples from the same protocol and different individuals, 2) between samples from different protocols and the same individual (P vs SB, P vs S, and S vs SB), and 3) between samples from the same protocol and the same individual (replicate samples) (Fig. 3c). As shown in the PCoA plot (Fig. 3b), the samples from the same protocol and different individuals displayed significantly greater Bray-Curtis distances (mean ± SD: 0.8063 ± 0.07258) compared to those between P vs SB, P vs S, and S vs SB or those between replicate samples, indicating that inter-individual variation is greater than inter-protocol or intra-protocol variations. Furthermore, Bray-Curtis distances of P vs SB and P vs S were slightly, however not significantly, higher than those of S vs SB (mean ± SD: P vs SB 0.2945 ± 0.04676, P vs S 0.3090 ± 0.04252, S vs SB 0.1138 ± 0.03161). Together, these results indicate that inter-individual differences largely influenced the gut microbiome profiles compared to the inter-protocol differences, and that individual microbial profiles generated with the different protocols were generally comparable.
Differences in genus abundance
We next performed differential abundance analysis to enumerate and determine the significantly differentially abundant genera between the samples obtained using each protocol. During this analysis, if all samples were assigned a zero count for a given genus, or if a genus was filtered by automatic independent filtering, the adjusted P-value (padj) was set to NA for the genus in DESeq2. Accordingly, among the 103 taxa classified at the genus level, the adjusted P-values were not applicable for 22 (21.4%) genera for the comparison between protocols P and SB and for 14 (13.6%) genera in comparisons both between protocols P and S, and between protocols SB and S. Moreover, 72 (69.9%) and 71 (68.9%) genera were not differentially abundant in protocol P compared with protocols SB and S, respectively; meanwhile, 88 (85.4%) genera were not differentially abundant in protocol SB compared to protocol S (padj > 0.05). Alternatively, nine (8.7%) and 18 (17.5%) of the tested genera were differentially abundant in protocol P compared to protocols SB and S, respectively; while only one genus (1.0%) was found to be differentially abundant in protocol SB compared to protocol S (padj < 0.05; Fig. 4a and Additional file 3: Table S1).
Samples from protocol P had significantly higher abundances of Blautia, Weissella, Dorea, Bifidobacterium, and Collinsella, while these samples had significantly lower abundances of Holdemania, Oscillospira, Lachnospira, and Sutterella than those from both protocols SB and S (Figs. 4b and c). The degrees of log2 fold-changes for these genera were higher between protocols P and S than between protocols P and SB, indicating that the abundances of these genera were affected by differences between the two kits, and the magnitude of the effect increased upon omission of an additional bead-beating step. These patterns were also observed in the graphs for genera relative abundance among the three protocols (Additional file 4: Fig. S3, Additional file 5: Fig. S4).
Compared to the samples obtained using protocol S, those obtained with protocol P additionally exhibited a higher abundance of other genera including Turicibacter, Adlercreutzia, Lactobacillus, Lactococcus, [Eubacterium], Clostridium, and [Ruminococcus], while having a lower abundance of Bacteroides and Parabacteroides (Figs. 4b and c). Interestingly, the genera determined to be more abundant in protocol P than in protocol S, but not than in protocol SB were gram-positive bacteria, while those that were less abundant in protocol P than in protocol S, but not than in protocol SB were gram-negative bacteria. These results are concurrent with the reduction in the alpha diversity in protocol S compared to protocol P (Fig. 3a), thus reconfirming that inclusion of a bead-beating step in the DNA extraction protocol is critical for disrupting the cell wall of gram-positive bacteria [5, 6].
Among the top 10 most abundant genera, differential abundance analysis of protocols P and SB revealed significant differences only in the 9th and 10th most abundant genera (Oscillospira and Blautia, respectively). In contrast, on comparing protocols P and S, four genera, including the most abundant (Bacteroides), were found to be differentially abundant between the two protocols among the top 10 most abundant genera (Additional file 3: Table S1). Therefore, when compared to protocol SB, the effect of protocol P on the abundances of major genera was smaller than compared to protocol S.