Skip to main content
Fig. 1 | BMC Microbiology

Fig. 1

From: Evaluation of PacBio sequencing for full-length bacterial 16S rRNA gene classification

Fig. 1

Change in sequence error rate a and the proportion of high quality reads that were retained when using different sequencing curation methods b. a The overall sequence error rates are plotted on the y-axis. The x-axis show the different allowed mismatches in the pre-clustering steps (5, 10, and 15 base pair for the full-length bacterial 16S rRNA gene) across the different screened datasets: no post alignment screening, post alignment screening using minimum sequence similarity (minsim) and minimum alignment score of (minscore) of 80%, post alignment screening using minsim and minscore of 90%. The round, square, top facing triangle and bottom facing triangle symbols denotes the 1 pass, 2 pass, 4 pass, and 8 pass datasets, respectively. b Top panel – Bold symbols, number of filtered raw reads across the different 1 pass, 2 pass, 4 pass, and 8 pass datasets. Top panel – Open symbols, number of high quality reads (HQRs) across the different post aligned screened datasets, i.e. no post alignment screening, post alignment screening using minimum sequence similarity and minimum alignment score of 80% (80/80), post alignment screening using minimum sequence similarity and minimum alignment score of 90% (90/90). b Bottom panel, + symbol = number of HQRs in the pre-clustered dataset with 5 allowed mismatches, x symbol = number of HQRs in the pre-clustered dataset with 10 allowed mismatches, circle with dot = number of HQRs in the pre-clustered datasets with 15 allowed mismatches

Back to article page