Skip to main content

Table 2 Supervised (machine) Learning Estimated Error Rates (randomForest simulation) for abundant OTU denoised pipelines

From: Influence of chronic azithromycin treatment on the composition of the oropharyngeal microbial community in patients with severe asthma

Clinical characteristic

Minimal Row Contribution Cut Off Sum for Each OTU

5

25

100

500

Estimated Error Rate (%)

Treatment

(i.e. AZ-treated and non-treated)

15.91

13.64*

15.91

18.18

Visit

(i.e. V2, V3, V6 and V7)

88.64

79.55

86.36

79.55

Patient group

(i.e. Placebo,

AZ responders,

AZ non-responders)

22.73

25.0

20.45

20.45

  1. Legend: Estimated error rate of the randomForest simulation by virtue of potentially contributable clinical metadata (Treatment, visit and Patient group) following an abundant OTU pipeline for denoising of dataset, i.e. following removal of chimeric and low quality sequences. Top row header: Minimal row contribution cut off sum for each OTU to determine the best performing data set (i.e. contains the most discriminative features with least amount of noise). When describing estimated error rate per minimal row, treatment was retained as the only clinical metadata category in the model simulation that had the smallest level of estimated error (closest to the < 10% mark for significance)
  2. *: the cut off value of 25 reads was chosen for further analysis, because its error rate is closest to 10%