Fig. 5 | Nature Communications

Fig. 5

From: Species abundance information improves sequence taxonomy classification accuracy

Fig. 5

Classification accuracy when using the appropriate bespoke weights is largely explained by how often sequences from different from species are confused (Pearson r2 = 0.72, P = 1.3 × 10−4). The confusion index is the log of the expected level of taxonomic difference between two similar reference sequences weighted by the likelihood of observing similar sequences. All points calculated using 5-fold cross validation. Error bars are standard errors across folds. Regression confidence intervals are 95%. Source data are provided as a Source Data file

Back to article page