Supplementary Figure 1: The effect of method and enzyme choice on the accuracy of 16S rRNA gene microbiome profiling.
From: Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies

A-G) Bar plots showing observed even mock community mean abundances (HM-276D, unless otherwise stated) measured using the following methods (Expected abundances are indicated with the dashed line. Black asterisks indicate that the observed abundance deviated by more than 5-fold from the expected value. Red asterisks indicate taxa that had no mapped reads (drop-outs). Error bars are +/- SEM):
A) Reported by Kozich et al.,1 n = 12. § Mapped to the HM-278D reference file.
B) The EMP protocol, reported by Nelson et al.,2 n = 2.
C) The EMP protocol (this study), n = 3.
D) The EMP protocol, substituting KAPA HiFi polymerase for the standard Taq polymerase, n = 3.
E) The Dual-indexing (DI) protocol with Taq polymerase, n = 4.
F) The DI protocol with Q5 polymerase, n = 4.
G) The DI protocol with KAPA HiFi polymerase, n = 4.
H) Mean Absolute Percentage Error (MAPE) plot for the HM-276D even mock community data measured using the indicated methods. § HM-278D expected abundance values were used to calculate MAPE for this data set. Error bars are +/- SEM.
I) Scatter plot comparing HM-276D even mock community data reported by Nelson et al.2 using the EMP protocol to data collected for this study using the EMP protocol. Error bars are +/- SEM.
J) Average number of L6 (genus level) taxa observed with the indicated methods. Error bars are +/- SEM. *** p < 0.01 determined by ANOVA with Tukey HSD post-hoc test.
K-O) Bar plots showing observed HM-277D staggered mock mean abundances versus expected abundances measured using the following methods (Expected abundances are indicated with the dashed line. Black asterisks indicate that the observed abundance deviated by more than 5-fold from the expected value. Red asterisks indicate taxa that had no mapped reads (drop-outs). Star indicates error bar with a lower bound of zero that cannot be plotted on a log scale. Error bars are +/- SEM):
K) The EMP protocol, n = 3.
L) The EMP protocol, substituting KAPA HiFi polymerase for the standard Taq polymerase, n = 3.
M) The Dual-indexing (DI) protocol with Taq polymerase, n = 3.
N) The DI protocol with Q5 polymerase, n = 3.
O) The DI protocol with KAPA HiFi polymerase, n = 3.
P) MAPE plot for the HM-277D staggered mock community data measured using the indicated methods. Error bars are +/- SEM.