Fig. 4: Distribution of false discovery rate simulation replicates for both unfiltered and filtered data. | Nature Communications

Fig. 4: Distribution of false discovery rate simulation replicates for both unfiltered and filtered data.

From: Microbiome differential abundance methods produce different results across 38 datasets

Fig. 4

The percentage of amplicon sequence variants that are significant after performing Benjamini–Hochberg correction of the p-values (using a cut-off of 0.05) are shown for each separate dataset and tool. Interquartile range (IQR) of boxplots represent the 25th and 75th percentiles while maxima and minima represent the maximum and minimum values outside 1.5 times the IQR. Notch in the middle of the boxplot represent the median. Note that the x-axis is on a pseudo-log10 scale. a Represents unfiltered datasets while b represents datasets filtered using a 10% prevalence requirement for each ASV. Datasets and tools were run 100 times while randomly assigning samples from the same environment and original groupings to one of two new randomly selected groupings. Differential abundance analysis was then performed on the two random groupings. Note that in the unfiltered datasets 100 replicates was only run 3 of the 8 datasets (Freshwater—Arctic, Soil—Blueberry, Human—OB (1)) with 100 ALDEx2 replications also being run in the Human - HIV (3) dataset. All other unfiltered datasets were run with 10 replicates due to computational limitations. Abbreviations: TMM, trimmed mean of M-values; TMMwsp, trimmed mean of M-values with singleton pairing; rare, rarefied; CLR, center-log-ratio. Source data are provided as a Source Data file.

Back to article page