Extended Data Fig. 1: Positive and negative controls.

a, Number of read pairs per sample (prior to aggregation), grouped by sample type (n = 3577 samples prior to aggregation by stool ID). Boxes extend from the 25th to 75th percentiles, whiskers extend to 1.5 times the interquartile range, and the center line is the median. b, PCoA of all samples prior to aggregation. Two positive controls (BZIZNTZA and JVOMNOOB, highlighted in red) clustered separately from the other positive controls. PCoA1 and PCoA2 explain 30% and 10% of overall variance, respectively. c, Species-level relative abundances (MetaPhlAn4) for positive controls. Two positive controls (BZIZNTZA and JVOMNOOB, highlighted in red) did not display the expected community composition. d, PCoA of non-control samples prior to aggregation. e, Same ordination as d, with lines connecting samples originating from the same DNA. f, Same ordination as d, with lines connecting samples in which a library was sequenced multiple times.