Extended Data Fig. 1: Sequencing depths and species richness across CRC datasets

a, Boxplots reporting the total number of reads in each dataset. P values between the carcinoma and control groups were calculated by two-tailed Wilcoxon rank-sum tests. b, Boxplots showing the total number of microbial species per dataset. P values were calculated by two-tailed Wilcoxon rank-sum tests. c, Boxplots showing the total number of microbial species per dataset calculated on metagenomes subsampled in each dataset to the number of reads of the tenth percentile. P values were calculated by two-tailed Wilcoxon rank-sum tests. d, Multivariate analysis of species richness using crude and age-, sex- and BMI-adjusted coefficients obtained from linear models. e, Meta-analysis of crude and adjusted multivariate richness coefficients using a random effects model. Bold lines represent the 95% confidence interval for the random effects model estimate.