Extended Data Fig. 2: Meta-analysis of species diversity and oral species richness in CRC datasets.

a, Boxplots reporting the Shannon species diversity in each dataset. P values between the carcinoma and control groups were calculated by two-tailed Wilcoxon rank-sum tests. b, Boxplots reporting the Shannon species diversity calculated on metagenomes subsampled in each dataset to the number of reads of the tenth percentile. P values were calculated by two-tailed Wilcoxon rank-sum tests. c, Multivariate analysis of species diversity using crude and age-, sex- and BMI-adjusted coefficients obtained from linear models. d, Meta-analysis of crude and adjusted multivariate Shannon diversity coefficients using a random effects model. Bold lines represent the 95% confidence interval for the random effects model estimate. e, Boxplots reporting the total number of oral microbial species per dataset. P values were calculated by two-tailed Wilcoxon rank-sum tests comparing values between controls and carcinomas for each dataset. f, Multivariate analysis of putative oral species richness using crude and age-, sex- and BMI-adjusted coefficients obtained from linear models. g, Meta-analysis of crude and adjusted multivariate putative oral species richness coefficients using a random effects model. Bold lines represent the 95% confidence interval for the random effects model estimate.