Fig. 1: Microbiome composition and diversity in the AWI-Gen 2 cohort.

a, Sample number and location of each study site. Countries containing sites are dark grey. b, Principal coordinate analysis of all samples on the basis of Bray–Curtis distance on species-level prokaryotic profiles. Study site is colour-coded and the boxplots show the samples per site projected onto the first and second principal coordinate. c, Prokaryotic diversity (inverse Simpson’s index after rarefaction) per site (Kruskal–Wallis test, P < 2 × 10−16, n = 1,796 after quality control and removing data from male individuals). d, Heatmap showing the number of prokaryotic species with high generalized fold change between sites; sites are clustered on the basis of this number of species. e, The log10(relative abundance) of genera with the highest variance in fold change and median across sites. f, The log10 of the mean relative abundance per site is shown for all species within the genera shown in e. For Prevotella, Oribacterium, Cryptobacteroides and Treponema, all species with scientific names are highlighted; only the top abundant species with scientific names are indicated for the other genera. All panels represent data from n = 1,796 biologically independent samples. Boxplot boxes denote the interquartile range (IQR), thick black lines indicate the median, and whiskers indicate the most extreme points within 1.5-fold IQR. Supplementary Methods contain photographs and further information for each site.