Fig. 2: Significantly Different Gut Bacterial Abundances Detected Across Cohorts.

A Taxonomic stacked bar plots depicting relative abundances of top 18 most abundant bacteria across all samples from shotgun sequence data (N = 174; Indian (n = 61), Indo-Immigr (n = 32), Indo-Can (n = 23), Euro-Can (n = 41), Euro-Immigr (n = 23). B Heatmap generated in MicrobiomeAnalyst 2.0, displaying taxa identified by Random Forest as key features that contribute to the predictive accuracy of classifying samples into their respective groups (Figure S5). For the heatmap, features were filtered for minimum 4 counts in 20% or more samples, and low variance filter of 10%. Bars on the top represent the cohorts that group the individual sample columns displayed. Taxa are labelled in rows, with taxonomic rank noted before the bacteria name. Colours on heatmap represent the relative abundances of each bacteria in a given sample. C Heatmap generated in R Studio, displaying the average relative abundances of P. copri clades in each cohort. D LEfSe results of top differentially abundant bacteria. A Kruskal-Wallis test was performed with a significance (α) at 0.05 for one-against-all comparisons. Differentially abundant bacteria were detected using a Linear Discriminate Analysis (LDA) score (equal to or greater than 3.5). E Abundance plot of notable differentially abundant species identified by LEfSe, displaying average relative abundance per cohort.