Fig. 1: Integrating host gene expression and gut microbiome abundance in CRC, IBD and IBS.

a, Overview of study design: colonic biopsy samples were collected from individuals in each disease cohort and for each sample, paired host transcriptomic (RNA-seq) data and gut microbiome abundance (16S rRNA) data were generated. The paired host transcriptomic and gut microbiome data were integrated using a machine learning-based framework to characterize associations between gut microbiota and host genes and pathways across the three diseases (left to right) (see Methods for details on the integration framework and mathematical notations). b, Procrustes analysis showing overall association between variation in host gene expression and gut microbiome composition in CRC, IBD and IBS (left to right). We used Aitchison’s distance for host gene expression data (circles) and Bray-Curtis distance for gut microbiome data (triangles). Panel a was created using BioRender.com.