Fig. 4: Novel diversity within Faecalibacterium and strain-dependent butyrate production.
From: HiBC: a publicly available collection of bacterial strains isolated from the human gut

a The two novel species of Faecalibacterium described within this paper placed within the current landscape of Faecalibacterium spp. with a valid name, along with the type genomes for proposed divisions of F. prausnitzii, as determined by GTDB. The phylogenomic tree was rooted on Ruminococcus bromii ATCC 27255T. Novel species are in bold, and the type strain of F. prausnitzii is underlined. b Relative abundance and prevalence of the genus Faecalibacterium, and each Faecalibacterium species represented within HiBC across 4624 metagenomic samples. The boxplots include a central line indicating the median, boxes represent the interquartile range, and whiskers represent the minimum and maximum values, not including outliers. c Butyrate production pathway in Faecalibacterium with gene names and KEGG ortholog identifiers when possible. d Phylogenomic tree of the Faecalibacterium strains within HiBC, displaying the ability of each strain to produce butyrate over a 48 h-period, along with the OD600 that the strain achieved during the testing period (n = 3 independent batch cultures for each strain; the replicates are shown with individual boxes). The phylogenomic tree was rooted on R. bromii ATCC 27255T. e Sequence comparison of the butyrate production loci across the Faecalibacterium strains. Genes are coloured based on their assignment to each step in the butyrate production pathway in (c). f AlphaFold3 model of the But complex in F. tardum CLA-AA-H175 against the full But protein in F. prausnitzii CLA-AA-H222. The first CLA-AA-H175 But gene is highlighted in yellow in the dashed box, while the second gene is shown in brown. The highlighted protein is indicated in the top right of the dashed box. g Same as in (f), but this time the second CLA-AA-H175 But gene is highlighted in yellow in the dashed box, while the first gene is in brown. The highlighted protein is indicated in the top right of the dashed box. h Correlation of the mean OD600 against the mean butyrate production of each strain with a linear regression and its 95% confidence interval and analysed using a two-sided Pearson correlation coefficient.