Extended Data Fig. 1: Diversification and structure of cps gene clusters in human gut Bacteroidetes.

Diversification and structure of cps gene clusters in human gut Bacteroidetes. a, The genomes of 53 different human gut Bacteroidetes (predominantly named type strains) were searched for gene clusters that contain two or more different protein families indicative of cps loci (see Methods). The number of cps loci detected in each genome is shown in the context of phylogenetic tree derived from the core genome of the 53 species used for this analysis; species for which cps loci were not detected using our search criteria are marked with a red “X”. Due to gaps in several genomes, which often occur at cps loci, the numbers shown are likely to be an underestimate. b, Schematics of the 8 annotated cps loci in B. thetaiotaomicron VPI-5482, which are singly present in the cps1-cps8 strains used in this study, or completely eliminated in the acapsular strain. Genes are color coded according to the key at the bottom and additional Pfam family designations are provided under most genes. The four main protein families used for informatics analysis are marked with asterisks and highlighted in bold in the key.