Fig. 1: The unified sequence catalog of the human gut microbiome. | Nature Biotechnology

Fig. 1: The unified sequence catalog of the human gut microbiome.

From: A unified catalog of 204,938 reference genomes from the human gut microbiome

Fig. 1: The unified sequence catalog of the human gut microbiome.

a, Number of gut genomes for each study set used to generate the sequence catalogs, colored according to whether they represent isolate genomes or MAGs. b, Geographic distribution of the number of genomes retrieved per country. c, Overview of the methods used to generate the genome (UHGG) and protein sequence (UHGP) catalogs. Genomes retrieved from public datasets first underwent quality control by CheckM. Filtered genomes were clustered at an estimated species level (95% ANI), and their intraspecies diversity was assessed (genes from conspecific genomes were clustered at 90% protein identity). In parallel, a nonredundant protein catalog was generated from all coding sequences of the 286,997 genomes at 100% (UHGP-100, n = 170,602,708), 95% (UHGP-95, n = 20,239,340), 90% (UHGP-90, n = 13,907,849) and 50% (UHGP-50, n = 4,735,546) protein identity.

Back to article page