Fig. 2: Distribution of rCGCs across bacterial taxonomy.

a rCGC prevalence in bacterial families. The size of a dot represents the percentage of genomes in a family that an rCGC type has been detected by rhizoSMASH (defined as prevalence in the main text). The phylogenetic tree on the left was derived from the lineage information available at NCBI. The number in parentheses indicates the number of completely assembled genome in BARS for each family. b The richness (the average number of rCGC types present in the genomes of a family) and c the diversity (the average differences in rCGC presence/absence between each pair of genomes in a family) indexes of rCGC types across bacterial families. d The rCGC presence/absence profiles in Nocardiaceae spp. genomes whose isolation origin were clear, where a solid dot represents the presence of a rCGC type in a genome, and an open dot represents the absence. The labels on the right represent the isolation origins, based on their records at JGI GOLD and NCBI. The phylogeny on the left was constructed with genomic 16S rDNA sequences, rooted with the Psedomonas putida type strain NBRC 14164 as an outgroup (not shown on the tree).