Extended Data Fig. 5: Genomic characterization and comparison of M. amalyticus.

a, Plot of the complete genome sequence of M. amalyticus JR1. From outer to inner: blue track represents sequencing coverage using a sliding window of 1,000 bp. Outer grey circle indicates maximum coverage (412-fold) and inner grey circle indicates average coverage (315-fold). The orange and green tracks indicate genes encoded in the forward and reverse strands, respectively. The light blue track indicates tRNA genes, and the red track shows the location of the single rRNA operon. Finally, the purple track represents GC content using a sliding window of 1,000 bp with the outer and inner grey circles indicating maximum (62%) and average (50.4%) GC content, respectively. Scale of genome length in Mb is shown inside b, BLASTn analysis of full-length M. amalyticus JR1 16 S rRNA gene showing % identity of top five closest hits. c, Phylogenetic reconstruction based on the amino acid sequence of a concatenated 120 bacterial standard protein set from M. amalyticus JR1 and 27 related Saccharibacteria (Supplementary Table 8) generated using GTDB-Tk. The phylogenetic tree is inferred using the neighbor-joining method with a bootstrap test (1000 replicates). Ca. Parcubacteria RAAC4-OD1 was used as an outgroup. Bar represents the number of amino acid substitutions per site. d, Heatmap based on percentage amino acid identity analysis of the genome sequence of M. amalyticus JR1 M. amalyticus and related Saccharibacteria as per the above panel. Text highlighted in green indicate Saccharibacteria that were identified in external environments (that is not from the human oral environment).