Fig. 5
From: Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea

Relative Archaea–Bacteria distances indicated by individual gene trees and their concordance with the species tree. a Distribution of relative A(rchaea)–B(acteria) distances of individual gene trees. A total of 161 gene trees are shown, selected such that both domains have at least 50% taxa represented in each tree. A histogram with Gaussian kernel density function and a rug plot representing individual data points are displayed. The blue and red vertical lines indicate the values of the ASTRAL species tree with branch lengths estimated using the global markers and the ribosomal proteins, respectively. Gene names are labeled at data points separated from the main cluster. b Distribution of relative A–B distances by functional category (GO slim term under the “molecular function” master category) of the 161 gene trees. The top ten most frequently assigned categories in all gene trees are shown. Boxplot components: center line, median; box limits, upper and lower quartiles; whiskers, 1.5 × interquartile range; black diamonds, outliers. c Distribution of quartet scores of all 381 gene trees vs. the species tree. Boxplot components are identical to b. d mMDS plot of the quartet distance matrix of all 381 gene trees plus the species tree (semi-transparent big gray ball in the center). The color scheme for genes annotated by exactly one of the top ten functional categories (normal-sized balls) is consistent with b and c, except that the category “ion binding” is omitted due to its high frequency. Genes annotated by more than one of the top ten categories (light yellow), or by categories other than the top ten (semi-transparent light gray) are indicated by smaller balls. e Linear regression of relative A–B distances vs. quartet scores of the 161 gene trees. The squared Pearson correlation coefficient (R2) and two-tailed p-value are displayed. Source data are provided as a Source Data file.