Extended Data Fig. 9: Scaling property of gene flow is obscured by fragmented genomes of varying quality.

The plots show the number of Archaea-related genes in relation to the total gene counts in the Asgard archaea genomes. a. only the 8 genomes investigated in detail in this study. All genomes have less than 20 contigs and with verified coverage of all archaeal markers. b. In addition to a, an additional 12 genomes were added (in black), which contain no more than 100 contigs with a loosened completeness scores as shown in Supplementary Table 9. Since marker redundancy differs among lineages, contamination level is hard to assess. c. In addition to b, all other 262 published Asgard archaea genomes were added (in green). This indicates a severe deviation from the invariable relation shown in a, but instead show a near linear relation. This can be understood that in either incomplete or contaminated genomes, all types of genes have equal possibility to be retained. For example, the 1.5Mb Odinarchaeote genome contains the similar number of Archaea-related genes (~900) as a Lokiarchaeote genome sized 4.4Mb. However, if a Lokiarchaeote is fragmented into 300 contigs and only 1.5Mb in total length is randomly binned into a MAG, the latter will roughly contain ~ 300 Archaea-related genes. Hence, the type of relation shown in (a) can only be captured in highly confident, complete genomes. Legend for all panels is shown in c.