Fig. 2: Protein-coding sequences on plasmids follow an empirical scaling law.

Comparison of scaling between chromosomes and plasmids on a log-log scale. Plasmids are shown in blue, megaplasmids (plasmids > 500,000 bp in length) are shown in red, and chromosomes are shown in green. Source data are provided as Source Data files. A As plasmids increase in size, the fraction of sequence dedicated to protein-coding sequences converges to the fraction of sequence dedicated to protein-coding sequences on chromosomes. B As plasmids increase in size, the fraction of sequence dedicated to protein-coding sequences increases. The noncoding fraction is defined as (length – coding sequence length) / (length). The gray line indicates the average noncoding fractions and lengths of 100 buckets of plasmids, binned by length. C The same pattern holds for microbes sampled across diverse environments. The ecological provenance of each replicon was annotated per the method described in Maddamsetti et al.3 (Methods).