Fig. 5: Coverage of a sparse set of explanatory genes related to cell structure, replication, energy production, amino acid, and inorganic ion transport predicts replication patterns of Prochlorococcus and SAR11 ecotypes in random-forest models.
From: Basin-scale biogeography of Prochlorococcus and SAR11 ecotype replication

A The variance explained (R2) by the random-forest models for the HLII, clade Ia, and clade Ib random-forest models. B Comparison of the relative frequency of genes annotated by their Cluster of Orthologous Groups (COG) category for all reference genomes within a specific ecotype and for the 40 explanatory genes selected by the random-forest model.