Fig. 4: Gene–chemical networks differentiated in glucose abundance.
From: Data-driven discovery of the interplay between genetic and environmental factors in bacterial growth

A Evaluating the ML model of Ksd by SHAP analysis. The SHAP value represents the chemical contribution to the variation of Ksd. The color gradation indicates the relative concentration of the chemical that normalized its concentrations across 135 media. The chemicals of significant contributions are shown in order. B–D Clustering dendrograms of the strains at varied glucose levels. Glc_l, Glc_m, and Glc_h indicate low, medium, and high glucose levels, comprising varied media and resultant gene–chemical combinations. The heatmaps indicate the feature importance of each chemical in each strain. Chemicals are arrayed vertically. The clusters are shown in green and orange. E–G Gene–chemical networks at varied glucose levels. 208, 216, and 218 combinations of genes and chemicals are used to create the networks of Glc_l, Glc_m, and Glc_h, respectively. The chemicals of high importance in determining the growth of 115 strains are indicated. The large nodes in transparent yellow represent the determinative chemicals, whose sizes are the sum of the feature importance to all linked strains (genes). The small nodes in green and orange indicate the knockout genes categorized in the four clusters. The thickness of the edges reflects the magnitude of the feature importance.