Fig. 1: Assessing the performance of different algorithms and testing the active learning workflow with minimal data points.
From: A versatile active learning workflow for optimization of genetic and metabolic networks

a An existing dataset of cell-free gene expression compositions composed of 1000 data points was used to build a gold standard regressor and assess the performance of different machine learning algorithms in 10 rounds of active learning. b Top panel: performance of 4 algorithms, multilayer perceptrons (MLP), deep neural networks (DNN), linear regressors, and XGBoost gradient boosting in 10 rounds of active learning (100 data points per round). Bottom panel: performance of the XGBoost gradient boosting algorithm as the selected algorithm with different sample sizes. The boxplots with whisker length of 1.5, represent the minimum, 25th percentile (bottom bound of box), median (center of box), 75th percentile (upper bound of box), and maximum. c An in vitro or cell-free transcription-translation (TXTL) system (based on E. coli lysate) to test the workflow with 20 data points per round. A plasmid expressing sfGfp was added to TXTL reaction mix along with 13 components of reaction buffer and energy mix. d Overview of the active learning cycle. 13 components are varied starting with random compositions and over 10 rounds of results are imported to the model, which learns and suggests new compositions for improvement of the objective function. e The plot presenting the average of triplicates (n = 3 independent experiments) of the objective function (yield) for compositions in 10 rounds (days) of active learning. The gray lines show the median. f Feature importance percentages show the effect of each factor on the model’s decision to calculate yields for the suggested compositions. g Distribution of different concentrations of each factor within the measured yields. The Google Colab Python notebook and all active learning data (combinations and yields) in this figure are available at https://github.com/amirpandi/METIS.