Fig. 5: Validation of the fitted large-scale cancer model on independent test data.
From: Mini-batch optimization enables training of ODE models on large-scale datasets

a Correlation of test data and model prediction. Color-coding indicates density in a scatter plot. b ROC curves for classification of drug responses of cell lines on the test set. Classification thresholds from the training data were used. c Area under ROC and classification accuracy on test data for the ten best optimizations (gray), the ensemble model (black), and the ensemble model for each drug individually (colored). d Simulated drug response. Left: Ranking of fit quality for cell lines by average root-mean-square error. Right: Two out of 59 cell lines from the test data, error bars indicate the standard deviation across an ensemble of the n = 10 best optimization runs, for a cell line which the model was able to describe well (blue, 8505C) and a cell line, which was less well captured by the model (orange, JHH5). e ROC curves for classification of gene essentiality. Measurement data for 18 cell lines were taken from Behan et al., 2019. In silico knockouts are shown for the untrained model (blue), the trained model (orange), and the refined and trained model (green). f Measurement, prediction, and confusion matrix for essential genes for the refined and trained model. Numbers indicate how often a gene was found to be essential in experimental data and in silico knockout predictions for the 18 cell lines, sums show the number of true and false predictions over essential genes.