Fig. 4: CeSta leverages informative features and combines weaker classifiers.

a Feature importance as determined by CeSta, represented by the mean absolute SHAP value (x-axis) for the top significant features (y-axis). b Top significant features’ impact on CeSta output using SHAP values (x-axis) across all 50 PDXs in the CR-PDX validation set (scatter dots). The most important features in (a) have the greatest impact on model outcomes, with a clear separation between positive and negative effects. c Performance of CeSta’s top features on IRCC PDXs and the external cohort. The relationship between a feature’s SHAP values and cetuximab sensitivity on the train set (full IRCC PDX set, x-axis) and test set (CR PDX set), after removing other features’ effects (partial correlation, parSHAP). Dot size and colour indicate a feature’s mean absolute SHAP value on the training set. Dots closer to the diagonal indicate consistent performance across train and test sets. Key features like KRAS mutation and EREG expression align closely with the diagonal, indicating a good fit or slight underfitting. d Underperformance of CMP-trained features on the external cohort. The relationship between CatBoostCMP feature SHAP values and cetuximab sensitivity on the train (panCMP set) and test (CR-PDX) sets, after removing other features’ effects. Dot size and colour represent a feature’s impact on model prediction. Many top features of this model fall in the lower right quadrant, indicating overfitting. e AUROC confidence intervals (CI, 95%) for CeSta (blue), three level 1 classifiers (orange), the catBoost model trained on the panCMP dataset (green), and the same catBoost model retrained on the IRCC-PDX dataset. CeSta shows a slight performance improvement over the best level 1 classifier, with overlapping CIs. The cell-line-trained CatBoost classifier poorly predicts cetuximab sensitivity in PDXs, but retraining improves its performance. Source data are provided as a Source Data file.