Extended Data Fig. 3: Robustness of OncoNPC performance with respect to input genomics features.

The figure shows the breakdown of OncoNPC performance in F1 score by 22 cancer types across increasing prediction confidence. The cancer types on the y-axis are sorted in a decreasing order of the number of tumor samples. In order to investigate the impact of input genomics features on OncoNPC’s robustness, we performed a feature ablation study, where we chose the most important genes based on their aggregated SHAP values and gradually reduced them from all 846 features associated with those genes, as well as age and sex, to only the top 10% (that is, top 29 features). In each feature configuration, we re-trained the model with the same set of hyperparameters and evaluated its performance on the held-out CKP tumor samples (n = 7,289), which were utilized throughout this work. Supplementary Data 4 provides a list of input features that correspond to the selected genes in each configuration.