Extended Data Fig. 3: Robustness of OncoNPC performance with respect to input genomics features. | Nature Medicine

Extended Data Fig. 3: Robustness of OncoNPC performance with respect to input genomics features.

From: Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary

Extended Data Fig. 3

The figure shows the breakdown of OncoNPC performance in F1 score by 22 cancer types across increasing prediction confidence. The cancer types on the y-axis are sorted in a decreasing order of the number of tumor samples. In order to investigate the impact of input genomics features on OncoNPC’s robustness, we performed a feature ablation study, where we chose the most important genes based on their aggregated SHAP values and gradually reduced them from all 846 features associated with those genes, as well as age and sex, to only the top 10% (that is, top 29 features). In each feature configuration, we re-trained the model with the same set of hyperparameters and evaluated its performance on the held-out CKP tumor samples (n = 7,289), which were utilized throughout this work. Supplementary Data 4 provides a list of input features that correspond to the selected genes in each configuration.

Back to article page