Extended Data Fig. 4: Explanation of OncoNPC prediction for a patient with CUP.

The patient is a 76-year-old male with a tumor biopsy from the liver. The pie chart on the left shows the top 10 important features across three different feature categories (that is, CNA events, somatic mutation and mutation signatures), and the scatter plot on the right shows their SHAP values and feature values. The size of each dot is scaled by corresponding absolute SHAP value. From the chart review, we found that the patient reported a 60-pack year smoking history, as well as having lived near a tar and chemical factory as a child. Despite the CUP diagnosis, OncoNPC confidently classified the primary site as NSCLC with posterior probability of 0.98. SBS4, a tobacco smoking-associated mutation signature, was significantly enriched in the patient’s tumor sample, which has, by far, the most impact on the prediction, followed by SBS24 mutation signature associated with known exposures to aflatoxin, and KRAS mutation.