Fig. 2: VEPs identify unannotated driver mutations.
From: AI cancer driver mutation predictions are valid in real-world data

A Frequency and annotation of missense mutations occurring at binding residues (either ligand binding or protein-protein interaction hotspots, see Supplemental Appendix) or non-binding residues of all genes with available binding residue information in GENIE v14-public pan-cancer cohort (N = 209,588). Asterisks denote statistical significance from two-sided Fisher’s exact tests with FDR correction (*: q-value ≤ 0.1). OncoKB groups include all missense mutations, whereas variant effect predictor groups only include VUSs. B. Inverse probability treatment weighted overall survival hazard ratios (from time of diagnosis left truncated at time of sequencing) of patients harboring reclassified oncogenic mutations compared to patients without mutation in commonly mutated genes in non-small cell lung cancer (NSCLC). Patients are from MSK-IMPACT NSCLC (N = 7965) and AACR GENIE Biopharma Collaborative NSCLC (N = 977) cohorts. Inset: Inverse probability of treatment weighted Kaplan Meier curves comparing overall survival from time of diagnosis left-truncated at time of sequencing of patients based on KEAP1 mutations annotation in the MSK-IMPACT NSCLC cohort. Alteration frequencies and overlap of AlphaMissense reclassified pathogenic mutations with oncogenic alterations of genes in the same pathway in MSK-IMPACT NSCLC cohort (N = 7965 patients). Inset: Oncoprint of genes in the NRF2 pathway for N = 1279 samples with NRF2 pathway alteration. Asterisks denote statistical significance from two-sided Fisher’s exact tests with FDR correction (***: q-value ≤ 0.01). KEAP1 reclassified mutations, similar to KEAP1 oncogenic mutations, are mutually exclusive with other oncogenic mutations in NFE2L2 and CUL3. Source data are provided as a Source Data file.