Fig. 2: Interpretation of cancer gene variants.

a, The MTBP classifies a given cancer gene variant as (putative) functionally relevant or neutral according to three distinct sources of evidence (named A, B and C here) or of unknown relevance if none of these criteria are fulfilled. Note that the knowledge bases listed here are those integrated at the moment of writing4,5,6,7,8,9,10, but their usage may be subject to changes depending on evolving needs and preferences. FI, functional impact; OncoKB-mut and OncoKB-biom refer to the biological and predictive relevance annotation of variants in OncoKB, respectively. b, Criteria supporting the variant functional classification are considered to provide strong (>0.9 certainty) or very strong (>0.99 certainty) evidence as extrapolated from the work in variant pathogenicity classification3, following the rationale described in the table. c, Aggregated knowledge base assertions (excluding those from genetics population data) at the moment of writing. As expected by the different scopes of each knowledge base and the long tail of lowly recurrent mutations, only a minority of the variants appear curated in more than one knowledge base, which stresses the importance of their aggregation to provide a comprehensive annotation. d, Graphical summary of some of the criteria used for assuming that a variant with null consequence type is disrupting the function of a given tumor suppressor (part of the evidence of type B; a). These are largely based on established rules to identify loss-of-function variants in Mendelian genes (Methods). e, The lowest level of evidence to estimate a given variant effect is based on bioinformatics metrics. For variants that are not located in mutation hotspots, we decided to use the combined annotation dependent depletion (CADD) score16 to estimate the functional relevance of missense mutations in tumor suppressor genes (TSGs), as functional impact predictions perform worse in other scenarios (data not shown). The method and associated thresholds were selected according to our own benchmarking, based on the performance observed for mutations with curated effects (upper violin plot) and in silico simulations (lower violin plot) (Methods). FN, false negative (given these thresholds); FP, false positive; TN, true negative; TP, true positive.