Fig. 2: Discovery and validation of rare putative loss-of-function (pLoF) variants associating with somatic mutation components via a gene-based combined burden and variance test. | Nature Communications

Fig. 2: Discovery and validation of rare putative loss-of-function (pLoF) variants associating with somatic mutation components via a gene-based combined burden and variance test.

From: The impact of rare germline variants on human somatic mutation processes

Fig. 2

a Associations were identified in the discovery cohort (TCGA WES) and replicated in the validation cohort (PCAWG + Hartwig WGS). b Associations were tested by 15 models in total, by utilizing 3 models of inheritance and 5 differently prioritized rare pLoF variant sets (all with population allele frequency <0.1%; PTVs, protein-truncating variants) (top). CADD, MTR, and CCR are different in silico variant prioritization tools. The combined test SKAT-O was applied, which calculates a weighted sum between a burden test statistic and the SKAT variance test statistic. When ρ = 1, the test reduces to a burden test, and when ρ = 0, the test reduces to the variance (SKAT) test. A schematic (not actual data) to show how the two tests can result in contrasting outcomes (bottom). c Number of replicated hits at a false discovery rate (FDR) of 1% and 2% across cancer types and d across somatic mutational components. e Overlap of number of genes replicating at a FDR of 1% and 2% via the two different dimensionality reduction methods. f Number of replicated hits at 1% and 2% FDR across models of inheritance (left) and overlap of replicated hits between models at a 1% FDR (right). g Number of replicated hits at 1% and 2% FDR across rare pLoF variant sets (left) and overlap of replicated hits between rare pLoF variant sets at a 1% FDR (right). h Distribution of ρ values from the SKAT-O test (x-axis) for the 207 hits that replicated at 1% FDR, in the discovery (gray) and validation cohort (red). i Distribution of SKAT-O ρ values (y-axis) for the 207 hits, in the discovery (top row) and validation cohort (bottom row), across models of inheritance (columns) and rare pLoF variant sets (x-axis). Data underlying panels ci are provided as a Source Data file.

Back to article page