Fig. 1: Overview and evaluation of the algorithm for SSCV detection.

a Splicing consequences via SSCVs. b Example of alignment status around SSCVs by the depiction of Integrative Genomics Viewer for the SSCVs in SDHA (ENST00000264932.11: c.1751 C > T) (left) and in PRKAR1A (ENST00000589228.6: c.892-129 C > G) (right). We can observe mismatch bases corresponding to the SSCVs. c Overview of juncmut procedures. d Overview of the evaluation of juncmut using the 1000 Genomes Project and GTEx genome and transcriptome sequence data. e Comparative evaluation of juncmut against SpliceAI. We analyzed variants from the 1000 Genomes Project whole-genome sequencing data with SpliceAI scores exceeding the thresholds (0.2, 0.5, and 0.8), as well as SSCVs identified by juncmut. We selected variants observed in at least one sample within the GTEx cohorts and calculated the combined p-value (measuring the difference in abnormal splicing ratios between samples with and without the variant across various tissues; see “Methods” section for details). Each p-value corresponding to these variants is plotted accordingly. The numbers of remaining SNVs with SpliceAI scores exceeding 0.2, 0.5, and 0.8 were 7923, 1498, and 336, respectively; additionally, 22 SNVs identified by juncmut were plotted. The boxplot summarizes the combined p-values. The ends of the boxes represent the lower and upper quartiles, the center line indicates the median, and the whiskers show the maximum and minimum values within 1.5 × IQR from the edges of the box, respectively. f An example of the relative abnormal splicing ratios in the presence (orange) and absence (gray) of SSCVs across various tissues, as measured using GTEx transcriptome data. See also Supplementary fig. 2. e, f Source data are provided as a Source Data file.