Fig. 3: BeatAML Dataset Analysis.
From: A Bayesian model for unsupervised detection of RNA splicing based subtypes in cancers

a Heatmap showing the tile discovered by CHESSBOARD on the beatAML dataset (samples = 477, LSVs = 2299). The signal (samples = 217, LSVs = 1910) is outlined in red. Note that although CHESSBOARD was run on junction spanning read rates as input, the heatmap shows Ψ values to facilitate visualization. b Heatmap of Ψ values in the Penn HTSC dataset (samples = 77, LSVs = 2299), showing reproducibility of the tiles originally identified by CHESSBOARD in the beatAML dataset. The signal tile (samples = 32, LSVs = 1899) is outlined in red. c Correlation between median(ΔΨ) in the beatAML and HTSC datasets for the representative junction in each LSVs belonging to the tile. The median(ΔΨ) value was computed between the 2 groups discovered by CHESSBOARD in both datasets. Correlation is measured using Pearson’s correlation coefficient (r) and the two-sided p value is the probability of observing a coefficient > ∣r∣ under the exact null distribution. d ENCODE based analysis of possible tile regulators. Top bar plot shows the percentage of splice junctions (y-axis) in the tile that overlap with splice junctions in one of three categories associated with each RBP/SF (x-axis). DS (blue) is the set of junctions that are differentially spliced between knockout and control samples in ENCODE K562 cell lines. CLIP (orange) is the set of junctions that are bound by the RBP in a 250 bp region flanking the junction. The “Both” bar (green) represents junctions in the intersection of DS and CLIP sets. The bottom bar plot shows whether the overlap is significant (bonferroni corrected cutoff) based on a one-sided fisher’s exact test for enrichment. The red circles indicate whether the matching RBP/SF is differentially spliced (in at least one junction) and/or differentially expressed between the tiles and whether it is a component of the spliceosome or a cis/trans acting splice factor. e mTORC1 GSEA: Enrichment of genes ranked by log(likelihood gain) of LSVs among the HALLMARK_MTORC1_SIGNALING gene set as performed with GSEA v. 4.1.0 and visualized with the fgsea R package.