Fig. 2: AS is Controlled by Splicing Factors (SFs).

A SF motifs significantly enriched around AS events. The position (R) of each splicing event relative to the event type is represented in the x axis, 5’ to 3’. Splicing factors’ binding motifs are described in the y axis and were clustered using a sequence similarity network (SSN). Clustered SFs are shown in different colors. Each panel shows a single event type (A3, A5, MXE, ES, RI, respectively). The color and size of circles both indicate the significance of SFs. The red color and larger circles show the most significant SFs. Multiple regions were evaluated for each event type (bottom panel). Exons that are shown with green boxes are the region of interest for each event type. Exons represented with gray colors are up- and downstream exons. Lines separated with / are representing exons. Each region is numbered from R1 to R(n), where n is the total region for each event type. Intronic regions are 250-bp flanking sequences, and exon regions are 50-bp sequences from the start or end of the exon where splicing factors often bind. B Boxplots of SRSF1 expression (y axis) by MM stage (x axis) in three datasets (3 panels). Panel corresponds to datasets: IFM 2009 (left), Mayo Clinic-(GSE6477) (center), and Arkansas-(GSE5900 and GSE2658) (right). Colors correspond to MM (red), MGUS (green), SMM (purple), and NPC (blue). p values for each dataset are given on the top with corresponding test names. C Hazard ratio (y axis) shown with blue boxes and 95% CI (lines) of overall survival (OS) for SRSF1 expression in three datasets. p values were calculated using the cox proportional hazard model, and the summary table is on the right. D Boxplots of the number of exon skipping and MXE (y axis) events by SRSF1 expression (x axis). Green boxes show that patients with low SRSF1 expression have fewer splicing events (ES, MXE) compared to patients with high SRSF1 expression (orange boxes), as separated by the upper median. E Positional distribution of the SRSF1 binding motif. The middle section of the figure shows skipped exon (green box), 3’ end of the exon before (left gray box), and the 5’ end of the exon after (right gray box). Lines represent 250-bp intronic sequences before and after each shown exon. The top panel shows the mean motif score calculated as density within a 50-bp sliding window as the overall percentage of nucleotides covered by the SRSF1 binding motif. The black line indicates the background signal in this region. Red and blue lines show enrichment of SRSF1 binding motifs around exons more utilized by MM (red) or less utilized by MM cells (blue). The SRSF1 binding motif is shown above the panel. The panel at the bottom shows the -log10(p value) of the motif enrichment in these windows. The higher the peak reaches, the more significant the p value for that region. Significance was determined by comparison to a ‘background set’ of 16765 exons without splicing changes (rMATs FDR > 50%) in expressed genes (bottom section).