Extended Data Fig. 6: The strength of the 5′ss contributes to splicing and sensitivity to SRRM4.

(a) Scheme of the mutations introduced to decrease complementarity to U1 snRNA or increase complementary to U6 snRNA. The top square represents the exon and its last three nts, the line represents the intron with the splice site “gu” and the following six nts. The sequence of U1 snRNA is provided as well the mutations performed to generate the variants named GT > GC (whereby position +2 is the intron is replaced from U to C) and U1weak (whereby 9 nts involved in base pairing to U1 snRNA have been replaced by the weak 5′ss associated with exon HsaEX6033277). In addition, positions +5 to +9 (U6cons_a) or +5 to +8 (U6cons_b) were mutated to enhance complementarity to U6 snRNA (whose sequence is provided above). (b) The lines within the violin represent the interquartile range of the maximal entropy score for 5′ss (MAXENT5) for the LS and HS events for the WT constructs and their corresponding U6cons_b variants. The median is represented by the dashed line, each dot represents an event. Statistics: Mann-Whitney-Wilcoxon test two-sided with Bonferroni correction. (c) The lines represent the PSI across four experimental conditions (GFP or LOW, MID, HIGH expression of SRRM4) for LS (left) or HS (right) events that are either WT or harbor mutations enhancing complementarity to U6 snRNA (U6cons_a, top panel or U6cons_b, bottom panel). Statistics: Mann-Whitney-Wilcoxon test two sided. (d, e) The lines depict the PSI across four experimental conditions (GFP vs LOW, MID, HIGH expression of SRRM4) for PLS3-HsaEX0048492 (d) and NRBP1-HsaEX0043752 (e). The nature of the variant (WT or extended U1cons (U1cons_a)) is indicated by the color. The corresponding RT-PCR assays are provided in Fig. 6e. (f) RT-PCR assays showing the splicing patterns of different minigenes under control condition (-) or under expression of human SRRM4 (+) in HEK 293 cells. Each minigene was generated using the sequences from the endogenous events (upstream/downstream exons and introns). The variant name and the corresponding sequence of its 5′ss is indicated on top of the gel (3 last exonic nucleotide | 6 first intronic nucleotide). The mutations were engineered to either increase or decrease the complementarity to U1 snRNA and the corresponding maximal entropy score for 5′ss is reported (MAXENT5). The results correspond to a single replicate of the experiment. Each condition of DAAM1-HsaEX0018410 and apbb1-DreEX00138456/HsaEX0005055 was conducted in three biological replicates, while asap1-DreEX0015055/HsaEX0006155 was performed in four biological replicates, for which the means and standard deviations (std) are provided below each lane. (g, h) On the left side, the lines within the violin represent the interquartile range of the maximal entropy score for 5′ss (MAXENT5) for the LS and HS events for the WT constructs and their corresponding GU > GC (g) or U1weak (h) variants. Each dot represents an event. Statistics: Mann-Whitney-Wilcoxon test two-sided with Bonferroni correction. On the right side, the lines represent the PSI across four experimental conditions (GFP or LOW, MID, HIGH expression of SRRM4) for LS (left) or HS (right) events for WT and GU > GC (g) or WT and U1weak (h) variants. Statistics: Mann-Whitney-Wilcoxon test two sided. (i) Alignment of ITSN1-HsaEX0032608 in 20 species spanning from primates to sauropsids for the last 93 nucleotides of the upstream intron, the microexon and 25 nucleotides of the downstream intron (as cloned in the context of the library). The level of conservation is depicted at the bottom with the black histogram and the nucleotides are counted from 1 to 133 at the top. The PSI values quantified across the four experimental conditions (GFP and LOW, MID, HIGH expression of SRRM4) are provided in the heatmap on the right side. The mutation from A to G in sus_scrofa, bos_taurus and tursiops_truncatus is indicated by the arrow. (j) The heatmap represents the effect of “likely pathogenic SNP” (ClinVar) mutations in four experimental conditions (GFP and LOW, MID, HIGH expression of SRRM4) in the 5′ ss of the CASK-HsaEX0012545 HS event at position 1 (rs1556000257 G > A) and in the 5′ss of the CS event SNAP25-HsaEX3077184 at position 2 (rs2123019421 T > G). (k) Distribution of ΔPSI (VAR-WT) values between variants and their corresponding WT in GFP vs MID, HIGH expression of SRRM4 for LS, HS, CS and their corresponding shortened version events. Variants include deep mutagenesis of sequences involved in base pairing to U1 snRNA including positions -3, -2, -1 at the 3′ end of the exon and positions 3 to 6 of the intron.