Fig. 5: Cryptic splicing is common in published MPRAs. | Nature Communications

Fig. 5: Cryptic splicing is common in published MPRAs.

From: U-rich elements drive pervasive cryptic splicing in 3’ UTR massively parallel reporter assays

Fig. 5

A SpliceAI37 total predicted splicing probability positively correlates with observed splicing fraction observed in the PTRE-seq library. B Designs of evaluated 3’ UTR-focused MPRAs and predicted effect of cryptic splicing in these libraries. C Percentage of reporters predicted to be spliced from MPRAs described in (B). D Location of predicted splice donors (splicing prob. > 0.30). E Top enriched sequence motif in reporter transcripts with >0.60 splicing probability for each MPRA. F Reporter expression as a function of predicted splicing probability. For clarity, data points outside the expression range (−6, 3) are not shown. G Opposing differences in RNA expression between paired ref and alt single-nucleotide variants in the Griesemer MPRA correlate with predicted splicing. alt > ref and ref > alt denote variants in which predicted splicing probability is 0.35 or greater between the ref and alt alleles. ref ≈ alt denotes alleles where the difference in splicing probability is less than 0.35. Expression measurements from HEK293FT cells are shown7. Data points outside of the whiskers of the ref ≈ alt group are not shown. H Summary of RT-PCR validation of predicted spliced reporters from the Griesemer MPRA. See Figure S8 for raw data. I Example functional variant from the Griesemer MPRA that is explained by cryptic splicing. At right are PCR products from DNA plasmids, RNA, and no RT control. Bar plot shows mean relative expression measured by Griesemer with error bars denoting standard error7. SpliceAI probability is shown below along with the sequence of the predicted 3’ splice site (underlined). The variant is highlighted red. RT-PCR analysis of individually transfected reporter into HEK293T cells is shown at right. For (F, G, I), RNA expression was obtained from the corresponding MPRAs (mean across 3 replicates for Siegel, Zhao, and Fu, and 6 replicates for Griesemer). Box plots span the 25th and 75th percentile and the centers indicate the median. Whiskers indicate the furthest datum 1.5x outside the interquartile range. P-values shown above plots were computed using two-sided Mann-Whitney U tests. Uncropped gels are provided in the Source Data file.

Back to article page