Fig. 3: Distribution of DNA G4s in the vicinity of splicing sites.

A Length density distribution of introns upstream and downstream from exons that are flanked or not by G4s (two-sided Kolmogorov–Smirnov test p value <0.001). B Abundance enrichment of intron sizes at upstream and downstream splice sites flanked by G4s. A bin size of 10 bps was used with the blue line representing an eighth-degree polynomial model. Error bands represent 95% confidence intervals of the regression model. C Heatmap for the relationship between splicing strength score, intron length and G4 presence in a local window of 100 nt within the splice site for the upstream and downstream introns. Red colour represents a high proportion of splice site regions with G4s, whereas blue colour represents depletion of G4s. D Fraction of splice sites with a G4, controlling for GC content between long and short introns. We use chi-squared test to evaluate significant differences between short and long introns (* denotes p values <0.05 after multiple testing corrections). E G4 motif enrichment relative to the splice site across exons in the gene body at the 3′ss and at the 5′ss for template and non-template strands. G4 motifs are enriched at both 3′ss and 5′ss across splice sites throughout the gene body. Exons were separated into first to fourth exons, middle exons, last four exons and the distribution of G4s were studied individually for each category. Error bands represent 95% confidence intervals based on the binomial distribution.