Extended Data Figure 9: Cryptic elements are frequent in long first introns.

a, UCSC annotated isoforms of the OPCML gene together with spliced expressed sequence tags (ESTs) detected across the OPCML locus. Recursive exon is marked in blue, and the preceding exons produced by minor promoter or cryptic splicing of the long first intron are marked in red. b, Lengths of the 9 introns containing the high-confidence RS-sites compared to other introns across vertebrates. Results are an extension of Fig. 4g. c, Boxplot showing the detected number of unannotated alternative start exons that junction to the dominant second exon of brain expressed genes. Only novel junctions that do not match UCSC/GENCODE transcripts are considered for analysis. Genes are separated into bins based on the first intron length of the canonical isoform. Boxplot presents median, first and third quartile boundaries for each bin. Additional red diamonds indicate mean values for each bin. *P < 10−10(Mann–Whitney U test). Only tests between the 100 kb+ bin to other bins are shown. Right panel shows cartoon of the implications of boxplot results.