Fig. 7: De novo secondary structure feature prediction for S. cerevisiae and the Saccharomyces genus.
From: Comprehensive analysis of Saccharomyces cerevisiae intron structures in vivo

a, Workflow for comparisons between introns’ secondary structure ensembles and those of control sequences. b, Comparison of secondary structure feature enrichment between introns and control sequences for DMS-guided structure prediction (left; with folding engine RNAstructure, comparing introns to shifted genomic controls), de novo MFE structure prediction (middle; with folding engine RNAstructure, comparing introns with shifted genomic controls), and de novo ensemble structure prediction (right; with folding engine Vienna 2.0, comparing introns with shuffled sequence controls). *P < 0.01 by two-sided Wilcoxon ranked-sum test. Exact P values from left to right are as follows for the left panel: P < 1 × 10–4, P = 0.0003, P = 0.0007, P < 1 × 10–4, P < 1 × 10–4; for the middle panel: P < 1 × 10–4, P = 0.0072, P = 0.0008, P < 1 × 10–4, P = 0.0015; and for the right panel: P < 1 × 10–4, P = 0.0003, P = 0.0011, P < 1 × 10–4, P < 1 × 10–4. In the left and middle panels, 140 introns are compared; 288 introns are compared in the right panel. MFE, minimum free-energy; SS, splice site; BP, branch point; intron - control score, difference between intron and control score. c, Differences in zipper stem (top) and downstream stem (bottom) ΔG between introns and shuffled sequence controls for introns in the Saccharomyces genus, using Vienna 2.0 ensemble predictions. *P < 0.01 by two-sided Wilcoxon ranked-sum test. All P values for the zipper stem comparisons are <1 × 10–4. For each species, the number of introns compared for both stem types and the P value for the downstream stem comparison are as follows: smik (n = 279, P = 0.00150), skud (n = 279, P = 0.21), suva (n = 278, P = 0.07), cgla (n = 100, P < 1 × 10–4), kafr (n = 216, P = 0.83), knag (n = 175, P = 0.011), ncas (n = 250, P = 0.10), ndai (n = 218, P = 0.00056), tbla (n = 163, P = 0.0017), tpha (n = 143, P = 0.0005), kpol (n = 175, P < 1 × 10–4), zrou (n = 166, P = 0.28), tdel (n = 202, P = 0.58), klac (n = 151, P = 0.0026), agos (n = 185, P = 0.26), ecym (n = 19, P = 0.41), sklu (n = 229, P = 0.27), kthe (n = 215, P = 0.31) and kwal (n = 210, P = 0.011). d, Distribution of zipper stems across introns in the Saccharomyces genus. Green values on the heatmap indicate a predicted zipper stem; white indicates no predicted zipper stem; gray values indicate deleted introns. Ohnologous introns are combined into a single row, and a zipper stem is annotated if present in either homolog. The species represented in this figure are: Eremothecium gossypii (agos), Candida glabrata (cgla), Eremothecium cymbalariae (ecym), Kazachstania africana (kafr), Kluyveromyces lactis (klac), Kazachstania naganishii (knag), Vanderwaltozyma polyspora (kpol), Lachancea thermotolerans (kthe), Lachancea waltii (kwal), Naumovozyma castellii (ncas), Naumovozyma dairenensis (ndai), Saccharomyces kudiavzevii (skud), Saccharomyces mikatae (smik), Saccharomyces uvarum (suva), Tetrapisispora blattae (tbla), Torulaspora delbrueckii (tdel), Torulaspora phaffii (tpha) and Zygosaccharomyces rouxii (zrou). In box plots, the median is the center white point, box limits are the 25th (Q1) and 75th (Q3) percentiles, and whiskers extend to the smallest and largest values that fall within 1.5 times the interquartile range below Q1 and above Q3.