Fig. 2: PAL-AI learned sequence elements and contexture features important for tail-length control during oocyte maturation.

a PAL-AI-predicted consequences of motif loss. For each 6-mer, the mean difference in predicted tail-length change, comparing mutants and wild-type, is plotted as a function of the number of analyzed mutants. CPE- or PAS-related motifs are indicated (red and blue, respectively). b Sequence motifs associated with PAL-AI-predicted changes in tail length. Sequence logos were generated from 8-mers most associated with the largest differences in PAL-AI-predicted tail-length change upon 8-mer loss. Pie charts indicate fractions of 8-mers aligned to logos. Bar plots show mean differences; points represent individual 8-mers. c Top 6-mers associated with decreased predicted tail-length change from in silico mutagenesis, selected iteratively, with exclusion. Motif colors: CPE (red), PAS (blue), GUU/UGU (orange), others (black). Error bars, standard error. d Positional effects of 3-mers flanking a CPE. Plotted for each 3-mer are differences in mean tail length for mRNAs with that 3-mer at indicated positions relative to a CPE in the N60-PASmos library14. The gray box indicates positions where 3-mers impact the CPE context14. e PAL-AI-predicted positional effects of inserting CPE and PAS along the 3′ UTR. Plotted are mean differences in predicted tail-length change. Drop near position 17 reflects PAS disruption (Supplementary Fig. 3e). Shaded areas, standard error. Inset, last 100 nt of the 3′ UTR. f Predicted effects of CPE-PAS spacing. Mean differences in PAL-AI-predicted tail-length change conferred upon inserting a CPE in silico are plotted as a function of the relative distance between CPE and PAS. Shaded areas, standard error. The gray box, CPE-PAS overlapping positions. g PAL-AI-predicted effects of single-nucleotide substitutions in the mos.L 3′ UTR. The heatmap indicates the difference in predicted tail-length change (DPTLC) for each substitution (x, original; y, alternative). Line plots indicate max (red) and min (blue) mutational outcomes at each position. The logo plot indicates the importance of each nucleotide, with the height normalized to the negative value of the average outcome of three possible substitutions. Dashed rectangles, CPE and PAS. The arrow points to an instance of a new CPE, generated by a G-to-U substitution, and its associated increase in tail-length change. h PAL-AI-predicted effects of single-nucleotide substitutions in the tpx2.L 3′ UTR, plotted as in (g). The solid arrow points to an example of a substitution to a more optimal PAS-flanking nucleotide; the dashed arrows point to optimal CPE-flanking nucleotides14.