Extended Data Fig. 6: Performance and interpretation of machine learning models trained exclusively on one MPIVA RNA sequence context. | Nature Structural & Molecular Biology

Extended Data Fig. 6: Performance and interpretation of machine learning models trained exclusively on one MPIVA RNA sequence context.

From: The anticancer compound JTE-607 reveals hidden sequence specificity of the mRNA 3′ processing machinery

Extended Data Fig. 6

(a) Scatter plots of L3-only model performance on predicting drug sensitivity at 3 Compound 2 doses on L3 test sequences (upper) and SVL test sequences (lower). Test sequences include equal number of sequences derived from both the L3 and SVL RNA contexts. (b) Plot of average of all L3-only model’s layer 1 filters’ absolute value of Pearson correlation with 12.5 μM Compound 2 predictions across all positions. These are split into Pearson correlation values associated with resistant, negative, and all 12.5 μM Compound 2 predictions. Dashed gray lines indicate positions at the edge of sequence padding. (c) L3-only model’s convolutional layer 1 max filter activations with the highest Pearson correlation with 12.5 μM Compound 2 predictions. Sequence logos are plotted on top of per-position absolute value of Pearson correlations with 12.5 μM Compound 2 sensitivity predictions. Filters’ Pearson correlations that begin at the canonical cut site in the SVL context are marked, and note that preceding filters may overlap with the designed canonical cut sites. (d) Same analyses as in panel a, but for the SVL-only model. (e) Same analyses as in panel b, but for the SVL-only model. (f) Same analyses as in panel c, but for the SVL-only model.

Source data

Back to article page