Fig. 2
From: UTR-DynaPro: a CNN–transformer multimodal language model for decoding 5′UTR regulatory mechanisms

Conceptual framework of the UTR-Dynapro model for comprehensive prediction and rational design of 5′UTR functions. Endogenous data from multiple species, human cell line/tissue datasets, and independent test sets are used as input. The encoder integrates a mixture-of-experts gating module with parallel Transformer and CNN branches to capture both global dependencies and local sequence features, followed by dynamic fusion and nonlinear transformation. The model is pre-trained with minimum free energy (MFE) and masked nucleotide (MN) objectives, and subsequently fine-tuned on downstream tasks, including mean ribosome load (MRL), translation efficiency (TE), and expression level (EL). Model performance is evaluated with multiple statistical metrics (e.g., Pearson R, Spearman R, RMSE), with checkpoint saving at the best epoch.