Fig. 2: RNA generative modeling with RBM and experimental validation. | Nature Communications

Fig. 2: RNA generative modeling with RBM and experimental validation.

From: Designing molecular RNA switches with Restricted Boltzmann machines

Fig. 2: RNA generative modeling with RBM and experimental validation.

A A Restricted Boltzmann machines (RBM), with the visible layer carrying nucleotides A, C, G, U, or—(alignment gap symbol), and a hidden layer extracting features. The two layers are connected by weights. B The RBM is trained by maximization of a regularized likelihood, see Eq. S4. A gradient term increases the probability of regions in sequence space populated by data, automatically discovering features desirable for functional sequences (blue), while an opposite gradient term lowers the probability of regions void of data (red). The RBM may also assign large probability to potentially interesting sequences not covered by data (teal). C The model can be sampled to generate novel sequences that may significantly differ from the natural ones (teal). D Hidden units extract latent features (nucleic-acid motifs) through the weights. Weight values, either positive or negative, are shown above or below the zero-weight horizontal bar in the logo plots, see Methods. Combining these motifs together allows RBM to design functional RNA sequences. E The RBM is able to model complex interactions along the RNA sequence. Here, a hidden unit interacting with three visible units is highlighted. After marginalizing over hidden-unit configurations, effective interactions arise between the visible sites, see Eq. (5). Here we represent schematically a three-body interaction, arising from the three weights onto the marginalized hidden unit. F Designed sequences are tested experimentally with chemical probing approaches. Reactivities of sites to the probes may differ when SAM is absent or present (top); difference in reactivities between the two conditions is informative about structural changes (bottom). G Distributions of reactivities obtained with SHAPE-MaP slightly differ for paired and unpaired nucleotides. Statistical resolution of global structural changes triggered by SAM can then be enhanced by aggregating multiple sites. Inset: distributions over 24 sites, see Methods, section “Statistical analysis of reactivities” and Supplementary Figs. S25, S39.

Back to article page