Figure 2
From: Discovery of novel chemical reactions by deep generative recurrent neural network

Modeling workflow for generation of new reactions consists of five main steps: (1) training sequence-to-sequence autoencoder on the USPTO database of chemical reactions; (2) building of Generative Topographic Map (GTM) using the autoencoder latent variables and preparation of GTM class landscape; (3) selecting on GTM a zone populated to Suzuki coupling reactions and identification of related autoencoder latent vectors; (4) sampling from the autoencoder latent space and generation of new reactions; and, (5) post-processing step. On the Generative Topographic Map, larger transparency levels correspond to lower density. The color code renders the (binary: Suzuki vs Other) reaction class distribution. Thus, zones in dark blue are exclusively populated by Suzuki reactions, zones in dark red are exclusively populated by other types of reactions; while intermediate colors correspond to reaction space areas hosting both categories, in various ratios. The red circle indicates the zone from which virtual Suzuki reactions were sampled.