Fig. 2: Illustration of the generation of synthetic data using RDChiral and synthetic data. | Nature Communications

Fig. 2: Illustration of the generation of synthetic data using RDChiral and synthetic data.

From: RSGPT: a generative transformer model for retrosynthesis planning pre-trained on ten billion datapoints

Fig. 2

a Method for generating synthetic data using RDChiral. Molecules from PubChem40, ChEMBL41, and Enamine42 were fragmented to submolecules. Submolecules were then matched with reaction centers of templates, as shown in the grey shaded part, and complete reactions were generated based on corresponding templates by concatenating reactants SMILES and products SMILES into a complete reaction text. b Examples of synthetic data. Case 1 is a coupling reaction, and Case 2 is a nucleophilic substitution reaction. The templates, reaction SMILES, and visualized reaction schemes for both examples are displayed.

Back to article page