Fig. 2: Conditional generation with SmileyLlama for fragment growth and before and after DPO compared with ChEMBL. | Nature Computational Science

Fig. 2: Conditional generation with SmileyLlama for fragment growth and before and after DPO compared with ChEMBL.

From: SmileyLlama: modifying large language models for directed chemical space exploration

Fig. 2: Conditional generation with SmileyLlama for fragment growth and before and after DPO compared with ChEMBL.The alternative text for this image may have been generated using AI.

a, Example molecules generated by growing from one of the Enamine substructures and to satisfy Lipinski’s rule-of-five using the prompt ‘Output a SMILES string for a drug like molecule with the following properties: a substructure of O=C(O)c1ccc(C(F)(F)F)cc1, < = 500 MW, < = 5 logP, < = 5 H-bond donors, < = 10 H-bond acceptors’. b, Distribution of four properties satisfying Lipinski’s rule-of-five comparing ChEMBL molecules (orange) with molecules generated by SmileyLlama (blue) with the prompt ‘Output a SMILES string for a drug like molecule with the following properties: < = 5 H-bond donors, < = 10 H-bond acceptors, < = 500 MW, < = 5 logP’, compared with 1,000 molecules generated by SmileyLlama with the same prompt after DPO (gray). MW and logP distributions were estimated using a Gaussian kernel density estimator (KDE). All results generated 1,000 molecules at a temperature T = 1.0 and a maximum of 128 new tokens.

Source data

Back to article page