Image-based generation for molecule design with SketchMol

Wang, Zixu; Chen, Yangyang; Ma, Pengsen; Yu, Zhou; Wang, Jianmin; Liu, Yuansheng; Ye, Xiucai; Sakurai, Tetsuya; Zeng, Xiangxiang

doi:10.1038/s42256-025-00982-3

Article
Published: 13 February 2025

Image-based generation for molecule design with SketchMol

Nature Machine Intelligence volume 7, pages 244–255 (2025)Cite this article

4136 Accesses
5 Citations
12 Altmetric
Metrics details

Subjects

Abstract

Efficient molecular design methods are crucial for accelerating early stage drug discovery, potentially saving years of development time and billions of dollars in costs. Current molecular design methods rely on sequence-based or graph-based representations, emphasizing local features such as bonds and atoms but lacking a comprehensive depiction of the overall molecular topology. Here we introduce SketchMol, an image-based molecular generation framework that combines visual understanding with molecular design. SketchMol leverages diffusion models and applies a refinement technique called reinforcement learning from molecular experts to improve the generation of viable molecules. It creates molecules through a painting-like approach that simultaneously depicts local structures and global layout of the molecule. By visualizing molecular structures, various design tasks are unified within a single image-based framework. De novo design becomes sketching new molecular images, whereas editing tasks transform into filling partially drawn images. Through extensive experiments, we demonstrated that SketchMol effectively handles a variety of molecular design tasks.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Illustration of the SketchMol framework.**

**Fig. 2: Experiments on the quality and rationality of molecular image generation.**

**Fig. 3: Experiments on the generative capabilities of SketchMol under different physicochemical conditions.**

**Fig. 4: Examples of mask-based molecular optimization.**

**Fig. 5: Cases on optimizing lead activity from various structural perspectives.**

**Fig. 6: Cases on fragment growing for EP₄.**

A molecular video-derived foundation model for scientific drug discovery

Article Open access 08 November 2024

Designing molecules with autoencoder networks

Article 21 November 2023

Generation of 3D molecules in pockets via a language model

Article Open access 15 January 2024

Data availability

The data used in this study are available via Figshare (https://doi.org/10.6084/m9.figshare.27210429.v1)⁶⁴.

Code availability

The source code for SketchMol is available via GitHub (https://github.com/WangZiXubiubiu/SketchMol-v1) and Zenodo (https://doi.org/10.5281/zenodo.13937534)⁶⁵.

References

Scannell, J. W., Blanckley, A., Boldon, H. & Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 11, 191–200 (2012).
Article Google Scholar
von Delft, A. et al. Accelerating antiviral drug discovery: lessons from COVID-19. Nat. Rev. Drug Discov. 22, 585–603 (2023).
Article MATH Google Scholar
Pandey, M. et al. The transformational role of GPU computing and deep learning in drug discovery. Nat. Mach. Intell. 4, 211–221 (2022).
Article MATH Google Scholar
Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395 (2014).
Article MATH Google Scholar
Tong, X. et al. Generative models for de novo drug design. J. Med. Chem. 64, 14011–14027 (2021).
Article MATH Google Scholar
Zeng, X. et al. Deep generative molecular design reshapes drug discovery. Cell Rep. Med. 3, 100794–100802 (2022).
Article MATH Google Scholar
Bian, Y. & Xie, X.-Q. Generative chemistry: drug discovery with deep learning generative models. J. Mol. Model. 27, 71–88 (2021).
Article MATH Google Scholar
Bagal, V., Aggarwal, R., Vinod, P. & Priyakumar, U. D. MolGPT molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 62, 2064–2076 (2021).
Article MATH Google Scholar
Wu, J.-N. et al. t-SMILES: a fragment-based molecular representation framework for de novo ligand design. Nat. Commun. 15, 4993 (2024).
Article MATH Google Scholar
Du, Y. et al. Machine learning-aided generative molecular design. Nat. Mach. Intell. 6, 589–604 (2024).
Article MATH Google Scholar
Li, Y. et al. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat. Commun. 13, 6891 (2022).
Article MATH Google Scholar
Liu, Q., Allamanis, M., Brockschmidt, M. & Gaunt, A. L. Constrained graph variational autoencoders for molecule design. Adv. Neural Inf. Process. Syst. 31, 7806–7815 (2018).
Jin, W., Barzilay, R. & Jaakkola, T. S. Junction tree variational autoencoder for molecular graph generation. In Proc. 35th International Conference on Machine Learning 2328–2337 (PMLR, 2018).
Jin, W., Barzilay, R. & Jaakkola, T. S. Hierarchical generation of molecular graphs using structural motifs. In Proc. 37th International Conference on Machine Learning 4839–4848 (PMLR, 2020).
Maziarz, K. et al. Learning to extend molecular scaffolds with structural motifs. In The Tenth International Conference on Learning Representations 1–22 (ICLR, 2022).
Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations 1–22 (ICLR, 2023).
Zhu, Y. et al. A survey on deep graph generation: methods and applications. In Proc. First Learning on Graphs Conference 47:1–47:21 (PMLR, 2022).
Vong, W. K., Wang, W., Orhan, A. E. & Lake, B. M. Grounded language acquisition through the eyes and ears of a single child. Science 383, 504–511 (2024).
Article Google Scholar
Cheng, S. et al. EgoThink: evaluating first-person perspective thinking capability of vision-language models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 14291–14302 (IEEE, 2024).
Song, W. et al. HOIAnimator: generating text-prompt human-object animations using novel perceptive diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 811–820 (IEEE, 2024).
Fang, C., Hu, X., Luo, K. & Tan, P. Ctrl-Room: controllable text-to-3D room meshes generation with layout constraints. Preprint at https://arxiv.org/abs/2310.03602 (2023).
Chen, W., Gu, T., Xu, Y. & Chen, A. Magic clothing: controllable garment-driven image synthesis. In Proc. 32nd ACM International Conference on Multimedia 6939–6948 (Association for Computing Machinery, 2024).
Schneider, G. & Fechner, U. Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4, 649–663 (2005).
Article MATH Google Scholar
Nicolaou, C. A., Brown, N. & Pattichis, C. S. Molecular optimization using computational multi-objective methods. Curr. Opin. Drug Discov. Dev. 10, 316–324 (2007).
MATH Google Scholar
Alizadeh, S. R. & Ebrahimzadeh, M. A. Quercetin derivatives: drug design, development, and biological activities, a review. Eur. J. Med. Chem. 229, 114068 (2022).
Article MATH Google Scholar
Hao, Y. et al. A review of the design and modification of lactoferricins and their derivatives. Biometals 31, 331–341 (2018).
Article MATH Google Scholar
Imrie, F., Bradley, A. R., van der Schaar, M. & Deane, C. M. Deep generative models for 3D linker design. J. Chem. Inf. Model. 60, 1983–1995 (2020).
Article Google Scholar
Huang, Y., Peng, X., Ma, J. & Zhang, M. 3DLinker: an E(3) equivariant variational autoencoder for molecular linker design. In International Conference on Machine Learning 9280–9294 (PMLR, 2022).
Elharrouss, O., Almaadeed, N., Al-Maadeed, S. & Akbari, Y. Image inpainting: a review. Neural Process. Lett. 51, 2007–2028 (2020).
Article Google Scholar
Tschannen, M., Bachem, O. & Lucic, M. Recent advances in autoencoder-based representation learning. Preprint at https://arxiv.org/abs/1812.05069 (2018).
Huang, H. et al. UNet 3+: a full-scale connected UNet for medical image segmentation. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing 1055–1059 (IEEE, 2020).
Croitoru, F.-A., Hondru, V., Ionescu, R. T. & Shah, M. Diffusion models in vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10850–10869 (2023).
Article MATH Google Scholar
Uesato, J. et al. Solving math word problems with process- and outcome-based feedback. Preprint at https://arxiv.org/abs/2211.14275 (2022).
Kim, S. et al. Pubchem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).
Article Google Scholar
Karras, T., Aila, T., Laine, S. & Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In 6th International Conference on Learning Representations 1–26 (ICLR, 2018).
van den Oord, A., Vinyals, O. & Kavukcuoglu, K. Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 30, 6306–6315 (2017).
Esser, P., Rombach, R. & Ommer, B. Taming transformers for high-resolution image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition 12873–12883 (IEEE, 2021).
Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10684–10695 (IEEE, 2022).
Yang, L. et al. Diffusion models: a comprehensive survey of methods and applications. ACM Comput. Surv. 56, 105 (2023).
MATH Google Scholar
Peng, X. & Zhu, F. Hitting stride by degrees: fine grained molecular generation via diffusion model. Expert Syst. Appl. 244, 122949 (2024).
Article Google Scholar
Weininger, D. SMILES, a chemical language and information system. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Article MATH Google Scholar
Panaretos, V. M. & Zemel, Y. Statistical aspects of Wasserstein distances. Annu. Rev. Stat. Appl. 6, 405–431 (2019).
Article MathSciNet MATH Google Scholar
Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S. & Klambauer, G. Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery. J. Chem. Inf. Model. 58, 1736–1741 (2018).
Article Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Process. Syst. 30, 6626–6637 (2017).
Atz, K. et al. Prospective de novo drug design with deep interactome learning. Nat. Commun. 15, 3408 (2024).
Article MATH Google Scholar
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
Article Google Scholar
Zeng, X. et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat. Mach. Intell. 4, 1004–1016 (2022).
Article MATH Google Scholar
Pourpanah, F. et al. A review of generalized zero-shot learning methods. IEEE Trans. Pattern Anal. Mach. Intell. 45, 4051–4070 (2023).
MATH Google Scholar
Alhossary, A., Handoko, S. D., Mu, Y. & Kwoh, C.-K. Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics 31, 2214–2216 (2015).
Article Google Scholar
Toyoda, Y. et al. Ligand binding to human prostaglandin E receptor EP₄ at the lipid-bilayer interface. Nat. Chem. Biol. 15, 18–26 (2019).
Article MATH Google Scholar
Wu, W.-I. et al. Crystal structure of human AKT1 with an allosteric inhibitor reveals a new mode of kinase inhibition. PLoS ONE 5, e12913 (2010).
Article MATH Google Scholar
Gao, H. et al. ROCK inhibitors 2. Improving potency, selectivity and solubility through the application of rationally designed solubilizing groups. Bioorg. Med. Chem. Lett. 28, 2616–2621 (2018).
Article MATH Google Scholar
Salentin, S., Schreiber, S., Haupt, V. J., Adasme, M. F. & Schroeder, M. PLIP: fully automated protein–ligand interaction profiler. Nucleic Acids Res. 43, W443–W447 (2015).
Article Google Scholar
DeLano, W. L. et al. PyMOL: an open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 40, 82–92 (2002).
Google Scholar
Shang, E. et al. De novo design of multitarget ligands with an iterative fragment-growing strategy. J. Chem. Inf. Model. 54, 1235–1241 (2014).
Article MATH Google Scholar
Old, D. W. Therapeutic agents. US patent no. WO2014/179263 (2015).
Qian, Y. et al. MolScribe: robust molecular structure recognition with image-to-graph generation. J. Chem. Inf. Model. 63, 1925–1934 (2023).
Article MATH Google Scholar
Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).
Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Song, J., Meng, C. & Ermon, S. Denoising diffusion implicit models. In 9th International Conference on Learning Representations 1–20 (ICLR, 2021).
Rawte, V., Sheth, A. P. & Das, A. A survey of hallucination in large foundation models. Preprint at https://arxiv.org/abs/2309.05922 (2023).
Ouyang, L. et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35, 27730–27744 (2022).
Landrum, G. et al. RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum 8, 5281 (2013).
MATH Google Scholar
Wang, Z. Data used for SketchMol training. Figshare https://doi.org/10.6084/m9.figshare.27210429.v1 (2024).
Wang, Z. WangZiXubiubiu/SketchMol-v1: SketchMol. Zenodo https://doi.org/10.5281/zenodo.13937534 (2024).

Download references

Acknowledgements

X.Z. acknowledges support from the National Natural Science Foundation of China (grant nos. 62425204, 62450002, 62122025, U22A2037 and 62432011) and the Beijing Natural Science Foundation (grant no. L248013). Y.L. acknowledges support from the National Natural Science Foundation of China (grant no. 62372159). T.S. acknowledges support from the Japan Society for the Promotion of Science (grant no. JP23H03411) and the Japan Science and Technology Agency (grant no. JPMJPF2017). X.Y. acknowledges support from the Japan Society for the Promotion of Science (grant no. JP22K12144). We extend our sincere gratitude to X.Y., Z.Y., T.S., Y.L. and Y.C. for their invaluable feedback on paper writing and figures.

Author information

These authors contributed equally: Zixu Wang, Yangyang Chen.

Authors and Affiliations

Department of Computer Science, University of Tsukuba, Tsukuba, Japan
Zixu Wang, Yangyang Chen, Xiucai Ye & Tetsuya Sakurai
College of Computer Science and Electronic Engineering, Hunan University, Changsha, People’s Republic of China
Pengsen Ma, Zhou Yu, Yuansheng Liu & Xiangxiang Zeng
Department of Integrative Biotechnology, Yonsei University, Incheon, Republic of Korea
Jianmin Wang

Authors

Zixu Wang
View author publications
Search author on:PubMed Google Scholar
Yangyang Chen
View author publications
Search author on:PubMed Google Scholar
Pengsen Ma
View author publications
Search author on:PubMed Google Scholar
Zhou Yu
View author publications
Search author on:PubMed Google Scholar
Jianmin Wang
View author publications
Search author on:PubMed Google Scholar
Yuansheng Liu
View author publications
Search author on:PubMed Google Scholar
Xiucai Ye
View author publications
Search author on:PubMed Google Scholar
Tetsuya Sakurai
View author publications
Search author on:PubMed Google Scholar
Xiangxiang Zeng
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.W. and X.Z. conceived the research project. Z.W., Y.C. and X.Z. designed and implemented the framework. Z.W., Y.L., X.Y., T.S. and X.Z. designed the experiments. Z.W., Y.C., P.M., Z.Y. and J.W. conducted the experiments and result analyses. Y.C. conducted the molecular dynamics simulation. All authors discussed the experimental results and commented on the manuscript.

Corresponding authors

Correspondence to Xiucai Ye or Xiangxiang Zeng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Kenneth Atz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–6, Figs. 1–13 and Tables 1–5.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Z., Chen, Y., Ma, P. et al. Image-based generation for molecule design with SketchMol. Nat Mach Intell 7, 244–255 (2025). https://doi.org/10.1038/s42256-025-00982-3

Download citation

Received: 19 July 2024
Accepted: 23 December 2024
Published: 13 February 2025
Issue date: February 2025
DOI: https://doi.org/10.1038/s42256-025-00982-3

This article is cited by

Molecular pretraining models towards molecular property prediction
- Jianbo Qiao
- Wenjia Gao
- Leyi Wei
Science China Information Sciences (2025)