Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Image-based generation for molecule design with SketchMol

Abstract

Efficient molecular design methods are crucial for accelerating early stage drug discovery, potentially saving years of development time and billions of dollars in costs. Current molecular design methods rely on sequence-based or graph-based representations, emphasizing local features such as bonds and atoms but lacking a comprehensive depiction of the overall molecular topology. Here we introduce SketchMol, an image-based molecular generation framework that combines visual understanding with molecular design. SketchMol leverages diffusion models and applies a refinement technique called reinforcement learning from molecular experts to improve the generation of viable molecules. It creates molecules through a painting-like approach that simultaneously depicts local structures and global layout of the molecule. By visualizing molecular structures, various design tasks are unified within a single image-based framework. De novo design becomes sketching new molecular images, whereas editing tasks transform into filling partially drawn images. Through extensive experiments, we demonstrated that SketchMol effectively handles a variety of molecular design tasks.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Illustration of the SketchMol framework.
Fig. 2: Experiments on the quality and rationality of molecular image generation.
Fig. 3: Experiments on the generative capabilities of SketchMol under different physicochemical conditions.
Fig. 4: Examples of mask-based molecular optimization.
Fig. 5: Cases on optimizing lead activity from various structural perspectives.
Fig. 6: Cases on fragment growing for EP4.

Similar content being viewed by others

Data availability

The data used in this study are available via Figshare (https://doi.org/10.6084/m9.figshare.27210429.v1)64.

Code availability

The source code for SketchMol is available via GitHub (https://github.com/WangZiXubiubiu/SketchMol-v1) and Zenodo (https://doi.org/10.5281/zenodo.13937534)65.

References

  1. Scannell, J. W., Blanckley, A., Boldon, H. & Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 11, 191–200 (2012).

    Article  Google Scholar 

  2. von Delft, A. et al. Accelerating antiviral drug discovery: lessons from COVID-19. Nat. Rev. Drug Discov. 22, 585–603 (2023).

    Article  MATH  Google Scholar 

  3. Pandey, M. et al. The transformational role of GPU computing and deep learning in drug discovery. Nat. Mach. Intell. 4, 211–221 (2022).

    Article  MATH  Google Scholar 

  4. Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395 (2014).

    Article  MATH  Google Scholar 

  5. Tong, X. et al. Generative models for de novo drug design. J. Med. Chem. 64, 14011–14027 (2021).

    Article  MATH  Google Scholar 

  6. Zeng, X. et al. Deep generative molecular design reshapes drug discovery. Cell Rep. Med. 3, 100794–100802 (2022).

    Article  MATH  Google Scholar 

  7. Bian, Y. & Xie, X.-Q. Generative chemistry: drug discovery with deep learning generative models. J. Mol. Model. 27, 71–88 (2021).

    Article  MATH  Google Scholar 

  8. Bagal, V., Aggarwal, R., Vinod, P. & Priyakumar, U. D. MolGPT molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 62, 2064–2076 (2021).

    Article  MATH  Google Scholar 

  9. Wu, J.-N. et al. t-SMILES: a fragment-based molecular representation framework for de novo ligand design. Nat. Commun. 15, 4993 (2024).

    Article  MATH  Google Scholar 

  10. Du, Y. et al. Machine learning-aided generative molecular design. Nat. Mach. Intell. 6, 589–604 (2024).

    Article  MATH  Google Scholar 

  11. Li, Y. et al. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat. Commun. 13, 6891 (2022).

    Article  MATH  Google Scholar 

  12. Liu, Q., Allamanis, M., Brockschmidt, M. & Gaunt, A. L. Constrained graph variational autoencoders for molecule design. Adv. Neural Inf. Process. Syst. 31, 7806–7815 (2018).

  13. Jin, W., Barzilay, R. & Jaakkola, T. S. Junction tree variational autoencoder for molecular graph generation. In Proc. 35th International Conference on Machine Learning 2328–2337 (PMLR, 2018).

  14. Jin, W., Barzilay, R. & Jaakkola, T. S. Hierarchical generation of molecular graphs using structural motifs. In Proc. 37th International Conference on Machine Learning 4839–4848 (PMLR, 2020).

  15. Maziarz, K. et al. Learning to extend molecular scaffolds with structural motifs. In The Tenth International Conference on Learning Representations 1–22 (ICLR, 2022).

  16. Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations 1–22 (ICLR, 2023).

  17. Zhu, Y. et al. A survey on deep graph generation: methods and applications. In Proc. First Learning on Graphs Conference 47:1–47:21 (PMLR, 2022).

  18. Vong, W. K., Wang, W., Orhan, A. E. & Lake, B. M. Grounded language acquisition through the eyes and ears of a single child. Science 383, 504–511 (2024).

    Article  Google Scholar 

  19. Cheng, S. et al. EgoThink: evaluating first-person perspective thinking capability of vision-language models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 14291–14302 (IEEE, 2024).

  20. Song, W. et al. HOIAnimator: generating text-prompt human-object animations using novel perceptive diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition 811–820 (IEEE, 2024).

  21. Fang, C., Hu, X., Luo, K. & Tan, P. Ctrl-Room: controllable text-to-3D room meshes generation with layout constraints. Preprint at https://arxiv.org/abs/2310.03602 (2023).

  22. Chen, W., Gu, T., Xu, Y. & Chen, A. Magic clothing: controllable garment-driven image synthesis. In Proc. 32nd ACM International Conference on Multimedia 6939–6948 (Association for Computing Machinery, 2024).

  23. Schneider, G. & Fechner, U. Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4, 649–663 (2005).

    Article  MATH  Google Scholar 

  24. Nicolaou, C. A., Brown, N. & Pattichis, C. S. Molecular optimization using computational multi-objective methods. Curr. Opin. Drug Discov. Dev. 10, 316–324 (2007).

    MATH  Google Scholar 

  25. Alizadeh, S. R. & Ebrahimzadeh, M. A. Quercetin derivatives: drug design, development, and biological activities, a review. Eur. J. Med. Chem. 229, 114068 (2022).

    Article  MATH  Google Scholar 

  26. Hao, Y. et al. A review of the design and modification of lactoferricins and their derivatives. Biometals 31, 331–341 (2018).

    Article  MATH  Google Scholar 

  27. Imrie, F., Bradley, A. R., van der Schaar, M. & Deane, C. M. Deep generative models for 3D linker design. J. Chem. Inf. Model. 60, 1983–1995 (2020).

    Article  Google Scholar 

  28. Huang, Y., Peng, X., Ma, J. & Zhang, M. 3DLinker: an E(3) equivariant variational autoencoder for molecular linker design. In International Conference on Machine Learning 9280–9294 (PMLR, 2022).

  29. Elharrouss, O., Almaadeed, N., Al-Maadeed, S. & Akbari, Y. Image inpainting: a review. Neural Process. Lett. 51, 2007–2028 (2020).

    Article  Google Scholar 

  30. Tschannen, M., Bachem, O. & Lucic, M. Recent advances in autoencoder-based representation learning. Preprint at https://arxiv.org/abs/1812.05069 (2018).

  31. Huang, H. et al. UNet 3+: a full-scale connected UNet for medical image segmentation. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing 1055–1059 (IEEE, 2020).

  32. Croitoru, F.-A., Hondru, V., Ionescu, R. T. & Shah, M. Diffusion models in vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10850–10869 (2023).

    Article  MATH  Google Scholar 

  33. Uesato, J. et al. Solving math word problems with process- and outcome-based feedback. Preprint at https://arxiv.org/abs/2211.14275 (2022).

  34. Kim, S. et al. Pubchem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).

    Article  Google Scholar 

  35. Karras, T., Aila, T., Laine, S. & Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In 6th International Conference on Learning Representations 1–26 (ICLR, 2018).

  36. van den Oord, A., Vinyals, O. & Kavukcuoglu, K. Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 30, 6306–6315 (2017).

  37. Esser, P., Rombach, R. & Ommer, B. Taming transformers for high-resolution image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition 12873–12883 (IEEE, 2021).

  38. Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10684–10695 (IEEE, 2022).

  39. Yang, L. et al. Diffusion models: a comprehensive survey of methods and applications. ACM Comput. Surv. 56, 105 (2023).

    MATH  Google Scholar 

  40. Peng, X. & Zhu, F. Hitting stride by degrees: fine grained molecular generation via diffusion model. Expert Syst. Appl. 244, 122949 (2024).

    Article  Google Scholar 

  41. Weininger, D. SMILES, a chemical language and information system. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).

    Article  MATH  Google Scholar 

  42. Panaretos, V. M. & Zemel, Y. Statistical aspects of Wasserstein distances. Annu. Rev. Stat. Appl. 6, 405–431 (2019).

    Article  MathSciNet  MATH  Google Scholar 

  43. Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S. & Klambauer, G. Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery. J. Chem. Inf. Model. 58, 1736–1741 (2018).

    Article  Google Scholar 

  44. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Process. Syst. 30, 6626–6637 (2017).

  45. Atz, K. et al. Prospective de novo drug design with deep interactome learning. Nat. Commun. 15, 3408 (2024).

    Article  MATH  Google Scholar 

  46. Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).

    Article  Google Scholar 

  47. Zeng, X. et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat. Mach. Intell. 4, 1004–1016 (2022).

    Article  MATH  Google Scholar 

  48. Pourpanah, F. et al. A review of generalized zero-shot learning methods. IEEE Trans. Pattern Anal. Mach. Intell. 45, 4051–4070 (2023).

    MATH  Google Scholar 

  49. Alhossary, A., Handoko, S. D., Mu, Y. & Kwoh, C.-K. Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics 31, 2214–2216 (2015).

    Article  Google Scholar 

  50. Toyoda, Y. et al. Ligand binding to human prostaglandin E receptor EP4 at the lipid-bilayer interface. Nat. Chem. Biol. 15, 18–26 (2019).

    Article  MATH  Google Scholar 

  51. Wu, W.-I. et al. Crystal structure of human AKT1 with an allosteric inhibitor reveals a new mode of kinase inhibition. PLoS ONE 5, e12913 (2010).

    Article  MATH  Google Scholar 

  52. Gao, H. et al. ROCK inhibitors 2. Improving potency, selectivity and solubility through the application of rationally designed solubilizing groups. Bioorg. Med. Chem. Lett. 28, 2616–2621 (2018).

    Article  MATH  Google Scholar 

  53. Salentin, S., Schreiber, S., Haupt, V. J., Adasme, M. F. & Schroeder, M. PLIP: fully automated protein–ligand interaction profiler. Nucleic Acids Res. 43, W443–W447 (2015).

    Article  Google Scholar 

  54. DeLano, W. L. et al. PyMOL: an open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 40, 82–92 (2002).

    Google Scholar 

  55. Shang, E. et al. De novo design of multitarget ligands with an iterative fragment-growing strategy. J. Chem. Inf. Model. 54, 1235–1241 (2014).

    Article  MATH  Google Scholar 

  56. Old, D. W. Therapeutic agents. US patent no. WO2014/179263 (2015).

  57. Qian, Y. et al. MolScribe: robust molecular structure recognition with image-to-graph generation. J. Chem. Inf. Model. 63, 1925–1934 (2023).

    Article  MATH  Google Scholar 

  58. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).

  59. Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

  60. Song, J., Meng, C. & Ermon, S. Denoising diffusion implicit models. In 9th International Conference on Learning Representations 1–20 (ICLR, 2021).

  61. Rawte, V., Sheth, A. P. & Das, A. A survey of hallucination in large foundation models. Preprint at https://arxiv.org/abs/2309.05922 (2023).

  62. Ouyang, L. et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35, 27730–27744 (2022).

  63. Landrum, G. et al. RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum 8, 5281 (2013).

    MATH  Google Scholar 

  64. Wang, Z. Data used for SketchMol training. Figshare https://doi.org/10.6084/m9.figshare.27210429.v1 (2024).

  65. Wang, Z. WangZiXubiubiu/SketchMol-v1: SketchMol. Zenodo https://doi.org/10.5281/zenodo.13937534 (2024).

Download references

Acknowledgements

X.Z. acknowledges support from the National Natural Science Foundation of China (grant nos. 62425204, 62450002, 62122025, U22A2037 and 62432011) and the Beijing Natural Science Foundation (grant no. L248013). Y.L. acknowledges support from the National Natural Science Foundation of China (grant no. 62372159). T.S. acknowledges support from the Japan Society for the Promotion of Science (grant no. JP23H03411) and the Japan Science and Technology Agency (grant no. JPMJPF2017). X.Y. acknowledges support from the Japan Society for the Promotion of Science (grant no. JP22K12144). We extend our sincere gratitude to X.Y., Z.Y., T.S., Y.L. and Y.C. for their invaluable feedback on paper writing and figures.

Author information

Authors and Affiliations

Authors

Contributions

Z.W. and X.Z. conceived the research project. Z.W., Y.C. and X.Z. designed and implemented the framework. Z.W., Y.L., X.Y., T.S. and X.Z. designed the experiments. Z.W., Y.C., P.M., Z.Y. and J.W. conducted the experiments and result analyses. Y.C. conducted the molecular dynamics simulation. All authors discussed the experimental results and commented on the manuscript.

Corresponding authors

Correspondence to Xiucai Ye or Xiangxiang Zeng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Kenneth Atz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–6, Figs. 1–13 and Tables 1–5.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Chen, Y., Ma, P. et al. Image-based generation for molecule design with SketchMol. Nat Mach Intell 7, 244–255 (2025). https://doi.org/10.1038/s42256-025-00982-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-00982-3

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing