Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks

A preprint version of the article is available at arXiv.

Abstract

Discovering novel drug candidate molecules is a fundamental step in drug development. Generative deep learning models can sample new molecular structures from learned probability distributions; however, their practical use in drug discovery hinges on generating compounds tailored to a specific target molecule. Here we introduce DrugGEN, an end-to-end generative system for the de novo design of drug candidate molecules that interact with a selected protein. The proposed method represents molecules as graphs and processes them using a generative adversarial network that comprises graph transformer layers. Trained on large datasets of drug-like compounds and target-specific bioactive molecules, DrugGEN designed candidate inhibitors for AKT1, a kinase crucial in many cancers. Docking and molecular dynamics simulations suggest that the generated compounds effectively bind to AKT1, and attention maps provide insights into the model’s reasoning. Furthermore, selected de novo molecules were synthesized and shown to inhibit AKT1 at low micromolar concentrations in the context of in vitro enzymatic assays. These results demonstrate the potential of DrugGEN for designing target-specific molecules. Using the open-access DrugGEN codebase, researchers can retrain the model for other druggable proteins, provided a dataset of known bioactive molecules is available.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Workflow of the study.
Fig. 2: Architecture of the DrugGEN model with powerful graph transformer encoder modules in both generator and discriminator networks.
Fig. 3: Exploration of de novo molecules via downstream analysis.
Fig. 4: Structural analysis of capivasertib (bound ligand in 4GV1) and five de novo-generated molecules (molecules 1–5) that were selected for experimental validation.
Fig. 5: Visualization of DrugGEN attention maps on three de novo molecules.

Similar content being viewed by others

Data availability

All resources required to reproduce this work are open source. The full training datasets (curated ChEMBL compound sets, AKT1 and CDK2 bioactivity records), the pretrained weight files for every DrugGEN model variant (targeted and non-targeted) and all result artefacts—including the complete sets of de novo-generated molecules, molecular docking output tables and MD trajectory analyses—are archived in the DrugGEN dataset repository and are available via figshare at https://doi.org/10.6084/m9.figshare.29119205.v3 (ref. 86). The data to reproduce the experiments and the output files are available via GitHub at https://github.com/HUBioDataLab/DrugGEN.

Code availability

The source code and ready-to-use trained models are available in the archived DrugGEN repository, which is available via GitHub at https://github.com/HUBioDataLab/DrugGEN and via Zenodo at https://doi.org/10.5281/zenodo.15014579 (ref. 87). DrugGEN is also available as an online tool with a graphical interface at https://huggingface.co/spaces/HUBioDataLab/DrugGEN, where users can generate de novo molecules by using the desired model.

References

  1. Rifaioglu, A. S. et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief. Bioinform. 20, 1878–1912 (2019).

    Article  Google Scholar 

  2. Paul, S. M. et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat. Rev. Drug Discov. 9, 203–214 (2010).

    Article  Google Scholar 

  3. Bhisetti, G. & Fang, C. Artificial intelligence–enabled de novo design of novel compounds that are synthesizable. Methods Mol. Biol. 2390, 409–419 (2022).

    Article  Google Scholar 

  4. Elton, D. C., Boukouvalas, Z., Fuge, M. D. & Chung, P. W. Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 4, 828–849 (2019).

    Article  Google Scholar 

  5. Walters, W. P. Virtual chemical libraries: miniperspective. J. Med. Chem. 62, 1116–1124 (2018).

    Article  Google Scholar 

  6. Mouchlis, V. D. et al. Advances in de novo drug design: from conventional to machine learning methods. Int. J. Mol. Sci. 22, 1676 (2021).

    Article  Google Scholar 

  7. Kingma, D. P. & Welling, M. An introduction to variational autoencoders. Found. Trends Mach. Learn. 12, 307–392 (2019).

    Article  Google Scholar 

  8. Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).

    Article  Google Scholar 

  9. Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).

    Article  Google Scholar 

  10. De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In ICML 2018 Workshop onTheoretical Foundations and Applications of Deep Generative Models (2018).

  11. Zou, J., Yu, J., Hu, P., Zhao, L. & Shi, S. STAGAN: an approach for improve the stability of molecular graph generation based on generative adversarial networks. Comput. Biol. Med. 167, 107691 (2023).

    Article  Google Scholar 

  12. Mahmood, O., Mansimov, E., Bonneau, R. & Cho, K. Masked graph modeling for molecule generation. Nat. Commun. 12, 3156 (2021).

    Article  Google Scholar 

  13. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).

    Google Scholar 

  14. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 2256–2265 (PMLR, 2015).

  15. Peng, X. et al. Pocket2Mol: efficient molecular sampling based on 3D protein pockets. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 17644–17655 (PMLR, 2022).

  16. Schneuing, A. et al. Structure‑based drug design with equivariant diffusion models. Nat. Comput. Sci. 4, 899–909 (2024).

    Article  Google Scholar 

  17. Hoogeboom, E., Satorras, V. G., Vignac, C. & Welling, M. Equivariant diffusion for molecule generation in 3D. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 8867–8887 (PMLR, 2022).

  18. Mitton, J., Senn, H. M., Wynne, K. & Murray‑Smith, R. A graph VAE and graph transformer approach to generating molecular graphs. In ICML 2020 Workshop on Graph Representation Learning and Beyond (2020).

  19. Richards, R. J. & Groener, A. M. Conditional β-VAE for de novo molecular generation. Preprint at https://arxiv.org/abs/2205.01592 (2022).

  20. Nemoto, K. & Kaneko, H. De novo direct inverse QSPR/QSAR: chemical variational autoencoder and Gaussian mixture regression models. J. Chem. Inf. Model. 63, 794–805 (2023).

    Article  Google Scholar 

  21. Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A. & Zhavoronkov, A. druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol. Pharm. 14, 3098–3104 (2017).

    Article  Google Scholar 

  22. Xie, X., Valiente, P. A. & Kim, P. M. HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures. Bioinformatics 39, btad036 (2023).

    Article  Google Scholar 

  23. Arús-Pous, J. et al. Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11, 71 (2019).

    Article  Google Scholar 

  24. Blaschke, T. et al. REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60, 5918–5922 (2020).

    Article  Google Scholar 

  25. Wang, X. et al. PETrans: de novo drug design with protein-specific encoding based on transfer learning. Int. J. Mol. Sci. 24, 1146 (2023).

    Article  Google Scholar 

  26. Bagal, V., Aggarwal, R., Vinod, P. K. & Priyakumar, U. D. MolGPT: molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 62, 2064–2076 (2022).

    Article  Google Scholar 

  27. Yang, M. et al. CMGN: a conditional molecular generation net to design target-specific molecules with desired properties. Brief. Bioinform. 24, bbad185 (2023).

    Article  Google Scholar 

  28. Zhang, O. et al. ResGen is a pocket-aware 3D molecular generation model based on parallel multiscale modelling. Nat. Mach. Intell. 5, 1020–1030 (2023).

    Article  Google Scholar 

  29. Guan, J. et al. 3D equivariant diffusion for target-aware molecule generation and affinity prediction. In Eleventh International Conference on Learning Representations (OpenReview.net, 2023); https://openreview.net/forum?id=kJqXEPXMsE0

  30. Perron, Q. et al. Deep generative models for ligand‐based de novo design applied to multi‐parametric optimization. J. Comput. Chem. 43, 692–703 (2022).

    Article  Google Scholar 

  31. Fang, Y., Pan, X. & Shen, H.-B. De novo drug design by iterative multiobjective deep reinforcement learning with graph-based molecular quality assessment. Bioinformatics 39, btad157 (2023).

    Article  Google Scholar 

  32. Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019).

    Article  Google Scholar 

  33. Liu, M., Luo, Y., Uchino, K., Maruhashi, K. & Ji, S. Generating 3D molecules for target protein binding. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 13912–13924 (PMLR, 2022).

  34. Wang, M. et al. RELATION: a deep generative model for structure-based de novo drug design. J. Med. Chem. 65, 9478–9492 (2022).

    Article  Google Scholar 

  35. Gebauer, N. W. A., Gastegger, M., Hessmann, S. S. P., Müller, K.-R. & Schütt, K. T. Inverse design of 3D molecular structures with conditional generative neural networks. Nat. Commun. 13, 973 (2022).

    Article  Google Scholar 

  36. Shi, W. et al. Pocket2Drug: an encoder-decoder deep neural network for the target-based drug design. Front. Pharmacol. 13, 837715 (2022).

    Article  Google Scholar 

  37. Uludoğan, G., Ozkirimli, E., Ulgen, K. O., Karalı, N. & Özgür, A. Exploiting pretrained biochemical language models for targeted drug design. Bioinformatics 38, ii155–ii161 (2022).

    Article  Google Scholar 

  38. Rozenberg, E. & Freedman, D. Semi-equivariant conditional normalizing flows, with applications to target-aware molecule generation. Mach. Learn. Sci. Technol. 4, 035037 (2023).

    Article  Google Scholar 

  39. Li, Y. et al. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat. Commun. 13, 6891 (2022).

    Article  Google Scholar 

  40. Zhang, Y. et al. Universal approach to de novo drug design for target proteins using deep reinforcement learning. ACS Omega 8, 5464–5474 (2023).

    Article  Google Scholar 

  41. Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 6000–6010 (2017).

    Google Scholar 

  42. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (OpenReview.net, 2017).

  43. Li, P. et al. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Brief. Bioinform. 22, bbab109 (2021).

    Article  Google Scholar 

  44. Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. Adv. Neural Inf. Process. Syst. 33, 12559–12571 (2020).

    Google Scholar 

  45. Li, H. et al. A knowledge-guided pre-training framework for improving molecular representation learning. Nat. Commun. 14, 7568 (2023).

    Article  Google Scholar 

  46. Zhu, J.-Y., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. IEEE International Conference on Computer Vision 2223–2232 (IEEE, 2017).

  47. Kim, T., Cha, M., Kim, H., Lee, J. K. & Kim, J. Learning to discover cross-domain relations with generative adversarial networks. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1857–1865 (PMLR, 2017).

  48. Addie, M. et al. Discovery of 4-amino-N-[(1S)-1-(4-chlorophenyl)-3-hydroxypropyl]-1-(7H-pyrrolo[2,3-d]pyrimidin-4-yl)piperidine-4-carboxamide (AZD5363), an orally bioavailable, potent inhibitor of Akt kinases. J. Med. Chem. 56 2059–2073 (2013).

    Article  Google Scholar 

  49. Polykovskiy, D. et al. Molecular Sets (MOSES): a benchmarking platform for molecular generation models. Front. Pharmacol. 11, 565644 (2020).

    Article  Google Scholar 

  50. Tadesse, S. et al. Targeting CDK2 in cancer: challenges and opportunities for therapy. Drug Discov. Today 25, 406–413 (2020).

    Article  Google Scholar 

  51. McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).

  52. Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    Google Scholar 

  53. Rifaioglu, A. S. et al. DEEPScreen: high performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem. Sci. 11, 2531–2557 (2020).

    Article  Google Scholar 

  54. Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 1, 8 (2009).

    Article  Google Scholar 

  55. Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).

    Article  Google Scholar 

  56. Abeer, A. N. M. N., Urban, N. M., Weil, M. R., Alexander, F. J. & Yoon, B.-J. Multi-objective latent space optimization of generative molecular design models. Patterns 5, 101042 (2024).

    Article  Google Scholar 

  57. Jain, M. et al. Multi‑objective GFlowNets. In Proc. 40th International Conference on Machine Learning (eds Krause, A. et al.) 14631–14653 (PMLR, 2023).

  58. Monteiro, N. R. C. et al. FSM-DDTR: end-to-end feedback strategy for multi-objective de novo drug design using transformers. Comput. Biol. Med. 164, 107285 (2023).

    Article  Google Scholar 

  59. Suzuki, T., Ma, D., Yasuo, N. & Sekijima, M. Mothra: multiobjective de novo molecular generation using Monte Carlo tree search. J. Chem. Inf. Model. 64, 7291–7302 (2024).

    Article  Google Scholar 

  60. Ghosh, B., Dutta, I. K., Totaro, M. & Bayoumi, M. A survey on the progression and performance of generative adversarial networks. In 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT) 1–8 (IEEE, 2020).

  61. Gui, J., Sun, Z., Wen, Y., Tao, D. & Ye, J. A review on generative adversarial networks: algorithms, theory, and applications. IEEE Trans. Knowl. Data Eng. 35, 3313–3332 (2023).

    Article  Google Scholar 

  62. Janson, G., Valdes-Garcia, G., Heo, L. & Feig, M. Direct generation of protein conformational ensembles via machine learning. Nat. Commun. 14, 774 (2023).

    Article  Google Scholar 

  63. Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein generative adversarial networks. In Proc. 34th International Conference on Machine Learning (Precup, D. & Teh, Y. W.) 214–223 (PMLR, 2017).

  64. Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1, 045024 (2020).

    Article  Google Scholar 

  65. Yüksel, A., Ulusoy, E., Ünlü, A. & Doğan, T. Selformer: molecular representation learning via SELFIES language models. Mach. Learn. Sci. Technol. 4, 035014 (2023).

    Article  Google Scholar 

  66. Doğan, T. et al. CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations. Nucleic Acids Res. 49, e96 (2021).

    Article  Google Scholar 

  67. Mendez, D. et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940 (2019).

    Article  Google Scholar 

  68. Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).

    Article  Google Scholar 

  69. Landrum, G. et al. rdkit/rdkit: 2024_09_6 (Q3 2024) Release (Release_2024_09_6). Zenodo https://doi.org/10.5281/zenodo.14943932 (2025).

  70. Dwivedi, V. P. & Bresson, X. A generalization of transformer networks to graphs. In AAAI Workshop on Deep Learning on Graphs: Methods and Applications (2021).

  71. Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation. In Eleventh International Conference on Learning Representations (OpenReview.net, 2023); https://openreview.net/forum?id=UaAD-Nu86WX

  72. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. & Courville, A. C. Improved training of Wasserstein GANs. Adv. Neural Inf. Process. Syst. 30, 5769–5779 (2017).

    Google Scholar 

  73. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (OpenReview.net, 2019); https://openreview.net/forum?id=Bkg6RiCqY7

  74. Schoenmaker, L., Béquignon, O. J. M., Jespers, W. & Van Westen, G. J. P. UnCorrupt SMILES: a novel approach to de novo design. J. Cheminform. 15, 22 (2023).

    Google Scholar 

  75. Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).

    Article  Google Scholar 

  76. Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).

    Article  Google Scholar 

  77. Veber, D. F. et al. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623 (2002).

    Article  Google Scholar 

  78. Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).

    Article  Google Scholar 

  79. Madhavi Sastry, G., Adzhigirey, M., Day, T., Annabhimoju, R. & Sherman, W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 27, 221–234 (2013).

    Article  Google Scholar 

  80. Schrödinger Release 2022-1: Maestro (Schrödinger, 2022).

  81. Friesner, R. A. et al. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein−ligand complexes. J. Med. Chem. 49, 6177–6196 (2006).

    Article  Google Scholar 

  82. Martin, M. P., Olesen, S. H., Georg, G. I. & Schönbrunn, E. Cyclin-dependent kinase inhibitor dinaciclib interacts with the acetyl-lysine recognition site of bromodomains. ACS Chem. Biol. 8, 2360–2365 (2013).

    Article  Google Scholar 

  83. The PyMOL molecular graphics system (version 1.8) (Schrödinger, 2015).

  84. Durant, J. L., Leland, B. A., Henry, D. R. & Nourse, J. G. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42, 1273–1280 (2002).

    Article  Google Scholar 

  85. Yang, K. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59, 3370–3388 (2019).

    Article  Google Scholar 

  86. Ünlü, A., Çevrim, E., Yiğit, M. G., Olğaç, A., & Doğan, T. DrugGEN resource collection: training data, model weights, generated molecules, docking and MD analyses (version 3). figshare https://doi.org/10.6084/m9.figshare.29119205.v3 (2025).

  87. Ünlü, A., Çevrim, E., Yigit, M. G., Sarigun, A., & Dogan, T. HUBioDataLab/DrugGEN: DrugGEN v2.0 release (v2.0). Zenodo https://doi.org/10.5281/zenodo.15014579 (2025).

  88. Guimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C. & Aspuru-Guzik, A. Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. Preprint at https://arxiv.org/abs/1705.10843 (2018).

  89. Grisoni, F., Moret, M., Lingwood, R. & Schneider, G. Bidirectional molecule generation with recurrent neural networks. J. Chem. Inf. Model. 60, 1175–1183 (2020).

    Article  Google Scholar 

  90. Xie, Y. et al. MARS: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations 1–19 (ICLR, 2021).

  91. Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).

    Google Scholar 

  92. Matsukiyo, Y., Yamanaka, C. & Yamanishi, Y. De novo generation of chemical structures of inhibitor and activator candidates for therapeutic target proteins by a transformer-based variational autoencoder and Bayesian optimization. J. Chem. Inf. Model. 64, 2345–2355 (2024).

    Article  Google Scholar 

Download references

Acknowledgements

This project was supported by TUBITAK-BIDEB 2247-A National Outstanding Researchers Program under project no. 120C123. We thank H. A. Güvenilir for guidance during the preparation of datasets and A. Koyaş for aiding the target protein selection process.

Author information

Authors and Affiliations

Authors

Contributions

T.D. conceptualized the study and designed the general methodology. E.Ç. prepared the datasets. A.S., A.Ü., A.S.R. and T.D. determined the technical details of the fundamental model architecture. A.Ü. and A.S. prepared the original codebase and designed and implemented the initial models. A.Ü. and M.G.Y. designed, implemented, trained, tuned and evaluated numerous model variants and constructed the finalized DrugGEN models. A.O. and E.Ç. conducted the molecular filtering operations and physics-based (docking and MD) experiments. A.Ü. and H.Ç. analysed the de novo-generated molecules in the context of deep learning-based DTI prediction. M.G.Y., E.Ç. and T.D. conducted the attention map analysis. A.Ü., E.Ç., A.O., E.B. and T.D. evaluated and discussed the findings. E.Ç., A.Ü., A.S., A.O. and T.D. visualized the results and prepared the figures in the paper. A.Ü., E.Ç., M.G.Y., A.S., D.C.K., A.O. and T.D. wrote the paper. A.Ü., E.Ç., A.S., M.G.Y. and T.D. prepared the repository. O.B. and M.G.Y. constructed the online tool. T.D. supervised the overall study. All authors approved the paper.

Corresponding author

Correspondence to Tunca Doğan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Artur Kadurin, Pengyong Li and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1

30 promising de novo molecules to effectively target AKT1 protein (generated by DrugGEN model), selected via expert curation from the dataset of molecules with sufficiently low binding free energies (< −8 kcal/mol) in the molecular docking experiment and deep learning-based DTI predictions (by DEEPScreen 59) as ‘active’.

Extended Data Fig. 2 Structural analysis of capivasertib (bound ligand in 4GV1) and five de novo generated molecules (Mol. ID 1–5) that were selected for experimental validation.

(a) Initial binding orientations of capivasertib, Molecules 1 and 2 at the starting point of molecular dynamics (MD) simulations. (b) Key protein-ligand interactions observed during MD simulations, visualised with interacting residues and interaction types. The depicted poses represent the most populated conformations from each simulation. (c) Root-mean-square deviation (RMSD) values of capivasertib, Molecules 1 and 2 in complex with AKT1. (d) Root-mean-square fluctuation (RMSF) values of ligand atoms in the same complexes. Abbreviations: I-VII represent β-sheet numbers, g.l represents glycine-rich loop, c.l represents catalytic loop, GK represents gatekeeper residue, and xDFG represents highly conserved kinase residues; linker represents the loop that connects the hinge domain to αChelix. Gray dashed lines represent Van der Waals interactions. Blue lines represent hydrogen bonds and water bridges. Green lines indicate halogen bonds. Yellow dashed lines represent salt bridges. Directional interactions were noted only when the occupancy value exceeded 10%; however, for visual clarity, occupancy values of the water bridges were stated only when they exceeded 30%.

Extended Data Table 1 Drug-likeness related and target-centric performance of DrugGEN and other methods: RELATION34, ResGen28, TRIOMPHE-BOA92, TargetDiff29, and Pocket2Mol15, measured in terms of of QED, synthetic accessibility (SA), FCD, fragment similarity, scaffold similarity, and adherence to Lipinski, Veber, and PAINS filters, together with docking scores (median kcal/mol values of the top 10% molecules in terms of docking scores), for the AKT1 and CDK2 targeting tasks, separately

Supplementary information

Supplementary Information

Supplementary Sections 1–13, Figs. 1–17 and Tables 1–3.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ünlü, A., Çevrim, E., Yiğit, M.G. et al. Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks. Nat Mach Intell 7, 1524–1540 (2025). https://doi.org/10.1038/s42256-025-01082-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01082-y

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics