Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks

Ünlü, Atabey; Çevrim, Elif; Yiğit, Melih Gökay; Sarıgün, Ahmet; Çelikbilek, Hayriye; Bayram, Osman; Kahraman, Deniz Cansen; Olğaç, Abdurrahman; Rifaioglu, Ahmet Sureyya; Banoğlu, Erden; Doğan, Tunca

doi:10.1038/s42256-025-01082-y

Article
Published: 15 September 2025

Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks

Atabey Ünlü^1,2^na1,
Elif Çevrim ORCID: orcid.org/0000-0001-7797-8080^1,2^na1,
Melih Gökay Yiğit^1,3^na1,
Ahmet Sarıgün^1,4,5,
Hayriye Çelikbilek¹,
Osman Bayram⁶,
Deniz Cansen Kahraman⁷,
Abdurrahman Olğaç ORCID: orcid.org/0000-0001-8470-4942^8,9,
Ahmet Sureyya Rifaioglu¹⁰,
Erden Banoğlu⁸ &
…
Tunca Doğan ORCID: orcid.org/0000-0002-1298-9763^1,2,11

Nature Machine Intelligence volume 7, pages 1524–1540 (2025)Cite this article

4834 Accesses
1 Citations
26 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Discovering novel drug candidate molecules is a fundamental step in drug development. Generative deep learning models can sample new molecular structures from learned probability distributions; however, their practical use in drug discovery hinges on generating compounds tailored to a specific target molecule. Here we introduce DrugGEN, an end-to-end generative system for the de novo design of drug candidate molecules that interact with a selected protein. The proposed method represents molecules as graphs and processes them using a generative adversarial network that comprises graph transformer layers. Trained on large datasets of drug-like compounds and target-specific bioactive molecules, DrugGEN designed candidate inhibitors for AKT1, a kinase crucial in many cancers. Docking and molecular dynamics simulations suggest that the generated compounds effectively bind to AKT1, and attention maps provide insights into the model’s reasoning. Furthermore, selected de novo molecules were synthesized and shown to inhibit AKT1 at low micromolar concentrations in the context of in vitro enzymatic assays. These results demonstrate the potential of DrugGEN for designing target-specific molecules. Using the open-access DrugGEN codebase, researchers can retrain the model for other druggable proteins, provided a dataset of known bioactive molecules is available.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Architecture of the DrugGEN model with powerful graph transformer encoder modules in both generator and discriminator networks.**

**Fig. 3: Exploration of de novo molecules via downstream analysis.**

**Fig. 4: Structural analysis of capivasertib (bound ligand in 4GV1) and five de novo-generated molecules (molecules 1–5) that were selected for experimental validation.**

**Fig. 5: Visualization of DrugGEN attention maps on three de novo molecules.**

DrugGen enhances drug discovery with large language models and reinforcement learning

Article Open access 18 April 2025

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Article Open access 30 May 2025

A structure-based framework for selective inhibitor design and optimization

Article Open access 12 March 2025

Data availability

All resources required to reproduce this work are open source. The full training datasets (curated ChEMBL compound sets, AKT1 and CDK2 bioactivity records), the pretrained weight files for every DrugGEN model variant (targeted and non-targeted) and all result artefacts—including the complete sets of de novo-generated molecules, molecular docking output tables and MD trajectory analyses—are archived in the DrugGEN dataset repository and are available via figshare at https://doi.org/10.6084/m9.figshare.29119205.v3 (ref. ⁸⁶). The data to reproduce the experiments and the output files are available via GitHub at https://github.com/HUBioDataLab/DrugGEN.

Code availability

The source code and ready-to-use trained models are available in the archived DrugGEN repository, which is available via GitHub at https://github.com/HUBioDataLab/DrugGEN and via Zenodo at https://doi.org/10.5281/zenodo.15014579 (ref. ⁸⁷). DrugGEN is also available as an online tool with a graphical interface at https://huggingface.co/spaces/HUBioDataLab/DrugGEN, where users can generate de novo molecules by using the desired model.

References

Rifaioglu, A. S. et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief. Bioinform. 20, 1878–1912 (2019).
Article Google Scholar
Paul, S. M. et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat. Rev. Drug Discov. 9, 203–214 (2010).
Article Google Scholar
Bhisetti, G. & Fang, C. Artificial intelligence–enabled de novo design of novel compounds that are synthesizable. Methods Mol. Biol. 2390, 409–419 (2022).
Article Google Scholar
Elton, D. C., Boukouvalas, Z., Fuge, M. D. & Chung, P. W. Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 4, 828–849 (2019).
Article Google Scholar
Walters, W. P. Virtual chemical libraries: miniperspective. J. Med. Chem. 62, 1116–1124 (2018).
Article Google Scholar
Mouchlis, V. D. et al. Advances in de novo drug design: from conventional to machine learning methods. Int. J. Mol. Sci. 22, 1676 (2021).
Article Google Scholar
Kingma, D. P. & Welling, M. An introduction to variational autoencoders. Found. Trends Mach. Learn. 12, 307–392 (2019).
Article Google Scholar
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Article Google Scholar
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Article Google Scholar
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In ICML 2018 Workshop onTheoretical Foundations and Applications of Deep Generative Models (2018).
Zou, J., Yu, J., Hu, P., Zhao, L. & Shi, S. STAGAN: an approach for improve the stability of molecular graph generation based on generative adversarial networks. Comput. Biol. Med. 167, 107691 (2023).
Article Google Scholar
Mahmood, O., Mansimov, E., Bonneau, R. & Cho, K. Masked graph modeling for molecule generation. Nat. Commun. 12, 3156 (2021).
Article Google Scholar
Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).
Google Scholar
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 2256–2265 (PMLR, 2015).
Peng, X. et al. Pocket2Mol: efficient molecular sampling based on 3D protein pockets. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 17644–17655 (PMLR, 2022).
Schneuing, A. et al. Structure‑based drug design with equivariant diffusion models. Nat. Comput. Sci. 4, 899–909 (2024).
Article Google Scholar
Hoogeboom, E., Satorras, V. G., Vignac, C. & Welling, M. Equivariant diffusion for molecule generation in 3D. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 8867–8887 (PMLR, 2022).
Mitton, J., Senn, H. M., Wynne, K. & Murray‑Smith, R. A graph VAE and graph transformer approach to generating molecular graphs. In ICML 2020 Workshop on Graph Representation Learning and Beyond (2020).
Richards, R. J. & Groener, A. M. Conditional β-VAE for de novo molecular generation. Preprint at https://arxiv.org/abs/2205.01592 (2022).
Nemoto, K. & Kaneko, H. De novo direct inverse QSPR/QSAR: chemical variational autoencoder and Gaussian mixture regression models. J. Chem. Inf. Model. 63, 794–805 (2023).
Article Google Scholar
Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A. & Zhavoronkov, A. druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol. Pharm. 14, 3098–3104 (2017).
Article Google Scholar
Xie, X., Valiente, P. A. & Kim, P. M. HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures. Bioinformatics 39, btad036 (2023).
Article Google Scholar
Arús-Pous, J. et al. Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11, 71 (2019).
Article Google Scholar
Blaschke, T. et al. REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60, 5918–5922 (2020).
Article Google Scholar
Wang, X. et al. PETrans: de novo drug design with protein-specific encoding based on transfer learning. Int. J. Mol. Sci. 24, 1146 (2023).
Article Google Scholar
Bagal, V., Aggarwal, R., Vinod, P. K. & Priyakumar, U. D. MolGPT: molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 62, 2064–2076 (2022).
Article Google Scholar
Yang, M. et al. CMGN: a conditional molecular generation net to design target-specific molecules with desired properties. Brief. Bioinform. 24, bbad185 (2023).
Article Google Scholar
Zhang, O. et al. ResGen is a pocket-aware 3D molecular generation model based on parallel multiscale modelling. Nat. Mach. Intell. 5, 1020–1030 (2023).
Article Google Scholar
Guan, J. et al. 3D equivariant diffusion for target-aware molecule generation and affinity prediction. In Eleventh International Conference on Learning Representations (OpenReview.net, 2023); https://openreview.net/forum?id=kJqXEPXMsE0
Perron, Q. et al. Deep generative models for ligand‐based de novo design applied to multi‐parametric optimization. J. Comput. Chem. 43, 692–703 (2022).
Article Google Scholar
Fang, Y., Pan, X. & Shen, H.-B. De novo drug design by iterative multiobjective deep reinforcement learning with graph-based molecular quality assessment. Bioinformatics 39, btad157 (2023).
Article Google Scholar
Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019).
Article Google Scholar
Liu, M., Luo, Y., Uchino, K., Maruhashi, K. & Ji, S. Generating 3D molecules for target protein binding. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 13912–13924 (PMLR, 2022).
Wang, M. et al. RELATION: a deep generative model for structure-based de novo drug design. J. Med. Chem. 65, 9478–9492 (2022).
Article Google Scholar
Gebauer, N. W. A., Gastegger, M., Hessmann, S. S. P., Müller, K.-R. & Schütt, K. T. Inverse design of 3D molecular structures with conditional generative neural networks. Nat. Commun. 13, 973 (2022).
Article Google Scholar
Shi, W. et al. Pocket2Drug: an encoder-decoder deep neural network for the target-based drug design. Front. Pharmacol. 13, 837715 (2022).
Article Google Scholar
Uludoğan, G., Ozkirimli, E., Ulgen, K. O., Karalı, N. & Özgür, A. Exploiting pretrained biochemical language models for targeted drug design. Bioinformatics 38, ii155–ii161 (2022).
Article Google Scholar
Rozenberg, E. & Freedman, D. Semi-equivariant conditional normalizing flows, with applications to target-aware molecule generation. Mach. Learn. Sci. Technol. 4, 035037 (2023).
Article Google Scholar
Li, Y. et al. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat. Commun. 13, 6891 (2022).
Article Google Scholar
Zhang, Y. et al. Universal approach to de novo drug design for target proteins using deep reinforcement learning. ACS Omega 8, 5464–5474 (2023).
Article Google Scholar
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 6000–6010 (2017).
Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (OpenReview.net, 2017).
Li, P. et al. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Brief. Bioinform. 22, bbab109 (2021).
Article Google Scholar
Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. Adv. Neural Inf. Process. Syst. 33, 12559–12571 (2020).
Google Scholar
Li, H. et al. A knowledge-guided pre-training framework for improving molecular representation learning. Nat. Commun. 14, 7568 (2023).
Article Google Scholar
Zhu, J.-Y., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. IEEE International Conference on Computer Vision 2223–2232 (IEEE, 2017).
Kim, T., Cha, M., Kim, H., Lee, J. K. & Kim, J. Learning to discover cross-domain relations with generative adversarial networks. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1857–1865 (PMLR, 2017).
Addie, M. et al. Discovery of 4-amino-N-[(1S)-1-(4-chlorophenyl)-3-hydroxypropyl]-1-(7H-pyrrolo[2,3-d]pyrimidin-4-yl)piperidine-4-carboxamide (AZD5363), an orally bioavailable, potent inhibitor of Akt kinases. J. Med. Chem. 56 2059–2073 (2013).
Article Google Scholar
Polykovskiy, D. et al. Molecular Sets (MOSES): a benchmarking platform for molecular generation models. Front. Pharmacol. 11, 565644 (2020).
Article Google Scholar
Tadesse, S. et al. Targeting CDK2 in cancer: challenges and opportunities for therapy. Drug Discov. Today 25, 406–413 (2020).
Article Google Scholar
McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
Rifaioglu, A. S. et al. DEEPScreen: high performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem. Sci. 11, 2531–2557 (2020).
Article Google Scholar
Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 1, 8 (2009).
Article Google Scholar
Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).
Article Google Scholar
Abeer, A. N. M. N., Urban, N. M., Weil, M. R., Alexander, F. J. & Yoon, B.-J. Multi-objective latent space optimization of generative molecular design models. Patterns 5, 101042 (2024).
Article Google Scholar
Jain, M. et al. Multi‑objective GFlowNets. In Proc. 40th International Conference on Machine Learning (eds Krause, A. et al.) 14631–14653 (PMLR, 2023).
Monteiro, N. R. C. et al. FSM-DDTR: end-to-end feedback strategy for multi-objective de novo drug design using transformers. Comput. Biol. Med. 164, 107285 (2023).
Article Google Scholar
Suzuki, T., Ma, D., Yasuo, N. & Sekijima, M. Mothra: multiobjective de novo molecular generation using Monte Carlo tree search. J. Chem. Inf. Model. 64, 7291–7302 (2024).
Article Google Scholar
Ghosh, B., Dutta, I. K., Totaro, M. & Bayoumi, M. A survey on the progression and performance of generative adversarial networks. In 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT) 1–8 (IEEE, 2020).
Gui, J., Sun, Z., Wen, Y., Tao, D. & Ye, J. A review on generative adversarial networks: algorithms, theory, and applications. IEEE Trans. Knowl. Data Eng. 35, 3313–3332 (2023).
Article Google Scholar
Janson, G., Valdes-Garcia, G., Heo, L. & Feig, M. Direct generation of protein conformational ensembles via machine learning. Nat. Commun. 14, 774 (2023).
Article Google Scholar
Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein generative adversarial networks. In Proc. 34th International Conference on Machine Learning (Precup, D. & Teh, Y. W.) 214–223 (PMLR, 2017).
Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1, 045024 (2020).
Article Google Scholar
Yüksel, A., Ulusoy, E., Ünlü, A. & Doğan, T. Selformer: molecular representation learning via SELFIES language models. Mach. Learn. Sci. Technol. 4, 035014 (2023).
Article Google Scholar
Doğan, T. et al. CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations. Nucleic Acids Res. 49, e96 (2021).
Article Google Scholar
Mendez, D. et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940 (2019).
Article Google Scholar
Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).
Article Google Scholar
Landrum, G. et al. rdkit/rdkit: 2024_09_6 (Q3 2024) Release (Release_2024_09_6). Zenodo https://doi.org/10.5281/zenodo.14943932 (2025).
Dwivedi, V. P. & Bresson, X. A generalization of transformer networks to graphs. In AAAI Workshop on Deep Learning on Graphs: Methods and Applications (2021).
Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation. In Eleventh International Conference on Learning Representations (OpenReview.net, 2023); https://openreview.net/forum?id=UaAD-Nu86WX
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. & Courville, A. C. Improved training of Wasserstein GANs. Adv. Neural Inf. Process. Syst. 30, 5769–5779 (2017).
Google Scholar
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (OpenReview.net, 2019); https://openreview.net/forum?id=Bkg6RiCqY7
Schoenmaker, L., Béquignon, O. J. M., Jespers, W. & Van Westen, G. J. P. UnCorrupt SMILES: a novel approach to de novo design. J. Cheminform. 15, 22 (2023).
Google Scholar
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Article Google Scholar
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).
Article Google Scholar
Veber, D. F. et al. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623 (2002).
Article Google Scholar
Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).
Article Google Scholar
Madhavi Sastry, G., Adzhigirey, M., Day, T., Annabhimoju, R. & Sherman, W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 27, 221–234 (2013).
Article Google Scholar
Schrödinger Release 2022-1: Maestro (Schrödinger, 2022).
Friesner, R. A. et al. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein−ligand complexes. J. Med. Chem. 49, 6177–6196 (2006).
Article Google Scholar
Martin, M. P., Olesen, S. H., Georg, G. I. & Schönbrunn, E. Cyclin-dependent kinase inhibitor dinaciclib interacts with the acetyl-lysine recognition site of bromodomains. ACS Chem. Biol. 8, 2360–2365 (2013).
Article Google Scholar
The PyMOL molecular graphics system (version 1.8) (Schrödinger, 2015).
Durant, J. L., Leland, B. A., Henry, D. R. & Nourse, J. G. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42, 1273–1280 (2002).
Article Google Scholar
Yang, K. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59, 3370–3388 (2019).
Article Google Scholar
Ünlü, A., Çevrim, E., Yiğit, M. G., Olğaç, A., & Doğan, T. DrugGEN resource collection: training data, model weights, generated molecules, docking and MD analyses (version 3). figshare https://doi.org/10.6084/m9.figshare.29119205.v3 (2025).
Ünlü, A., Çevrim, E., Yigit, M. G., Sarigun, A., & Dogan, T. HUBioDataLab/DrugGEN: DrugGEN v2.0 release (v2.0). Zenodo https://doi.org/10.5281/zenodo.15014579 (2025).
Guimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C. & Aspuru-Guzik, A. Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. Preprint at https://arxiv.org/abs/1705.10843 (2018).
Grisoni, F., Moret, M., Lingwood, R. & Schneider, G. Bidirectional molecule generation with recurrent neural networks. J. Chem. Inf. Model. 60, 1175–1183 (2020).
Article Google Scholar
Xie, Y. et al. MARS: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations 1–19 (ICLR, 2021).
Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).
Google Scholar
Matsukiyo, Y., Yamanaka, C. & Yamanishi, Y. De novo generation of chemical structures of inhibitor and activator candidates for therapeutic target proteins by a transformer-based variational autoencoder and Bayesian optimization. J. Chem. Inf. Model. 64, 2345–2355 (2024).
Article Google Scholar

Download references

Acknowledgements

This project was supported by TUBITAK-BIDEB 2247-A National Outstanding Researchers Program under project no. 120C123. We thank H. A. Güvenilir for guidance during the preparation of datasets and A. Koyaş for aiding the target protein selection process.

Author information

These authors contributed equally: Atabey Ünlü, Elif Çevrim, Melih Gökay Yiğit.

Authors and Affiliations

Biological Data Science Lab, Department of Computer Engineering, Hacettepe University, Ankara, Turkey
Atabey Ünlü, Elif Çevrim, Melih Gökay Yiğit, Ahmet Sarıgün, Hayriye Çelikbilek & Tunca Doğan
Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
Atabey Ünlü, Elif Çevrim & Tunca Doğan
Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
Melih Gökay Yiğit
Department of Chemistry, Middle East Technical University, Ankara, Turkey
Ahmet Sarıgün
Department of Physics, Middle East Technical University, Ankara, Turkey
Ahmet Sarıgün
Department of Artificial Intelligence Engineering, Bahcesehir University, Istanbul, Turkey
Osman Bayram
Cancer Systems Biology Lab, Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
Deniz Cansen Kahraman
Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Gazi University, Ankara, Turkey
Abdurrahman Olğaç & Erden Banoğlu
Laboratory of Molecular Modeling, Evias Pharmaceutical R&D Ltd, Ankara, Turkey
Abdurrahman Olğaç
Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
Ahmet Sureyya Rifaioglu
Department of Health Informatics, Institute of Informatics, Hacettepe University, Ankara, Turkey
Tunca Doğan

Authors

Atabey Ünlü
View author publications
Search author on:PubMed Google Scholar
Elif Çevrim
View author publications
Search author on:PubMed Google Scholar
Melih Gökay Yiğit
View author publications
Search author on:PubMed Google Scholar
Ahmet Sarıgün
View author publications
Search author on:PubMed Google Scholar
Hayriye Çelikbilek
View author publications
Search author on:PubMed Google Scholar
Osman Bayram
View author publications
Search author on:PubMed Google Scholar
Deniz Cansen Kahraman
View author publications
Search author on:PubMed Google Scholar
Abdurrahman Olğaç
View author publications
Search author on:PubMed Google Scholar
Ahmet Sureyya Rifaioglu
View author publications
Search author on:PubMed Google Scholar
Erden Banoğlu
View author publications
Search author on:PubMed Google Scholar
Tunca Doğan
View author publications
Search author on:PubMed Google Scholar

Contributions

T.D. conceptualized the study and designed the general methodology. E.Ç. prepared the datasets. A.S., A.Ü., A.S.R. and T.D. determined the technical details of the fundamental model architecture. A.Ü. and A.S. prepared the original codebase and designed and implemented the initial models. A.Ü. and M.G.Y. designed, implemented, trained, tuned and evaluated numerous model variants and constructed the finalized DrugGEN models. A.O. and E.Ç. conducted the molecular filtering operations and physics-based (docking and MD) experiments. A.Ü. and H.Ç. analysed the de novo-generated molecules in the context of deep learning-based DTI prediction. M.G.Y., E.Ç. and T.D. conducted the attention map analysis. A.Ü., E.Ç., A.O., E.B. and T.D. evaluated and discussed the findings. E.Ç., A.Ü., A.S., A.O. and T.D. visualized the results and prepared the figures in the paper. A.Ü., E.Ç., M.G.Y., A.S., D.C.K., A.O. and T.D. wrote the paper. A.Ü., E.Ç., A.S., M.G.Y. and T.D. prepared the repository. O.B. and M.G.Y. constructed the online tool. T.D. supervised the overall study. All authors approved the paper.

Corresponding author

Correspondence to Tunca Doğan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Artur Kadurin, Pengyong Li and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1

30 promising de novo molecules to effectively target AKT1 protein (generated by DrugGEN model), selected via expert curation from the dataset of molecules with sufficiently low binding free energies (< −8 kcal/mol) in the molecular docking experiment and deep learning-based DTI predictions (by DEEPScreen 59) as ‘active’.

Extended Data Fig. 2 Structural analysis of capivasertib (bound ligand in 4GV1) and five de novo generated molecules (Mol. ID 1–5) that were selected for experimental validation.

(a) Initial binding orientations of capivasertib, Molecules 1 and 2 at the starting point of molecular dynamics (MD) simulations. (b) Key protein-ligand interactions observed during MD simulations, visualised with interacting residues and interaction types. The depicted poses represent the most populated conformations from each simulation. (c) Root-mean-square deviation (RMSD) values of capivasertib, Molecules 1 and 2 in complex with AKT1. (d) Root-mean-square fluctuation (RMSF) values of ligand atoms in the same complexes. Abbreviations: I-VII represent β-sheet numbers, g.l represents glycine-rich loop, c.l represents catalytic loop, GK represents gatekeeper residue, and xDFG represents highly conserved kinase residues; linker represents the loop that connects the hinge domain to αChelix. Gray dashed lines represent Van der Waals interactions. Blue lines represent hydrogen bonds and water bridges. Green lines indicate halogen bonds. Yellow dashed lines represent salt bridges. Directional interactions were noted only when the occupancy value exceeded 10%; however, for visual clarity, occupancy values of the water bridges were stated only when they exceeded 30%.

Extended Data Table 1 Drug-likeness related and target-centric performance of DrugGEN and other methods: RELATION³⁴, ResGen²⁸, TRIOMPHE-BOA⁹², TargetDiff²⁹, and Pocket2Mol¹⁵, measured in terms of of QED, synthetic accessibility (SA), FCD, fragment similarity, scaffold similarity, and adherence to Lipinski, Veber, and PAINS filters, together with docking scores (median kcal/mol values of the top 10% molecules in terms of docking scores), for the AKT1 and CDK2 targeting tasks, separately

Full size table

Supplementary information

Supplementary Information

Supplementary Sections 1–13, Figs. 1–17 and Tables 1–3.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ünlü, A., Çevrim, E., Yiğit, M.G. et al. Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks. Nat Mach Intell 7, 1524–1540 (2025). https://doi.org/10.1038/s42256-025-01082-y

Download citation

Received: 26 July 2024
Accepted: 23 June 2025
Published: 15 September 2025
Version of record: 15 September 2025
Issue date: September 2025
DOI: https://doi.org/10.1038/s42256-025-01082-y