Abstract
Molecular docking plays a crucial role in structure-based drug discovery, enabling the prediction of how small molecules interact with protein targets. Traditional docking methods rely on scoring functions and search heuristics, whereas recent generative approaches, such as DiffDock, leverage deep learning for pose prediction. However, blind-diffusion-based docking often struggles with binding site localization and pose accuracy, particularly in complex protein–ligand systems. This work introduces GeoDirDock (GDD), a guided diffusion approach to molecular docking that enhances the accuracy and physical plausibility of ligand docking predictions. GDD guides the denoising process of a diffusion model along geodesic paths within multiple spaces representing translational, rotational and torsional degrees of freedom. Our method leverages expert knowledge to direct the generative modelling process, specifically targeting desired protein–ligand interaction regions. We demonstrate that GDD outperforms existing blind docking methods in terms of root mean squared distance accuracy and physicochemical pose realism. Our results indicate that incorporating domain expertise into the diffusion process leads to more biologically relevant docking predictions. Additionally, we explore the potential of GDD as a template-based modelling tool for lead optimization in drug discovery through angle transfer in maximum common substructure docking, showcasing its capability to accurately predict ligand orientations for chemically similar compounds. Future applications in real-world drug discovery campaigns will naturally continue to refine and extend the utility of prior-informed diffusion docking methods.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout


Similar content being viewed by others
Data availability
The datasets used for this work are available as follows: PDBBind (http://pdbbind.org.cn/), PoseBusters (https://github.com/maabuu/posebusters), DockGen (https://github.com/gcorso/DiffDock) and D3R Grand Challenge 4 (www.drugdesigndata.org).
Code availability
All source code as well as instructions on how to run the code are available via GitHub at https://github.com/NBDsoftware/GDD and via Zenodo at https://doi.org/10.5281/zenodo.15755564 (ref. 21).
References
Meng, X.-Y., Zhang, H.-X., Mezei, M. & Cui, M. Molecular docking: a powerful approach for structure-based drug discovery. Curr. Comput. Aided Drug Des. 7, 146–157 (2011).
Friesner, R. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 47, 1739–1749 (2004).
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Ruiz-Carmona, S. et al. rDock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids. PLoS Comput. Biol. 10, e1003571 (2014).
Corso, G. et al. DiffDock: diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations 1–33 (2023).
Ghersi, D. & Sanchez, R. Improving accuracy and efficiency of blind protein-ligand docking by focusing on predicted binding sites. Proteins 74, 417–424 (2009).
Buttenschoen, M., Morris, G. M. & Deane, C. M. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences. Chem. Sci. 15, 3130–3139 (2024).
Yu, Y. et al. Do deep learning models really outperform traditional approaches in molecular docking? In The Eleventh International Conference on Learning Representations Workshop on Machine Learning for Drug Discovery (MLDD) 1–7 (2023)
Koes, D. R., Baumgartner, M. P. & Camacho, C. J. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model. 53, 1893–1904 (2013).
McNutt, A. T. et al. Gnina 1.0: molecular docking with deep learning. J. Cheminform. 13, 43 (2021).
Nakata, S., Mori, Y. & Tanaka, S. End-to-end protein–ligand complex structure generation with diffusion-based generative models. BMC Bioinform. 24, 233 (2023).
Qiao, Z., Nie, W., Vahdat, A., Miller, T. F. III & Anandkumar, A. State-specific protein–ligand complex structure prediction with a multiscale deep generative model. Nat. Mach. Intell. 6, 195–208 (2024).
Plainer, M. et al. DiffDock‑Pocket: diffusion for pocket‑level docking with sidechain flexibility. In NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development 1–26 (2023).
Stärk, H. et al. EquiBind: geometric deep learning for drug binding structure prediction. In International Conference on Machine Learning 20503–20521 (PMLR, 2022).
Krivák, R. & Hoksza, D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J. Cheminform. 10, 39 (2018).
Le Guilloux, V., Schmidtke, P. & Tuffery, P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinform. 10, 168 (2009).
Raman, E. P. Template-based method for conformation generation and scoring for congeneric series of ligands. J. Chem. Inf. Model. 59, 2690–2701 (2019).
Whitehouse, A. J. et al. Development of inhibitors against Mycobacterium abscessus tRNA (m1G37) methyltransferase (TrmD) using fragment-based approaches. J. Med. Chem. 62, 7210–7232 (2019).
Parks, C. D. et al. D3R Grand Challenge 4: blind prediction of protein–ligand poses, affinity rankings, and relative binding free energies. J. Comput. Aided Mol. Des. 34, 99–119 (2020).
Dhariwal, P. & Nichol, A. Diffusion models beat GANs on image synthesis. In 35th Conference on Neural Information Processing Systems (NeurIPS 2021) https://proceedings.nips.cc/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf (NeurIPS, 2021).
Miñán, R. & rmincam. NBDsoftware/GDD: GeoDirDock 1.0. Zenodo https://doi.org/10.5281/zenodo.15755564 (2025).
Acknowledgements
We thank all members of Nostrum Biodiscovery for the help provided and insightful discussions.
Author information
Authors and Affiliations
Contributions
J.G. and R.M. developed the model architecture and conducted the experiments. R.M. contributed to the dataset curation and preprocessing. R.M. implemented the evaluation framework and conducted benchmarking. J.G., R.M., Á.C. and A.M. conceived the project. Á.C. and A.M. supervised the research. R.M. and Á.C. wrote the manuscript. A.M. revised the final draft and provided corrections. All authors reviewed and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
R.M., Á.C. and A.M. are employees at Nostrum Biodiscovery. J.G. performed this work during an internship at Nostrum Biodiscovery.
Peer review
Peer review information
Nature Machine Intelligence thanks Alex Morehead, Shigenori Tanaka and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary information
Supplementary Sections A–G, Figs. 1–13, Methods, Discussion and Results.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Miñán, R., Gallardo, J., Ciudad, Á. et al. Informed protein–ligand docking via geodesic guidance in translational, rotational and torsional spaces. Nat Mach Intell 7, 1555–1560 (2025). https://doi.org/10.1038/s42256-025-01091-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-025-01091-x
This article is cited by
-
Assessing the potential of deep learning for protein–ligand docking
Nature Machine Intelligence (2025)


