Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Pharmacophore-oriented 3D molecular generation toward efficient feature-customized drug discovery

Abstract

Molecular generation is a cutting-edge technology with the potential to revolutionize intelligent drug discovery. However, currently reported ligand-based or structure-based molecular generation methods remain unpractical for real-world drug discovery. Here we propose an explicit pharmacophore-oriented 3D molecular generation method, termed PhoreGen. PhoreGen employs asynchronous perturbations and updates on both atomic and bond information, coupled with a message-passing mechanism that incorporates prior knowledge of ligand–pharmacophore mapping during the diffusion–denoising process. Evaluations revealed that PhoreGen efficiently generates 3D molecules well aligned with pharmacophores, maintaining good chemical reasonability, diversity, drug-likeness and binding affinity and, importantly, produces feature-customized molecules at high frequency. By using PhoreGen, we successfully identified new bicyclic boronate inhibitors of evolved metallo-β-lactamase and serine-β-lactamases, which potentiate meropenem against clinically isolated superbugs. Moreover, we identified inhibitors of metallo-nicotinamidases, emerging targets for insecticides. This work explores an explicitly constrained mode for molecular generation and demonstrates its potential in feature-customized drug discovery.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The overall architecture of PhoreGen.
Fig. 2: PhoreGen generates 3D molecules well aligned with complex- and ligand-derived pharmacophore models.
Fig. 3: PhoreGen shows great potential in covalent and metalloenzyme drug design.
Fig. 4: Application of PhoreGen in discovering novel dual MBL/SBL inhibitors.
Fig. 5: Application of PhoreGen leads to identification of the covalent inhibitors of Naam.

Similar content being viewed by others

Data availability

The LigPhore, CpxPhore and DockPhore datasets for training and evaluation are available via Zenodo at https://zenodo.org/records/15518867 (ref. 62). The PDBBind dataset is available at http://pdbbind.org.cn. The CrossDocked2020 dataset is available at https://bits.csb.pitt.edu/files/crossdock2020/. The pharmacophore models for molecular generation can be created via our web server at https://phoregen.ddtmlab.org. The complex crystal structures (PDB codes 9KSA and 9U8M) are available via PDB at https://www.rcsb.org. Source data are provided with this paper.

Code availability

The source code of PhoreGen is available on our web server at https://phoregen.ddtmlab.org, via GitHub at https://github.com/ppjian19/PhoreGen and via Zenodo at https://zenodo.org/records/15518867 (ref. 62) under an open-source license.

References

  1. Berdigaliyev, N. & Aljofan, M. An overview of drug discovery and development. Future Med. Chem. 12, 939–947 (2020).

    Article  Google Scholar 

  2. Goodnow, J. R. A. Hit and lead identification: integrated technology-based approaches. Drug Discov. Today 3, 367–375 (2006).

    Article  Google Scholar 

  3. Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).

    Article  Google Scholar 

  4. Catacutan, D. B. et al. Machine learning in preclinical drug discovery. Nat. Chem. Biol. 20, 960–973 (2024).

    Article  Google Scholar 

  5. Du, Y. et al. Machine learning-aided generative molecular design. Nat. Mach. Intell. 6, 589–604 (2024).

    Article  Google Scholar 

  6. Cheng, Y. et al. Molecular design in drug discovery: a comprehensive review of deep generative models. Brief. Bioinformatics. 22, bbab344 (2021).

    Article  Google Scholar 

  7. Bian, Y. & Xie, X. Q. Generative chemistry: drug discovery with deep learning generative models. J. Mol. Model. 27, 71 (2021).

    Article  Google Scholar 

  8. Hoogeboom, E. et al. Equivariant diffusion for molecule generation in 3D. In International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).

  9. Xu, M. et al. Geometric latent diffusion models for 3D molecule generation. In International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).

  10. Huang, L. et al. MDM: molecular diffusion model for 3D molecule generation. In Association for the Advancement of Artificial Intelligence (AAAI Press, 2022).

  11. Luo, S. et al. A 3D generative model for structure-based drug design. In Conference on Neural Information Processing Systems (eds Ranzao, M. et al.) (OpenReview.net, 2021).

  12. Peng, X. et al. Pocket2Mol: efficient molecular sampling based on 3D protein pockets. In International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).

  13. Guan, J. et al. 3D equivariant diffusion for target-aware molecule generation and affinity prediction. In International Conference on Learning Representations (OpenReview.net, 2023).

  14. Huang, L. et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun. 15, 2657 (2024).

    Article  Google Scholar 

  15. Zhung, W., Kim, H. & Kim, W. Y. 3D molecular generative framework for interaction-guided drug design. Nat. Commun. 15, 2688 (2024).

    Article  Google Scholar 

  16. Wu, P. et al. Guided diffusion for molecular generation with interaction prompt. Brief. Bioinformatics 25, bbae174 (2024).

    Article  Google Scholar 

  17. Lee, J., Zhung, W. & Kim, W. NCIDiff: non-covalent interaction-generative diffusion model for improving reliability of 3D molecule generation inside protein pocket. In International Conference on Machine Learning (OpenReview.net, 2024).

  18. Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).

    Article  Google Scholar 

  19. De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In International Conference on Machine Learning (eds Dy, J. et al.) (PMLR, 2018).

  20. Zhang, O. et al. ResGen is a pocket-aware 3D molecular generation model based on parallel multiscale modelling. Nat. Mach. Intell. 5, 1020–1030 (2023).

    Article  Google Scholar 

  21. Liu, M. et al. Generating 3D molecules for target protein binding. In International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).

  22. Luo, S. & Hu, W. Diffusion probabilistic models for 3D point cloud generation. In IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2021).

  23. Sun, J. et al. A critical revisit of adversarial robustness in 3D point cloud recognition with diffusion-driven purification. In International Conference on Machine Learning (eds Krause, A. et al.) (PMLR, 2023).

  24. Lyu, Z. et al. A conditional point diffusion-refinement paradigm for 3D point cloud completion. In International Conference on Learning Representations (OpenReview.net, 2022).

  25. Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Nat. Comput. Sci. 4, 899–909 (2024).

    Article  Google Scholar 

  26. Guan, J. et al. DecompDiff: diffusion models with decomposed priors for structure-based drug design. In International Conference on Machine Learning (eds Krause, A. et al.) (PMLR, 2023).

  27. Huang, Z. et al. Interaction-based retrieval-augmented diffusion models for protein-specific 3D molecule generation. In International Conference on Machine Learning (OpenReview.net, 2024).

  28. Alakhdar, A., Poczos, B. & Washburn, N. Diffusion models in de novo drug design. J. Chem. Inf. Model. 64, 7238–7256 (2024).

    Article  Google Scholar 

  29. Boike, L., Henning, N. J. & Nomura, D. K. Advances in covalent drug discovery. Nat. Rev. Drug Discov. 21, 881–898 (2022).

    Article  Google Scholar 

  30. Schaller, D. et al. Next generation 3D pharmacophore modeling. Wiley Interdiscip. Rev. Comput. Mol. Sci. 10, e1468 (2020).

    Article  Google Scholar 

  31. Imrie, F., Hadfield, T. E., Bradley, A. R. & Deane, C. M. Deep generative design with 3D pharmacophoric constraints. Chem. Sci. 12, 14577–14589 (2021).

    Article  Google Scholar 

  32. Zhu, H., Zhou, R., Cao, D., Tang, J. & Li, M. A pharmacophore-guided deep learning approach for bioactive molecular generation. Nat. Commun. 14, 6234 (2023).

    Article  Google Scholar 

  33. Bush, K. & Bradford, P. A. Interplay between β-lactamases and new β-lactamase inhibitors. Nat. Rev. Microbiol. 17, 295–306 (2019).

    Article  Google Scholar 

  34. Yang, Y. et al. Metallo-β-lactamase-mediated antimicrobial resistance and progress in inhibitor discovery. Trends Microbiol. 31, 735–748 (2023).

    Article  Google Scholar 

  35. Brem, J. et al. Imitation of β-lactam binding enables broad-spectrum metallo-β-lactamase inhibitors. Nat. Chem. 14, 15–24 (2022).

    Article  Google Scholar 

  36. Qiao, X. et al. An insecticide target in mechanoreceptor neurons. Sci. Adv. 8, eabq3132 (2022).

    Article  Google Scholar 

  37. Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In International Conference on Machine Learning (eds Meila, M. et al.) (PMLR, 2021).

  38. Buttenschoen, M., Morris, G. M. & Deane, C. M. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences. Chem. Sci. 15, 3130–3139 (2024).

    Article  Google Scholar 

  39. Schneuing, A. et al. Multi-domain distribution learning for de novo drug design. In International Conference on Learning Representations (OpenReview.net, 2025).

  40. Adams, K. et al. ShEPhERD: diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design. In International Conference on Learning Representations (OpenReview.net, 2025).

  41. Eberhardt, J., Santos-Martins, D., Tillack, A. F. & Forli, S. AutoDock Vina 1.2.0: new docking methods, expanded force field, and Python bindings. J. Chem. Inf. Model. 61, 3891–3898 (2021).

    Article  Google Scholar 

  42. Francoeur, P. G. et al. Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J. Chem. Inf. Model. 60, 4200–4215 (2020).

    Article  Google Scholar 

  43. Du, H. et al. CovalentInDB 2.0: an updated comprehensive database for structure-based and ligand-based covalent inhibitor design and screening. Nucleic Acids Res. 53, D1322–D1327 (2025).

    Article  Google Scholar 

  44. Hecker, S. J. et al. Discovery of cyclic boronic acid QPX7728, an ultrabroad-spectrum inhibitor of serine and metallo-β-lactamases. J. Med. Chem. 63, 7491–7507 (2020).

    Article  Google Scholar 

  45. Fyfe, P. K., Rao, V. A., Zemla, A., Cameron, S. & Hunter, W. N. Specificity and mechanism of Acinetobacter baumanii nicotinamidase: implications for activation of the front-line tuberculosis drug pyrazinamide. Angew. Chem. Int. Ed. 48, 9176–9179 (2009).

    Article  Google Scholar 

  46. Wang, C. & Rajapakse, J. C. Pharmacophore-guided de novo drug design with diffusion bridge. Preprint at https://doi.org/10.48550/arXiv.2412.19812 (2025).

  47. Heider, J. et al. Apo2ph4: a versatile workflow for the generation of receptor-based pharmacophore models for virtual screening. J. Chem. Inf. Model. 63, 101–110 (2023).

    Article  Google Scholar 

  48. Seo, S. & Kim, W. Y. PharmacoNet: deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening. Chem. Sci. 15, 19473–19487 (2024).

    Article  Google Scholar 

  49. Ho, J., Jain, A. & Abbeel, P. J. A. Denoising diffusion probabilistic models. In Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) (Curran Associates Inc, 2020).

  50. Peng, X., Guan, J., Liu, Q. & Ma, J. MolDiff: addressing the atom-bond inconsistency problem in 3d molecule diffusion generation. In International Conference on Machine Learning (eds Krause, A. et al.) (PMLR, 2023).

  51. Pearce, T., Brintrup, A., Zaki, M. & Neely, A. High-quality prediction intervals for deep learning: a distribution-free, ensembled approach. In International Conference on Machine Learning (eds Dy, J. et al.) (PMLR, 2018).

  52. Dhariwal, P. & Nichol, A. Diffusion models beat GANs on image synthesis. In Conference on Neural Information Processing Systems (eds Ranzao, M. et al.) (OpenReview.net, 2021).

  53. Yu, J. et al. Knowledge-guided diffusion model for 3D ligand–pharmacophore mapping. Nat. Commun. 16, 2269 (2025).

    Article  Google Scholar 

  54. Dai, Q. et al. AncPhore: a versatile tool for anchor pharmacophore steered drug discovery with applications in discovery of new inhibitors targeting metallo-β-lactamases and indoleamine/tryptophan 2,3-dioxygenases. Acta Pharm. Sin. B 11, 1931–1946 (2021).

    Article  Google Scholar 

  55. Liu, Z. et al. Forging the basis for developing protein–ligand interaction scoring functions. Acc. Chem. Res. 50, 302–309 (2017).

    Article  Google Scholar 

  56. Su, M. et al. Comparative assessment of scoring functions: the CASF-2016 update. J. Chem. Inf. Model. 59, 895–913 (2019).

    Article  Google Scholar 

  57. Yan, Y.-H. et al. Discovery of 2-aminothiazole-4-carboxylic acids as broad-spectrum metallo-β-lactamase inhibitors by mimicking carbapenem hydrolysate binding. J. Med. Chem. 66, 13746–13767 (2023).

    Article  Google Scholar 

  58. Xiao, Y.-C. et al. Design and enantioselective synthesis of 3-(α-acrylic acid) benzoxaboroles to combat carbapenemase resistance. Chem. Commun. 57, 7709–7712 (2021).

    Article  Google Scholar 

  59. Wang, Y.-L. et al. Structure-based development of (1-(3′-mercaptopropanamido) methyl)boronic acid derived broad-spectrum, dual-action inhibitors of metallo- and serine-β-lactamases. J. Med. Chem. 62, 7160–7184 (2019).

    Article  Google Scholar 

  60. Van Berkel, S. S. et al. Assay platform for clinically relevant metallo-β-lactamases. J. Med. Chem. 56, 6945–6953 (2013).

    Article  Google Scholar 

  61. Xiao, Q. et al. Upgrade of crystallography beamline BL19U1 at the Shanghai Synchrotron Radiation Facility. J. Appl. Crystallogr. 57, 630–637 (2024).

    Article  Google Scholar 

  62. Peng, J. PhoreGen source code and data (LigPhore, CpxPhore, and DockPhore dataset and trained weights). Zenodo https://doi.org/10.5281/zenodo.15518867 (2025).

Download references

Acknowledgements

This work is financially supported by the National Natural Science Foundation of China (grant nos. 82122065, 82473845 and 82073698), the National Key R&D Program of China (grant no. 2023YFF1204901), the Sichuan Science and Technology Program (grant no. 2025YFHZ0085), the Foundation for Innovative Research Groups of the Natural Science Foundation of Sichuan Province (grant no. 2024NSFTD0026) and the Basic Research Foundation of Sichuan University (grant no. 2023SCUH0073). We thank the staff at beamline BL19U1 of the Shanghai Synchrotron Radiation Facility, National Facility for Protein Science (Shanghai, China), for their great support. We also thank the members of the Mass Spectrometry Platform (R. Wang and X. Wu) for proteomic techniques and data interpretation.

Author information

Authors and Affiliations

Authors

Contributions

J.P., J.-L.Y., Z.-B.Y. and Y.-T.C. contributed equally to this work. G.-B.L. conceived, planned and supervised this study. J.P. and J.-L.Y. designed and trained the model supervised by G.-B.L.; J.P., J.-L.Y. and Y.-G.W. performed model validations. Z.-B.Y. and Y.-T.C. performed chemical synthesis. S.-Q.W., F.-B.M. and Y.-T.C. conducted protein production and bioactivity testing. J.P., J.-L.Y., Z.-B.Y., Y.-T.C. and G.-B.L. wrote the manuscript. G.-B.L. revised and polished the manuscript. All authors contributed to the final draft and approved the final version for submission.

Corresponding author

Correspondence to Guo-Bo Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 View of the top-10 ranked generated molecules with their corresponding pharmacophore models.

It reveals that for pharmacophore models with common, balanced feature compositions, PhoreGen can generate well-mapped, high-quality 3D molecules. Boron atoms, nitrogen atoms, oxygen atoms, phosphorus atoms, sulfur atoms, and chlorine atoms are pink, blue, red, darkorange, goldenrod, and limegreen respectively, while the other colors are carbon atoms.

Source data

Extended Data Fig. 2 Distributions of the pharmacophore feature and exclusion sphere counts versus the atom counts of generated molecules.

a,b For the complex-derived pharmacophore models (that is, the CpxPhore test set), the atom counts of the generated molecules show correlation with the pharmacophore feature and exclusion sphere counts. c,d For the ligand-derived pharmacophore models (that is, the LigPhore test set), the generated molecule’s atom counts also correlate with the feature and exclusion sphere counts. The sample size can be found in the Source Data file. Shadings indicate the feature or exclusion sphere counts, with darker shades representing higher counts. The box is the interquartile range (IQR), the line inside the box is the median, the whiskers represent data within ±1.5 × IQR, and the dots outside the whiskers are outliers. The r value refers to the Pearson correlation coefficient.

Source data

Extended Data Fig. 3 Correlation between the cavity sizes of pharmacophore models and the atom counts of generated molecules.

a,b For both the complex-derived (a) and ligand-derived (b) pharmacophore models, the atom counts of the generated molecules show a strong correlation with the corresponding cavity sizes, as indicated by the average distance between the pharmacophore features and exclusion spheres. The r value refers to the Pearson correlation coefficient.

Source data

Extended Data Fig. 4 CN1 and CN2 form a covalent bond with the catalytic cysteine of Naam.

a, The LC–MS/MS spectrum of CN1-modified peptide Y266DVC269VGATAVDALSAGYR283 from Drosophila melanogaster Naam. b, The LC–MS/MS spectrum of CN2-modified peptide M253KGATDIYVC262(Carbamidomethyl)GLAYDVC269 from Drosophila melanogaster Naam. The results demonstrated that CN1 and CN2 form a covalent bond with the catalytic residue Cys269 of Drosophila melanogaster Naam. CN1/CN2 (120 μM) was incubated with Drosophila melanogaster Naam (25 μM) at 4 °C for 2 h and the mixture was digested by trypsin.

Source data

Extended Data Fig. 5 CN1 and CN2 show inhibitory activity against the pests’ Naam enzymes.

a, b IC50 curves are obtained by incubating compounds CN1 and CN2 with the pests’ Myzus persicae Naam (15 nM) and Bemisia tabaci Naam (30 nM) enzymes, respectively, for durations ranging from 0 to 120 minutes, revealing that both compounds are time-dependent inhibitors of these Naam enzymes; data are presented as mean values ±SEM from n = 3 biological replicates, with error bars representing the SEM. c, d The melting curves (first-derivative of dissociation) of Myzus persicae Naam (10 μM) and Bemisia tabaci Naam (10 μM) in presence or absence of CN1 (50 μM) or CN2 (50 μM), reveal that these two compounds bind to and stabilize these enzymes; data are presented as mean values ±SEM from n = 3, 2, 3, 3 for c and n = 3, 2, 3, 1 for d biological replicates, with error bars representing the SEM.

Source data

Supplementary information

Supplementary Information (download PDF )

Supplementary Notes, Tables 1–29, and Figs. 1–29.

Reporting Summary (download PDF )

Peer Review File (download PDF )

Supplementary Data 1 (download ZIP )

Source data for Supplementary Figs. 1–5 and 7–10.

Supplementary Data 2 (download ZIP )

Source data for Supplementary Tables 5–21.

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, J., Yu, JL., Yang, ZB. et al. Pharmacophore-oriented 3D molecular generation toward efficient feature-customized drug discovery. Nat Comput Sci 5, 898–914 (2025). https://doi.org/10.1038/s43588-025-00850-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s43588-025-00850-5

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research