Enzyme specificity prediction using cross attention graph neural networks

Cui, Haiyang; Su, Yufeng; Dean, Tanner J.; Yu, Tianhao; Zhang, Zhengyi; Peng, Jian; Shukla, Diwakar; Zhao, Huimin

doi:10.1038/s41586-025-09697-2

Article
Published: 08 October 2025

Enzyme specificity prediction using cross attention graph neural networks

Haiyang Cui ORCID: orcid.org/0000-0001-8360-0447^1,2,3^na1^nAff7,
Yufeng Su^3,4,5^na1,
Tanner J. Dean^3,6^na1,
Tianhao Yu^1,2,3,
Zhengyi Zhang ORCID: orcid.org/0000-0002-4228-8773^1,2,
Jian Peng^4,5,
Diwakar Shukla ORCID: orcid.org/0000-0003-4079-5381^1,3,6 &
…
Huimin Zhao ORCID: orcid.org/0000-0002-9069-6739^1,3,4,5,6

Nature (2025)Cite this article

12 Altmetric
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Enzymes are the molecular machines of life, and a key property that governs their function is substrate specificity—the ability of an enzyme to recognize and selectively act on particular substrates. This specificity originates from the three-dimensional (3D) structure of the enzyme active site and complicated transition state of the reaction^1,2. Many enzymes can promiscuously catalyze reactions or act on substrates beyond those for which they were originally evolved^1,3-5. However, millions of known enzymes still lack reliable substrate specificity information, impeding their practical applications and comprehensive understanding of the biocatalytic diversity in nature. Herein, we developed a cross-attention-empowered SE(3)-equivariant graph neural network architecture named EZSpecificity for predicting enzyme substrate specificity, which was trained on a comprehensive tailor-made database of enzyme-substrate interactions at sequence and structural levels. EZSpecificity outperformed the existing machine learning models for enzyme substrate specificity prediction, as demonstrated by both an unknown substrate and enzyme database and seven proof-of-concept protein families. Experimental validation with eight halogenases and 78 substrates revealed that EZSpecificity achieved a 91.7% accuracy in identifying the single potential reactive substrate, significantly higher than that of the state-of-the-art model ESP (58.3%). EZSpecificity represents a general machine learning model for accurate prediction of substrate specificity for enzymes related to fundamental and applied research in biology and medicine.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Learn more

Prices may be subject to local taxes which are calculated during checkout

Author information

Haiyang Cui
Present address: State Key Laboratory of Microbial Technology, Ministry of Education Key Laboratory of NSLSCS, Jiangsu Basic Research Center for Synthetic Biology, College of Life Science, Nanjing Normal University, Nanjing, People’s Republic of China
These authors contributed equally: Haiyang Cui, Yufeng Su, Tanner J. Dean

Authors and Affiliations

Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL, USA
Haiyang Cui, Tianhao Yu, Zhengyi Zhang, Diwakar Shukla & Huimin Zhao
Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, USA
Haiyang Cui, Tianhao Yu & Zhengyi Zhang
NSF Molecule Maker Lab Institute, University of Illinois Urbana-Champaign, Urbana, IL, USA
Haiyang Cui, Yufeng Su, Tanner J. Dean, Tianhao Yu, Diwakar Shukla & Huimin Zhao
Department of Computer Science, University of Illinois Urbana-, Champaign, IL, USA
Yufeng Su, Jian Peng & Huimin Zhao
DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois Urbana-Champaign, Urbana, IL, USA
Yufeng Su, Jian Peng & Huimin Zhao
Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL, USA
Tanner J. Dean, Diwakar Shukla & Huimin Zhao

Authors

Haiyang Cui
View author publications
Search author on:PubMed Google Scholar
Yufeng Su
View author publications
Search author on:PubMed Google Scholar
Tanner J. Dean
View author publications
Search author on:PubMed Google Scholar
Tianhao Yu
View author publications
Search author on:PubMed Google Scholar
Zhengyi Zhang
View author publications
Search author on:PubMed Google Scholar
Jian Peng
View author publications
Search author on:PubMed Google Scholar
Diwakar Shukla
View author publications
Search author on:PubMed Google Scholar
Huimin Zhao
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Diwakar Shukla or Huimin Zhao.

Supplementary information

Supplementary Information

Supplementary Text sections 1–7, Supplementary Figs 1–52, Supplementary Tables 1–13 and references.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, H., Su, Y., Dean, T.J. et al. Enzyme specificity prediction using cross attention graph neural networks. Nature (2025). https://doi.org/10.1038/s41586-025-09697-2

Download citation

Received: 02 November 2024
Accepted: 01 October 2025
Published: 08 October 2025
DOI: https://doi.org/10.1038/s41586-025-09697-2