Abstract
Efficiency and substrate specificity of proteases in the Potyviridae family have not been comprehensively profiled. Here we develop a model that learns co-evolutionary features to accurately predict and experimentally validate protease performance at single amino-acid resolution. We identify and engineer several proteases that perform better than the commercially available tobacco etch virus protease. To demonstrate the resolving power of our methods, we engineer protease crosstalk to selectively trigger a synthetic cell-death program in human cells.
Data availability
The data generated in this study have been deposited in the Zenodo database under accession code https://doi.org/10.5281/zenodo.15039890. The overview of aligned Potyviridae sequences and the list of plasmids (Supplementary Data 2) used in this study are provided as Supplementary Data. The plasmid sequences and maps for all proteases, the 7-, GS-flanked 7-, and 20-amino acid substrate of TEVp, and the H2B-sfGFP reporter used here are available on Addgene: https://www.addgene.org/Dave_Dingal/. Source data are provided with this paper. The protein structural data used in this study are available in the PDB database under accession code 1LVM. The protein family profile HMM used in this study are available in the InterPro database under accession code PF00863. Source data are provided with this paper.
Code availability
To facilitate the testing of thousands of potyviral proteases against peptide targets by multiple laboratories, we created an interactive web application for ProSSpeC (https://coevolutionary.org/prosspec/). Code is available at https://github.com/morcoslab/ProSSpeC and archived on Zenodo with https://doi.org/10.5281/zenodo.1832102535.
References
Neurath, H. & Walsh, K. A. Role of proteolytic enzymes in biological regulation (a review). Proc. Natl. Acad. Sci. USA. 73, 3825–3832 (1976).
Kapust, R. B. & Waugh, D. S. Controlled intracellular processing of fusion proteins by TEV protease. Protein Expr. Purif. 19, 312–318 (2000).
Rawlings, N. D. & Salvesen, G. Handbook of Proteolytic Enzymes, 1–3 (Academic Press, 2013).
Xie, M. & Fussenegger, M. Designing cell function: assembly of synthetic gene circuits for cell biology applications. Nat. Rev. Mol. Cell Biol. 19, 507–525 (2018).
Fernandez-Rodriguez, J. & Voigt, C. A. Post-translational control of genetic circuits using potyvirus proteases. Nucleic Acids Res. 44, 6493–6502 (2016).
Fink, T. et al. Design of fast proteolysis-based signaling and logic circuits in mammalian cells. Nat. Chem. Biol. 15, 115–122 (2018).
Sanchez, M. I. & Ting, A. Y. Directed evolution improves the catalytic efficiency of TEV protease. Nat. Methods 17, 167–174 (2019).
Carrington, J. C. & Dougherty, W. G. Small nuclear inclusion protein encoded by a plant potyvirus genome is a protease. J. Virol. 61, 2540–2548 (1987).
Morcos, F. et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. USA. 108, E1293–E1301 (2011).
Dos Santos, R. N., Morcos, F., Jana, B., Andricopulo, A. D. & Onuchic, J. N. Dimeric interactions and complex formation using direct coevolutionary couplings. Sci. Rep. 5, 1–10 (2015).
Morcos, F., Jana, B., Hwa, T. & Onuchic, J. N. Coevolutionary signals across protein lineages help capture multiple protein conformations. Proc. Natl. Acad. Sci. USA. 110, 20533–20538 (2013).
Cheng, R. R., Morcos, F., Levine, H. & Onuchic, J. N. Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information. Proc. Natl. Acad. Sci. USA. 111, E563–E571 (2014).
Jiang, X. L., Dimas, R. P., Chan, C. T. Y. & Morcos, F. Coevolutionary methods enable robust design of modular repressors by reestablishing intra-protein interactions. Nat. Commun. 12, 5592 (2021).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Savinov, A., Swanson, S., Keating, A. E. & Li, G.-W. High-throughput discovery of inhibitory protein fragments with AlphaFold. Proc. Natl. Acad. Sci. USA. 122, e2322412122 (2025).
Zhou, Q. et al. Global pairwise RNA interaction landscapes reveal core features of protein recognition. Nat. Commun. 9, 2511 (2018).
Dimas, R. P., Jiang, X. L., De La Paz, J. A., Morcos, F. & Chan, C. T. Y. Engineering repressors with coevolutionary cues facilitates toggle switches with a master reset. Nucleic Acids Res. 47, 5449–5463 (2019).
Kipniss, N. H. et al. Engineering cell sensing and responses using a GPCR-coupled CRISPR-Cas system. Nat. Commun. 2017 8, 1–10 (2017).
Kapust, R. B. et al. Tobacco etch virus protease: mechanism of autolysis and rational design of stable mutants with wild-type catalytic proficiency. Protein Eng. 14, 993–1000 (2001).
Xia, S. et al. Synthetic protein circuits for programmable control of mammalian cell death. Cell 187, 2785–2800.e16 (2024).
Gao, X. J., Chong, L. S., Kim, M. S. & Elowitz, M. B. Programmable protein circuits in living cells. Science 361, 1252–1258 (2018).
Goh, C. J. & Hahn, Y. Analysis of proteolytic processing sites in potyvirus polyproteins revealed differential amino acid preferences of NIa-pro protease in each of seven cleavage sites. PLoS ONE 16, e0245853 (2021).
Kapust, R. B., Toözseór, J., Copeland, T. D. & Waugh, D. S. The P1′ specificity of tobacco etch virus protease. Biochem. Biophys. Res. Commun. 294, 949–955 (2002).
Huber, L. et al. Data-driven protease engineering by DNA-recording and epistasis-aware machine learning. Nat. Commun. 16, 5466 (2025).
Beaumont, L. P., Mehalko, J., Johnson, A., Wall, V. E. & Esposito, D. Unexpected tobacco etch virus (TEV) protease cleavage of recombinant human proteins. Protein Expr. Purif. 220, 106488 (2024).
Song, J. et al. PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites. PLoS ONE 7, e50300 (2012).
Song, J. et al. IProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform. 20, 638–658 (2019).
Song, J. et al. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics 34, 684–687 (2018).
Gasteiger, E. et al. ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31, 3784–3788 (2003).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Koutra, D., Shah, N., Vogelstein, J. T., Gallagher, B. & Faloutsos, C. DELTACON: principled massive-graph similarity function with attribution. ACM Trans. Knowl. Discov. Data 10, 1–43 (2016).
Kumar, S. et al. MEGA12: molecular evolutionary genetic analysis version 12 for adaptive and green computing. Mol. Biol. Evol. 41, msae263 (2024).
Feng, S. et al. Bright split red fluorescent proteins for the visualization of endogenous proteins and synapses. Commun. Biol. 2, 1–12 (2019).
Lim Suan, M. B. et al. Identification and engineering of highly functional potyviral proteases in cells using co-evolutionary models. ProSSpeC. https://doi.org/10.5281/zenodo.18321025 (2025).
Acknowledgements
We thank members of the Dingal lab and Morcos lab for their advice, expertise, and discussions. We thank Elliott Joe, Ahmed Adookkattil, and Shashwat Singh for data analysis support. We also thank the UTD Flow Cytometry Core for infrastructure and support. We acknowledge the UTD Office of Information Technology Cyberinfrastructure Research Computing for providing high-performance computing and services. M.B.L. is supported by the UTD Eugene McDermott Graduate Fellowship. This research was supported by a UTD Startup Fund and National Institutes of Health-NIGMS awards to the labs of P.C.D.P.D. (R35GM150967) and of F.M. (R35GM133631). F.M. acknowledges support from the National Science Foundation (MCB-1943442).
Author information
Authors and Affiliations
Contributions
M.B.L., Z.S., A.S.Y., A.T., R.R., J.N., and J.K. performed experiments. M.B.L. and P.C.D.P.D. analyzed experimental results. Computational modeling and full stack development of the ProSSpeC web app performed by C.Z. Conceptual planning and resources provided by P.C.D.P.D. and F.M. M.B.L., C.Z., F.M., and P.C.D.P.D. authored and edited this manuscript, including illustrations. The final version of this manuscript is approved by all authors.
Corresponding authors
Ethics declarations
Competing interests
The Board of Regents of The University of Texas System have filed a pending patent application on behalf of co-inventors P.C.D.P.D., F.M., C.Z., and M.B.L. of the engineered proteases described (US Provisional Application No. 63/885,099). The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Zhongyue Yang and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lim Suan, M.B., Ziegler, C., Syed, Z. et al. Identification and engineering of highly functional potyviral proteases in cells using co-evolutionary models. Nat Commun (2026). https://doi.org/10.1038/s41467-026-69961-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-69961-5