Abstract
A central tenet of biology is that protein structure mediates the sequence-function relationship. Recently, there has been excitement about the promise of advances in protein structure modeling to generate hypotheses about sequence-structure-function relationships. Here, we leverage structural similarity to identify rapidly evolving proteasome assembly chaperones and characterize their function in Candidozyma (Candida) auris. Despite extensive sequence divergence, we demonstrate conservation of function, corroborating that specific folds, and not sequences, are required for function. This theoretical premise suggests that protein structures with certain properties should be functionally interchangeable, even if they were not products of a common evolutionary history. To reduce this theory to practice, we performed structure-informed protein design, exploring sequence space that is not accessible via stepwise evolution, and mutated more than 40 residues in the Poc4 proteasome assembly chaperone to demonstrate that artificial proteins can rescue complex biological processes in the context of the whole cell. This sequence-structure-function relationship expands our ability to use structure to identify deep evolutionary relationships between proteins and generate hypotheses about gene function in non-model organisms. Overall, this helps to define and understand functional constraints on protein evolution, with important implications for both future protein design and retrospective function prediction.
Similar content being viewed by others
Data availability
All data generated in this study are provided in this study are provided in the Supplementary Information and/or Source Data files. Strain used in this will be sent following standard material transfer agreements (MTAs) Source data are provided with this paper.
Code availability
All original code has been deposited at Zenodo at DOI: 10.5281/zenodo.1905266577 and is publicly available as of the date of publication. The code is also developed openly at the GitHub repository (https://github.com/maomlab/Poc4).
References
Rost, B. Twilight zone of protein sequence alignments. Protein Eng. 12, 85–94 (1999).
Greenbury, S. F., Louis, A. A. & Ahnert, S. E. The structure of genotype-phenotype maps makes fitness landscapes navigable. Nat. Ecol. Evol. 6, 1742–1752 (2022).
Koehler Leman, J. et al. Sequence-structure-function relationships in the microbial protein universe. Nat. Commun. 14, 2351 (2023).
Zhang, S., Zhang, T. & Fu, Y. Proteome-wide structural analysis quantifies structural conservation across distant species. Genome Res. 33, 1975–1993 (2023).
Kabir, A., Moldwin, A., Bromberg, Y. & Shehu, A. In the twilight zone of protein sequence homology: do protein language models learn protein structure? Bioinform. Adv. 4, vbae119 (2024).
Steenwyk, J. L. et al. An orthologous gene coevolution network provides insight into eukaryotic cellular and genomic structure and function. Sci. Adv. 8, eabn0105 (2022).
Monzon, V., Paysan-Lafosse, T., Wood, V. & Bateman, A. Reciprocal best structure hits: using AlphaFold models to discover distant homologues. Bioinform. Adv. 2, vbac072 (2022).
van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246 (2024).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-atom. Science 384, eadl2528 (2024).
Nguyen, V. et al. Evolutionary drivers of thermoadaptation in enzyme catalysis. Science 355, 289–294 (2017).
Fowler, D. M. et al. High-resolution mapping of protein sequence-function relationships. Nat. Methods 7, 741–746 (2010).
Aharoni, A. et al. The “evolvability” of promiscuous protein functions. Nat. Genet. 37, 73–76 (2005).
Kortemme, T. De novo protein design-From new structures to programmable functions. Cell 187, 526–544 (2024).
Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C. & Kondrashov, F. A. Epistasis as the primary factor in molecular evolution. Nature 490, 535–538 (2012).
Starr, T. N. & Thornton, J. W. Epistasis in protein evolution: epistasis in protein evolution. Protein Sci. 25, 1204–1218 (2016).
Hong, L. et al. Fast, sensitive detection of protein homologs using deep dense retrieval. Nat. Biotechnol. 1, 13 (2024).
Hamamsy, T. et al. Protein remote homology detection and structural alignment using deep learning. Nat. Biotechnol. 42, 975–985 (2024).
Choi, I.-G. & Kim, S.-H. Evolution of protein structural classes and protein sequence families. Proc. Natl. Acad. Sci. USA. 103, 14056–14061 (2006).
Sousounis, K., Haney, C. E., Cao, J., Sunchu, B. & Tsonis, P. A. Conservation of the three-dimensional structure in non-homologous or unrelated proteins. Hum. Genom. 6, 10 (2012).
Betts, M. J., Guigó, R., Agarwal, P. & Russell, R. B. Exon structure conservation despite low sequence similarity: a relic of dramatic events in evolution? EMBO J. 20, 5354–5360 (2001).
Little, J., Chikina, M. & Clark, N. L. Evolutionary rate covariation is a reliable predictor of co-functional interactions but not necessarily physical interactions. Elife 12, RP93333 (2024).
Le Tallec, B. et al. 20S proteasome assembly is orchestrated by two distinct pairs of chaperones in yeast and in mammals. Mol. Cell 27, 660–674 (2007).
Kusmierczyk, A. R., Kunjappu, M. J., Funakoshi, M. & Hochstrasser, M. A multimeric assembly factor controls the formation of alternative 20S proteasomes. Nat. Struct. Mol. Biol. 15, 237–244 (2008).
Hirano, Y. et al. Cooperation of multiple chaperones required for the assembly of mammalian 20S proteasomes. Mol. Cell 24, 977–984 (2006).
Rousseau, A. & Bertolotti, A. Regulation of proteasome assembly and activity in health and disease. Nat. Rev. Mol. Cell Biol. 19, 697–712 (2018).
Zhang, J. & Yang, J.-R. Determinants of the rate of protein sequence evolution. Nat. Rev. Genet. 16, 409–420 (2015).
Santana, D. J. et al. A Candida auris–specific adhesin, Scf1, governs surface association, colonization, and virulence. Science 381, 1461–1467 (2023).
Riziotis, I. G., Ribeiro, A. J. M., Borkakoti, N. & Thornton, J. M. The 3D modules of enzyme catalysis: deconstructing active sites into distinct functional entities. J. Mol. Biol. 435, 168254 (2023).
Siddiq, M. A., Hochberg, G. K. A. & Thornton, J. W. Evol. Protein Specif Insights Ancestral Protein Reconstr. 47, 113–122 (2017).
Groll, M. et al. Structure of 20S proteasome from yeast at 2.4 A resolution. Nature 386, 463–471 (1997).
Beck, F. et al. Near-atomic resolution structural model of the yeast 26S proteasome. Proc. Natl. Acad. Sci. USA. 109, 14870–14875 (2012).
Adolf, F. et al. Visualizing chaperone-mediated multistep assembly of the human 20S proteasome. Nat. Struct. Mol. Biol. 31, 1176–1188 (2024).
Basu, S. & Wallner, B. DockQ: a quality measure for protein-protein docking models. PLoS ONE 11, e0161879 (2016).
Lensink, M. F. & Wodak, S. J. Docking, scoring, and affinity prediction in CAPRI: docking, scoring, and affinity prediction in CAPRI. Proteins 81, 2082–2095 (2013).
Akpinaroglu, D. et al. Structure-conditioned masked language models for protein sequence design generalize beyond the native sequence space. bioRxiv https://doi.org/10.1101/2023.12.15.571823 (2023).
Schake, P., Bolz, S. N., Linnemann, K. & Schroeder, M. PLIP 2025: introducing protein-protein interactions to the protein-ligand interaction profiler. Nucleic Acids Res. 53, W463–W465 (2025).
Vakirlis, N., Carvunis, A.-R. & McLysaght, A. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. Elife 9, e53500 (2020).
Weisman, C. M., Murray, A. W. & Eddy, S. R. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol. 18, e3000862 (2020).
Siddiq, M. A., Loehlin, D. W., Montooth, K. L. & Thornton, J. W. Experimental test and refutation of a classic case of molecular adaptation in Drosophila melanogaster. Nat. Ecol. Evol. 1, 25 (2017).
Pillai, A. S., Hochberg, G. K. A. & Thornton, J. W. Simple mechanisms for the evolution of protein complexity. Protein Sci. 31, e4449 (2022).
Dulchavsky, M. et al. Directed evolution unlocks oxygen reactivity for a nicotine-degrading flavoenzyme. Nat. Chem. Biol. 19, 1406–1414 (2023).
Copley, S. D. An evolutionary perspective on protein moonlighting. Biochem. Soc. Trans. 42, 1684–1691 (2014).
Fort, P., Kajava, A. V., Delsuc, F. & Coux, O. Evolution of proteasome regulators in eukaryotes. Genome Biol. Evol. 7, 1363–1379 (2015).
Gille, C. et al. A comprehensive view on proteasomal sequences: implications for the evolution of the proteasome. J. Mol. Biol. 326, 1437–1448 (2003).
Starr, T. N., Picton, L. K. & Thornton, J. W. Alternative evolutionary histories in the sequence space of an ancient protein. Nature 549, 409–413 (2017).
Ropars, J. et al. Gene flow contributes to diversification of the major fungal pathogen Candida albicans. Nat. Commun. 9, 2253 (2018).
Shah, P., McCandlish, D. M. & Plotkin, J. B. Contingency and entrenchment in protein evolution under purifying selection. Proc. Natl. Acad. Sci. USA. 112, E3226–E3235 (2015).
Bridgham, J. T., Ortlund, E. A. & Thornton, J. W. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461, 515–519 (2009).
Schulz, L. et al. Evolution of increased complexity and specificity at the dawn of form I Rubiscos. Science 378, 155–160 (2022).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Albanese, K. I., Barbe, S., Tagami, S., Woolfson, D. N. & Schiex, T. Computational protein design. Nat. Rev. Methods Prim. 5, 1–28 (2025).
Hettiaratchi, M. H. et al. Reengineering biocatalysts: computational redesign of chondroitinase ABC improves efficacy and stability. Sci. Adv. 6, eabc6378 (2020).
Vázquez Torres, S. et al. De novo designed proteins neutralize lethal snake venom toxins. Nature 639 1–7 (2025).
Jiang, K. et al. Rapid in silico directed evolution by a protein language model with EVOLVEpro. Science 387, eadr6006 (2025).
Hayes, T. et al. Simulating 500 million years of evolution with a language model. Science https://doi.org/10.1126/science.ads0018 (2025).
Zhu, J. et al. De novo design of transmembrane fluorescence-activating proteins. Nature 640, 249–257 (2025).
Yeh, A. H.-W. et al. De novo design of luciferases using deep learning. Nature 614, 774–780 (2023).
Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).
Weinreich, D. M., Watson, R. A. & Chao, L. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59, 1165–1174 (2005).
Broerman, A. J. et al. Design of facilitated dissociation enables timing of cytokine signalling. Nature 647, 528–535 (2025).
Podgornaia, A. I. & Laub, M. T. Protein evolution. Pervasive degeneracy and epistasis in a protein-protein interface. Science 347, 673–677 (2015).
Santana, D. J. & O’Meara, T. R. Forward and reverse genetic dissection of morphogenesis identifies filament-competent Candida auris strains. Nat. Commun. 12, 7197 (2021).
Byrne, K. P. & Wolfe, K. H. The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species. Genome Res. 15, 1456–1461 (2005).
Maguire, S. L. et al. Comparative genome analysis and gene finding in Candida species using CGOB. Mol. Biol. Evol. 30, 1281–1291 (2013).
Edgar, R. C. Muscle5: high-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny. Nat. Commun. 13, 6968 (2022).
Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Minh, B. Q. et al. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Pei, J. & Grishin, N. V. A. L. 2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17, 700–712 (2001).
Elfmann, C. & Stülke, J. PAE viewer: a webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks. Nucleic Acids Res. 51, W404–W410 (2023).
Meng, E. C. et al. UCSF ChimeraX: tools for structure building and analysis. Protein Sci. 32, e4792 (2023).
Krissinel, E. & Henrick, K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797 (2007).
Dunbrack, R. L. Jr. Rēs ipSAE loquunt: What’s wrong with AlphaFold’s ipTM score and how to fix it. bioRxiv https://doi.org/10.1101/2025.02.10.637595 (2025).
Stivala, A., Wybrow, M., Wirth, A., Whisstock, J. C. & Stuckey, P. J. Automatic generation of protein structure cartoons with Pro-origami. Bioinformatics 27, 3315–3316 (2011).
Zhong, B. et al. ParaFold: Paralleling AlphaFold for large-scale predictions. International Conference on HighPerformance Computing in Asia-Pacifi c Region Workshops (ACM, New York, NY, USA, 2022) https://doi.org/10.1145/3503470.3503471.
Rapala, J. et al. Deep homology and design of proteasome chaperone proteins in Candidozyma auris. Zenodo, https://doi.org/10.5281/zenodo.19052665 (2026).
Acknowledgements
We thank the O’Meara Labs for helpful discussions. National Institutes of Health grant R35GM147894 (TRO), National Institutes of Health grant U19AI181767 (TRO). National Institutes of Health grant T32GM149391 (JRR). National Institutes of Health grant R35GM151129 (MJO). National Institutes of Health grant 5F32CA261115 (MS). National Institutes of Health grant R35 GM118073 (PJW). Michigan Postdoctoral Pioneer Fellowship (MS)
Author information
Authors and Affiliations
Contributions
Conceptualization: J.R.R., M.S., M.J.O., and T.R.O. Methodology: J.R.R., M.S., M.J.O. and T.R.O. Investigation: J.R.R., M.S., M.J.O. and T.R.O. Visualization: J.R.R., M.S., M.J.O. and T.R.O. Funding acquisition: J.R.R., M.S., P.J.W. and T.R.O. Project administration: T.R.O., M.J.O. Supervision: P.J.W., T.R.O. and M.J.O. Writing—original draft: J.R.R., M.S., M.J.O. and T.R.O. Writing—review and editing: J.R.R., M.S., P.J.W., M.J.O. and T.R.O.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Rapala, J.R., Siddiq, M., Wittkopp, P.J. et al. Deep homology and design of proteasome chaperone proteins in Candidozyma auris. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71206-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-71206-4


