Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Accelerating protein engineering with fitness landscape modelling and reinforcement learning

Abstract

Protein engineering holds substantial promise for designing proteins with customized functions, yet the vast landscape of potential mutations versus limited laboratory capacity constrains the discovery of optimal sequences. Here, to address this, we present the μProtein framework, which accelerates protein engineering by combining μFormer, a deep learning model for accurate mutational effect prediction, with μSearch, a reinforcement learning algorithm designed to efficiently navigate the protein fitness landscape using μFormer as an oracle. μProtein leverages single-mutation data to predict optimal sequences with complex, multi-amino-acid mutations through its modelling of epistatic interactions and a multi-step search strategy. In addition to strong performance on benchmark datasets, μProtein identified high-gain-of-function multi-point mutants for the enzyme β-lactamase, surpassing one of the highest-known activity levels, in wet laboratory, trained solely on single-mutation data. These results demonstrate μProtein’s capability to discover impactful mutations across the vast protein sequence space, offering a robust, efficient approach for protein optimization.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of μProtein.
Fig. 2: Quantitative comparison of μFormer with alternative mutational effect prediction approaches.
Fig. 3: Effective modelling of epistasis effects in high-order mutants.
Fig. 4: μFormer effectively identifies high-functioning variants with multi-point mutations.
Fig. 5: Quantitative comparison of μSearch with prevalent fitness landscape exploration algorithms.
Fig. 6: Design high-functioning sequences with μProtein.

Similar content being viewed by others

Data availability

The FLIP benchmark is publicly available at http://data.bioembeddings.com/public/FLIP/. The ProteinGym v0.1 data collection can be accessed via Hugging Face at https://huggingface.co/datasets/OATML-Markslab/ProteinGym_v0.1. The dataset under our split scheme is available via figshare at https://doi.org/10.6084/m9.figshare.26892355 (ref. 69). The list of ESBLs are available at http://bldb.eu/ (ref. 44). The fitness scores for β-lactamase variants against cefotaxime, validated by our wet-laboratory experiments, are provided in the Supplementary Information. Source data are provided with this paper.

Code availability

The code to reproduce the results of this paper is publicly available via GitHub at https://github.com/microsoft/Mu-Protein and via Zenodo at https://doi.org/10.5281/zenodo.15836168 (ref. 76). The data-split scheme is available via figshare at https://doi.org/10.6084/m9.figshare.26892355 (ref. 69).

References

  1. Miton, C. M. & Tokuriki, N. Insertions and deletions (indels): a missing piece of the protein engineering jigsaw. Biochemistry 62, 148–157 (2023).

    Article  Google Scholar 

  2. Gray, V. E., Hause, R. J., Luebeck, J., Shendure, J. & Fowler, D. M. Quantitative missense variant effect prediction using large-scale mutagenesis data. Cell Syst. 6, 116–124 (2018).

    Article  Google Scholar 

  3. Riesselman, A. J., Ingraham, J. B. & Marks, D. S. Deep generative models of genetic variation capture the effects of mutations. Nat. Methods 15, 816–822 (2018).

    Article  Google Scholar 

  4. Biswas, S., Khimulya, G., Alley, E. C., Esvelt, K. M. & Church, G. M. Low-n protein engineering with data-efficient deep learning. Nat. Methods 18, 389–396 (2021).

    Article  Google Scholar 

  5. Notin, P. et al. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. In Proc. 39th International Conference on Machine Learning 16990–17017 (PMLR, 2022).

  6. Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    Article  Google Scholar 

  7. Weile, J. & Roth, F. P. Multiplexed assays of variant effects contribute to a growing genotype–phenotype atlas. Hum. Genet. 137, 665–678 (2018).

    Article  Google Scholar 

  8. Wittmund, M., Cadet, F. & Davari, M. D. Learning epistasis and residue coevolution patterns: current trends and future perspectives for advancing enzyme engineering. ACS Catal. 12, 14243–14263 (2022).

    Article  Google Scholar 

  9. Judge, A. et al. Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning. Proc. Natl Acad. Sci. USA 121, e2313513121 (2024).

    Article  Google Scholar 

  10. Gelman, S., Fahlberg, S. A., Heinzelman, P., Romero, P. A. & Gitter, A. Neural networks to learn protein sequence–function relationships from deep mutational scanning data. Proc. Natl Acad. Sci. USA 118, e2104878118 (2021).

    Article  Google Scholar 

  11. Kim, H. Y. & Kim, D. Prediction of mutation effects using a deep temporal convolutional network. Bioinformatics 36, 2047–2052 (2020).

    Article  Google Scholar 

  12. Shanehsazzadeh, A., Belanger, D. & Dohan, D. Is transfer learning necessary for protein landscape prediction? In Proc. Machine Learning for Structural Biology Workshop in the Thirty-Fourth Annual Conference on Neural Information Processing Systems (NeurIPS) https://www.mlsb.io/papers/MLSB2020_Is_Transfer_Learning_Necessary.pdf (2020).

  13. Yang, K. K., Lu, A. X. & Fusi, N. Convolutions are competitive with transformers for protein sequence pretraining. Cell System. 15, 286–294.e2 (2024).

    Article  Google Scholar 

  14. Luo, Y. et al. ECNet is an evolutionary context-integrated deep learning framework for protein engineering. Nat. Commun. 12, 5743 (2021).

    Article  Google Scholar 

  15. Hsu, C., Nisonoff, H., Fannjiang, C. & Listgarten, J. Learning protein fitness models from evolutionary and assay-labeled data. Nat. Biotechnol. 40, 1114–1122 (2022).

  16. Sim, N.-L. et al. Sift web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 40, W452–W457 (2012).

    Article  Google Scholar 

  17. Hopf, T. A. et al. Mutation effects predicted from sequence co-variation. Nat. Biotechnol. 35, 128–135 (2017).

    Article  Google Scholar 

  18. Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95 (2021).

    Article  Google Scholar 

  19. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 4171–4186 (Association for Computational Linguistics, 2019).

  20. Floridi, L. & Chiriatti, M. GPT-3: its nature, scope, limits, and consequences. Minds Mach. 30, 681–694 (2020).

    Article  Google Scholar 

  21. Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).

    Article  Google Scholar 

  22. Hie, B., Zhong, E. D., Berger, B. & Bryson, B. Learning the language of viral evolution and escape. Science 371, 284–288 (2021).

    Article  MathSciNet  Google Scholar 

  23. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  MathSciNet  Google Scholar 

  24. Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. Adv. Neural Inf. Process. Syst. 34, 29287–29303 (2021).

    Google Scholar 

  25. Suzek, B. E. et al. Uniref clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

    Article  Google Scholar 

  26. He, L. et al. Pre-training co-evolutionary protein representation via a pairwise masked language model. Preprint at https://arxiv.org/abs/2110.15527 (2021).

  27. Dallago, C. et al. FLIP: benchmark tasks in fitness landscape inference for proteins. Preprint at bioRxiv https://doi.org/10.1101/2021.11.09.467890 (2021).

  28. Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 41, 275–283 (2023).

  29. Lu, H. et al. Machine learning-aided engineering of hydrolases for pet depolymerization. Nature 604, 662–667 (2022).

    Article  Google Scholar 

  30. Hughes, D. & Andersson, D. I. Evolutionary trajectories to antibiotic resistance. Annu. Rev. Microbiol. 71, 579–596 (2017).

    Article  Google Scholar 

  31. Wang, X., Zhang, H. & Chen, X. Drug resistance and combating drug resistance in cancer. Cancer Drug Resist. 2, 141 (2019).

    Google Scholar 

  32. Notin, P., Weitzman, R., Marks, D. & Gal, Y. ProteinNPT: improving protein property prediction and design with non-parametric transformers. Adv. Neural Inf. Process. Syst. 36, 33529–33563 (2023).

    Google Scholar 

  33. Zhao, J., Zhang, C. & Luo, Y. Contrastive fitness learning: reprogramming protein language models for low-n learning of protein fitness landscape. In Proc. 32nd International Conference on Pattern Recognition 470–474 (Springer, 2024).

  34. Bryant, D. H. et al. Deep diversification of an aav capsid protein by machine learning. Nat. Biotechnol. 39, 691–696 (2021).

    Article  Google Scholar 

  35. Jacquier, H. et al. Capturing the mutational landscape of the beta-lactamase TEM-1. Proc. Natl Acad. Sci. USA 110, 13067–13072 (2013).

    Article  Google Scholar 

  36. Sarkisyan, K. S. et al. Local fitness landscape of the green fluorescent protein. Nature 533, 397–401 (2016).

    Article  Google Scholar 

  37. Seuma, M., Faure, A. J., Badia, M., Lehner, B. & Bolognesi, B. The genetic landscape for amyloid beta fibril nucleation accurately discriminates familial Alzheimer’s disease mutations. eLife 10, e63364 (2021).

    Article  Google Scholar 

  38. Faure, A. J. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022).

    Article  Google Scholar 

  39. Araya, C. L. et al. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc. Natl Acad. Sci. USA 109, 16858–16863 (2012).

    Article  Google Scholar 

  40. Melamed, D., Young, D. L., Gamble, C. E., Miller, C. R. & Fields, S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA 19, 1537–1551 (2013).

    Article  Google Scholar 

  41. Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).

    Article  Google Scholar 

  42. Starr, T. N. & Thornton, J. W. Epistasis in protein evolution. Protein Sci. 25, 1204–1218 (2016).

    Article  Google Scholar 

  43. Sailer, Z. R. & Harms, M. J. High-order epistasis shapes evolutionary trajectories. PLoS Comput. Biol. 13, e1005541 (2017).

    Article  Google Scholar 

  44. Naas, T. et al. Beta-lactamase database (BLDB)–structure and function. J. Enzyme Inhib. Med. Chem. 32, 917–919 (2017).

    Article  Google Scholar 

  45. Stiffler, M. A., Hekstra, D. R. & Ranganathan, R. Evolvability as a function of purifying selection in TEM-1 β-lactamase. Cell 160, 882–892 (2015).

    Article  Google Scholar 

  46. Weinreich, D. M., Delaney, N. F., DePristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006).

    Article  Google Scholar 

  47. Sinai, S. et al. AdaLead: a simple and robust adaptive greedy search algorithm for sequence design. Preprint at https://arxiv.org/abs/2010.02141 (2020).

  48. Barrera, L. A. et al. Survey of variation in human transcription factors reveals prevalent DNA binding changes. Science 351, 1450–1454 (2016).

    Article  Google Scholar 

  49. Chaudhury, S., Lyskov, S. & Gray, J. J. PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics 26, 689–691 (2010).

    Article  Google Scholar 

  50. Ogden, P. J., Kelsic, E. D., Sinai, S. & Church, G. M. Comprehensive AAV capsid fitness landscape reveals a viral gene and enables machine-guided design. Science 366, 1139–1143 (2019).

    Article  Google Scholar 

  51. Lorenz, R. et al. ViennaRNA package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    Article  Google Scholar 

  52. Angermueller, C. et al. Model-based reinforcement learning for biological sequence design. In Proc. International Conference on Learning Representations https://openreview.net/forum?id=HklxbgBKvr (ICLR, 2020).

  53. Brookes, D., Park, H. & Listgarten, J. Conditioning by adaptive sampling for robust design. In Proc. 36th International Conference on Machine Learning 773–782 (PMLR, 2019).

  54. Hansen, N. The CMA evolution strategy: a tutorial. Preprint at https://arxiv.org/abs/1604.00772 (2016).

  55. Kirjner, A. et al. Improving protein optimization with smoothed fitness landscapes. In Proc. International Conference on Learning Representations https://openreview.net/forum?id=rxlF2Zv8x0 (ICLR, 2024).

  56. Wang, Y. et al. Self-play reinforcement learning guides protein engineering. Nat. Mach. Intell. 5, 845–860 (2023).

    Article  Google Scholar 

  57. Wittmann, B. J., Yue, Y. & Arnold, F. H. Informed training set design enables efficient machine learning-assisted directed protein evolution. Cell Syst. 12, 1026–1045 (2021).

    Article  Google Scholar 

  58. Qiu, Y., Hu, J. & Wei, G.-W. Cluster learning-assisted directed evolution. Nat. Comput. Sci. 1, 809–818 (2021).

    Article  Google Scholar 

  59. De Visser, J. A. G. & Krug, J. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480–490 (2014).

    Article  Google Scholar 

  60. Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).

    Article  Google Scholar 

  61. Lenski, R. E., Barrick, J. E. & Ofria, C. Balancing robustness and evolvability. PLoS Biol. 4, e428 (2006).

    Article  Google Scholar 

  62. Buel, G. R. & Walters, K. J. Can AlphaFold2 predict the impact of missense mutations on structure? Nat. Struct. Mol. Biol. 29, 1–2 (2022).

    Article  Google Scholar 

  63. Wu, L. et al. SPRoBERTa: protein embedding learning with local fragment modeling. Brief. Bioinform. 23, bbac365 (2022).

  64. Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

  65. Hie, B., Zhong, E., Bryson, B. & Berger, B. Learning mutational semantics. Adv. Neural Inf. Process. Syst. 33, 9109–9121 (2020).

    Google Scholar 

  66. Bileschi, M. L. et al. Using deep learning to annotate the protein universe. Nat. Biotechnol. 40, 932–937 (2022).

  67. Kulmanov, M. & Hoehndorf, R. DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 36, 422–429 (2020).

    Article  Google Scholar 

  68. Elnaggar, A. et al. ProtTrans: towards cracking the language of life’s code through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2021).

    Article  Google Scholar 

  69. He, L., Deng, P. & Liu, G. μFormer encoder. figshare https://doi.org/10.6084/m9.figshare.26892355 (2024).

  70. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).

  71. OpenAI. Introducing ChatGPT; https://openai.com/blog/chatgpt

  72. Ouyang, L. et al. Training language models to follow instructions with human feedback. In 36th Conference on Neural Information Processing Systems (NeurIPS 2022) https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf (2022).

  73. Firnberg, E., Labonte, J. W., Gray, J. J. & Ostermeier, M. A comprehensive, high-resolution map of a gene’s fitness landscape. Mol. Biol. Evol. 31, 1581–1592 (2014).

    Article  Google Scholar 

  74. Gonzalez, C. E. & Ostermeier, M. Pervasive pairwise intragenic epistasis among sequential mutations in tem-1 β-lactamase. J. Mol. Biol. 431, 1981–1992 (2019).

    Article  Google Scholar 

  75. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  Google Scholar 

  76. He, L., Deng, P. & Liu, G. microsoft/Mu-Protein: MuProtein 0.1.1. Zenodo https://doi.org/10.5281/zenodo.15836168 (2025).

Download references

Acknowledgements

We gratefully acknowledge M. Ostermeier for generously providing detailed information on the plasmids pSkunk3-TEM-1 and pTS42, and for sharing the pTS42 plasmid itself. We also extend our thanks to T. Peng for developing the demonstration webpage and J. Bai for creating the graphic illustrations (the copyright for which is held by Microsoft as the work was completed during her employment). Last, we appreciate B. Kruft and U. Munir for their invaluable support in program management and coordination.

Author information

Authors and Affiliations

Contributions

Conceptualization: L.H., H.S. and P.D. Methodology and modelling: L.H., H.S., G.L., Z.Z., Y.J., F.J. and L.W. Data curation: L.H., H.S. and P.D. Result interpretation: P.D., H.L., L.H. and C.C. Writing—original draft: L.H., P.D., H.S. and G.L. Writing—review: H.L., T.Q. and C.C. Supervision: T.-Y.L. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Liang He, Pan Deng, Haiguang Liu or Tao Qin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Yuchi Qiu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Ablation and analysis on μFormer components.

a) Ablation study to evaluate the importance of each component in μFormer. The change in performance after removing various components from the model relative to a full model is shown. Negative numbers (blue) indicate a loss of performance and positive numbers (red) indicate an improvement in performance. The last row displays the average performance change over 9 proteins. The plus/minus signs at the bottom indicate the presence/removal of the corresponding component. b) Spearman ρ statistics on 3 FLIP GB1 datasets of μFormer, ECNet, and their variants. ECNet w/ μFormer encoder replaces the language model in ECNet with μFormer’s language model. μFormer-S (Methods) is a variation with a model size similar to ECNet. 1-vs-rest: a train-test split where single-point mutants are used for training, and multi-point mutants are reserved for testing. 2-vs-rest: a train-test split where single- and double-point mutants are used for training, and all higher-order mutants are reserved for testing. 3-vs-rest: a train-test split where single-, double-, and triple-point mutants are used for training, and all higher-order mutants are reserved for testing. See Supplementary Notes for details.

Extended Data Fig. 2 Analysis of μProtein.

a) Performance of μFormer and Ridge on GB1 double mutants with varying training data size. Here, μFormer is a μFormer variation with a smaller supervised scorer module size (μFormer-SS). Training data ratio indicates the number of residues used for training versus the total number of amino acids in GB1. The training data size equals 209, 418, 627, 836, and 1045 for 20%, 40%, 60%, 80%, and 100%, respectively. All scores were evaluated on GB1 saturated double mutants (n=535,917). Center: mean. Error bands: standard deviation. Five experiments are performed for each setting with random selection on training data. b) Illustration of test data split, using a protein of 10 residues and the 40% setting as an example. 2/2 unseen: neither of the mutated residues in double mutants are seen by the model. 1/2 unseen: one and only one of the mutated residues in double mutants are seen by the model. c) Performance of μFormer and Ridge on different splits of GB1 double mutants. Training data split criteria are the same as in a). Center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. Five experiments are performed for each setting with random selection on training data.

Extended Data Table 1 Size of Training (Single-point mutants) and Test (Multi-point mutants) Datasets in One-to-Multi Setting

Supplementary information

Supplementary Information

Supplementary Notes 1–3, Figs. 1–11 and Tables 1–3.

Reporting Summary

Supplementary Data 1

Source data for Supplementary Figs. 1, 3, 5, 8, 9 and 11.

Supplementary Table 4

Source data for valid experiment results.

Source data

Source Data Figs. 2–6 and Extended Data Figs. 1 and 2

Source data for Figs. 2–6 and Extended Data Figs. 1 and 2.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, H., He, L., Deng, P. et al. Accelerating protein engineering with fitness landscape modelling and reinforcement learning. Nat Mach Intell 7, 1446–1460 (2025). https://doi.org/10.1038/s42256-025-01103-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01103-w

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing