Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Generalized design of sequence–ensemble–function relationships for intrinsically disordered proteins

A preprint version of the article is available at bioRxiv.

Abstract

The design of folded proteins has advanced substantially in recent years. However, many proteins and protein regions are intrinsically disordered and lack a stable fold, that is, the sequence of an intrinsically disordered protein (IDP) encodes a vast ensemble of spatial conformations that specify its biological function. This conformational plasticity and heterogeneity makes IDP design challenging. Here we introduce a computational framework for de novo design of IDPs through rational and efficient inversion of molecular simulations that approximate the underlying sequence–ensemble relationship. We highlight the versatility of this approach by designing IDPs with diverse properties and arbitrary sequence constraints. These include IDPs with target ensemble dimensions, loops and linkers, highly sensitive sensors of physicochemical stimuli, and binders to target disordered substrates with distinct conformational biases. Overall, our method provides a general framework for designing sequence–ensemble–function relationships of biological macromolecules.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Method for inverse design of IDPs.
Fig. 2: Designing IDPs with varying ensemble dimensions.
Fig. 3: Shaping global conformational biases through loops and linker IDPs.
Fig. 4: Engineering IDPs with arbitrary sequence constraints.
Fig. 5: Programming stimuli-responsive IDP sensors.
Fig. 6: Designing IDP binders for disordered substrates.

Similar content being viewed by others

Data availability

All optimized sequences are provided in the Supplementary Data 1. Source data are provided with this paper.

Code availability

The complete codebase is available at the following GitHub repository: https://github.com/rkruegs123/idp-design. This repository includes a notebook containing a scaffold of a simple optimization for a custom state-level property. A snapshot of this repository, including the full source code and corresponding documentation, has also been archived on Zenodo at https://doi.org/10.5281/zenodo.15311353 (ref. 65).

References

  1. Boehr, D. D., Nussinov, R. & Wright, P. E. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789–796 (2009).

    Article  Google Scholar 

  2. Holehouse, A. S. & Kragelund, B. B. The molecular basis for cellular function of intrinsically disordered protein regions. Nat. Rev. Mol. Cell Biol. 25, 187–211 (2024).

    Article  Google Scholar 

  3. van der Lee, R. et al. Classification of intrinsically disordered regions and proteins. Chem. Rev. 114, 6589–6631 (2014).

    Article  Google Scholar 

  4. Uversky, V. N. Recent developments in the field of intrinsically disordered proteins: intrinsic disorder–based emergence in cellular biology in light of the physiological and pathological liquid–liquid phase transitions. Ann. Rev. Biophys. 50, 135–156 (2021).

    Article  Google Scholar 

  5. Tesei, G. et al. Conformational ensembles of the human intrinsically disordered proteome. Nature 626, 897–904 (2024).

    Article  Google Scholar 

  6. Thomasen, F. E. & Lindorff-Larsen, K. Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins. Biochem. Soc. Trans. 50, 541–554 (2022).

    Article  Google Scholar 

  7. Mittag, T. & Forman-Kay, J. D. Atomic-level characterization of disordered protein ensembles. Cur. Opin. Struct. Biol. 17, 3–14 (2007).

    Article  Google Scholar 

  8. Davey, N. E., Simonetti, L. & Ivarsson, Y. The next wave of interactomics: mapping the SLiM-based interactions of the intrinsically disordered proteome. Curr. Opin. Struct. Biol. 80, 102593 (2023).

    Article  Google Scholar 

  9. Huang, Q., Li, M., Lai, L. & Liu, Z. Allostery of multidomain proteins with disordered linkers. Curr. Opin. Struct. Bio. 62, 175–182 (2020).

    Article  Google Scholar 

  10. Moses, D., Ginell, G. M., Holehouse, A. S. & Sukenik, S. Intrinsically disordered regions are poised to act as sensors of cellular chemistry. Trends Biochem. Sci. 48, 1019–1034 (2023).

  11. Banani, S. F. et al. Genetic variation associated with condensate dysregulation in disease. Develop. Cell 57, 1776–1788.e8 (2022).

    Article  Google Scholar 

  12. Shrinivas, K. et al. Enhancer features that drive formation of transcriptional condensates. Mol. Cell 75, 549–561 (2019).

    Article  Google Scholar 

  13. Sabari, B. R. Biomolecular condensates and gene activation in development and disease. Develop. Cell 55, 84–96 (2020).

    Article  Google Scholar 

  14. Shi, M., Zhang, P., Vora, S. M. & Wu, H. Higher-order assemblies in innate immune and inflammatory signaling: a general principle in cell biology. Cur. Opin. Cell Biol. 63, 194–203 (2020).

    Article  Google Scholar 

  15. Tsang, B., Pritišanac, I., Scherer, S. W., Moses, A. M. & Forman-Kay, J. D. Phase separation as a missing mechanism for interpretation of disease mutations. Cell 183, 1742–1756 (2020).

    Article  Google Scholar 

  16. Dauparas, J. et al. Robust deep learning–based protein sequence design using ProteinMPNN. Science 378, 49–56 (2022).

    Article  Google Scholar 

  17. Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article  Google Scholar 

  18. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. nature 596, 583–589 (2021).

    Article  Google Scholar 

  19. Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science 384, eadl2528 (2024).

    Article  Google Scholar 

  20. Dignon, G. L., Zheng, W., Best, R. B., Kim, Y. C. & Mittal, J. Relation between single-molecule properties and phase behavior of intrinsically disordered proteins. Proc. Natl Acad. Sci. USA 115, 9929–9934 (2018).

    Article  Google Scholar 

  21. Joseph, J. A. et al. Physics-driven coarse-grained model for biomolecular phase separation with near-quantitative accuracy. Nat. Comput. Sci. 1, 732–743 (2021).

    Article  Google Scholar 

  22. Tesei, G., Schulze, T. K., Crehuet, R. & Lindorff-Larsen, K. Accurate model of liquid–liquid phase behavior of intrinsically disordered proteins from optimization of single-chain properties. Proc. Natl Acad. Sci. USA 118, e2111696118 (2021).

    Article  Google Scholar 

  23. Lotthammer, J. M., Ginell, G. M., Griffith, D., Emenecker, R. J. & Holehouse, A. S. Direct prediction of intrinsically disordered protein conformational properties from sequence. Nat. Methods 21, 465–476 (2024).

    Article  Google Scholar 

  24. Emenecker, R. J., Guadalupe, K., Shamoon, N. M., Sukenik, S. & Holehouse, A. S. Sequence–ensemble–function relationships for disordered proteins in live cells. Preprint at bioRxiv https://doi.org/10.1101/2023.10.29.564547 (2023).

  25. Regy, RoshanMammen, Thompson, J., Kim, Y. C. & Mittal, J. Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins. Protein Sci. 30, 1371–1379 (2021).

    Article  Google Scholar 

  26. Bradbury, J. et al. JAX: composable transformations of Python+ Numpy programs. GitHub https://github.com/jax-ml/jax (2018).

  27. Schoenholz, S. & Cubuk, E. D. JAX MD: a framework for differentiable physics. In Proc. 34th International Conference on Neural Information Processing System Vol. 33, 11428–11441 (2020).

  28. Thaler, S. & Zavadlav, J. Learning neural network potentials from experimental data via differentiable trajectory reweighting. Nat. Commun. 12, 6884 (2021).

    Article  Google Scholar 

  29. Zhang, Shi-Xin, Wan, Zhou-Quan & Yao, H. Automatic differentiable Monte Carlo: theory and application. Phys. Rev. Res. 5, 033041 (2023).

    Article  Google Scholar 

  30. González-Foutel, NicolásS. et al. Conformational buffering underlies functional selection in intrinsically disordered protein regions. Nat. Struct. Mol. Biol. 29, 781–790 (2022).

    Article  Google Scholar 

  31. Lin, Yi-Hsuan & Chan, HueSun Phase separation and single-chain compactness of charged disordered proteins are strongly correlated. Biophys. J. 112, 2043–2046 (2017).

    Article  Google Scholar 

  32. G. Greener, J. Differentiable simulation to develop molecular dynamics force fields for disordered proteins. Chem. Sci. 15, 4897–4909 (2024).

    Article  Google Scholar 

  33. Mugnai, M. L. et al. Sizes, conformational fluctuations, and SAXS profiles for intrinsically disordered proteins. Protein Sci. 34, e70067 (2025).

    Article  Google Scholar 

  34. Riback, J. A. et al. Commonly used FRET fluorophores promote collapse of an otherwise disordered protein. Proc. Natl Acad. Sci. USA 116, 8889–8894 (2019).

    Article  Google Scholar 

  35. Krueger, R. K. & Ward, M. JAX-RNAfold: scalable differentiable folding. Bioinformatics 41, btaf203 (2025).

    Article  Google Scholar 

  36. Emenecker, R. J., Griffith, D. & Holehouse, A. S. Metapredict V2: an update to metapredict, a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2022.06.06.494887v1 (2022).

  37. Das, R. K. & Pappu, R. V. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc. Natl Acad. Sci. USA 110, 13392–13397 (2013).

    Article  Google Scholar 

  38. Zhang, M. et al. The intrinsically disordered region from PP2C phosphatases functions as a conserved CO2 sensor. Nat. Cell Biol. 24, 1029–1037 (2022).

    Article  Google Scholar 

  39. Dignon, G. L., Zheng, W., Kim, Y. C. & Mittal, J. Temperature-controlled liquid–liquid phase separation of disordered proteins. ACS Central Sci. 5, 821–830 (2019).

    Article  Google Scholar 

  40. Borgia, A. et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature 555, 61–66 (2018).

    Article  Google Scholar 

  41. Schuler, B. et al. Binding without folding—the biomolecular function of disordered polyelectrolyte complexes. Curr. Opin. Struc. Biol. 60, 66–76 (2020).

    Article  Google Scholar 

  42. Adachi, K. & Kawaguchi, K. Predicting heteropolymer interactions: demixing and hypermixing of disordered protein sequences. Phys. Rev. X 14, 031011 (2024).

    Google Scholar 

  43. Ginell, G. M. et al. Sequence-based prediction of intermolecular interactions driven by disordered regions. Science 388, eadq8381 (2025).

    Article  Google Scholar 

  44. Portz, B., Lee, Bo. Lim & Shorter, J. FUS and TDP-43 phases in health and disease. Trends Biochem. Sci. 46, 550–563 (2021).

    Article  Google Scholar 

  45. Wang, J. et al. A molecular grammar governing the driving forces for phase separation of prion-like RNA binding proteins. Cell 174, 688–699.e16 (2018).

    Article  Google Scholar 

  46. Roden, C. & Gladfelter, A. S. RNA contributions to the form and function of biomolecular condensates. Nat. Rev. Mol. Cell Biol. 22, 183–195 (2021).

    Article  Google Scholar 

  47. Rauh, A. S., Hedemark, G. S., Tesei, G. & Lindorff-Larsen, K. A coarse-grained model for simulations of phosphorylated disordered proteins. Biophys J. https://doi.org/10.1016/j.bpj.2025.07.001 (2025).

  48. Kilgore, H. R. et al. Distinct chemical environments in biomolecular condensates. Nat. Chem. Biol. 20, 291–301 (2024).

    Article  Google Scholar 

  49. Choi, Jeong-Mo & Pappu, R. V. Improvements to the ABSINTH force field for proteins based on experimentally derived amino acid specific backbone conformational statistics. J. Chem. Theory Comput. 15, 1367–1382 (2019).

    Article  Google Scholar 

  50. Wessén, J., Das, S., Pal, T. & Chan, HueSun Analytical formulation and field-theoretic simulation of sequence-specific phase separation of protein-like heteropolymers with short- and long-spatial-range interactions. J. Phys. Chem. B 126, 9222–9245 (2022).

    Article  Google Scholar 

  51. Shrinivas, K. & Brenner, M. P. Phase separation in fluids with many interacting components. Proc. Natl Acad. Sci. USA 118, e2108551118 (2021).

  52. Frank, C. et al. Scalable protein design using optimization in a relaxed sequence space. Science 386, 439–445 (2024).

    Article  Google Scholar 

  53. Matthies, M. C., Krueger, R., Torda, A. E. & Ward, M. Differentiable partition function calculation for RNA. Nucleic Acids Res. 52, e14 (2024).

    Article  Google Scholar 

  54. Krueger, R. K., Engel, M. C., Hausen, R. & Brenner, M. P. Fitting coarse-grained models to macroscopic experimental data via automatic differentiation. Preprint at https://arxiv.org/abs/2411.09216 (2025).

  55. Janson, G. & Feig, M. Transferable deep generative modeling of intrinsically disordered protein conformations. PLoS Comput. Biol. 20, e1012144 (2024).

    Article  Google Scholar 

  56. Liu, C. et al. Diffusing protein binders to intrinsically disordered proteins. Nature 644, 809–817 (2025)

  57. Pesce, F. et al. Design of intrinsically disordered protein variants with diverse structural properties. Sci. Adv. 10, eadm9926 (2024).

    Article  Google Scholar 

  58. Sanchez-Burgos, I., Espinosa, J. R., Joseph, J. A. & Collepardo-Guevara, R. RNA length has a non-trivial effect in the stability of biomolecular condensates formed by RNA-binding proteins. PLoS Comput. Biol. 18, e1009810 (2022).

    Article  Google Scholar 

  59. Zhu, H. et al. The chromatin regulator HMGA1a undergoes phase separation in the nucleus. ChemBioChem 24, e202200450 (2023).

    Article  Google Scholar 

  60. Alston, J. J., Soranno, A. & Holehouse, A. S. Conserved molecular recognition by an intrinsically disordered region in the absence of sequence conservation. Biophys. J. 123, 26a (2024).

    Article  Google Scholar 

  61. Wessén, J., Das, S., Pal, T. & Chan, HueSun Analytical formulation and field-theoretic simulation of sequence-specific phase separation of protein-like heteropolymers with short-and long-spatial-range interactions. J. Phys. Chem. B 126, 9222–9245 (2022).

    Article  Google Scholar 

  62. Garaizar, A. et al. Aging can transform single-component protein condensates into multiphase architectures. Proc. Natl Acad. Sci. USA 119, e2119800119 (2022).

    Article  Google Scholar 

  63. Taneja, I. & Lasker, K. Machine learning based methods to generate conformational ensembles of disordered proteins. Biophys. J. 123, 101–113 (2023).

  64. Wang, X., Ramírez-Hinestrosa, Simón, Dobnikar, J. & Frenkel, D. The Lennard–Jones potential: when (not) to use it. Phys. Chem. Chem. Phys. 22, 10624–10633 (2020).

    Article  Google Scholar 

  65. Krueger, R. K. & Shrinivas, K. rkruegs123/idp-design: file format change for figures. Zenodo https://doi.org/10.5281/zenodo.15311353 (2025).

Download references

Acknowledgements

We thank M. Ward for his collaboration on sequence design via overparameterization in the context of differentiable RNA folding, which inspired this work, J. Smith for helpful discussions relating to stochastic gradient estimators, W. Snead for helpful discussions on IDP design, and J. Boodry, N. Tyagi and members of the Shrinivas laboratory, for discussions and experimentation with the codebase. We acknowledge support from the Simons Foundation through the Simons Foundation Investigator award (R.K.K., M.P.B. and K.S). We acknowledge support from the NSF AI Institute of Dynamic Systems (grant no. 2112085), the Office of Naval Research (grant no. N00014-17-1-3029) and the Harvard Materials Research Science and Engineering Center (DMR no. 20-11754 to R.K.K and M.P.B.). We acknowledge support from NSF-Simons Center for Mathematical and Statistical Analysis of Biology at Harvard (grant no. 1764269) and Northwestern University for startup funding (K.S.). The computations in this paper were in part run on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University. This research was supported in part through the computational resources and staff contributions provided for the Quest high performance computing facility at Northwestern University.

Author information

Authors and Affiliations

Authors

Contributions

R.K.K., M.P.B. and K.S. designed the study. R.K.K. and K.S. performed the research. All authors contributed to the writing and revision of the manuscript.

Corresponding authors

Correspondence to Michael P. Brenner or Krishna Shrinivas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Mikael Lund and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–8, Tables 1–4 and Sections 1–9.

Reporting Summary

Peer Review file

Supplementary Data 1

All optimized sequences from Figs. 1–6.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Krueger, R.K., Brenner, M.P. & Shrinivas, K. Generalized design of sequence–ensemble–function relationships for intrinsically disordered proteins. Nat Comput Sci (2025). https://doi.org/10.1038/s43588-025-00881-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s43588-025-00881-y

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics