Abstract
Identifying governing equations from observational data is crucial for understanding nonlinear physical systems but remains challenging due to the risk of overfitting. Here we introduce the Bi-Level Identification of Equations (BILLIE) framework, which simultaneously discovers and validates equations using a hierarchical optimization strategy. The policy gradient algorithm of reinforcement learning is leveraged to achieve the bi-level optimization. We demonstrate BILLIE’s superior performance through comparisons with baseline methods in canonical nonlinear systems such as turbulent flows and three-body systems. Furthermore, we apply the BILLIE framework to discover RNA and protein velocity equations directly from single-cell sequencing data. The equations identified by BILLIE outperform empirical models in predicting cellular differentiation states, underscoring BILLIE’s potential to reveal fundamental physical laws across a wide range of scientific fields.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout


Similar content being viewed by others
Data availability
The datasets used in this study are available in the Zenodo repository at https://doi.org/10.5281/zenodo.15140828 (ref. 59). Peripheral blood mononuclear cell CITE-Seq dataset (related to Extended Data Fig. 1 and Supplementary Fig. 5): the protein and RNA expression profiles were downloaded from the Gene Expression Omnibus database with the accession numbers GSM2695381 (protein) and GSM2695382 (RNA). Mouse hippocampus RNA-Seq dataset (related to Supplementary Figs. 6 and 7): the RNA expression profiles were downloaded from http://pklab.med.harvard.edu/velocyto/DentateGyrus/DentateGyrus.loom. Source data are available with this manuscript.
Code availability
The source codes to reproduce the results in this study are available on GitHub at https://github.com/HuiningYuan/BILLIE and Code Ocean at https://doi.org/10.24433/CO.0462000.v1 (ref. 60).
References
La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
Gorin, G., Svensson, V. & Pachter, L. Protein velocity and acceleration from single-cell multiomics experiments. Genome Biol. 21, 39 (2020).
Carroll, B. W. & Ostlie, D. A. An Introduction to Modern Astrophysics (Cambridge Univ. Press, 2017).
Batchelor, G. K. An Introduction to Fluid Dynamics (Cambridge Univ. Press, 1967).
Karatzas, I., Shreve, S. E., Karatzas, I. & Shreve, S. E. Methods of Mathematical Finance Vol. 39 (Springer, 1998).
Achdou, Y., Buera, F. J., Lasry, J.-M., Lions, P.-L. & Moll, B. Partial differential equation models in macroeconomics. Phil. Trans. R. Soc. A 372, 20130397 (2014).
Schuch, N. & Verstraete, F. Computational complexity of interacting electrons and fundamental limitations of density functional theory. Nat. Phys. 5, 732–735 (2009).
Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science 324, 81–85 (2009).
Udrescu, S.-M. & Tegmark, M. AI Feynman: a physics-inspired method for symbolic regression. Sci. Adv. 6, eaay2631 (2020).
Udrescu, S.-M. et al. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. Adv. Neural Inf. Process. Syst. 33, 4860–4871 (2020).
Vastl, M., Kulhánek, J., Kubalík, J., Derner, E. & Babuška, R. Symformer: end-to-end symbolic regression using transformer-based architecture. IEEE Access 12, 37840–37849 (2024).
Sun, F., Liu, Y., Wang, J.-X. & Sun, H. Symbolic physics learner: discovering governing equations via Monte Carlo tree search. In Proc. 11th International Conference on Learning Representations https://openreview.net/forum?id=ZTK3SefE8_Z (OpenReview.net, 2023).
Lemos, P., Jeffrey, N., Cranmer, M., Ho, S. & Battaglia, P. Rediscovering orbital mechanics with machine learning. Mach. Learn. Sci. Technol. 4, 045002 (2023).
Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl Acad. Sci. USA 113, 3932–3937 (2016).
Rudy, S. H., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Data-driven discovery of partial differential equations. Sci. Adv. 3, e1602614 (2017).
Champion, K., Zheng, P., Aravkin, A. Y., Brunton, S. L. & Kutz, J. N. A unified sparse optimization framework to learn parsimonious physics-informed models from data. IEEE Access 8, 169259–169271 (2020).
Chen, Z., Liu, Y. & Sun, H. Physics-informed learning of governing equations from scarce data. Nat. Commun. 12, 6136 (2021).
Boninsegna, L., Nüske, F. & Clementi, C. Sparse learning of stochastic dynamical equations. J. Chem. Phys. 148, 241723 (2018).
Zheng, P., Askham, T., Brunton, S. L., Kutz, J. N. & Aravkin, A. Y. A unified framework for sparse relaxed regularized regression: SR3. IEEE Access 7, 1404–1423 (2018).
Champion, K., Lusch, B., Kutz, J. N. & Brunton, S. L. Data-driven discovery of coordinates and governing equations. Proc. Natl Acad. Sci. USA 116, 22445–22451 (2019).
Xu, H., Chang, H. & Zhang, D. DLGA-PDE: discovery of PDEs with incomplete candidate library via combination of deep learning and genetic algorithm. J. Comput. Phys. 418, 109584 (2020).
Xu, H., Zhang, D. & Zeng, J. Deep-learning of parametric partial differential equations from sparse and noisy data. Phys. Fluids 33, 037132 (2021).
Xu, H., Zhang, D. & Wang, N. Deep-learning based discovery of partial differential equations in integral form from sparse and noisy data. J. Comput. Phys. 445, 110592 (2021).
Reinbold, P. A. K., Gurevich, D. R. & Grigoriev, R. O. Using noisy or incomplete data to discover models of spatiotemporal dynamics. Phys. Rev. E 101, 010203 (2020).
Reinbold, P. A., Kageorge, L. M., Schatz, M. F. & Grigoriev, R. O. Robust learning from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression. Nat. Commun. 12, 3219 (2021).
Fasel, U., Kutz, J. N., Brunton, B. W. & Brunton, S. L. Ensemble-SINDy: robust sparse model discovery in the low-data, high-noise limit, with active learning and control. Proc. R. Soc. A 478, 20210904 (2022).
Berg, J. & Nyström, K. Data-driven discovery of PDEs in complex datasets. J. Comput. Phys. 384, 239–252 (2019).
Xu, H., Haibin, C. & Zhang, D. DL-PDE: deep-learning based data-driven discovery of partial differential equations from discrete and noisy data. Commun. Comput. Phys. 29, 698–728 (2021).
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
Raissi, M. & Karniadakis, G. E. Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys. 357, 125–141 (2018).
Long, Z., Lu, Y., Ma, X. & Dong, B. PDE-Net: learning PDEs from data. Proc. Mach. Learn. Res. 80, 3208–3216 (2018).
Long, Z., Lu, Y. & Dong, B. PDE-Net 2.0: learning PDEs from data with a numeric–symbolic hybrid deep network. J. Comput. Phys. 399, 108925 (2019).
Rao, C. et al. Encoding physics to learn reaction–diffusion processes. Nat. Mach. Intell. 5, 765–779 (2023).
Kabanikhin, S. I. Definitions and examples of inverse and ill-posed problems. J. Inverse Ill-Posed Probl. 16, 317–357 (2008).
Sutton, R. S., McAllester, D., Singh, S. & Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process. Syst. 12, 1057–1063 (1999).
Silver, D. et al. Deterministic policy gradient algorithms. Proc. Mach. Learn. Res. 32, 387–395 (2014).
Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).
McDonald, P. W. The Computation of Transonic Flow Through Two-Dimensional Gas Turbine Cascades 79825 (American Society of Mechanical Engineers, 1971).
Ferziger, J. H., Perić, M. & Street, R. L. Computational Methods for Fluid Dynamics Vol. 3 (Springer, 2002).
Li, T., Shi, J., Wu, Y. & Zhou, P. On the mathematics of RNA velocity I: theoretical analysis. CSIAM Trans. Appl. Math. 2, 1–55 (2021).
Stoeckius, M. et al. Large-scale simultaneous measurement of epitopes and transcriptomes in single cells. Nat. Methods 14, 865–868 (2017).
Setty, M. et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol. 37, 451–460 (2019).
Dhapola, P. et al. Scarf enables a highly memory-efficient analysis of large-scale single-cell genomics data. Nat. Commun. 13, 4616 (2022).
Hochgerner, H., Zeisel, A., Lönnerberg, P. & Linnarsson, S. Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing. Nat. Neurosci. 21, 290–299 (2018).
Cleveland, W. S. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 74, 829–836 (1979).
Oduguwa, V. & Roy, R. Bi-level optimisation using genetic algorithm. In Proc. 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS 2002) 322–327 (IEEE, 2002).
Wang, X. et al. Optimizing data usage via differentiable rewards. Proc. Mach. Learn. Res. 119, 9983–9995 (2020).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) https://arxiv.org/abs/1412.6980 (2014).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Hoerl, A. E. & Kennard, R. W. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970).
Hoerl, A. E. & Kennard, R. W. Ridge regression: applications to nonorthogonal problems. Technometrics 12, 69–82 (1970).
Raissi, M., Yazdani, A. & Karniadakis, G. E. Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367, 1026–1030 (2020).
Boffetta, G. et al. Two-dimensional turbulence. Annu. Rev. Fluid Mech. 44, 427–451 (2012).
Kochkov, D. et al. Machine learning-accelerated computational fluid dynamics. Proc. Natl Acad. Sci. USA 118, e2101784118 (2021).
Van Leer, B. Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. J. Comput. Phys. 32, 101–136 (1979).
Frisch, U. & Kolmogorov, A. N. Turbulence: the Legacy of AN Kolmogorov (Cambridge Univ. Press, 1995).
de Silva, B. et al. PySINDy: a Python package for the sparse identification of nonlinear dynamical systems from data. J. Open Source Softw. 5, 2104 (2020).
Kaptanoglu, A. A. et al. PySINDy: a comprehensive Python package for robust sparse system identification. J. Open Source Softw. 7, 3994 (2022).
Li, Z. Bi-level identification of governing equations for nonlinear physical systems. Zenodo https://doi.org/10.5281/zenodo.15140828 (2025).
Li, Z. et al. Bi-level identification of governing equations for nonlinear physical systems. Code Ocean https://doi.org/10.24433/CO.0462000.v1 (2025).
Acknowledgements
This work is supported by the National Natural Science Foundation of China (no. 52376090) and the National Key Research and Development Program of China (no. 2022YFF0504500).
Author information
Authors and Affiliations
Contributions
L.Y. and W.H. supervised the project. L.Y., Z.L., H.Y. and W.H. conceived the idea. Z.L. carried out the numerical simulations. Z.L., H.Y., Y.H. and H.D. performed the research. All authors discussed the results and assisted during paper preparation.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Alan Ali Kaptanoglu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Jie Pan, in collaboration with the Nature Computational Science team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Identifying RNA velocity and protein velocity on multi-modal single-cell sequencing data.
a, The process of a gene’s information passing from unspliced mRNA (denote as u) to spliced mRNA (s) through splicing, and from spliced mRNA (s) to protein (p) through translation. b, Cell type of the single-cell sequencing dataset used in the identification of RNA velocity and protein velocity, where Mono type cells were used as the training data, CD4+T and CD8+T type cells were used as the testing data. c, the process of performing RNA/protein velocity identification with BILLIE, in which the equations across different genes share the same form (that is, Γ) while having distinct libraries (that is, Ut and Q) and coefficients (that is, θ). d, The cell-level correlation between the original sequencing and the predictions made by the identified equation and the empirical equation on the abundance of spliced mRNA. e, The relationship between gene-level correlation (between the original sequencing and the predictions) and data sparsity on the abundance of spliced mRNA. Each point in the plot presents a single gene, and 69.5%, 30.5% and 0.5% denotes the ratio of genes divided by the data sparsity and the performance of the predictions. Predictions with over 0.6 Pearson correlation are considered ‘good’ predictions; genes with data sparsity over 0.5 are considered ‘very sparse’. f, Spliced mRNA abundance of representative marker genes, including the original sequencing and the predictions made by the different equations. g, The cell-level correlation between the original sequencing and the predictions on the abundance of protein. h, The gene-level correlation of all 7 genes on the abundance of protein. i, Protein abundance of 4 of the 7 marker genes.
Extended Data Fig. 2 The general workflow of identifying the governing equation of a 2D fluid dynamical system from data.
With the spatial–temporal measurements collected from a physical system (such as a fluid system shown in the first panel on the left), the spatial and temporal derivatives at each location can be calculated using polynomial fit (the second panel), which are then used for building the overcomplete library Q (the third panel). By selecting proper terms from the overcomplete library, the dynamics of a given system can be identified (the last panel on the right).
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1–7, Figs. 1–8 and Tables 1–4.
Source data
Source Data Fig. 2 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 1 (download XLSX )
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Z., Yuan, H., Han, W. et al. Bi-level identification of governing equations for nonlinear physical systems. Nat Comput Sci 5, 456–466 (2025). https://doi.org/10.1038/s43588-025-00804-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s43588-025-00804-x


