Machine learning models are promising approaches to tackle partial differential equations, which are foundational descriptions of many scientific and engineering problems. However, in speaking with several experts about progress in the area, questions are emerging over what realistic advantages machine learning models have and how their performance should be evaluated.
Partial differential equations (PDEs) model the spatiotemporal behaviour of physical systems, serving as descriptors of natural physical laws in a compact and symbolic form. For complex problems, these equations are usually not amenable to analytical solutions, and computationally intensive numerical approximations are often the method of choice. However, the rise of machine learning (ML) approaches in recent years has led to new, data-driven ways to solve PDEs, promising transformative engineering applications in domains such as fluid dynamics, heat transfer and wave propagation1.
Although ML for PDEs has rapidly become a highly active area, the successful deployment of such approaches outside academic research has so far been limited. A challenge for the community seems to be agreement on the appropriate benchmark datasets and evaluation metrics.
As a recent community effort, a large-scale dataset collection, known as ‘the Well’, was presented at NeurIPS 20242, containing numerical simulations for a variety of spatiotemporal physical systems at different scales, from acoustic waves to active matter and astrophysical phenomena. According to Petros Koumoutsakos, professor at Harvard, the work is “a step in the right direction” and stands out as it provides a convenient repository of data from simulations. All in all, the Well seems to be a serious attempt to address the current need for more complex and challenging benchmark problems that can be used to test ML solvers of PDEs in science and engineering.
Previously, Nick McGreivy, at the time a researcher at Princeton University, and Ammar Hakim, a principal research physicist at the Princeton Plasma Physics Laboratory, reported in an Analysis in this journal3 on the status of ML-based PDE solvers in fluid dynamics, by surveying 82 articles in the relevant literature. The authors identified two concerning issues. First, they found that comparisons against standard numerical methods have often been unfair, either because they involve non-state-of-the-art baselines for the specific problem, or because the runtime of a less accurate ML model is compared with the longer runtime of a more accurate standard numerical method. Second, they found evidence for systematic underreporting of negative results. Overall, they concluded that there is an overoptimistic view of the capabilities of ML models to solve fluid-related PDE problems.
Commenting on the issues raised in the Analysis, Johannes Brandstetter, assistant professor at Johannes Kepler University Linz and chief researcher at NXAI, a newly founded AI research institute in Linz, identifies two fundamental questions in a News & Views article in this issue. First, why is it so challenging for the community to enable fair comparisons between ML models and numerical methods? Second, are there any genuine advantages provided by ML models? An underlying issue is that for any specific PDE problem, there are many user demands such as speed, accuracy, uncertainty quantification, generalization and practical implementation. In Brandstetter’s opinion, benchmark challenges that combine such demands are needed to advance the field, similar to the CASP challenge for protein structure prediction.
We spoke to Brandstetter and McGreivy to further probe their opinions on progress in the field. McGreivy agrees that CASP-like benchmark problems are needed, stating that “until that happens, ML-based solvers for fluid-related PDEs will remain a solution looking for a problem”. He highlights three general key characteristics for a useful benchmark problem: (1) it should be unsolvable by current methods; (2) it should be verifiable to ensure that the answer is correct; and (3) it should be useful by contributing to scientific knowledge or real-world industrial applications. However, McGreivy noted that because the ground-truth can only currently be provided by high-fidelity modelling such as direct numerical simulations, benchmarks for PDE solvers cannot be computationally unsolved and verifiable at the same time, making it difficult for researchers to come up with impactful challenge problems.
Brandstetter is convinced that a change of perspective is needed, in that researchers should seek problems unsolvable with existing numerical methods. Accordingly, he is pursuing complex PDE problems that are an order of magnitude larger than previously tackled systems, as he believes such complexity is necessary for deep learning to really shine. “We should not forget that the numerical methods are also just proxies and if we look at problems where these proxies are almost perfect and reasonably fast, there is hardly any reason to replace them”, he added.
Crafting useful and challenging benchmarks in which ML methods can excel and become the go-to solution is a priority, and such efforts will greatly benefit from close collaboration with domain experts. As Koumoutsakos told us, “Choosing informative benchmarks requires significant domain knowledge that exceeds solving an equation and collecting data”. We are excited to see the direction the field will take, and what problems the ML-based PDE approaches of tomorrow will solve.
References
Brunton, S. L. & Kutz, J. N. Nat. Comput. Sci. 4, 483–494 (2024).
Ohana, R. et al. In 38th Conf. Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks (2024).
McGreivy, N. & Hakim, A. Nat. Mach. Intell. 6, 1256–1269 (2024).
Rights and permissions
About this article
Cite this article
Machine learning solutions looking for PDE problems. Nat Mach Intell 7, 1 (2025). https://doi.org/10.1038/s42256-025-00989-w
Published:
Issue date:
DOI: https://doi.org/10.1038/s42256-025-00989-w