Abstract
With the rapid development of deep learning weather prediction (DLWP) models like GenCast, rigorous evaluation of their physical consistency is essential. This study investigates the dynamical fidelity of GenCast against ECMWF IFS-HRES and IFS-ENS using comprehensive kinetic energy (KE) and difference kinetic energy (DKE) spectra over 2021. Unlike the physically consistent error growth in IFS-ENS, GenCast exhibits weak planetary-scale growth and a persistent, flattened KE tail at high wavenumbers starting from the first forecast step. These mesoscale artifacts persist across multiple GenCast variants and AIFS-ENS, indicating a broader challenge for noise-conditioned generation. Helmholtz decomposition further reveals white-noise-like variance rather than balanced dynamics. Spatially, weak interactions between large-scale and mesoscale wind fields suggest a misrepresentation of topography-flow interactions. Furthermore, analyses of KE gradient (∣∇KE∣) revealed that GenCast fails to reproduce the sharp, filamentary structures, instead generating broad, isotropic, and noisy patterns. These findings suggest that current noise injection mechanisms in DLWPs produce noisy artifacts mimicking variance without reproducing realistic error growth physics. Improving these mechanisms is vital for developing physically consistent DLWPs.
Similar content being viewed by others
Data availability
ERA5 reanalysis and IFS-HRES data were obtained from the WeatherBench2 dataset https://console.cloud.google.com/storage/browser/weatherbench2/datasets. ERA5 reanalysis is also available at Climate Data Store https://cds.climate.copernicus.eu/.IFS-ENS were downloaded from the ECMWF TIGGE archive https://apps.ecmwf.int/datasets/data/tigge/levtype=pv/type=pf/. ECMWF's Open data is available at https://www.ecmwf.int/en/forecasts/datasets/open-data.
Code availability
The GenCast model code is available at https://github.com/google-deepmind/graphcast. AIFS-ENS is accessible at https://huggingface.co/ecmwf/aifs-ens-1.0.
References
Lorenz, E. N. Deterministic nonperiodic flow. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/20/2/1520-0469_1963_020_0130_dnf_2_0_co_2.xml (1963).
Kalnay, E. Atmospheric Modeling, Data Assimilation and Predictability (Cambridge University Press, 2012).
Owens, R. & Hewson, T. ECMWF Forecast User Guide https://www.ecmwf.int/node/16559 (ECMWF, 2018).
Bi, K. et al. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619, 533–538 (2023).
Sun, X. et al. A data-to-forecast machine learning system for global weather. Nat. Commun. 16, 6658 (2025).
Chen, K. et al. The operational medium-range deterministic weather forecasting can be extended beyond a 10-day lead time. Commun. Earth Environ. 6, 518 (2025).
Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382, 1416–1421 (2023).
Pathak, J. et al. FourCastNet: a global data-driven high-resolution weather model using adaptive Fourier neural operators. Preprint at https://doi.org/10.48550/arXiv.2202.11214 (2022).
Price, I. et al. Probabilistic weather forecasting with machine learning. Nature 1–7. https://www.nature.com/articles/s41586-024-08252-9 (2024).
Selz, T. & Craig, G. C. Can artificial intelligence-based weather prediction models simulate the butterfly effect? Geophys. Res. Lett. 50, e2023GL105747 (2023).
Bonavita, M. On some limitations of current machine learning weather prediction models. Geophys. Res. Lett. 51, e2023GL107377 (2024).
Li, Z. et al. Exploring the differences in atmospheric mesoscale kinetic energy spectra between AI based and physics based models. Sci. Rep. 15, 15504 (2025).
Hakim, G. J. & Masanam, S. Dynamical tests of a deep learning weather prediction model. Artif. Intell. Earth Syst. 3, e230090 (2024).
Alet, F. et al. Skillful joint probabilistic weather forecasting from marginals. Preprint at https://doi.org/10.48550/arXiv.2506.10772 (2025).
Mihai Alexe, S. L. Data-driven ensemble forecasting with the AIFS. ECMWF Newsletter https://www.ecmwf.int/en/elibrary/81620-data-driven-ensemble-forecasting-aifs (2024).
Leith, C. E. Atmospheric predictability and two-dimensional turbulence. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/28/2/1520-0469_1971_028_0145_apatdt_2_0_co_2.xml (1971).
Tribbia, J. J. & Baumhefner, D. P. Scale interactions and atmospheric predictability: an updated perspective. Mon. Weather Rev. https://journals.ametsoc.org/view/journals/mwre/132/3/1520-0493_2004_132_0703_siaapa_2.0.co_2.xml (2004).
Zhang, F., Bei, N., Rotunno, R., Snyder, C. & Epifanio, C. C. Mesoscale predictability of moist baroclinic waves: convection-permitting experiments and multistage error growth dynamics. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/64/10/jas4028.1.xml (2007).
Selz, T., Riemer, M. & Craig, G. C. The transition from practical to intrinsic predictability of midlatitude weather. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/79/8/JAS-D-21-0271.1.xml (2022).
Zhang, Y. Sensitivity of intrinsic error growth to large-scale uncertainty structure in a record-breaking summertime rainfall event. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/80/5/JAS-D-22-0231.1.xml (2023).
Durran, D. R. & Gingrich, M. Atmospheric predictability: Why butterflies are not of practical importance. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/71/7/jas-d-14-0007.1.xml (2014).
Wang, J.-W. A. & Sardeshmukh, P. D. Inconsistent global kinetic energy spectra in reanalyses and models. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/78/8/JAS-D-20-0294.1.xml (2021).
Rotunno, R., Snyder, C. & Judt, F. Upscale versus “up-amplitude” growth of forecast-error spectra. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/80/1/JAS-D-22-0070.1.xml (2022).
Lauritzen, P. H. et al. NCAR Release of CAM-SE in CESM2.0: a reformulation of the spectral element dynamical core in dry-mass vertical coordinates with comprehensive treatment of condensates and energy. J. Adv. Model. Earth Syst. 10, 1537–1570 (2018).
Skamarock, W. C., Park, S.-H., Klemp, J. B. & Snyder, C. Atmospheric kinetic energy spectra from global high-resolution nonhydrostatic simulations. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/71/11/jas-d-14-0114.1.xml (2014).
Sun, Y. Q. & Zhang, F. A new theoretical framework for understanding multiscale atmospheric predictability. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/77/7/jasD190271.xml (2020).
Nathaniel, J. et al. ChaosBench: a multi-channel, physics-based benchmark for subseasonal-to-seasonal climate prediction. Adv. Neural Inf. Process. Syst. 37, 43715–43729 (2024).
Fortin, V., Abaza, M., Anctil, F. & Turcotte, R. Why should ensemble spread match the RMSE of the ensemble mean? J. Hydrometeorol. https://journals.ametsoc.org/view/journals/hydr/15/4/jhm-d-14-0008_1.xml (2014).
Bechtold, P. Convection parametrization. In Proc. Seminar on Parameterization of Subgrid Physical Processes 63–86 (Reading, UK, ECMWF, 2008).
ECMWF. IFS Documentation CY47R1—part VI: technical and computational procedures. https://www.ecmwf.int/node/19750 (ECMWF, 2020).
Sambamurthy, A. & Chattopadhyay, A. Lazy Diffusion: mitigating spectral collapse in generative diffusion-based stable autoregressive emulation of turbulent flows. Preprint at https://doi.org/10.48550/arXiv.2512.09572 (2025).
Lang, S. et al. AIFS-CRPS: ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score. npj Artifcial Intelligence 2, 18 (2026).
Li, Z., Peng, J. & Zhang, L. Spectral budget of rotational and divergent kinetic energy in global analyses. J. Atmos. Sci. 80, 813–831 (2023).
Andrae, M., Landelius, T., Oskarsson, J. & Lindsten, F. Continuous ensemble weather forecasting with diffusion models. In The Thirteenth International Conference on Learning Representations. https://opereview.net/forum?id=ePEZvQNFDW (2025).
Antonio, B., Strommen, K. & Christensen, H. M. Seasonal forecasting using the GenCast probabilistic machine learning model. Preprint at https://doi.org/10.48550/arXiv.2509.06457 (2025).
Hatanpää, V. et al. AERIS: Argonne Earth Systems Model for reliable and skillful predictions. Preprint at https://doi.org/10.48550/arXiv.2509.13523 (2025).
Chattopadhyay, A., Sun, Y. Q. & Hassanzadeh, P. Challenges of learning multi-scale dynamics with AI weather models: implications for stability and one solution. Preprint at https://doi.org/10.48550/arXiv.2304.07029 (2024).
Augier, P. & Lindborg, E. A new formulation of the spectral energy budget of the atmosphere, with application to two high-resolution general circulation models. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/70/7/jas-d-12-0281.1.xml (2013).
Rasp, S. et al. WeatherBench 2: a benchmark for the next generation of data-driven global weather models. J. Adv. Model. Earth Syst. 16, e2023MS004019 (2024).
Bougeault, P. et al. The THORPEX Interactive Grand Global Ensemble. Bull. Ame. Meteorol. Soc. https://journals.ametsoc.org/view/journals/bams/91/8/2010bams2853_1.xml (2010).
Santoalla, D. V. & Kasic, B. TIGGE archive. https://confluence.ecmwf.int/x/JQTs (2024).
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).
Google DeepMind. graphcast/docs/cloud_vm_setup.md. https://github.com/google-deepmind/graphcast/blob/main/docs/cloud_vm_setup.md (2024).
Wiin-Nielsen, A. & Chen, T.-C. Fundamentals of Atmospheric Energetics (Oxford University Press, 1993).
Jakob, A. R., Hack, A. J. & Williamson, A. D. Solutions to the shallow water test set using the spectral transform method. University Corporation for Atmospheric Research, https://opensky.ucar.edu/islandora/object/%3A3430 (1993).
Kraichnan, R. H. Inertial ranges in two-dimensional turbulence. Phys. Fluids 10, 1417–1423 (1967).
Charney, J. G. Geostrophic turbulence. J. Atmos. Sci. https://journals.ametsoc.org/view/journals/atsc/28/6/1520-0469_1971_028_1087_gt_2_0_co_2.xml (1971).
Lindborg, E. Can the atmospheric kinetic energy spectrum be explained by two-dimensional turbulence? J. Fluid Mech. 388, 259–288 (1999).
Nastrom, G. D. & Gage, K. S. A climatology of atmospheric wavenumber spectra of wind and temperature observed by commercial aircraft. J. Atmos. Sci. 42, 950–960 (1985).
Niranjan Kumar, K. et al. Atmospheric kinetic energy spectra from global and regional NCMRWF unified modelling system. Q. J. R. Meteorol. Soc. 149, 2784–2799 (2023).
Gupta, S., Gupta, C. & Chakarvarti, S. K. Image edge detection: a review. Int. J. Adv. Res. Comput. Eng. Technol. 2, 2246–2251 (2013).
Vikram Mutneja, D. Methods of image edge detection: a review. J. Electr. Electron. Syst. 4, http://www.omicsgroup.org/journals/methods-of-image-edge-detection-a-review-2332-0796-1000150.php?aid=57249 (2015).
Ebert-Uphoff, I. et al. Measuring sharpness of AI-generated meteorological imagery. Artif. Intell. Earth Syst. https://journals.ametsoc.org/view/journals/aies/4/3/AIES-D-24-0083.1.xml (2025).
Acknowledgements
This research was supported by the National Research Foundation (NRF) of Korea under RS-2025-02363044, and the High-Performance Computing Support Project, funded by the Government of the Republic of Korea (Ministry of Science and ICT). This work was also partially supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) [NO.RS-2021-II211343, Artificial Intelligence Graduate School Program (Seoul National University)].
Author information
Authors and Affiliations
Contributions
Hi. Kim designed and conducted the study, wrote the initial draft of the manuscript, and carried out the revisions. J. Ryu contributed to the computation of KE spectra, discussed the results, provided comments on the manuscript, and was involved in the revision process. S.-W. Son, J.-H. Jeong, and Hy. Kim contributed to writing and editing the manuscript. J.-H. Yoon supervised the overall research, secured the funding, and reviewed the manuscript. All authors have contributed to a comprehensive review to ensure the depth and rigor of the study and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kim, H., Ryu, J., Son, SW. et al. A spectral test of the butterfly effect and physical consistency in the diffusion-based GenCast’s ensembles. npj Clim Atmos Sci (2026). https://doi.org/10.1038/s41612-026-01380-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41612-026-01380-1


