Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
WxC-Bench: A Novel Dataset for Weather and Climate Downstream Tasks
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 05 March 2026

WxC-Bench: A Novel Dataset for Weather and Climate Downstream Tasks

  • Rajat Shinde  ORCID: orcid.org/0000-0002-9505-62041 na1,
  • Kumar Ankur1 na1,
  • Christopher E. Phillips1 na1,
  • Aman Gupta  ORCID: orcid.org/0000-0002-2215-71352 na1,
  • Simon Pfreundschuh3 na1,
  • Sujit Roy1 na1,
  • Sheyenne Kirkland1,
  • Vishal Gaur1,
  • Venkatesh Kolluru1,
  • Amy Lin1,
  • Prajun Trital4,5,
  • Aditi Sheshadri2,
  • Udaysankar Nair1,
  • Manil Maskey6 &
  • …
  • Rahul Ramachandran6 

Scientific Data , Article number:  (2026) Cite this article

  • 1532 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Atmospheric dynamics
  • Environmental impact

Abstract

High-quality openly-accessible machine learning (ML)-ready datasets play a foundational role in developing new artificial intelligence (AI) models or fine-tuning existing models for scientific applications such as weather and climate analysis. However, despite the growing development of new deep learning models for weather and climate, there is a scarcity of curated, pre-processed ML-ready datasets. Curating such high-quality datasets for developing new models is challenging particularly because the modality of the input data varies significantly for different downstream tasks addressing different atmospheric scales (spatial and temporal). Here we introduce WxC-Bench (Weather and Climate Bench), a multi-modal dataset designed to support the development of generalizable AI models for various downstream use-cases in weather and climate research. WxC-Bench supports examining several atmospheric processes from meso-β (20 - 200 km) scale to synoptic scales (2500 km), such as aviation turbulence, hurricane intensity and track monitoring, weather analog search, gravity wave parameterization, and natural language report generation. We provide a comprehensive description of the dataset and also present a technical validation for baseline analysis. The dataset and code to prepare the ML-ready data have been made publicly available on Hugging Face, and can be accessed using WxC-Bench Python package.

Similar content being viewed by others

Multi-output deep learning for high-frequency prediction of air and surface temperature in Kuwait

Article Open access 07 November 2025

CMIP6-driven 10 km super-resolution daily climate projections with PET estimates in China

Article Open access 30 April 2025

Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán

Article Open access 22 April 2024

Data availability

WxC-Bench is publicly available at https://doi.org/10.57967/hf/771121. Additional details on the file formats and folder structure are provided in the Data Records section.

Code availability

The full codebase for dataset creation, pre-processing, and task-specific pipelines is openly available at: https://github.com/NASA-IMPACT/WxC-Bench. Additionally, a Python package providing programmatic access to WxC-Bench, along with helper utilities for loading and supporting documentation, is available on PyPI: https://pypi.org/project/wxcbench/.

References

  1. Kurth, T. et al. Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators. In Proceedings of the platform for advanced scientific computing conference, 1–11, https://doi.org/10.48550/arXiv.2208.05419 (2023).

  2. Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382,1416-1421, https://doi.org/10.1126/science.adi2336 (2023).

  3. Chen, K. et al. Fengwu: Pushing the skillful global medium-range weather forecast beyond 10 days lead. arXiv preprint arXiv:2304.02948 https://doi.org/10.48550/arXiv.2304.02948 (2023).

  4. Bi, K. et al. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619, 533–538, https://doi.org/10.1038/s41586-023-06185-3 (2023).

  5. Sønderby, C. K. et al. Metnet: A neural weather model for precipitation forecasting. arXiv preprint arXiv:2003.12140 https://doi.org/10.48550/arXiv.2003.12140 (2020).

  6. Weyn, J. A., Durran, D. R., Caruana, R. & Cresswell-Clay, N. Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models. Journal of Advances in Modeling Earth Systems 13, e2021MS002502, https://doi.org/10.1029/2021MS002502 (2021).

    Google Scholar 

  7. Mansfield, L. A. et al. Updates on Model Hierarchies for Understanding and Simulating the Climate System: A Focus on Data-Informed Methods and Climate Change Impacts. Journal of Advances in Modeling Earth Systems 15, e2023MS003715, https://doi.org/10.1029/2023MS003715 (2023).

    Google Scholar 

  8. Rasp, S., Pritchard, M. S. & Gentine, P. Deep learning to represent subgrid processes in climate models. Proc. Natl Acad. Sci. USA 115, 9684–9689, https://doi.org/10.1073/pnas.1810286115 (2018).

  9. Zanna, L. & Bolton, T. Data-Driven Equation Discovery of Ocean Mesoscale Closures. Geophysical Research Letters 47, e2020GL088376, https://doi.org/10.1029/2020GL088376 (2020).

    Google Scholar 

  10. Espinosa, Z. I., Sheshadri, A., Cain, G. R., Gerber, E. P. & DallaSanta, K. A Deep Learning Parameterization of Gravity Wave Drag Coupled to an Atmospheric Global Climate Model. Geophysical Research Letters (submitted, 2021).

  11. Wang, P., Yuval, J. & O’Gorman, P. A. Non-Local Parameterization of Atmospheric Subgrid Processes With Neural Networks. Journal of Advances in Modeling Earth Systems 14, e2022MS002984, https://doi.org/10.1029/2022MS002984 (2022).

    Google Scholar 

  12. Bretherton, C. S. et al. Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations. Journal of Advances in Modeling Earth Systems 14, e2021MS002794, https://doi.org/10.1029/2021MS002794 (2022).

    Google Scholar 

  13. Davenport, F. V. & Diffenbaugh, N. S. Using Machine Learning to Analyze Physical Causes of Climate Change: A Case Study of U.S. Midwest Extreme Precipitation. Geophysical Research Letters 48, e2021GL093787, https://doi.org/10.1029/2021GL093787 (2021).

    Google Scholar 

  14. Diffenbaugh, N. S. & Barnes, E. A. Data-driven predictions of the time remaining until critical global warming thresholds are reached. Proceedings of the National Academy of Sciences 120, e2207183120, https://doi.org/10.1073/pnas.2207183120 (2023).

    Google Scholar 

  15. Roy, S. et al. Clifford neural operators on atmospheric data influenced partial differential equations. In 12th International Conference on Learning Representations (2024).

  16. Schmude, J. et al. Prithvi wxc: Foundation model for weather and climate. arXiv preprint arXiv:2409.13598 (2024).

  17. Nguyen, T., Brandstetter, J., Kapoor, A., Gupta, J. K. & Grover, A. ClimaX: A foundation model for weather and climate. arXiv (2023).

  18. Bodnar, C. et al. A foundation model for the Earth system. Nature 641, 1180–1187, https://doi.org/10.1038/s41586-025-09005-y (2025).

  19. Szwarcman, D. et al. Prithvi-eo-2.0: A versatile multi-temporal foundation model for earth observation applications. IEEE Transactions on Geoscience and Remote Sensing https://doi.org/10.1109/TGRS.2025.3642610 (2025).

  20. Bommasani, R. et al. On the opportunities and risks of foundation models (2022). 2108.07258.

  21. Shinde, R. et al. Wxc-bench, https://doi.org/10.57967/hf/7711 (2026).

  22. Rasp, S. et al. Weatherbench: a benchmark data set for data-driven weather forecasting. Journal of Advances in Modeling Earth Systems 12, https://doi.org/10.1029/2020MS002203 (2020).

  23. Rasp, S. et al. Weatherbench 2: A benchmark for the next generation of data-driven global weather models (2024). 2308.15560.

  24. Prabhat et al. Climatenet: An expert-labelled open dataset and deep learning architecture for enabling high-precision analyses of extreme weather. Geoscientific Model Development Discussions 2020, 1–28, https://doi.org/10.5194/gmd-14-107-2021 (2020).

  25. Watson-Parris, D. et al. Climatebench v1.0: A benchmark for data-driven climate projections. Journal of Advances in Modeling Earth Systems 14, e2021MS002954, https://doi.org/10.1029/2021MS002954 (2022).

    Google Scholar 

  26. Yu, S. et al. Climsim: A large multi-scale dataset for hybrid physics-ml climate emulation. In Oh, A.et al. (eds.) Advances in Neural Information Processing Systems, vol. 36, 22070–22084 (2023).

  27. Nguyen, T., Jewik, J., Bansal, H., Sharma, P. & Grover, A. Climatelearn: Benchmarking machine learning for weather and climate modeling. In Oh, A. et al. (eds.) Advances in Neural Information Processing Systems, vol. 36, 75009–75025 (Curran Associates, Inc., 2023).

  28. Shinde, R. et al. Windset: Weather insights and novel data for systematic evaluation and testing. In The Twelfth International Conference on Learning Representations (ICLR) (2024).

  29. Ito, J., Niino, H. & Yoshino, K. Large Eddy Simulation on Horizontal Convective Rolls that Caused an Aircraft Accident during its Landing at Narita Airport. Geophysical Research Letters 47, https://doi.org/10.1029/2020gl086999 (2020).

  30. Golding, W. Turbulence and Its Impact on Commercial Aviation. Journal of Aviation/Aerospace Education & Researchhttps://doi.org/10.15394/jaaer.2002.1301 (2000).

  31. Emara, M. et al. Machine Learning Enabled Turbulence Prediction using Flight Data for Safety Analysis. In 32nd Congress of the International Council of the Aeronautical Sciences (Shanghai, China, 2021).

  32. Williams, J. K. Using random forests to diagnose aviation turbulence. Machine Learning 95, 51–70, https://doi.org/10.1007/s10994-013-5346-7 (2014).

    Google Scholar 

  33. Hon, K. K., Ng, C. W. & Chan, P. W. Machine learning based multi-index prediction of aviation turbulence over the Asia-Pacific. Machine Learning with Applications 2, 100008, https://doi.org/10.1016/j.mlwa.2020.100008 (2020).

    Google Scholar 

  34. Gelaro, R. et al. The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Journal of Climate 30, 5419–5454, https://doi.org/10.1175/jcli-d-16-0758.1 (2017).

    Google Scholar 

  35. Iowa Environmental Mesonet (IEM) Download PIREPs, https://mesonet.agron.iastate.edu/request/gis/pireps.php.

  36. Wolff, J. K. & Sharman, R. D. Climatology of Upper-Level Turbulence over the Contiguous United States. Journal of Applied Meteorology and Climatology 47, 2198–2214, https://doi.org/10.1175/2008jamc1799.1 (2008).

    Google Scholar 

  37. Fritts, D. C. & Alexander, M. J. Gravity wave dynamics and effects in the middle atmosphere. Reviews of Geophysics 41, https://doi.org/10.1029/2001RG000106 (2003).

  38. Kim, Y.-J., Eckermann, S. D. & Chun, H.-Y. An overview of the past, present and future of gravity-wave drag parametrization for numerical climate and weather prediction models. Atmosphere-Ocean 41, 65–98, https://doi.org/10.3137/ao.410105 (2003).

    Google Scholar 

  39. Achatz, U. et al. Atmospheric Gravity Waves: Processes and Parameterization. Journal of the Atmospheric Sciences -1, https://doi.org/10.1175/JAS-D-23-0210.1 (2023).

  40. Sorbjan, Z. Improving Non-local Parameterization of the Convective Boundary Layer. Boundary-Layer Meteorol 130, 57–69, https://doi.org/10.1007/s10546-008-9331-9 (2009).

    Google Scholar 

  41. Chen, T.-C., Yau, M. K. & Kirshbaum, D. J. Assessment of Conditional Symmetric Instability from Global Reanalysis Data. Journal of the Atmospheric Sciences 75, 2425–2443, https://doi.org/10.1175/JAS-D-17-0221.1 (2018).

    Google Scholar 

  42. Plougonven, R., de la Cámara, A., Hertzog, A. & Lott, F. How does knowledge of atmospheric gravity waves guide their parameterizations? Quarterly Journal of the Royal Meteorological Society 146, 1529–1543, https://doi.org/10.1002/qj.3732 (2020).

    Google Scholar 

  43. McLandress, C., Scinocca, J. F., Shepherd, T. G., Reader, M. C. & Manney, G. L. Dynamical Control of the Mesosphere by Orographic and Nonorographic Gravity Wave Drag during the Extended Northern Winters of 2006 and 2009. J. Atmos. Sci. 70, 2152–2169, https://doi.org/10.1175/JAS-D-12-0297.1 (2012).

    Google Scholar 

  44. de la Cámara, A., Abalos, M. & Hitchcock, P. Changes in Stratospheric Transport and Mixing During Sudden Stratospheric Warmings. Journal of Geophysical Research: Atmospheres 123, 3356–3373, https://doi.org/10.1002/2017JD028007 (2018).

    Google Scholar 

  45. Gupta, A. et al. Estimates of Southern Hemispheric Gravity Wave Momentum Fluxes Across Observations, Reanalyses, and Kilometer-scale Numerical Weather Prediction Model. Journal of the Atmospheric Sciences -1, https://doi.org/10.1175/JAS-D-23-0095.1 (2024).

  46. Kruse, C. G. et al. Observed and Modeled Mountain Waves from the Surface to the Mesosphere near the Drake Passage. Journal of the Atmospheric Sciences 79, 909–932, https://doi.org/10.1175/JAS-D-21-0252.1 (2022).

    Google Scholar 

  47. Kim, Y.-H., Voelker, G. S., Bölöni, G., Zängl, G. & Achatz, U. Crucial role of obliquely propagating gravity waves in the quasi-biennial oscillation dynamics. Atmospheric Chemistry and Physics 24, 3297–3308, https://doi.org/10.5194/acp-24-3297-2024 (2024).

    Google Scholar 

  48. Gupta, A., Sheshadri, A., Alexander, M. J. & Birner, T. Insights on Lateral Gravity Wave Propagation in the Extratropical Stratosphere from 44 Years of ERA5 Data. Geophysical Research Letters https://doi.org/10.1029/2024GL108541 (2024).

  49. Lindborg, E. A Helmholtz decomposition of structure functions and spectra calculated from aircraft data. Journal of Fluid Mechanics 762, R4, https://doi.org/10.1017/jfm.2014.685 (2015).

    Google Scholar 

  50. Hersbach, H. et al. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 1999–2049, https://doi.org/10.1002/qj.3803 (2020).

    Google Scholar 

  51. Skamarock, W. C. Evaluating Mesoscale NWP Models Using Kinetic Energy Spectra. Monthly Weather Review 132, 3019–3032, https://doi.org/10.1175/MWR2830.1 (2004).

    Google Scholar 

  52. Gupta, A., Birner, T., Dörnbrack, A. & Polichtchouk, I. Importance of Gravity Wave Forcing for Springtime Southern Polar Vortex Breakdown as Revealed by ERA5. Geophysical Research Letters 48, e2021GL092762, https://doi.org/10.1029/2021GL092762 (2021).

    Google Scholar 

  53. Pahlavan, H. A., Fu, Q., Wallace, J. M. & Kiladis, G. N. Revisiting the Quasi-Biennial Oscillation as Seen in ERA5. Part I: Description and Momentum Budget. Journal of the Atmospheric Sciences 78, 673–691, https://doi.org/10.1175/JAS-D-20-0248.1 (2021).

    Google Scholar 

  54. Chattopadhyay, A., Nabizadeh, E. & Hassanzadeh, P. Analog forecasting of extreme-causing weather patterns using deep learning. Journal of Advances in Modeling Earth Systems 12, https://doi.org/10.1029/2019ms001958 (2020).

  55. Dool, H. Mvd A New Look at Weather Forecasting through Analogues. Monthly Weather Review 117, 2230–2247 (1989).

    Google Scholar 

  56. Yang, D. & Alessandrini, S. An ultra-fast way of searching weather analogs for renewable energy forecasting. Solar Energy 185, 255–261, https://doi.org/10.1016/j.solener.2019.03.068 (2019).

    Google Scholar 

  57. Franch, G., Jurman, G., Coviello, L., Pendesini, M. & Furlanello, C. Mass-umap: Fast and accurate analog ensemble search in weather radar archives. Remote Sensing 11, https://doi.org/10.3390/rs11242922 (2019).

  58. Ahn, H. et al. Searching similar weather maps using convolutional autoencoder and satellite images. ICT Express 9, 69–75, https://doi.org/10.1016/j.icte.2022.03.013 (2023).

    Google Scholar 

  59. Raoult, B., Di Fatta, G., Pappenberger, F. & Lawrence, B. Fast retrieval of weather analogues in a multi-petabytes archive using wavelet-based fingerprints. In Shi, Y. et al. (eds.) Computational Science – ICCS 2018, 697–710 (Springer International Publishing, Cham, 2018).

  60. Demir, B. & Bruzzone, L. Hashing-based scalable remote sensing image search and retrieval in large archives. IEEE Transactions on Geoscience and Remote Sensing 54, 892–904, https://doi.org/10.1109/TGRS.2015.2469138 (2016).

    Google Scholar 

  61. Hu, W., Cervone, G., Young, G. & Delle Monache, L. Machine learning weather analogs for near-surface variables. Boundary-Layer Meteorology 186, 711–735, https://doi.org/10.1007/s10546-022-00779-6 (2023).

    Google Scholar 

  62. Vitart, F. et al. Outcomes of the wmo prize challenge to improve subseasonal to seasonal predictions using artificial intelligence. Bulletin of the American Meteorological Society 103, E2878 – E2886, https://doi.org/10.1175/BAMS-D-22-0046.1 (2022).

    Google Scholar 

  63. Robertson, A. W., Vitart, F. & Camargo, S. J. Subseasonal to seasonal prediction of weather to climate with application to tropical cyclones. Journal of Geophysical Research: Atmospheres 125, e2018JD029375, https://doi.org/10.1029/2018JD029375 (2020).

    Google Scholar 

  64. Vitart, F., Robertson, A. W. & Anderson, D. Subseasonal to seasonal prediction project: bridging the gap between weather and climate (2012).

  65. de Witt, C. S. et al. Rainbench: Towards global precipitation forecasting from satellite imagery (2020). 2012.09670.

  66. Knapp, K. R. et al. Globally gridded satellite observations for climate studies. Bulletin of the American Meteorological Society 92, 893 – 907, https://doi.org/10.1175/2011BAMS3039.1 (2011).

    Google Scholar 

  67. Foster, M. J., Philips, C., Heidinger, A. K. & Program, N. C. Noaa climate data record (cdr) of advanced very high resolution radiometer (avhrr) and high-resolution infra-red sounder (hirs) reflectance, brightness temperature, and cloud products from pathfinder atmospheres - extended (patmos-x), version 6.0. https://doi.org/10.7289/V5X9287 (2021).

  68. Kummerow, C. D., Berg, W. K., Kuo, C.-P., & Program, N. C. Noaa climate data record (cdr) of advanced very high resolution radiometer (avhrr) and high-resolution infra-red sounder (hirs) reflectance, brightness temperature, and cloud products from pathfinder atmospheres - extended (patmos-x), version 6.0. https://doi.org/10.25921/E3K5-BW77 (2022).

  69. Ashouri, H. et al. Persiann-cdr: Daily precipitation climate data record from multisatellite observations for hydrological and climate studies. Bulletin of the American Meteorological Society 96, 69–83 (2015).

    Google Scholar 

  70. Huffman, G., Stocker, E., Bolvin, D., Nelkin, E. & Jackson, T. Gpm imerg final precipitation l3 1 day 0.1 degree x 0.1 degree v07, greenbelt, md, goddard earth sciences data and information services center (ges disc).[accessed 2023 sep 14]. https://doi.org/10.5067/GPM/IMERGDF/DAY/07 (2023).

  71. Emanuel, K. Assessing the present and future probability of hurricane harvey’s rainfall. Proceedings of the National Academy of Sciences 114, 12681–12684, https://doi.org/10.1073/pnas.1716222114 (2017).

    Google Scholar 

  72. Bromirski, P. D. & Kossin, J. P. Increasing hurricane wave power along the us atlantic and gulf coasts. Journal of Geophysical Research: Oceans 113, https://doi.org/10.1029/2007JC004706 (2008).

  73. Emanuel, K. Increasing destructiveness of tropical cyclones over the past 30 years. Nature 436, 686–688, https://doi.org/10.1038/nature03906 (2005).

    Google Scholar 

  74. Burg, T. & Lillo, S. P. Tropycal: A python package for analyzing tropical cyclones and more. In 34th Conference on Hurricanes and Tropical Meteorology (AMS, 2021).

  75. Goldberg, E., Driedger, N. & Kittredge, R. Using natural-language processing to produce weather forecasts. IEEE Expert 9, 45–53, https://doi.org/10.1109/64.294135 (1994).

    Google Scholar 

  76. Ramos-Soto, A., Bugarín, A. J., Barro, S. & Taboada, J. Linguistic descriptions for automatic generation of textual short-term weather forecasts on real prediction data. IEEE Transactions on Fuzzy Systems 23, 44–57, https://doi.org/10.1109/TFUZZ.2014.2328011 (2015).

    Google Scholar 

  77. Reiter, E., Sripada, S., Hunter, J., Yu, J. & Davy, I. Choosing words in computer-generated weather forecasts. Artificial Intelligence 167, 137–169, https://doi.org/10.1016/j.artint.2005.06.006 (2005).

    Google Scholar 

  78. Zhang, H.-P., Wu, H.-P., Gao, J., Zhao, Y.-P. & Lv, Z.-L. Meteorological bulletin automatic generation based on spatio-temporal reasoning. In 2011 International Conference on Machine Learning and Cybernetics, vol. 4, 1927–1931, https://doi.org/10.1109/ICMLC.2011.6016952 (IEEE, 2011).

  79. Oktay, O. et al. Attention U-Net: Learning Where to Look for the Pancreas, https://doi.org/10.48550/arXiv.1804.03999 (2018). 1804.03999.

  80. Gupta, A. et al. Machine Learning Global Simulation of Nonlocal Gravity Wave Propagation (2024). 2406.14775.

  81. Geller, M. A. et al. A Comparison between Gravity Wave Momentum Fluxes in Observations and Climate Models. Journal of Climate 26, 6383–6405, https://doi.org/10.1175/JCLI-D-12-00545.1 (2013).

    Google Scholar 

  82. Bouzerdoum, A., Havstad, A. & Beghdadi, A. Image quality assessment using a neural network approach. In Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004., 330–333, https://doi.org/10.1109/ISSPIT.2004.1433751 (2004).

  83. Xie, S., Girshick, R., Dollar, P., Tu, Z. & He, K. Aggregated Residual Transformations for Deep Neural Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5987–5995, https://doi.org/10.1109/CVPR.2017.634 (IEEE, Honolulu, HI, 2017).

  84. Pfreundschuh, S. et al. A neural network approach to estimating a posteriori distributions of bayesian retrieval problems. Atmospheric Measurement Techniques 11, 4627–4643, https://doi.org/10.5194/amt-11-4627-2018 (2018).

    Google Scholar 

  85. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization (2019). 1711.05101.

  86. Vitart, F. et al. The subseasonal to seasonal (s2s) prediction project database. Bulletin of the American Meteorological Society 98, 163–173, https://doi.org/10.1175/BAMS-D-16-0017.1 (2017).

    Google Scholar 

  87. de Andrade, F. M., Coelho, C. A. & Cavalcanti, I. F. Global precipitation hindcast quality assessment of the subseasonal to seasonal (s2s) prediction project models. Climate Dynamics 52, 5451–5475, https://doi.org/10.1007/s00382-018-4457-z (2019).

    Google Scholar 

  88. Dosovitskiy, A. et al. An image is worth 16 × 16 words: Transformers for image recognition at scale (2021). 2010.11929.

  89. Radford, A. et al. Language models are unsupervised multitask learners (2019).

  90. Lin, C.-Y. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, 74–81 (Association for Computational Linguistics, Barcelona, Spain, 2004).

Download references

Acknowledgements

This work was supported by NASA’s Office of Chief Science Data Officer and Earth Science Division’s Earth Science Scientific Computing, Earth Science Data Systems Program, and the Earth Science Modeling and Analysis Program.

The long-term precipitation forecasting task uses S2S. S2S is a joint initiative of the World Weather Research Programme (WWRP) and the World Climate Research Programme (WCRP). The original S2S database is hosted at ECMWF as an extension of the TIGGE database. AG and AS were supported by Schmidt Sciences, LLC, a philanthropic initiative founded by Eric and Wendy Schmidt, as part of the Virtual Earth System Research Institute (VESRI). AS acknowledges support from the National Science Foundation through grant OAC-2004492. RS thanks Prajun Trital for working on the WxC-Bench Python Package during their internship. The authors thank the anonymous reviewers for their constructive feedback and Dr. Alireza Foroozani for editorial oversight and guidance throughout the review process.

Author information

Author notes
  1. These authors contributed equally: Rajat Shinde, Kumar Ankur, Christopher E. Phillips, Aman Gupta, Simon Pfreundschuh, Sujit Roy.

Authors and Affiliations

  1. Earth System Science Center, The University of Alabama in Huntsville, Huntsville, AL, USA

    Rajat Shinde, Kumar Ankur, Christopher E. Phillips, Sujit Roy, Sheyenne Kirkland, Vishal Gaur, Venkatesh Kolluru, Amy Lin & Udaysankar Nair

  2. Stanford University, Stanford, CA, USA

    Aman Gupta & Aditi Sheshadri

  3. Department of Atmospheric Science, Colorado State University, Fort Collins, CO, USA

    Simon Pfreundschuh

  4. Department of Computer Science, The University of Alabama in Huntsville, Huntsville, AL, USA

    Prajun Trital

  5. Department of Space Science, The University of Alabama in Huntsville, Huntsville, AL, USA

    Prajun Trital

  6. NASA Marshall Space Flight Center, Huntsville, AL, USA

    Manil Maskey & Rahul Ramachandran

Authors
  1. Rajat Shinde
    View author publications

    Search author on:PubMed Google Scholar

  2. Kumar Ankur
    View author publications

    Search author on:PubMed Google Scholar

  3. Christopher E. Phillips
    View author publications

    Search author on:PubMed Google Scholar

  4. Aman Gupta
    View author publications

    Search author on:PubMed Google Scholar

  5. Simon Pfreundschuh
    View author publications

    Search author on:PubMed Google Scholar

  6. Sujit Roy
    View author publications

    Search author on:PubMed Google Scholar

  7. Sheyenne Kirkland
    View author publications

    Search author on:PubMed Google Scholar

  8. Vishal Gaur
    View author publications

    Search author on:PubMed Google Scholar

  9. Venkatesh Kolluru
    View author publications

    Search author on:PubMed Google Scholar

  10. Amy Lin
    View author publications

    Search author on:PubMed Google Scholar

  11. Prajun Trital
    View author publications

    Search author on:PubMed Google Scholar

  12. Aditi Sheshadri
    View author publications

    Search author on:PubMed Google Scholar

  13. Udaysankar Nair
    View author publications

    Search author on:PubMed Google Scholar

  14. Manil Maskey
    View author publications

    Search author on:PubMed Google Scholar

  15. Rahul Ramachandran
    View author publications

    Search author on:PubMed Google Scholar

Contributions

R.S., C.E.P., K.A., S.R., A.G., S.P. compiled the manuscript. A.G., S.R., V.G., A.S. conceptualized the technical validation for Nonlocal Parameterization of Gravity Wave Momentum Flux task. S.P. performed the technical validation for Long-Term Precipitation Forecasting task. K.A. performed the technical validation for the Hurricane Track and Intensity Prediction task. R.S., S.K., and C.E.P. performed the technical validation for the Aviation Turbulence Prediction and Weather Analog Search task. R.S., K.A. curated the data whereas R.S. conceptualized the technical validation for the Generation of Natural Language Weather Forecasts task. V.K. and P.T. supported in development of supplementary materials and Python package for the final submission. A.L. validated the data files whereas U.N. supported in selection of downstream tasks. M.M., R.R. supervised the work and reviewed the manuscript. All the authors reviewed the manuscript.

Corresponding authors

Correspondence to Rajat Shinde or Sujit Roy.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shinde, R., Ankur, K., Phillips, C.E. et al. WxC-Bench: A Novel Dataset for Weather and Climate Downstream Tasks. Sci Data (2026). https://doi.org/10.1038/s41597-026-06839-7

Download citation

  • Received: 24 December 2024

  • Accepted: 05 February 2026

  • Published: 05 March 2026

  • DOI: https://doi.org/10.1038/s41597-026-06839-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing Anthropocene

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Anthropocene