Abstract
The oceanic data assimilation (DA) system has been developed to optimally combine numerical-model predictions with actual measurements from the ocean to create the best estimates of current ocean conditions and their uncertainties, improving our ability to forecast and understand the global climate variations. We developed DeepDA, a global oceanic DA system using deep learning, by integrating a partial convolutional neural network and a generative adversarial network. Partial convolution serves as an observation operator, mapping irregular observational data onto gridded fields, while generative adversarial network incorporates observational information from previous time frames. Our observing system simulation experiments, using simulated observations for the DA, revealed that DeepDA markedly reduces analysis error of the oceanic temperature, outperforming both background and observed values. DeepDA’s real-case global temperature reanalysis spanning from 1981 to 2020 accurately reconstructs observed global climatological temperature fields, along with their seasonal cycles, major oceanic temperature variabilities and global warming trend. Developed solely with a long-term control simulation, DeepDA lowers technical hurdles in creating global ocean reanalysis datasets using multiple numerical models’ physical constraints, thereby diminishing systematic uncertainties in estimating global oceanic states over decades with these models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The following data related to this study can be downloaded from: HadIOD v.1.2.0.0, https://www.metoffice.gov.uk/hadobs/hadiod/download-hadiod1-2-0-0.html; OISST v.2, https://downloads.psl.noaa.gov/Datasets/noaa.oisst.v2.highres; ERSST v.5, https://psl.noaa.gov/data/gridded/data.noaa.ersst.v5.html, EN4.2.2, https://www.metoffice.gov.uk/hadobs/en4/download-en4-2-2.html; ORAS5, https://doi.org/10.24381/cds.67e8eeb7; COBE, https://downloads.psl.noaa.gov/Datasets/COBE; ECCO v.4r4, https://podaac.jpl.nasa.gov/announcements/2021-04-27-ECCO-Version-4-Datasets-Release; ECDA, https://data1.gfdl.noaa.gov/dods-data/gfdl_cm2_1/Fv_NetCDF_test/pp/ocean_interp/ts/monthly (alternative https://apdrc.soest.hawaii.edu/dods/public_data/GFDL/ecda_v2.0); GECCO3, https://icdc.cen.uni-hamburg.de/thredds/catalog/ftpthredds/EASYInit/GECCO3/regular_1x1_grid/catalog.html GODAS, https://psl.noaa.gov/data/gridded/data.godas.html; MERRA2, https://doi.org/10.5067/4IASLIDL8EEC; SODA v.3.4.2, http://www.atmos.umd.edu/~ocean/index_files/soda3_readme.htm; ARMOR3D L4, https://doi.org/10.48670/moi-00052 and Roemmich–Gilson Argo climatology, https://sio-argo.ucsd.edu/RG_Climatology.html and GHRSST MW-OI, https://doi.org/10.5067/GHMWO-4FR51. Source data are provided with this paper.
Code availability
TensorFlow (https://www.tensorflow.org) libraries were used to formulate the deep-learning model for the global oceanic DA. The developed model and the sample dataset are available via Code Ocean at https://doi.org/10.24433/CO.7269173.v2 and via Zenodo at https://doi.org/10.5281/zenodo.11255094 (ref. 61).
References
Ghil, M. & Malanotte-Rizzoli, P. in Advances in Geophysics (eds Dmowska, R. & Saltzman, B.) Vol. 33, 141–266 (Elsevier, 1991).
Derber, J. & Rosati, A. A global oceanic data assimilation system. J. Phys. Oceanogr. 19, 1333–1347 (1989).
Keppenne, C. L. & Rienecker, M. M. Assimilation of temperature into an isopycnal ocean general circulation model using a parallel ensemble Kalman filter. J. Mar. Syst. 40, 363–380 (2003).
Evensen, G. Using the extended Kalman filter with a multilayer quasi‐geostrophic ocean model. J. Geophys. Res. Oceans 97, 17905–17924 (1992).
Zhang, S., Harrison, M. J., Rosati, A. & Wittenberg, A. System design and evaluation of coupled ensemble data assimilation for global oceanic climate studies. Mon. Weather Rev. 135, 3541–3564 (2007).
Penny, S. G. et al. The local ensemble transform Kalman filter and the running-in-place algorithm applied to a global ocean general circulation model. Nonlinear Process. Geophys. 20, 1031–1046 (2013).
Sugiura, N. et al. Development of a four-dimensional variational coupled data assimilation system for enhanced analysis and prediction of seasonal to interannual climate variations. J. Geophys. Res. Oceans 113, C10017 (2008).
Kalnay, E. et al. The NCEP/NCAR reanalysis project. Bull. Am. Meteorol. Soc. 77, 437–471 (1996).
Carton, J. A., Chepurin, G. A. & Chen, L. SODA3: a new ocean vlimate reanalysis. J. Clim. 31, 6967–1983 (2018).
Zuo, H., Balmaseda, M. A., Tietsche, S., Mogensen, K. & Mayer, M. The ECMWF operational ensemble reanalysis-analysis system for ocean and sea ice: a description of the system and assessment. Ocean Sci. 15, 779–808 (2019).
Saha, S. et al. The NCEP climate forecast system version 2. J. Clim. 27, 2185–2208 (2014).
Takaya, Y. et al. Japan meteorological agency/meteorological research institute-coupled prediction system version 2 (JMA/MRI-CPS2): atmosphere–land–ocean–sea ice coupled prediction system for operational seasonal forecasting. Clim. Dyn. 50, 751–765 (2018).
Waters, J. et al. Implementing a variational data assimilation system in an operational 1/4 degree global ocean model. Q. J. R. Meteorol. Soc. 141, 333–349 (2015).
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).
Casas, C. Q., Arcucci, R., Wu, P., Pain, C. & Guo, Y. K. A reduced order deep data assimilation model. Phys. D Nonlinear Phenom. 412, 132615 (2020).
Arcucci, R., Zhu, J., Hu, S. & Guo, Y. K. Deep data assimilation: integrating deep learning with data assimilation. Appl. Sci. 11, 1114 (2021).
Tang, M., Liu, Y. & Durlofsky, L. J. A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems. J. Comput. Phys. 413, 109456 (2020).
Buizza, C. et al. Data learning: integrating data assimilation and machine learning. J. Comput. Sci. 58, 101525 (2022).
Kim, H., Ham, Y. G., Joo, Y. S. & Son, S. W. Deep learning for bias correction of MJO prediction. Nat. Commun. 12, 3087 (2021).
Grönquist, P. et al. Deep learning for post-processing ensemble weather forecasts. Philos. Trans. A Math. Phys. Eng. Sci. 379, 20200092 (2021).
Liu, G. et al. Image inpainting for irregular holes using partial convolutions. In Proc. 15th European Conference on Computer Vision 89–105 (Springer, 2018).
Kadow, C., Hall, D. M. & Ulbrich, U. Artificial intelligence reconstructs missing climate information. Nat. Geosci. 13, 408–413 (2020).
Goodfellow, I. J. et al. Generative adversarial nets. Proc. Adv. Neural Inf. Process. Syst. 3, 2672–2680 (2014).
Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T. & Efros, A. A. Context encoders: feature learning by inpainting. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2536–2544 (IEEE, 2016).
Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382, 1416–1421 (2023).
Iten, R., Metger, T., Wilming, H., Del Rio, L. & Renner, R. Discovering physical concepts with neural networks. Phys. Rev. Lett. 124, 010508 (2020).
Bertino, L., Evensen, G. & Wackernagel, H. Sequential data assimilation techniques in oceanography. Int. Stat. Rev. 71, 223–241 (2003).
Rodgers, K. B. et al. Ubiquity of human-induced changes in climate variability. Earth Syst. Dyn. 12, 1393–1411 (2021).
Arnold, C. P. & Dey, C. H. Observing-systems simulation experiments: past, present, and future. Bull. Am. Meteorol. Soc. 67, 687–695 (1986).
Atkinson, C. P., Rayner, N. A., Kennedy, J. J. & Good, S. A. An integrated database of ocean temperature and salinity observations. J. Geophys. Res.: Oceans 119, 7139–7163 (2014).
Kalnay, E. Atmospheric Modeling, Data Assimilation and Predictability (Cambridge Univ. Press, 2003).
Huang, B. et al. Improvements of the daily optimum interpolation sea surface temperature (DOISST) version 2.1. J. Clim. 34, 2923–2939 (2021).
Kim, Y. H., Hwang, C. & Choi, B. J. An assessment of ocean climate reanalysis by the data assimilation system of KIOST from 1947 to 2012. Ocean Model. 91, 1–22 (2015).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Huang, B. et al. Extended Reconstructed Sea Surface Temperature, version 5 (ERSST v5): upgrades, validations, and intercomparisons. J. Clim. 30, 8179–8205 (2017).
Good, S. A., Martin, M. J. & Rayner, N. A. EN4: quality controlled ocean temperature and salinity profiles and monthly objective analyses with uncertainty estimates. J. Geophys. Res. Oceans 118, 6704–6716 (2013).
Dash, P. et al. Group for High Resolution Sea Surface Temperature (GHRSST) analysis fields inter-comparisons—part 2: near real time web-based level 4 SST Quality Monitor (L4-SQUAM). Deep Sea Res. II: Top. Stud. Oceanogr. 77, 31–43 (2012).
Roemmich, D. & Gilson, J. The 2004–2008 mean and annual cycle of temperature, salinity, and steric height in the global ocean from the Argo Program. Prog. Oceanogr. 82, 81–100 (2009).
Castruccio, F. et al. An EnOI‐based data assimilation system with DART for a high‐resolution version of the CESM2 ocean component. J. Adv. Model. Earth Syst. 12, e2020MS002176 (2020).
Freeman, E. et al. ICOADS release 3.0: a major update to the historical marine climate record. Int. J. Climatol. 37, 2211–2232 (2017).
Eyring, V. et al. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev. 9, 1937–1958 (2016).
DelSole, T., Nattala, J. & Tippett, M. K. Skill improvement from increased ensemble size and model diversity. Geophys. Res. Lett. 41, 7331–7342 (2014).
Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Proc. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 Vol. 9351 (eds Navab, N. et al.) 234–241 (Springer, 2015).
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A. A. & Research, B. A. Image-to-Image translation with conditional adversarial networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1125–1134 (2017).
Laves, M. H., Ihler, S., Kortmann, K. P., & Ortmaier, T. Calibration of model uncertainty for dropout variational inference. Preprint at https://arxiv.org/abs/2006.11584 (2020).
Leutbecher, M. et al. Stochastic representations of model uncertainties at ECMWF: state of the art and future vision. Q. J. R. Meteorol. Soc. 143, 2315–2339 (2017).
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In Proc. IEEE International Conference on Computer Vision 1026–1034 (2015).
Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. Thirteenth International Conference on Artificial Intelligence and Statistics 249–256 (2010).
Ishii, M., Shouji, A., Sugimoto, S. & Matsumoto, T. Objective analyses of sea‐surface temperature and marine meteorological variables for the 20th century using ICOADS and the Kobe collection. Int. J. Climatol.: J. R. Meteorol. Soc. 25, 865–879 (2005).
Gouretski, V. & Reseghetti, F. On depth and temperature biases in bathythermograph data: development of a new correction scheme based on analysis of a global ocean database. Deep Sea Res. I: Oceanogr. Res. Papers 57, 812–833 (2010).
Gouretski, V. & Cheng, L. Correction for systematic errors in the global dataset of temperature profiles from mechanical bathythermographs. J. Atmos. Oceanic Technol. 37, 841–855 (2020).
Guinehut, S., Dhomps, A. L., Larnicol, G. & Le Traon, P. Y. High resolution 3-D temperature and salinity fields derived from in situ and satellite observations. Ocean Sci. 8, 845–857 (2012).
Fukumori, I. et al. Synopsis of the ECCO Central Production Global Ocean and Sea-Ice State Estimate, version 4 release 4 (4 release 4). Zenodo https://doi.org/10.5281/zenodo.3765929 (2020).
Chang, Y. S., Zhang, S., Rosati, A., Delworth, T. L. & Stern, W. F. An assessment of oceanic variability for 1960–2010 from the GFDL ensemble coupled data assimilation. Clim. Dyn. 40, 775–803 (2013).
Köhl, A. Evaluating the GECCO3 1948–2018 ocean synthesis–a configuration for initializing the MPI-ESM climate model. Q. J. R. Meteorol. Soc. 146, 2250–2273 (2020).
Behringer, D. & Xue, Y. Evaluation of the global ocean data assimilation system at NCEP: The Pacific Ocean. In Proc. Eighth Symp. on Integrated Observing and Assimilation Systems for Atmosphere, Oceans, and Land Surface. 11–15 (AMS, 2004).
Gelaro, R. et al. The modern-era retrospective analysis for research and applications, version 2 (MERRA-2). J. Clim. 30, 5419–5454 (2017).
Woodruff, S. D. et al. ICOADS Release 2.5: extensions and enhancements to the surface marine meteorological archive. Int. J. Climatol. 31, 951–967 (2011).
McPhaden, M. J., Busalacchi, A. J. & Anderson, D. L. A toga retrospective. Oceanography 23, 86–103 (2010).
Argo. Argo float data and metadata from Global Data Assembly Centre (Argo GDAC). SEANOE https://doi.org/10.17882/42182 (2022).
Kim, J.-H. The developed DeepDA model and the sample dataset. Zenodo https://doi.org/10.5281/zenodo.11255094 (2020).
Acknowledgements
This study is supported by Korea Environment Industry and Technology Institute through ‘Climate Change R&D Project for New Climate Regime’, funded by Korea Ministry of Environment (grant no. 2022003560006). Y.-G.H. was supported by the Ministry of Science and ICT through the National Research Foundation of Korea (NRF-2022M3K3A1094114).
Author information
Authors and Affiliations
Contributions
Y.-G.H. designed the study. J.-H.K., Y.-S.J. and Y.-G.H. formulated the deep-learning model and performed the experiments. J.-G.L. provided the observational data. Y.-S.J. analysed the results. Y.-G.H. wrote the paper. All the authors discussed the study results.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks James Carton, Peter van Leeuwen, Christopher Kadow and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Initial adjustment of DeepDA-produced subsurface temperature anomalies in the OSSEs.
The temperature anomalies of the (a) ground truth, (b) observations, (c) analysis, and (d) background states at 5 m at first data assimilation cycle in the Observing System Simulation Experiment (OSSEs). (e)-(h) same as (a)-(d), but for 105 m. (i)-(l) same as (a)-(d), but for 198 m. (m)-(p) same as (a)-(d), but for 707 m.
Extended Data Fig. 2 Horizontal distribution of the analysis increment in the OSSEs using the DeepDA.
The analysis increment (that is, analysis states minus background states) with a single observation at (a) 0.5oS, 101oW, (b) 0.5oS, 159oE, (c) 35.5oN, 51oW, (d) 19.5oS, 99oE, (e) 19.5oS, 59oE, and (f) 0.5oS, 6oW at January 1st 1997 in the OSSEs.
Extended Data Fig. 3 Reduction of the loss values in time and DeepDA-produced subsurface temperature anomalies in the OSSEs after 3-months spin-up.
(a) Time-series of the globally- and vertically- (from 0 to 700 m) averaged root-mean-squared-errors (RMSEs) of the temperature anomalies in the analysis fields from the ground truth during January to December 1974 in the OSSEs. Black dots denote each assimilation cycle. The temperature anomalies of the (b) ground truth, (c) observations, (d) analysis states, and (e) background states at the surface layer at 15th data assimilation cycle in the Observing System Simulation Experiments (OSSEs). (f)-(i) same as (b)-(e), but for 105 m. (j)-(m) same as (b)-(e), but for 198 m. (n)-(q) same as (b)-(e), but for 707 m.
Extended Data Fig. 4 Ensemble spread in DeepDA reanalysis.
Ensemble spread of the monthly temperature from 1981 to 2020 at each vertical level using 100 ensemble members in real-case DeepDA reanalysis (panels in left columns), spread between 6 different oceanic reanalysis products (that is, ORAS5, ECCO v4r4, ECDA, GECCO2, GODAS, SODA v3.4.2) (panels in mid-columns) and the ensemble spread in 5 ensemble members in ORAS5 (panels in right columns). Oceanic surface (a–c), 100 m (d–f), 200 m (g–i), 300 m (j–l) and 500 m depth (m,o).
Extended Data Fig. 5 RMSE of the DeepDA reanalysis in the observed locations.
(a) Tropical Pacific (TP, 14.5°S-14.5°N, 119.5°E-239.5°E) (b) North Pacific (NP, 20–55°N, 119.5°E-239.5°E), (c) Northern Atlantic (NA, 1.5°N-64.5°N, 279.5°E-359.5°E), and (d) Indian Ocean (IO, 19.5°S-19°N, 39.5°E-119°E) average of the monthly-averaged temperature RMSE from surface to 750 m in the DeepDA and other oceanic reanalysis products. The RMSE is calculated using the in situ ARGO and TAO observations.
Extended Data Fig. 6 Annual-mean SST and its biases in various oceanic reanalysis products.
Annual-mean SST (contour), and the difference from the reference data (that is, ERSST V5) (shading) during 1981–2020 in (a) ORAS5, (b) COBE, (c) ECCO v4r4, (d) ECDA v2, (e) GECCO2, (f) OSTIA, (g) GODAS, (h) MERRA2, and (i) SODA v3.4.2.
Extended Data Fig. 7 Annual-mean T300 and its biases in various oceanic reanalysis products.
Annual-mean temperature averaged from the surface to 300 m (T300) in DeepDA (contour), and the difference from (a) EN4.2.2, and (b) ARMOR3D L4 reference dataset. Annual-mean T300 (contour), and the difference from the Roemmich-Gilson Argo climatology (shading) during 1994–2018 in (c) ORAS5, (d) ECCO v4r4, (e) ECDA v2, (f) GECCO2, (g) GODAS, and (h) SODA v3.4.2.
Extended Data Fig. 8 Correlation skill of SST and T300 anomalies in various oceanic reanalysis products.
Temporal anomaly correlation coefficients of the monthly SST anomalies between the ERSST V5 and (a) ORAS5, (b) COBE, (c) ECCO v4r4, (d) ECDA v2, (e) GECCO2, (f) OSTIA, (g) GODAS, (h) MERRA2, or (i) SODA v3.4.2 during 1981–2020. Temporal anomaly correlation coefficients of the monthly T300 anomalies between the EN4.2.2 and (j) ORAS5, (k) ECCO v4r4, (l) ECDA v2, (m) GECCO2, (n) GODAS, or (o) SODA v3.4.2 during 1981–2020.
Extended Data Fig. 9 Equatorial subsurface temperature anomalies during El Nino peak season.
Equatorial temperature anomalies from surface to 300 m at January 1998 in (a) EN4.2.2, (b) DeepDA, (c) ORAS5, (d) ECCO v4r4, (e) ECDA v2, (f) GECCO2, (g) GODAS, and (h) SODA v3.4.2.
Source data
Source Data Fig. 1
r.m.s.e. of the temperature for testing period in the OSSEs.
Source Data Fig. 2
Taylor diagram of the global-mean monthly OHC500.
Source Data Fig. 3
Taylor diagram of the seasonal climatology difference T300.
Source Data Extended Data Fig. 3
r.m.s.e. between the analysis fields and the true states in the OSSEs.
Source Data Extended Data Fig. 5
Vertical profiles of monthly averaged temperature r.m.s.e.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ham, YG., Joo, YS., Kim, JH. et al. Partial-convolution-implemented generative adversarial network for global oceanic data assimilation. Nat Mach Intell 6, 834–843 (2024). https://doi.org/10.1038/s42256-024-00867-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-024-00867-x
This article is cited by
-
Improving global weather and ocean wave forecast with large artificial intelligence models
Science China Earth Sciences (2024)


