Detecting marine heatwaves below the sea surface globally using dynamics-guided statistical learning

Zhang, Xiang; Li, Furong; Jing, Zhao; Zhang, Bohai; Ma, Xiaohui; Du, Tianshi

doi:10.1038/s43247-024-01769-x

Download PDF

Article
Open access
Published: 23 October 2024

Detecting marine heatwaves below the sea surface globally using dynamics-guided statistical learning

Communications Earth & Environment volume 5, Article number: 616 (2024) Cite this article

3237 Accesses
1 Citations
8 Altmetric
Metrics details

Subjects

Physical oceanography

Abstract

Extreme warm water events, known as marine heatwaves, cause a variety of adverse impacts on the marine ecosystem. They are occurring more and more frequently across the global ocean. Yet monitoring marine heatwaves below the sea surface is still challenging due to the sparsity of in situ temperature observations. Here, we propose a statistical learning method guided by ocean dynamics and optimal prediction theory, to detect subsurface marine heatwaves based on the observable sea surface temperature and sea surface height. This dynamics-guided statistical learning method shows good skills in detecting subsurface marine heatwaves in the oceanic epipelagic zone over many parts of the global ocean. It outperforms both the classical ordinary least square regression and popular deep learning methods that do not effectively exploit ocean dynamics, with clear dynamical interpretation for its outperformance. Our study provides a useful statistical learning method for near real-time monitoring of subsurface marine heatwaves at a global scale and highlights the importance of exploiting ocean dynamics for enhancing the efficiency and interpretability of statistical learning.

Frequent marine heatwaves hidden below the surface of the global ocean

Article Open access 20 November 2023

Seasonal forecasting of subsurface marine heatwaves

Article Open access 26 June 2023

A global overview of marine heatwaves in a changing climate

Article Open access 20 November 2024

Introduction

Marine heatwaves (MHWs), defined as discrete, prolonged, extreme warm water events^1,2 exert catastrophic impacts on marine ecosystems, affecting adversely fishery industry and social economy^3,4,5,6,7. The intensity, frequency and duration of MHWs have increased significantly over most parts of the global ocean during the past several decades in response to the global warming^8,9. These trends will be likely to persist or even become more evident in the future^9,10,11. Due to the rising threat of MHWs, there are growing efforts dedicated to describing their features^12,13,14, understanding their dynamical drivers^15,16,17, monitoring their occurrence¹⁸, evaluating their predictability and making forecast^19,20.

Current knowledge of MHWs globally is primarily confined to the sea surface, relying on up-to-date, satellite-measured sea surface temperature (SST)^2,8,15. However, there is extensive evidence on the presence of MHWs below the sea surface (referred to as the subsurface MHWs hereinafter to distinguish them from the surface MHWs)^{21,22,23,24,25,26,27,28,29}. For instance, MHWs at 150 m are detected in the Northeast Pacific based on a combination of Argo floats and ship-board measurements, and thought to impact fishes like salmons²¹. Long-term mooring observations in the coastal waters off southeastern Australia²² reveal the subsurface intensification of MHWs that are driven by local downwelling favorable winds and bring damages to benthic species such as coral, kelp and seagrass. Based on OOI Pioneer Array and fishing fleets, a subsurface MHW is detected in early 2017 on the Northeast U.S. continental shelf, affecting significantly on the ecosystem and commercial fishing industry^23,24. Recently, a study²⁵ based on the Tropical Atmosphere Ocean/Triangle Trans-Ocean Buoy Network buoys in the West Pacific, reports much stronger subsurface MHWs in the thermocline caused by downwelling, compared to the surface MHWs. More intense MHWs than surface MHWs have also been discovered along the continental shelves of North America²⁶. These subsurface MHWs are likely to threaten the marine ecosystem no less severely than the surface MHWs, since a massive number of organisms live in the oceanic epipelagic zone extending down to 200 m³⁰.

To facilitate timely management for alleviating the marine ecosystem stress and associated socioeconomic ramifications, monitoring the occurrence of subsurface MHWs becomes imperative. However, current capacity for direct detection of subsurface MHWs is strongly hindered by the sparse in situ temperature observations below the sea surface². Satellite remote sensing provides high spatio-temporal resolution, extensive coverage, and nearly real-time ocean observations at the sea surface. Here we switch the lens and attempt to indirectly detect the subsurface MHWs by retrieving the subsurface temperature from the surface observations.

The relationship between the subsurface temperature and surface variables has recently been estimated using dynamical methods^31,32,33,34 and statistical learning methods^35,36,37,38. The dynamical methods are primarily based on the quasi-geostrophic (QG) dynamics³⁹ to retrieve the subsurface temperature anomaly ${T}^{{\prime} }$ (relative to a climatological mean seasonal cycle) using the observable variables at the sea surface, i.e., SST anomaly (SSTA) and sea surface height anomaly (SSHA). However, some dynamical assumptions have to be made as the potential vorticity in the ocean interior is not observable. These assumptions, essentially empirical, differ substantially among different studies^31,32,33,34, which makes the dynamical methods applicable only in specific regions under certain conditions.

Statistical learning, also known as machine learning, is free from dynamical assumptions and instead employs a data-driven way to predict ${T}^{{\prime} }$ from SSTA and SSHA. A variety of statistical learning methods with different levels of complexity^35,36,37,38 have been applied, ranging from the classical ordinary least square (OLS) regression to the popular deep neural networks. However, statistical learning methods suffer from the bias-variance trade-off⁴⁰. On the one hand, an overly simple model such as the OLS regression may not accurately capture the complicated relationship between ${T}^{{\prime} }$ and its predictors (i.e., SSTA and SSHA), leading to large estimation bias. On the other hand, highly flexible models such as the deep neural networks are subjected to large estimation variance and thus may cause overfitting, in addition to the lack of interpretability. Constructing an appropriate statistical model that provides accurate prediction by exploiting dynamical information is thus crucial for the detection of subsurface MHWs.

In this study, we first demonstrate a geographically and temporally varying linear relationship between ${T}^{{\prime} }$ and its predictors (i.e., SSTA and SSHA) by combining the ocean dynamics and optimal prediction theory in statistics. We then propose a statistical model to capture such relationship to retrieve ${T}^{{\prime} }$ from SSTA and SSHA, and finally detect subsurface MHWs based on the retrieved ${T}^{{\prime} }$. Application of our dynamics-guided statistical model to three ocean reanalysis datasets (See “Ocean reanalysis datasets” in Methods) confirms its consistent outperformance in detecting the subsurface MHWs over both the OLS regression and deep neural networks. We remark that our model, trained for one ocean reanalysis dataset, shows good performance when tested against another, despite the different numerical model configurations and data assimilation methods among the different ocean reanalysis datasets. It thus suggests that the relationship between ${T}^{{\prime} }$ and its predictors in different ocean reanalysis datasets is generally consistent with each other and should capture the relationship in reality. This lends support that our model learned from the ocean reanalysis datasets can also be used to detect the subsurface MHWs in the observations, helping in the near real-time monitoring of subsurface MHWs globally.

Results

Guidance of ocean dynamics on statistical learning

A statistical model consists of three components, i.e., the response, the predictors, and the relationship between the response and predictors⁴⁰. Here we treat ${T}^{{\prime} }$ at some depth below the sea surface as the response. Once ${T}^{{\prime} }$ is obtained, whether an MHW occurs or not can be determined straightforwardly based on some prescribed threshold for ${T}^{{\prime} }$ (See “Definition of MHWs” in Methods). The predictors are chosen as SSTA and SSHA. Such choices are not only due to the availability of these variables in the observations but also their tight relationship with ${T}^{{\prime} }$. In particular, we will show that there is geographically and temporally varying linear dependence of ${T}^{{\prime} }$ on SSTA and SSHA according to the ocean dynamics. The mathematical derivations of the relationships are detailed in the Methods (See “Guidance of ocean dynamics and optimal prediction theory in statistics” in Methods).

Variations of satellite-measured SSTA and SSHA are dominated by oceanic mesoscale eddies^41,42. The behavior of these eddies is largely governed by the QG dynamics that can be further divided into the surface and interior QG dynamics. Assume that density anomaly is dominated by temperature anomaly, which holds over most parts of the global ocean (Supplementary Fig. 1). On the one hand, the surface QG dynamics suggests a linear relationship between ${T}^{{\prime} }$ and SSTA. On the other hand, the interior QG dynamics combined with the optimal prediction theory in statistics suggests linear dependence of ${T}^{{\prime} }$ on SSHA. However, the dependences of ${T}^{{\prime} }$ on SSTA and SSHA are neither constant over space nor time but are affected by the Coriolis frequency and background stratification (See “Guidance of ocean dynamics and optimal prediction theory in statistics” in Methods). In particular, the strong seasonal variation of background stratification in the upper ocean should cause evident seasonality in the relationship between ${T}^{{\prime} }$ and SSTA according to the surface QG dynamics. The above dynamics information guides us to propose a geographically and seasonally varying coefficient (GSVC) linear regression model to capture such relationships (See “GSVC model” in Methods). In the GSVC model, the regression coefficients of ${T}^{{\prime} }$ onto SSTA and SSHA can vary with location and calendar day. Estimating the regression coefficient at each location and calendar day is achieved by borrowing the samples on its neighboring locations and calendar days.

It should be noted that we do not take into consideration the variation of relationship between ${T}^{{\prime} }$ and its predictors at interannual and longer time scales. The main reason is that the variation of upper-ocean background stratification is dominated by its seasonal cycle⁴³. By neglecting the variations of regression coefficients at interannual and longer time scales, the GSVC model treats samples on the same calendar day in different years as replicates and is thus capable of reducing estimation variance at the expense of a minor increase in the estimation bias. Additionally, by assuming the regression coefficient to be the same every year, we can train the GSVC model using the historical data (e.g., the ocean reanalysis datasets) and apply it to the near real-time monitoring of subsurface MHWs.

Performance of the GSVC model in detecting subsurface MHWs

Ideally, the GSVC model should be trained and tested based on observational data. However, the sparsity of the observed ${T}^{{\prime} }$ prohibits such analyses. Instead, we use three different ocean reanalysis datasets as surrogates to evaluate the performance of the GSVC model (See “Ocean reanalysis datasets” in Methods). In this case, there are concerns on to what extent the ocean reanalysis datasets can faithfully capture the relationship between ${T}^{{\prime} }$ and its predictors in reality. Such uncertainty can be inferred from the inter-dataset differences of relationships among the three ocean reanalysis datasets, based on the premise that the differences of relationships between the ocean reanalysis datasets and observations should be comparable in magnitude to those among the different ocean reanalysis datasets. For this reason, we train the GSVC model for one ocean reanalysis dataset and test it against another. Accordingly, the resulting model test errors contain three components related to the variation of ${T}^{{\prime} }$ independent from the predictors, the inaccuracy of estimated relationship between ${T}^{{\prime} }$ and its predictors in some ocean reanalysis datasets, and the differences of relationships among the different reanalysis datasets.

To quantify the performance of the GSVC model in detecting the subsurface MHWs based on SSHA and SSTA, we use a scalar metric, the Matthews correlation coefficient (MCC)⁴⁴, ranging from −1 to 1. The MCC has been demonstrated to be a reliable measure of quality of binary classification (MHW or not in our case) especially for imbalanced datasets⁴⁵. It produces a high score only if the classifier leads to high values for all the four basic rates of the confusion matrix, i.e., the sensitivity, specificity, precision, and negative predictive value. (See “Computation of the Matthews correlation coefficient” in Methods).

We first examine the performance of GSVC model in MHW detection at different depths. Figure 1 displays the distribution of the test MCC for MHW detection at 25 m, 50 m, 100 m and 200 m averaged over the 6 combinations of training and test datasets. At 25 m, the GSVC model has MCC values close to 1 over most parts of the global ocean. This is expected from a dynamical point of view, as ${T}^{{\prime} }$ at this shallow depth should vary coherently with SSTA via the surface QG dynamics^41,46 and/or mixed-layer dynamics⁴³. Indeed, using SSTA as a single predictor leads to almost identical MCC values at 25 m (Fig. 2a). As the depth increases, the values of MCC decrease but there is a still considerable fraction of the ocean showing high positive values of MCC. The fraction of ocean area with MCC > 0.3 is 94.5% at 50 m, 62.9% at 100 m and 40.2% at 200 m, respectively, suggesting that the GSVC model has skills in detecting subsurface MHWs across the oceanic epipelagic zone. The good performance of the GSVC model is also indicated by the high hit rate and low false alarm rate over the global ocean (Supplementary Figs. 2 and 3).

**Fig. 1: Performance of the GSVC model in detecting the subsurface MHWs globally.**

**Fig. 2: Importance of SSTA and SSHA for subsurface MHW detection at different depths.**

Furthermore, the relative importance of two predictors, SSHA and SSTA, is investigated. The role of SSHA in predicting ${T}^{{\prime} }$ becomes more important as the depth increases, whereas the role of SSTA fades with the depth (Fig. 2). The MCC value derived from using SSHA as a single predictor approaches that from using both SSHA and SSTA, as the depth increases to 200 m (Fig. 2d). In contrast, using SSTA as a single predictor leads to severe model degrade at 200 m (Fig. 2d). Such patterns have clear physical interpretation. On the one hand, once the depth goes below the surface mixed layer, the strong background stratification in the pycnocline causes rapid decay of the surface QG modes with the increasing depth, so that SSTA has weaker imprint on ${T}^{{\prime} }$. On the other hand, the variation of ${T}^{{\prime} }$ generated by interior QG modes becomes dominant in the pycnocline³². Such variation is associated with the variation of SSHA through the hydrostatic balance.

Finally, it should be noted that the skill of the GSVC model for subsurface MHW detection varies geographically especially at the large depths, as evidenced by the geographical heterogeneity of the MCC (Fig. 1). At 100 m and 200 m, the low detection skill is primarily located in the eastern Atlantic and eastern Pacific, suggesting that SSHA is an insufficient predictor for ${T}^{{\prime} }$ in these regions. This might be partially due to the relatively stronger effects of subsurface salinity anomaly on density anomaly there than elsewhere, which makes the density anomaly less dominated by ${T}^{{\prime} }$ (Supplementary Fig. 1) and degrades the correlation between ${T}^{{\prime} }$ and SSHA.

Advantage of exploiting ocean dynamics in statistical learning

The proposal of the GSVC is guided by ocean dynamics. To examine how such guidance benefits the statistical learning outcomes, we compare the performance of the GSVC model for subsurface MHW detection with another two models that do not effectively exploit ocean dynamics. The first is the OLS regression model, one of the simplest but widely used statistical learning methods. Here the OLS model is trained at each grid point with samples at different times as replicates. This leads to a geographically varying but temporally constant relationship between ${T}^{{\prime} }$ and its predictors. The OLS model is actually a reduced case of the GSVC model (See “GSVC model” in Methods). The second is the convolutional neural network (CNN)⁴⁷, one of the popular deep learning methods (See “Convolutional neural network” in Methods). It is a highly flexible model that can capture complicated nonlinear relationship between response and predictors. In this subsection, our principal interest is to compare the efficiency of the different statistical models in estimating the relationship between ${T}^{{\prime} }$ and its predictors. Therefore, rather than training and testing the models on the different reanalysis datasets, we divide each reanalysis dataset into training and test sets. This excludes influences of the discrepancy of relationships among the different reanalysis datasets. Nevertheless, we find that excluding such influences or not does not impact the rank of performance of statistical models (Fig. 3 and Supplementary Fig. 4).

**Fig. 3: Comparison of performance of the different statistical learning methods in detecting the subsurface MHWs globally.**

The geographical and vertical distributions of MCC values for the OLS and CNN models are qualitatively consistent with those for the GSVC model (Supplementary Figs. 5–7). In particular, all the models have some skills for detecting MHWs except in the eastern Atlantic and eastern Pacific at 100 and 200 m. This lends further supports that the poor skills in these regions mainly result from limitations of the predictors rather than the relationships adopted by the statistical models. Despite the similar geographical and vertical distributions of MCC values among the models, the globally averaged MCC of the GSVC and OLS models are significantly larger than that of the CNN model at all the depths (Fig. 3). Given the simplicity of the OLS model, its outperformance over the CNN model is remarkable but understandable. The QG dynamics in combination with the optimal prediction theory suggest a linear relationship between ${T}^{{\prime} }$ and its predictors. It is natural that a linear model should be more efficient than a nonlinear model like the CNN model. In view that the advantage of the CNN model lies in its capability for capturing nonlinear relationship, we use the CNN model to detect subsurface MHWs directly by using a binary variable (MHW or not) as its response (Note that the dependence of MHWs on ${T}^{{\prime} }$ is nonlinear). However, this model does not outperform the GSVC and OLS models, either (Fig. 3).

As to the comparison between the GSVC and OLS models, they have similar performance at 200 m, whereas the GSVC model outperforms the OLS model at 25 m, 50 m and 100 m. This depth range generally coincides with the seasonal pycnocline where the seasonal variation of background stratification is most prominent⁴³. A strong seasonal cycle of relationship between ${T}^{{\prime} }$ and SSTA is expected there according to the surface QG dynamics and mixed-layer dynamics. In specific, SSTA should play a more important role in determining ${T}^{{\prime} }$ in winter when these depths are generally within the deep mixed layer than in summer when the seasonal pycnocline is developed. Such a pattern is captured by the GSVC model (Fig. 4) but not so by the OLS model, leading to higher MCC values for the GSVC model than the OLS model.

**Fig. 4: Seasonal cycle of relationship between subsurface temperature anomaly (${T}^{{\prime} }$) and sea surface temperature anomaly (SSTA) at 50 m.**

Discussion

With the increase of computational resources, there is a tendency to use more and more complicated statistical learning methods in oceanic studies and other scientific fields. However, more complicated methods, albeit more flexible, do not necessarily outperform the simpler ones due to the bias-variance trade-off issue⁴⁰. In particular, complicated methods require large-volume training data, which are usually not available in oceanic studies. Therefore, developing methods with the optimal degree of complexity is thus crucial to enhance the efficiency of statistical learning. The optimal method depends on the true relationship between the response and predictors and is thus problem-specific.

This study proposes a dynamics-guided statistical learning method for detecting subsurface MHWs based on SSTA and SSHA and highlights how the usage of dynamical information can enhance the efficiency and interpretability of statistical learning. Combination of the ocean dynamics and optimal prediction theory in statistics demonstrates a geographically and seasonally varying linear relationship between ${T}^{{\prime} }$ and its predictors (SSTA and SSHA). This insight guides us to propose the GSVC model to capture such relationship. Benefiting from exploiting the dynamical information, the GSVC model has good skills in detecting MHWs in the oceanic epipelagic zone over many parts of the global ocean and outperforms both the classical OLS model and popular CNN model. More importantly, the superiority of the GSVC model has clear dynamical interpretation. On the one hand, the outperformance of the GSVC model over the OLS model is because the OLS is too simple to capture the geographically and seasonally varying relationship between ${T}^{{\prime} }$ and its predictors (SSTA and SSHA). On the other hand, it is superior than the CNN model as the latter is overly complicated. In particular, the CNN model has a highly complicated architecture aimed to handle the nonlinear relationship, whereas the ocean dynamics suggests a linear relationship.

Despite the superiority of the GSVC model, it has its own limitations. The GSVC model is developed based on the QG dynamics. The QG dynamics provides an appropriate dynamical framework for ocean mesoscale eddies^41,42 but no so for submesoscale⁴⁸ and coastal processes⁴⁹. In regions where these processes dominate the variations of ${T}^{{\prime} }$, the GSVC model may be not effective in detecting subsurface MHWs. In particular, we caution the readers that the high MCC in the coastal regions shown in Fig. 1 does not necessarily suggest the good performance of the GSVC model for detecting coastal subsurface MHWs in reality. Instead, it is likely an artifact of the ocean reanalysis datasets, as the coastal processes are poorly represented in these datasets due to the insufficient resolution.

Even in regions where the QG dynamics governs the variations of ${T}^{{\prime} }$, there is still a lot of space to further improve the capability of statistical learning for detecting subsurface MHWs, especially in the regions where SSTA and SSHA are poor predictors of ${T}^{{\prime} }$. The improvement requires utilizing in situ subsurface measurements such as Argo floats^50,51. These measurements can be merged with the predictions from SSTA and SSHA using the spatio-temporal statistical learning methods such as the kriging methods⁵². More subsurface in situ observations and advanced statistical learning methods exploiting dynamical information are crucial for enhancing the capacity of subsurface MHW monitoring and alleviating their induced ecosystem stress and socioeconomic ramifications.

Methods

Ocean reanalysis datasets

In this study, we use three ocean reanalysis datasets to identify subsurface MHWs. The first is the Estimating the Circulation and Climate of the Ocean, phase II (ECCO2) project⁵³, version “cube 92.” It is obtained by using Green’s function approach and provides a three-day mean estimate of the physical ocean state on 0.25° × 0.25° regular grids from 1993 to 2020. The second is the Global Ocean Physics Reanalysis (GLORYS)⁵⁴ product delivered from the Copernicus Marine Environment Monitoring Service. It is assimilated by means of a reduced-order Kalman filter and corrected based on a three-dimensional variational (3D-VAR) scheme, providing a daily mean 1/12° global ocean reanalysis from 1993 to 2020. The third is the Hybrid Coordinate Ocean Model (HYCOM)⁵⁵ product covering the period from 1993 to 2012, which uses the Navy Coupled Ocean Data Assimilation (NCODA) system for 3DVAR data assimilation. It has a 3-h temporal resolution and a 0.08° horizontal resolution.

We bin-average the HYCOM data within each day and interpolate the ECCO2 data onto a daily basis. The spatial grids of the three reanalysis datasets are unified by interpolating each dataset onto regular grids with a horizontal resolution of 1° × 1° and vertical levels of 5 m, 25 m, 50 m, 100 m, and 200 m. The SST is approximated as the temperature at 5 m. The anomalies of SST, sea surface height and subsurface temperature $T$ (SSTA, SSHA and ${T}^{{\prime} }$) are computed by subtracting their respective climatological mean seasonal cycles over the entire period (1993-2020 for ECCO2 and GLORYS but 1993–2012 for HYCOM).

Definition of MHWs

MHWs are defined, following Hobday et al.¹, as an abnormal warming event with $T$ consecutively exceeding its seasonally varying 90th percentile $\theta$ for at least 5 days. The baseline period for the computation of $\theta$ is the entire period of the ocean reanalysis datasets (1993–2020 for ECCO2 and GLORYS but 1993–2012 for HYCOM). Note that $T={T}^{{\prime} }+{T}_{c}$ and $\theta =\theta ^{\prime} +{T}_{c}$ where ${T}_{c}$ is the climatological mean seasonal cycle of $T$ and $\theta ^{\prime}$ is the seasonally varying 90th percentile of ${T}^{{\prime} }$. As MHWs are defined based on $T-\theta$ (Hobday et al.¹), ${T}_{c}$ is irrelevant to MHWs and we can detect subsurface MHWs based on ${T}^{{\prime} }$ alone⁵⁶. The value of $\theta ^{{\prime} }$ is always computed from the “true” ${T}^{{\prime} }$ in the reanalysis dataset, no matter whether MHWs are identified based on the “true” ${T}^{{\prime} }$ or predicted ${T}^{{\prime} }$ from SSTA and SSHA.

Guidance of ocean dynamics and optimal prediction theory in statistics

In this subsection, we combine the ocean dynamics and optimal prediction theory to demonstrate a geographically and seasonally varying linear relationship between ${T}^{{\prime} }$ and its predictors (i.e., SSTA and SSHA). Under the QG approximation, the geostrophic streamfunction anomaly ($\psi ^{{\prime} }$) can be derived by inverting the potential vorticity (PV) anomaly under the surface and bottom boundary conditions³⁹

$$\left(\frac{{\partial }^{2}}{\partial {x}^{2}}+\frac{{\partial }^{2}}{\partial {y}^{2}}+\frac{\partial }{\partial z}\frac{{f}^{2}}{{N}^{2}}\frac{\partial }{\partial z}\right)\psi ^{\prime} =q^{{\prime} } ,-H \, < \,z \, < \,0,$$

(1a)

$$f\frac{\partial \psi ^{{\prime} } }{\partial z}={b}_{0}^{{\prime} } (x,y,t),z=0,$$

(1b)

$$f\frac{\partial \psi ^{{\prime} } }{\partial z}=0,z=-H,$$

(1c)

where $f$ is the Coriolis frequency, $N$ is the background buoyancy frequency, $q^{{\prime} }$ is the QG PV anomaly, ${b}^{{\prime} }=f\frac{\partial \psi ^{{\prime} } }{\partial z}=-(\frac{g}{{\rho }_{r}})\rho ^{{\prime} }$ is the buoyancy anomaly with $g$ the gravity acceleration, $\rho$ the density and ${\rho }_{r}$ the reference density value, and ${b}_{0}^{{\prime} }$ is the value of ${b}^{{\prime} }$ at the sea surface. In the following analysis, we assume ${b}^{{\prime} }$ is dominated by temperature anomaly ${T}^{{\prime} }$ so that ${b}^{{\prime} }\approx g\alpha T^{{\prime} }$ with $\alpha$ the thermal expansion coefficient, which holds over most parts of the global ocean (Supplementary Fig. 1).

Eq. (1) is linear with respect to $\psi ^{{\prime} }$, allowing $\psi ^{{\prime} }$ to be decomposed into a homogeneous solution known as the surface QG (SQG) solution ${\psi }_{s}^{{\prime} }$ and a particular solution as the interior QG (IQG) solution ${\psi }_{i}^{{\prime} }$^41,42. Accordingly, $b^{{\prime} }$ can be decomposed into ${b}^{{\prime} }={b}_{s}^{{\prime} } +{b}_{i}^{{\prime} }$ with ${b}_{s}^{{\prime} } =f\partial {\psi }_{s}^{{\prime} } /\partial z$ and ${b}_{i}^{{\prime} } =f\partial {\psi }_{i}^{{\prime} } /\partial z$ representing the buoyancy anomalies of SQG and IQG solutions, respectively. We next demonstrate ${b}_{s}^{{\prime} }$ depends linearly on SSTA and ${b}_{i}^{{\prime} }$ depends linearly on SSTA and SSHA.

The SQG solution is defined as

$$\left(\frac{{\partial }^{2}}{\partial {x}^{2}}+\frac{{\partial }^{2}}{\partial {y}^{2}}+\frac{\partial }{\partial z}\frac{{f}^{2}}{{N}^{2}}\frac{\partial }{\partial z}\right){\psi }_{s}^{{\prime} } =0,-H \, < \,z \, < \, 0,$$

(2a)

$$f\frac{\partial {\psi }_{s}^{{\prime} } }{\partial z}={b}_{0}^{{\prime} } (x,y,t),z=0,$$

(2b)

$$f\frac{\partial {\psi }_{s}^{{\prime} } }{\partial z}=0,z=-H.$$

(2c)

As evidenced from Eq. (2), ${\psi }_{s}^{{\prime} }$ depends linearly on ${b}_{0}^{{\prime} }$ (SSTA) and so does ${b}_{s}^{{\prime} }$. Moreover, it can be demonstrated that ${b}_{s}^{{\prime} }$, equal to ${b}_{0}^{{\prime} }$ at the sea surface, decays as the depth increases with decaying rate proportional to $N/f$⁴¹. These thus suggest a geographically and temporally varying linear relationship between ${b}_{s}^{{\prime} }$ and ${b}_{0}^{{\prime} }$ (SSTA).

The IQG solution is defined as:

$$\left(\frac{{\partial }^{2}}{\partial {x}^{2}}+\frac{{\partial }^{2}}{\partial {y}^{2}}+\frac{\partial }{\partial z}\frac{{f}^{2}}{{N}^{2}}\frac{\partial }{\partial z}\right){\psi }_{i}^{{\prime} } =q^{{\prime} } ,-H \, < \,z \, < \,0,$$

(3a)

$$f\frac{\partial {\psi }_{i}^{{\prime} } }{\partial z}=0,z=0,$$

(3b)

$$f\frac{\partial {\psi }_{i}^{{\prime} } }{\partial z}=0,z=-H.$$

(3c)

As $q^{{\prime} }$ is unknown in reality, ${\psi }_{i}^{{\prime} }$ and hence ${b}_{i}^{{\prime} }$ cannot be derived deterministically. However, they can be predicted based on SSHA and SSTA if we switch our perspective from dynamical to statistical, as demonstrated below.

For now, we consider ${\psi }_{i}^{{\prime} }$ at some geographical location and time $\left(x,y,t\right)$ as a vertical random process. In order to predict ${\psi }_{i}^{{\prime} }$, we decompose it as a linear combination of countable basis functions, that is,

$${\psi }_{i}^{{\prime} }={\sum }_{n=0}^{\infty }{\theta }_{n}^{{\prime} }{\left(x,y,t\right)F}_{n}\left(x,y,z,t\right),$$

(4)

where ${F}_{n}$ are specified basis functions in the vertical and the coefficient ${\theta }_{n}^{{\prime} }$ are random effects. Essentially, Eq. (4) is a statistical model for the vertical variability of ${\psi }_{i}^{{\prime} }$.

There are many ways to specify the basic functions but it is natural and useful to choose basic functions with clear physical meanings⁵⁷. For this reason, the basic functions are chosen as the vertical modes in ocean dynamics³⁹, i.e., the eigen functions of the following Sturm-Liouville problem

$$\left(\frac{\partial }{\partial z}\frac{{f}^{2}}{{N}^{2}}\frac{\partial }{\partial z}\right){F}_{n}+{\lambda }_{n}^{2}{F}_{n}=0,-H \, < \, z \, < \,0,$$

(5a)

$$\frac{\partial }{\partial z}{F}_{n}=0,z=0,$$

(5b)

$$\frac{\partial }{\partial z}{F}_{n}=0,z=-H,$$

(5c)

where ${F}_{n}$ is the eigen function (vertical mode) associated with the eigenvalue ${\lambda }_{n}$. The ${F}_{n}$ is a known function as both ${f}^{2}$ and ${N}^{2}$ are known in reality. In particular, ${F}_{0}\equiv 1$ with the eigenvalue ${\lambda }_{0}=0$ is the barotropic mode.

The advantage of using the vertical modes as the basic functions is that ${\psi }_{i}^{{\prime} }$ in reality can be approximated at a high level of accuracy by the first several vertical modes⁵⁸

$${\psi }_{i}^{{\prime} }\approx {\sum }_{n=0}^{N}{\theta }_{n}^{{\prime} }{\left(x,y,t\right)F}_{n}\left(x,y,z,t\right).$$

(6)

We next utilize the dynamical constraints to construct an optimal predictor of ${\psi }_{i}^{{\prime} }$. Assuming no motions at the sea floor yields the first constraint:

$${\sum }_{n=0}^{N}{\theta }_{n}^{{\prime}} {\left(x,y,t\right)F}_{n}(x,y,-H,t)=0,$$

or equivalently

$${\theta }_{0}^{{\prime} }\left(x,y,t\right)=-{\sum }_{n=1}^{N}{\theta }_{n}^{{\prime} }{\left(x,y,t\right)F}_{n}\left(x,y,-H,t\right).$$

(7)

Substituting Eq. (7) into Eq. (6) gives rise to

$${\psi }_{i}^{{\prime} }={\sum }_{n=1}^{N}{\theta }_{n}^{{\prime} }{\left(x,y,t\right)G}_{n}\left(x,y,z,t\right),$$

(8)

where ${G}_{n}{\left(x,y,z,t\right)=F}_{n}(x,y,z,t)-{F}_{n}(x,y,-H,t)$.

The other constraint is related to SSHA and SSTA due to the hydrostatic balance, i.e.,

$${\psi }_{i}^{{\prime} }\left(x,y,0,t\right)={\sum }_{n=1}^{N}{\theta }_{n}^{{\prime} }{\left(x,y,t\right)G}_{n}\left(x,y,0,t\right)=\frac{g}{f}{SSHA}-{\psi }_{s}^{{\prime} }\left(x,y,0,t\right),$$

(9)

where ${\psi }_{s}^{{\prime} }\left(x,y,0,t\right)$ is a linear function of SSTA according to the SQG solution.

Typically, ${\theta }_{n}^{{\prime} },n=1\ldots N$ can be uniquely determined based on Eq. (9) only for $N=1$³². In reality where $N \, > \,1$, ${\theta }_{n}^{{\prime} }$ is underdetermined and so is ${\psi }_{i}^{{\prime} }$. Nevertheless, from a statistical perspective, ${\psi }_{i}^{{\prime} }$ and further ${b}_{i}^{{\prime} }$ can be predicted based on the predictive distribution of ${\psi }_{i}^{{\prime} }$ given its physical constraint, i.e., Eq. (9).

We assume that ${{{\boldsymbol{\theta }}}}^{{\prime} }\equiv {({\theta }_{1}^{{\prime} },{\theta }_{2}^{{\prime} }\ldots {\theta }_{N}^{{\prime} })}^{T}$ is a N-dimensional Gaussian random vector with zero mean and covariance matrix ${\Sigma }_{N\times N}\equiv {{\rm{Cov}}}\left({{{\boldsymbol{\theta }}}}^{{\prime} }\right)$, denoted as ${{{\boldsymbol{\theta }}}}^{{\prime} } \sim {Gau}\left({{\boldsymbol{0}}}{{,}}{{\boldsymbol{\Sigma }}}\right)$, where the superscript T denotes the transpose. Let $\pi ^{{\prime}} \equiv \frac{g}{f}{SSHA}-{\psi }_{s}^{{\prime} }(x,y,0,t)$, ${{\bf{G}}}\equiv {({G}_{1},{G}_{2}\ldots {G}_{N})}^{T}$ and ${{{\bf{G}}}}_{0}\equiv {{\bf{G}}}{|}_{z=0}$. Then we have ${{{\boldsymbol{\tau }}}\equiv ({\psi }_{i}^{{\prime} },\pi ^{{\prime}} )}^{T}={\left[{{\bf{G}}}{{,}}{{{\bf{G}}}}_{0}\right]}^{T}{{{\boldsymbol{\theta }}}}^{{\prime} }$, a 2-dimensional Gaussian random vector with covariance matrix ${\Omega }_{2\times 2}\equiv {{\rm{Cov}}}\left({{\boldsymbol{\tau }}}\right)={\left[{{\bf{G}}}{{,}}{{{\bf{G}}}}_{0}\right]}^{T}\Sigma \left[{{\bf{G}}}{{,}}{{{\bf{G}}}}_{0}\right]$. According to the property of multivariate Gaussian distribution, the conditional distribution of ${\psi }_{i}^{{\prime} }$ at some depth given $\pi ^{{\prime}}$ is

$${\psi }_{i}^{{\prime} }{{|}}{\pi }^{{\prime} } \sim {Gau}\left(\mu ,{\sigma }^{2}\right),$$

(10)

with conditional mean and variance respectively being

$$\mu ={\Omega }_{12}{\Omega }_{22}^{-1}{\pi }^{{\prime} },$$

(11)

$${\sigma }^{2}={\Omega }_{11}-{\Omega }_{12}{\Omega }_{22}^{-1}{\Omega }_{21},$$

(12)

where ${\Omega }_{{ij}}$ corresponds to the component of $\Omega$ at th i-th row, j-th column.

It can be proved that the optimal predictor of ${\psi }_{i}^{{\prime} }$⁵⁹, under the mean squared error loss, is the conditional mean given by Eq. (11). Accordingly, $f\partial \mu /\partial z$ is the optimal predictor of ${b}_{i}^{{\prime} }$. Therefore, Eqs. (10) and (11) suggest that the optimal predictors of ${\psi }_{i}^{{\prime} }$ and ${b}_{i}^{{\prime} }$ depend linearly on SSHA and SSTA (note that ${\psi }_{s}^{{\prime} }(x,y,0,t)$ is a linear function of SSTA) but the relationship varies with location and time due to the geographical and temporal variations of ${F}_{n}$.

GSVC model

The ocean dynamics in combination with the optimal prediction theories suggest linear but geographically and seasonally varying relationships between $T^{{\prime}}$ and its predictors (SSTA and SSHA). To capture such relationships, we propose a geographically and seasonally varying coefficient (GSVC) linear regression model,

$${T}_{k}^{{\prime} }\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)={\beta }_{1}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right){{SSTA}}_{k}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)+{\beta }_{2}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right){{SSHA}}_{k}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)+{\beta }_{0}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)+{\varepsilon }_{k}\left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right),$$

(13)

where ${{{\boldsymbol{s}}}}_{i}=({x}_{i},{y}_{i})$ is the geographical coordinate, ${t}_{j}$ is the calendar day, the subscript $k$ represents the k-th year, ${\beta }_{1}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ and ${\beta }_{2}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ are the geographically and seasonally varying regression coefficients of ${T}_{k}^{{\prime}} ({{{\boldsymbol{s}}}}_{i},{t}_{j})$ onto ${{SSTA}}_{k}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ and ${{SSHA}}_{k}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ respectively, ${\beta }_{0}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ is the geographically and seasonally varying intercept, and ${\varepsilon }_{k}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ is a random noise with mean zero and independent from SSTA and SSHA. Here ${\varepsilon }_{k}({{{\boldsymbol{s}}}}_{i},{t}_{j})$ is a catch-all for what are missed in the GSVC model. Samples in different years are treated as replicates in the GSVC model, since ${\beta }_{1}$, ${\beta }_{2}$ and ${\beta }_{0}$ are identical for every year.

A simple way to estimate the regression coefficients including the intercept in Eq. (13) at $({{{\boldsymbol{s}}}}_{i},{t}_{j})$ is minimizing the residual sum of squares (RSS) over the training dataset⁴⁰:

$$\min {\sum}_{k}{\left({T}_{k}^{{\prime}} \left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)-{\hat{T}}_{k}^{{\prime}} \left({{{\boldsymbol{s}}}}_{i},{t}_{j}\right)\right)}^{2},$$

where ${\hat{T}}_{k}^{{\prime} }$ is the prediction of ${T}_{k}^{{\prime} }$ based on Eq. (13). However, the above minimization leads to a poor estimator of the regression coefficients, as there are only ~20 replicates available for estimation. A more efficient estimation method could be motivated from physical information. Note that the background stratification and Coriolis frequency vary smoothly in the spatio-temporal domain, suggesting that the regression coefficients should be similar at the proximate locations in the spatio-temporal domain. Accordingly, the regression coefficients at $({{{\boldsymbol{s}}}}_{i},{t}_{j})$ can be estimated by borrowing the information from its neighborhood. In other words, the GSVC model can be fitted by locally weighted least squares:

$$\min {\sum}_{p}{\sum}_{q}{\sum}_{k}w\left({{{\boldsymbol{s}}}}_{i}-{{{\boldsymbol{s}}}}_{p},{t}_{j}-{t}_{q}\right){\left({T}_{k}^{{\prime} } \left({{{\boldsymbol{s}}}}_{p},{t}_{q}\right)-{\hat{T}}_{k}^{{\prime} } \left({{{\boldsymbol{s}}}}_{p},{t}_{q}\right)\right)}^{2},$$

(14)

where localization is specified by a spatio-temporal kernel function $w({{{\boldsymbol{s}}}}_{i}-{{{\boldsymbol{s}}}}_{p},{t}_{j}-{t}_{q})$⁵². By assigning a weight to samples in the neighborhood of the location $({{{\boldsymbol{s}}}}_{i},{t}_{j})$, the kernel function reflects the different importance of these samples when estimating the regression coefficients at $({{{\boldsymbol{s}}}}_{i},{t}_{j})$. In this study, we adopt a Gaussian kernel, i.e.,

$$w\left({{{\boldsymbol{s}}}}_{i}-{{{\boldsymbol{s}}}}_{p},{t}_{j}-{t}_{q}\right)=\exp \left(-\left[{\left(\frac{\varDelta x}{{\theta }_{x}}\right)}^{2}+{\left(\frac{\varDelta y}{{\theta }_{y}}\right)}^{2}+{\left(\frac{\varDelta t}{{\theta }_{t}}\right)}^{2}\right]\right),$$

(15)

where $\left(\Delta x,\Delta y\right)={{{\boldsymbol{s}}}}_{i}-{{{\boldsymbol{s}}}}_{p}$, $\Delta t={t}_{j}-{t}_{q}$, and ${\theta }_{x}$, ${\theta }_{y}$ and ${\theta }_{t}$ are the band widths determining the decay rate of $w({{{\boldsymbol{s}}}}_{i}-{{{\boldsymbol{s}}}}_{p},{t}_{j}-{t}_{q})$ along the zonal, meridional and temporal axes, respectively. We remark that the GSVC model can be treated as a generalization of the geographically and temporally weighted regression model⁶⁰ to deal with any number of replicates. Furthermore, the OLS model is a reduced case of the GSVC model with zero ${\theta }_{x}$ and ${\theta }_{y}$ but infinite ${\theta }_{t}$.

We set $({\theta }_{x},{\theta }_{y})$ as $(2^{\circ } ,1^{\circ }\, )$ and ${\theta }_{t}$ as 60 days, large enough to include sufficient samples for estimation but small enough to capture the geographical and seasonal variations of regression coefficients caused by the varying background stratification and Coriolis frequency. Sensitivity tests suggest that changing the band widths within a reasonable range does not affect the performance of the GSVC model significantly. Finally, the value of $w < 0.01$ is truncated as zero. This truncation has nearly no influence on the estimation but reduces the computational burden substantially.

Convolutional neural network

The Convolutional Neural Network (CNN) is one of the deep learning networks and has been widely used in marine research^61,62,63. It has the capability for capturing nonlinear relationships within data. In this study, we use an end-to-end CNN model to detect the occurrence of subsurface MHWs, with SSTA and SSHA as inputs. The CNN model is implemented in two different approaches, depending on its output. In the first approach, the output is set as a binary variable to directly detect MHW occurrence, where one indicates the occurrence of MHWs and zero indicates the otherwise. It corresponds to a supervised classification learning model, referred to as CNN_cla. The second approach uses the CNN model to first learn $T^{{\prime} }$ and then detect MHWs based on the learned $T^{{\prime} }$. It corresponds to a supervised regression learning model, referred to as CNN_reg.

The CNN_cla and the CNN_reg share the same architecture, depicted in Supplementary Fig. 8. Based on the volume of available data, we first experiment with 2-layer, 3-layer and 4-layer CNN and ultimately choose a 3-layer-architecture CNN to prevent overfitting. In the proposed architecture, data normalization is employed at the start to effectively accelerate convergence and alleviate the vanishing gradient issue. The first two layers are convolutional layers with a filter size of 3° × 3° followed by a ReLU activation function for CNN_reg and a Sigmoid activation function for CNN_cla. Filters of different sizes are also tried but not adopted due their poorer prediction performance (Supplementary Table 1). The third layer is a fully connected layer essentially implemented as a convolutional layer, with an additional sigmoid function followed for CNN_cla to produce an output between 0 and 1 representing the probability of MHW occurrence. The output is classified as one (an MHW) if the probability is higher than 0.5 and zero (not an MHW) otherwise.

We use Python with Pytorch API to implement the above learning architectures. For optimization, the Adadelta⁶⁴ is used, which adjusts the learning rate for each parameter individually and dynamically. As the relationship between the input and output is assumed to be invariant at the interannual and longer time scales, we experimented with different batch sizes and ultimately set a batch size of 365 samples for CNN_reg and 256 samples for CNN_cla. (Supplementary Table 2). The number of training epochs is set empirically as 100. The loss function is the mean squared error for CNN_reg and binary cross entropy for CNN_cla, respectively. Note that CNN_cla involves imbalanced binary classification as MHWs are rare events. For this reason, we give different weights to different classes in the loss function, i.e., 0.80 for the “one” class (an MHW) and 0.20 for the “zero” class (not an MHW), to make the model pay more attention to occurrence of MHW events. This weight leads to a better prediction performance compared to those derived from the weights (0.3, 0.7) and (0.1, 0.9) (Supplementary Table 3).

Computation of the Matthews correlation coefficient

The performance of a statistical model for binary classification (MHW or not in our case) can be evaluated based on the 2 × 2 confusion matrix⁴⁰:

$${{\bf{M}}}=\left(\begin{array}{cc}{TP} & {FN}\\ {FP} & {TN}\end{array}\right),$$

where TP represents the true positives (i.e., the prediction from a statistical model correctly indicates the presence of an MHW), TN the true negatives (i.e., the prediction correctly indicates the absence of an MHW), FP the false positives (i.e., the prediction wrongly indicates the presence of an MHW), and FN the false negatives (i.e., the prediction wrongly indicates the absence of an MHW).

Although the confusion matrix provides a complete description on the performance of a statistical model, analyzing all the four components of the confusion matrix separately would be cumbersome. It is thus desirable to define a scalar metric that summarizes the confusion matrix. The Matthews correlation coefficient (MCC)⁴⁴ has been proved to be an effective metric superior than the commonly used metrics like the accuracy and F1 score particularly for imbalanced class classification like MHW⁴⁵. The MCC is calculated as

$${MCC}=\frac{{TP}\cdot {TN}-{FP}\cdot {FN}}{\sqrt{\left({TP}+{FP}\right)\left({TP}+{FN}\right)\left({FP}+{TN}\right)\left({FN}+{TN}\right)}}.$$

(16)

The MCC has a range of $[-{\mathrm{1,1}}]$. A score higher (lower) than zero indicates superior (inferior) performance than a random classification. It possesses class-exchange invariance and yields a high score only when predictions perform well in all the four components of the confusion matrix.

We employ a bootstrap method to test whether the value of MCC at some spatial location (Fig. 1) is significantly positive. First, we compute the probability p of MHWs estimated as the ratio of number of MHW days to the total number of days ${N}_{t}$. Then, we randomly pick up ${{pN}}_{t}$ days from the entire period, assign them as MHW days and compute the MCC. This random prediction is repeated for 5000 times and an empirical 95% confidence interval is computed. The MCC of a statistical model is thought to be significantly positive at the 95% confidence level if its value is higher than the upper bound of the 95% confidence interval of the MCC for random prediction.

Data availability

Source data are provided with this paper. The ECCO2 data are provided by NASA/JPL from https://ecco.jpl.nasa.gov/. The GLORYS12V1 data are obtained from Copernicus Marine Service product GLOBAL_MULTIYEAR_PHY_001_030 (https://data.marine. copernicus.eu/). The HYCOM data are available from http://ncss.hycom.org/.

Code availability

Codes for MHW identification are available in Matlab (https://github.com/ZijieZhaoMMHW/m_mhw1.0). The GSVC, OLS, CNN_cla and CNN_reg models used in this study are programmed with Python, and available through https://github.com/Zhang-Koda/Detecting_subsurface_MHWs.

References

Hobday, A. J. et al. A hierarchical approach to defining marine heatwaves. Prog. Oceanogr. 141, 227–238 (2016).
Article Google Scholar
Oliver, E. C. J. et al. Marine heatwaves. Annu. Rev. Mar. Sci. 13, 313–342 (2021).
Article Google Scholar
Hughes, T. P. et al. Global warming and recurrent mass bleaching of corals. Nature 543, 373–377 (2017).
Article CAS Google Scholar
Wernberg, T. et al. An extreme climatic event alters marine ecosystem structure in a global biodiversity hotspot. Nat. Clim. Chang. 3, 78–82 (2013).
Article Google Scholar
Smale, D. A. et al. Marine heatwaves threaten global biodiversity and the provision of ecosystem services. Nat. Clim. Chang. 9, 306–312 (2019).
Article Google Scholar
Jones, T. et al. Massive mortality of a planktivorous seabird in response to a marine heatwave. Geophys. Res. Lett. 45, 3193–3202 (2018).
Article Google Scholar
Mills, K. E. et al. Fisheries management in a changing climate: lessons from the 2012 ocean heat wave in the Northwest Atlantic. Oceanog. 26, 191–195 (2013).
Article Google Scholar
Oliver, E. C. J. et al. Longer and more frequent marine heatwaves over the past century. Nat. Commun. 9, 1324 (2018).
Article Google Scholar
Frölicher, T. L., Fischer, E. M. & Gruber, N. Marine heatwaves under global warming. Nature 560, 360–364 (2018).
Article Google Scholar
Oliver, E. C. J. et al. Projected marine heatwaves in the 21st century and the potential for ecological impact. Front. Mar. Sci. 6, 734 (2019).
Article Google Scholar
Laufkötter, C., Zscheischler, J. & Frölicher, T. L. High-impact marine heatwaves attributable to human-induced global warming. Science 369, 1621–1625 (2020).
Hobday, A. et al. Categorizing and naming marine heatwaves. Oceanog. 31, 162–173 (2018).
Sun, D., Jing, Z., Li, F. & Wu, L. Characterizing global marine heatwaves under a spatio-temporal framework. Prog. Oceanogr. 211, 102947 (2023).
Article Google Scholar
Sun, D., Li, F., Jing, Z., Hu, S. & Zhang, B. Frequent marine heatwaves hidden below the surface of the global ocean. Nat. Geosci. 16, 1099–1104 (2023).
Article CAS Google Scholar
Holbrook, N. J. et al. A global assessment of marine heatwaves and their drivers. Nat. Commun. 10, 2624 (2019).
Article Google Scholar
Sen Gupta, A. et al. Drivers and impacts of the most extreme marine heatwave events. Sci. Rep. 10, 19359 (2020).
Article Google Scholar
Bian, C. et al. Oceanic mesoscale eddies as crucial drivers of global marine heatwaves. Nat. Commun. 14, 2970 (2023).
Article CAS Google Scholar
Borgman, E., Pedersen, M., Staehr, P. & Fischer-Bogason, R. Marine Heatwaves in Northen Sea Areas: Occurrence, Effects, and Expected Frequencies (PlanMiljø, 2022).
Jacox, M. G. et al. Global seasonal forecasts of marine heatwaves. Nature 604, 486–490 (2022).
Article CAS Google Scholar
McAdam, R., Masina, S. & Gualdi, S. Seasonal forecasting of subsurface marine heatwaves. Commun. Earth Environ. 4, 225 (2023).
Article Google Scholar
Jackson, J. M., Johnson, G. C., Dosser, H. V. & Ross, T. Warming from recent marine heatwave lingers in deep British Columbia fjord. Geophys. Res. Lett. 45, 9757–9764 (2018).
Article Google Scholar
Schaeffer, A. & Roughan, M. Subsurface intensification of marine heatwaves off southeastern Australia: The role of stratification and local winds. Geophys. Res. Lett. 44, 5025–5033 (2017).
Article Google Scholar
Gawarkiewicz, G. et al. Characteristics of an advective marine heatwave in the middle atlantic bight in early 2017. Front. Mar. Sci. 6, 712 (2019).
Article Google Scholar
Chen, K., Gawarkiewicz, G. & Yang, J. Mesoscale and submesoscale shelf‐ocean exchanges initialize an advective marine heatwave. JGR Oceans 127, e2021JC017927 (2022).
Article Google Scholar
Hu, S. et al. Observed strong subsurface marine heatwaves in the tropical western Pacific Ocean. Environ. Res. Lett. 16, 104024 (2021).
Article Google Scholar
Amaya, D. J. et al. Bottom marine heatwaves along the continental shelves of North America. Nat Commun 14, 1038 (2023).
Article CAS Google Scholar
Elzahaby, Y. & Schaeffer, A. Observational insight into the subsurface anomalies of marine heatwaves. Front. Mar. Sci. 6, 745 (2019).
Article Google Scholar
Scannell, H. A., Johnson, G. C., Thompson, L., Lyman, J. M. & Riser, S. C. Subsurface evolution and persistence of marine heatwaves in the Northeast Pacific. Geophys. Res. Lett. 47, e2020GL090548 (2020).
Article Google Scholar
Ryan, S. et al. Depth structure of Ningaloo Niño/Niña events and associated drivers. J. Clim. 34, 1767–1788 (2021).
Article Google Scholar
Fragkopoulou, E. et al. Marine biodiversity exposed to prolonged and intense subsurface heatwaves. Nat. Clim. Chang. 13, 1114–1121 (2023).
Article Google Scholar
Lapeyre, G. & Klein, P. Dynamics of the upper oceanic layers in terms of surface quasigeostrophy theory. J. Phys. Oceanogr. 36, 165–176 (2006).
Article Google Scholar
Wang, J., Flierl, G. R., LaCasce, J. H., McClean, J. L. & Mahadevan, A. Reconstructing the ocean’s interior from surface data. J. Phys. Oceanogr. 43, 1611–1626 (2013).
Article Google Scholar
Klein, P. et al. Diagnosis of vertical velocities in the upper ocean from high resolution sea surface height. Geophys. Res. Lett. 36, 2009GL038359 (2009).
Article Google Scholar
Qiu, B. et al. Reconstructability of three-dimensional upper-ocean circulation from SWOT sea surface height measurements. J. Phys. Oceanogr. 46, 947–963 (2016).
Article Google Scholar
Guinehut, S., Dhomps, A.-L., Larnicol, G. & Le Traon, P.-Y. High resolution 3-D temperature and salinity fields derived from in situ and satellite observations. Ocean Sci. 8, 845–857 (2012).
Article Google Scholar
Su, H., Huang, L., Li, W., Yang, X. & Yan, X. Retrieving ocean subsurface temperature using a satellite‐based geographically weighted regression model. J. Geophys. Res. Oceans 123, 5180–5193 (2018).
Article Google Scholar
Meng, L. et al. Reconstructing high-resolution ocean subsurface and interior temperature and salinity anomalies from satellite observations. IEEE Trans. Geosci. Remote Sensing 60, 1–14 (2022).
Google Scholar
Xie, H., Xu, Q., Cheng, Y., Yin, X. & Jia, Y. Reconstruction of subsurface temperature field in the South China Sea from satellite observations based on an attention U-net model. IEEE Trans. Geosci. Remote Sensing 60, 1–19 (2022).
Google Scholar
Pedlosky, J. Geophysical Fluid Dynamics (Springer, 1987). https://doi.org/10.1007/978-1-4612-4650-3.
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning with Applications in R. (Springer, 2013).
Charney, J. G. Geostrophic turbulence. J. Atmos. Sci. 28, 1087–1095 (1971).
Article Google Scholar
Ferrari, R. & Wunsch, C. The distribution of eddy kinetic and potential energies in the global ocean. Tellus A Dynamic Meteorology and Oceanography 62, 92–108 (2010).
Article Google Scholar
Talley, L. D., Pickard, G. L., Emery, W. J. & Swift, J. H. Descriptive Physical Oceanography: An Introduction (Elsevier, 2011).
Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta - Protein Structure 405, 442–451 (1975).
Article CAS Google Scholar
Chicco, D. & Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21, 6 (2020).
Article Google Scholar
Isern-Fontanet, J., Lapeyre, G., Klein, P., Chapron, B. & Hecht, M. W. Three-dimensional reconstruction of oceanic mesoscale currents from surface information. J. Geophys. Res. Oceans 113, (2008).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).
Article Google Scholar
McWilliams, J. C. Submesoscale currents in the ocean. Proc. R. Soc. A. 472, 20160117 (2016).
Article Google Scholar
Johns, B. Physical Oceanography of Coastal and Shelf Seas (Elsevier, 1983).
Roemmich, D. et al. The Argo Program: observing the global ocean with profiling floats. Oceanog. 22, 34–43 (2009).
Article Google Scholar
Riser, S. C. et al. Fifteen years of ocean observations with the global Argo array. Nat. Clim. Chang. 6, 145–153 (2016).
Article Google Scholar
Cressie, N. Statistics for Spatial Data (John Wiley & Sons, 2015).
Menemenlis, D. et al. ECCO2: high resolution global ocean and sea ice data synthesis. AGU Fall Meeting Abstracts 31, (2008).
Jean-Michel, L. et al. The Copernicus Global 1/12° oceanic and sea ice GLORYS12 reanalysis. Front. Earth Sci. 9, 698876 (2021).
Article Google Scholar
Chassignet, E. P. et al. The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system. J. Marine Syst. 65, 60–83 (2007).
Article Google Scholar
Wang, S., Jing, Z., Sun, D., Shi, J. & Wu, L. A new model for isolating the marine heatwave changes under warming scenarios. J. Atmos. Ocean Tech. 39, 1353–1366 (2022).
Article Google Scholar
Wikle, C. K., Zammit Mangion, A. & Cressie, N. A. C. Spatio-Temporal Statistics with R (CRC Press Taylor & Francis Group, 2019).
Wunsch, C. The vertical partition of oceanic horizontal kinetic energy. J. Phys. Oceanogr. 27, 1770–1794 (1997).
Article Google Scholar
Cox, D. R. & Donnelly, C. A. Principles of Applied Statistics (Cambridge University Press, 2011). https://doi.org/10.1017/CBO9781139005036.
Fotheringham, A. S., Crespo, R. & Yao, J. Geographical and temporal weighted regression (GTWR). Geogr. Anal. 47, 431–452 (2015).
Article Google Scholar
Ham, Y.-G., Kim, J.-H. & Luo, J.-J. Deep learning for multi-year ENSO forecasts. Nature 573, 568–572 (2019).
Article CAS Google Scholar
Zanna, L. & Bolton, T. Data‐driven equation discovery of ocean mesoscale closures. Geophys. Res. Lett. 47, e2020GL088376 (2020).
Article Google Scholar
Han, M. et al. A convolutional neural network using surface data to predict subsurface temperatures in the Pacific Ocean. IEEE Access 7, 172816–172829 (2019).
Article Google Scholar
Zeiler, M. D. Adadelta: an adaptive learning rate method. Preprint at http://arxiv.org/abs/1212.5701 (2012).

Download references

Acknowledgements

This work was supported by Laoshan Laboratory Science and Technology Innovation Projects (LSKJ202400203 to F.L.), Taishan Scholar Funds (tsqn201909052 to Z.J.) and Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science (2022B1212010006 to B.Z.). Computational resources were supported by Laoshan Laboratory (LSKJ202300302 to Z.J.).

Author information

Authors and Affiliations

Frontiers Science Center for Deep Ocean Multispheres and Earth System and Key Laboratory of Physical Oceanography/Academy of Future Ocean, Ocean University of China, Qingdao, China
Xiang Zhang, Zhao Jing & Xiaohui Ma
Laoshan Laboratory, Qingdao, China
Xiang Zhang, Zhao Jing, Xiaohui Ma & Tianshi Du
School of Mathematical Sciences, Ocean University of China, Qingdao, China
Furong Li
Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China
Bohai Zhang

Authors

Xiang Zhang
View author publications
Search author on:PubMed Google Scholar
Furong Li
View author publications
Search author on:PubMed Google Scholar
Zhao Jing
View author publications
Search author on:PubMed Google Scholar
Bohai Zhang
View author publications
Search author on:PubMed Google Scholar
Xiaohui Ma
View author publications
Search author on:PubMed Google Scholar
Tianshi Du
View author publications
Search author on:PubMed Google Scholar

Contributions

X.Z. conducted the analysis under F.L. and Z.J.’s instruction. F.L. and Z.J. conceived the project. F.L. proposed the statistical learning method for detecting subsurface MHW events. Z.J. provided the dynamical analysis to guide the development of the statistical learning method. X.Z., F.L. and Z.J. wrote the manuscript. B.Z. and X.M. contributed to the writing and interpretation of the results. T.D. assists in the dynamical analysis.

Corresponding authors

Correspondence to Furong Li or Zhao Jing.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Earth & Environment thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Alireza Bahadori. A peer review file is available

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer Review file

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, X., Li, F., Jing, Z. et al. Detecting marine heatwaves below the sea surface globally using dynamics-guided statistical learning. Commun Earth Environ 5, 616 (2024). https://doi.org/10.1038/s43247-024-01769-x

Download citation

Received: 04 May 2024
Accepted: 04 October 2024
Published: 23 October 2024
DOI: https://doi.org/10.1038/s43247-024-01769-x