Introduction

The scientific community widely recognizes the alarming and accelerating phenomenon of global warming since industrialization1,2,3. Compared to the pre-industrial era, the global mean surface temperature (GMST) in the last decade (2013–2022) has risen by more than 1.1 °C, as reported by WMO and the latest IPCC assessments4,5,6 (Figure S1). The rapid rate and magnitude of surface temperature increase have led to more common occurrences of phenomena such as glacier retreat, sea-level rise, and extreme weather events5,7. The energy budget is a key factor in determining the direction and extent of climate change8,9. Generally, Earth’s climate is relatively stable, with the top of the atmosphere (TOA) maintaining a state of radiative equilibrium, fluctuating slightly around zero due to internal variability10,11. However, changes in forcing factors such as aerosols and greenhouse gases emitted by human activities can lead to an imbalance in the Earth’s energy budget12. This imbalance heats different components of the climate system (such as the atmosphere and ocean), alters temperature distributions, affects the water cycle, and changes ocean and atmospheric circulation13. Consequently, it leads to a series of changes in global temperatures, the water cycle, atmospheric and oceanic movements14. Extensive evidence indicates that the primary driver of global warming is the increase in radiative forcing due to human activities since the industrial era, particularly the large-scale burning of fossil fuels, which has significantly increased the concentrations of carbon dioxide and other greenhouse gases in the atmosphere5. The warming effects of these greenhouse gases are most pronounced over time scales of 50–100 years15,16.

For more than a century, scientists have attempted to quantify the various components of the global energy balance. Early research relied on limited data obtained from surface observation stations and observation balloons, coupled with numerous assumptions, resulting in high uncertainty17,18. The advent of satellite measurements has revolutionized this field, allowing for accurate observations of the shortwave and longwave exchanges between the Earth and space at the TOA. The Earth Radiation Budget Experiment19(ERBE) initiated in 1980, and the Clouds and the Earth’s Radiant Energy System20(CERES) initiated in the early 2000s, have provided high-resolution data on the radiative fluxes at the TOA. These datasets have been extensively used in radiation budget assessments and for the tuning and evaluation of global climate models (GCMs)21,22,23,24. However, satellite observations constrained by ocean heat content (OHC) changes, only provide reliable TOA data. The complex internal energy transformations within Earth’s climate system and the energy budget at the surface still lack high-quality, high-resolution observational data11,25.

Climate models are extensively used in the study of Earth’s energy budget26,27,28,29,30. climate models show good consistency in radiative fluxes at the TOA, but they tend to overestimate downward shortwave radiation at the surface and underestimate downward longwave radiation31,32,33, leading to significant discrepancies in the simulated energy budget at the Earth’s surface24. A “Model democracy” multi-model ensembles approach is often adopted, giving equal weight to each model, which makes it difficult to distinguish between the performances of different models34,35,36,37,38. Some studies use simple linear regression with observational results to obtain optimal estimates, which, while intuitive, makes it hard to assess the uncertainty of these estimates39,40. Although it is straightforward to define a “model performance index”, demonstrating its relevance and usefulness for predictive accuracy presents a more complex challenge41.The Reliability Ensemble Averaging (REA) method has been employed to enhance the credibility of subcontinental-scale climate change projections. This approach evaluates models based on their ability to reproduce current climate conditions and the convergence of simulated changes across different models, offering a quantitative measure of reliability42. The REA method evaluates models based on their performance in reproducing current climate and the convergence of simulated changes across models. By minimizing the influence of outlier or poorly performing models, REA significantly reduces uncertainty ranges and provides a quantitative measure of reliability. The Bayesian Model Averaging (BMA) method was introduced in studies to integrate model outputs and observational data, providing a framework for assessing probability distributions of regional temperature changes43,44. BMA refines the REA method by weighting models according to bias and convergence criteria, thereby improving prediction accuracy and consistency. Increasing evidence suggests that observationally-constrained weighting improves the accuracy and reliability of predictions34,45.

CMIP6 Model Performance in Simulating Surface Radiation Budget Components and BMA Weight Allocation

The fundamental concept of the BMA (see Data and Methods) is to consider all plausible CMIP6 models (see Data and Methods) and assign each a weight proportional to how well the model matches observation-constrained data. In this study, our observational constraints are derived from surface observation data. Specifically, for Surface Solar Radiation (SSR), we used data reconstructed by Jiao et al. through an improved partial convolutional neural network46 (see Data and Methods). We selected 34 CMIP6 models (Table S2) that include the SSP245 scenario for analysis and used single run results. Figure 1a-d present the comparison results using Taylor diagrams47, quantifying each model’s predictions for the various radiative components based on correlation coefficients (CC), standard deviation (STD), and root mean square error (RMSE), and comparing the simple models’ ensembles (SME) and BMA results (based on transient phenomena). They also reveal that, except for upward shortwave radiation, the model outputs for other radiation components generally show good correlation with observations (CC above 0.9) and smaller RMSE, with distributions that are relatively close. In contrast, the results for upward shortwave radiation exhibit significant discrepancies, with the SME and BMA averages showing larger deviations from observations. More specifically, in the BMA method, the weight of each model is its posterior probability, which combines its prior probability (reflecting initial belief about the model) and its likelihood (how well the model matches the observed data). This ensures higher weights for models that better reproduce observed climate patterns, improving overall predictive accuracy. The BMA method, due to its approach of creating a weighted average over all included models, consistently provides one of the best-performing estimates in the ensemble. Further comparison of the PDFs for the four components—BMA constraints, SME, and observations— shows that the BMA constraints increase the consistency of the PDFs of the individual components with those from the observations (Figure S4). Figure 1e displays the model weights for each radiative component as determined by the BMA method. It is evident that there are significant differences in the weights assigned to each model under the BMA approach.

Fig. 1: Model simulation and Assessment of Radiation Budget Components.
figure 1

Comparison of Taylor diagrams for (a) Downward Shortwave Component, (b) Downward Longwave Component, (c) Upward Shortwave Component, (d) Upward Longwave Component in various models, BMA fitting, and simple average fitting, and (e) Distribution of BMA weights for each radiation component across different models. The radius in the diagram represents the standard deviation size, and the angle indicates the correlation coefficient with observational data. The comparison period is from 1960 to 2022, with the SSP245 scenario extended into the future. Correlations were obtained by calculating the monthly data correlation between each station and the model data, then averaging across all stations.

Recent Global Radiation Imbalance from a Surface Perspective Estimated by CMIP6 Models under BMA Constraint

The surface energy budget components primarily consist of incoming solar radiation (rsds), outgoing solar radiation (rsus), downward longwave radiation (rlds), upward longwave radiation (rlus), surface latent heat (hfls), and surface sensible heat (hfss). The surface energy budget can be expressed as:

$${EEI}={rsds}-{rsus}+{rlds}-{rlus}-{hfls}-{hfss}$$
(1)

However, there remains some differences in the surface radiation budget components as reported by different teams24,48,49,50,51 (see Table S1 and Figure S2). Table S1 and Figure S2 also provides the uncertainty for each component and compares it with the most recent estimates from other teams (except for Stephens et al.51 which covers 2000–2010, others are averages for 2000–2014). From Table S1, the BMA’s Surface Earth Energy Imbalance (EEI) fitting result averages 0.7 Wm² during the period of 2000–2014, while the SME result is 1.7 Wm−2, and CERES’s average is 0.63 Wm−2 for the same period. It is important to note that CMIP6 models inherently possess certain systematic errors in parsing fluxes. For instance, these models typically only account for shallow soil layers without considering deep ground heat absorption52. Additionally, in CMIP6, either ice sheets are assumed to be in a stable state or emissions are assumed to be constant53. These systematic errors partially explain the excessively high SME EEI results. By incorporating observational constraints, the BMA method mitigates the impact of these systematic errors, resulting in lower EEI data. Besides, Research by Wild et al. indicates that CMIP5 models significantly overestimate downward shortwave radiation and underestimate upward longwave radiation compared to surface observations (Figure S3), and our study shows that CMIP6 models still exhibit the same issues39. After being constrained by observational data, the downward shortwave radiation decreased to 186 Wm-2, and the upward longwave radiation increased to 401 Wm-2, which are similar to the results of the latest IPCC AR6 report54. The BMA results show improvements in upward longwave radiation, with closer alignment to observational values (Fig. 1d). This may be linked to significantly higher global surface temperature anomalies in the early 21st century, which account for changes in Arctic ice surface temperatures6,55. According to the Stefan-Boltzmann law, this results in an approximate 1.5 Wm-2 increase in thermal radiation, corresponding to a temperature anomaly increase of about 0.25 °C. Surface sensible and latent heat fluxes, lacking observational data for BMA constraint, are directly calculated using the SME method in this study, with latent heat slightly higher than previous estimates but closer to those based on global precipitation observations51(Table S1), and sensible heat generally consistent with other estimates except Kato (2018)50 (Table S1). Figure 2 presents the updated surface energy budget diagram during 2000–2022.

Fig. 2: The global Radiation Budget from a surface perspective in this study.
figure 2

The numbers indicate the best estimates and their uncertainties (at 95% confidence level) for the magnitudes of the globally averaged energy balance components, obtained using the BMA method constrained by observational data from 34 CMIP6 models during 2000–2022, representing present-day climate conditions since the start of the 21st century. Downward shortwave radiation is constrained by reconstructed SSR data from Jiao et al. 2023, while upward shortwave and longwave radiation are constrained using GEBA observational station data. Due to the lack of observational data for latent and sensible heat fluxes, SME results are used directly. Uncertainties are calculated from the weighted ensemble standard deviation of the models.

On the Uncertainties in EEI Estimation

It is important to note that the method of calculating EEI using various radiative fluxes typically involves considerable uncertainties. The uncertainties in EEI estimates derived from observed surface energy imbalance or net ocean heat flux from space observations can be as high as ±15–17 Wm-²11,50,51,56. The significant uncertainty primarily originates from the method of summing components. This is because the magnitudes of individual radiative fluxes are significantly higher than EEI, and thus their uncertainties are also much larger than those of EEI, which are then propagated into the EEI estimates. An approach was proposed to obtain the best estimates of radiation components from the Earth system model, constrained by surface observations and CERES satellite data at the top of the atmosphere (TOA)57. However, the uncertainty assessment in this approach still relies on empirical estimates.

Observing the energy change trends in OHC seems to offer a way to calculate EEI changes with lower levels of uncertainties58,59. These energy changes are closely related to anthropogenic or natural forcing, mainly manifesting on decadal or longer scales60. Although the majority of the EEI is stored in OHC, the interannual variability in OHC does not show strong responses in certain volcanic years (Fig. 3, Figure S5). These years exhibit noticeable troughs in the EEI series. This can be explained by the margin of error and the changing proportion of OHC in EEI (Figure S7). Some studies derive EEI directly from the time derivative of OHC divided by a fixed ocean absorption rate λ (e.g., λ = 0.90), which may introduce significant errors in interannual variations58.

Fig. 3: Time Series Analysis of EEI from 1961 to 2022.
figure 3

The red line represents the CMIP6 ensemble results obtained through BMA constraint, the gray line indicates the results of a simple models’ ensemble (SME), the blue line shows the EEI results from the SME at TOA, the purple line shows the EEI results from the BMA at TOA, the black line displays the estimates from CERES satellite observations, the orange line represents the OHC changes calculated from multiple OHC data sets, the green line represents the Earth’s energy change calculated by Schuckmann et al. 2023 based on multiple surface energy observations, and the brown line shows the EEI results calculated by Marti et al. 2022 based on space altimetry and space gravimetry. The data from Schuckmann et al. and the OHC data have been smoothed using a 10-year moving average after first-order differencing to convert energy data into energy change data for comparison with EEI data. A more detailed comparison with the OHC data is shown in Fig. 4.

In this study, the uncertainty of each radiative component is calculated based on the ensemble standard deviation (\({{\rm{\sigma }}}\)). For the weighted average results, the weighted \({{\rm{\sigma }}}\) is used. The uncertainty range is determined by the 95% confidence interval, calculated as\(1.96\,\times {{\rm{\sigma }}}\). Considering that the components used to calculate EEI are not independent variables but are highly interdependent, the correlation matrix between variables should be considered when calculating uncertainties instead of using the root sum square estimation of uncertainties of individual components as previously done11,50,51,56. The covariance matrix between components is calculated for each year, which is then used to determine the uncertainty range of EEI while considering the correlations between components (Figure S10 shows the covariance matrix for 2000). Specific methods can be found in the Data and Method section. Therefore this study arrives at a narrower range of uncertainty of EEI ( ± 1.0 Wm-²).

Table S1 reveals that the two largest numerical values in the calculation of EEI are derived from longwave radiation components, indicating the impact of the warming effect of greenhouse gases. This aligns with the IPCC conclusion that the positive EEI primarily results from the increase in atmospheric greenhouse gases61. The optimal estimate for downward longwave radiation at the surface for the years 2000–2014 averages 345.5 Wm-², and for upward longwave radiation, it is 401.4 Wm-², both slightly higher than the IPCC AR6 estimates (342 Wm-² and 398 Wm-², respectively). In comparison, the contribution of the shortwave component to the energy imbalance seems slightly lesser in magnitude than that of the longwave component, with the best estimate for downward shortwave radiation for 2000–2014 averaging 185.8 ± 6.2 Wm-². Additionally, among the various radiative components, the shortwave radiation has the largest uncertainty range (Table S1). This study, utilizing the most precise observational baseline data available46, constrained the downward shortwave radiation, thereby further enhancing its accuracy. By applying the BMA method, this study obtained lower EEI levels (from 1.7 Wm-² to 0.7 Wm-²), which are comparable to that derived from the OHC observations (0.6 Wm-², Fig. 4).

Fig. 4: Time series for changes in EEI and OHC of the upper 2000 m.
figure 4

Blue thick curve represents the BMA fitted EEI results, orange thick curve shows mean of five different OHC datasets (Levitus et al. 2012, Cheng et al. 2017, Ishii et al. 2017, Good et al. 2013, Roemmich et al. 2009), green thick curve denotes EEI results obtained by calculating the sum of increased OHC, land heat content, heat used for melting fixed and floating ice, and heat for atmospheric warming (Schuckmann et al. 2023).The brown thick curve represents the EEI series obtained by calculating the rate of change of OHC and ocean absorption rate (Matri et al., 2022). Changes are obtained by calculating the first-order difference, and the changes series have been smoothed with a 10-year moving average.

Rationality of the EEI variations

Figure 3 presents the optimal estimate time series of global EEI from 1961 to 2022 derived from CMIP6 models. The data after 2014 is spliced using the SSP245 scenario, as described in other studies62. Besides, Fig. 3 also presents three types of EEI estimates at TOA: 1) the simple ensemble average of the CMIP6 models’ output (TOA-SME); 2) the TOA results constrained by BMA; and 3) the CERES estimate. The BMA estimates for TOA are constrained using CERES EBAF data as observational data. From a long-term trend perspective, the TOA EEI results of the CMIP6 models (both SME and BMA results) exhibit a similar trend to the surface EEI. In terms of absolute EEI values, the SME results show significantly higher levels. This also corroborates findings that CMIP6 models tend to generally overestimate EEI24. By using CERES EBAF data for BMA constraint, a lower TOA EEI (blue solid line in Fig. 3) was also obtained. Since both the SME and BMA results are derived through multi-model ensembles, they are not able to reproduce inter-annual variations as the CERES results do. However, whether at the TOA or at the surface of Earth, the changes in EEI seem to be consistent with each other from the BMA-constrained results.

It is worth noting that the TOA EEI and surface EEI results from SME are very consistent, indicating good internal consistency within the CMIP6 models, given that these two quantities should be the same on a global scale. Although the BMA results for TOA EEI and surface EEI are similar in numerical level and closer to the observations, their interannual variations show some differences. The primary reason for this discrepancy is the varying quality of the observational data used for constraints. There is no good uniformity between the CERES EBAF data used for TOA constraints and the GEBA station data and improved partial convolutional neural network SSR data used for surface constraints. Therefore, it is difficult to obtain consistent EEI sequences for both the surface and TOA.

Whether at the TOA or the surface, compared to the SME results, the BMA EEI results show a higher rising trend during these decades (from 0.13 ± 0.05 Wm−2decade−1 to 0.17 ± 0.09 Wm−2decade−1at the TOA, from 0.13 ± 0.04 Wm−2 decade−1 to 0.26 ± 0.05 Wm−2 decade−1at the surface). This partially confirms findings that coupled climate models tend to underestimate the increase in global radiation trends63. However, in this paper, the BMA results show a surface EEI trend of 0.26 ± 0.11 Wm−2decade−1 (0.17 ± 0.09 Wm−2decade−1 for EEI trend at TOA) from 2000 to 2020, both still lower than the CERES result (0.38 ± 0.02 Wm−2decade−1). This is consistent with findings from independent studies, such as the OHC acceleration, which has been estimated at 0.30  ±  0.28 Wm−2decade−1 from 2002 to 202064,65.

Additionally, significant extreme values in the EEI series can be observed in certain years (Fig. 3). For example, extremely low values occurred in 1963, 1983, and 1991–1992, typically associated with extreme natural events such as volcanic eruptions. These volcanic activities released vast amounts of volcanic ash and sulfur dioxide into the atmosphere, leading to substantial absorption and reflection of solar radiation, ultimately causing a sharp decrease in EEI66.

The ocean, being the largest heat reservoir on Earth, plays a crucial role in absorbing and storing the excess energy from the Sun. Numerous studies have shown that most of the heat in the Earth’s energy budget due to the greenhouse effect is absorbed by the oceans59,67,68. Therefore, EEI should directly correspond with changes in OHC. Figure 4 displays the relationship between changes in OHC and EEI. Changes in OHC are obtained by first-order differencing of OHC data, followed by filtering with a 10-year moving average (essentially a 10-year differencing). Studies have shown that during the period from 1972 to 2005, CMIP5 simulations of ocean heat content (OHC) and Earth’s energy imbalance (EEI) tended to overestimate values compared to observational data. Prior to 2000, CMIP5 ensemble EEI estimates aligned well with observed OHC changes, but discrepancies appeared in the following years69,70. This paper shows that the EEI, constrained by BMA, has remained consistent with OHC changes since the 1960s (Fig. 4). The anomalous variations in the OHC time series before 1970 are likely due to measurement errors from early observation instruments, resulting in higher uncertainty. The moving average plot shows that both EEI and OHC values began to show a continuous increasing trend around the 1980s, reflecting their decadal-scale consistency. The average oceanic heat absorption estimated for OHC changes from 2000–2014 is 0.6 ± 0.13Wm-² (based on OHC ensemble mean series in Fig. 4), slightly lower than the average EEI value for 2000–2014 (0.70 ± 1.0Wm-²). OHC datasets (Figure S5) show a rapid rising trend since the 1990s, although some studies attribute this to the uncertainty in OHC measurement methods71,72. However, this paper shows that the decadal changes reflected in ocean heat absorption characteristics are largely consistent with EEI changes, with a calculated correlation of 0.732 and a 99% confidence level indicating a significant relationship.This result is consistent with findings from several other studies that have demonstrated the strong linkage between OHC and EEI11,64. Similarly, using space geodetic observations through sea-level budget assessments to determine the thermal expansion of the ocean can also estimate changes in OHC, thereby further estimating EEI. Figure 3 presents the global EEI estimates obtained through space altimetry and space gravimetry58, showing good consistency. This indicates that the EEI and its component estimates calculated by CMIP6 under BMA observational constraints are quite satisfactory.

The possible link between the intensification of EEI and the recent rapid warming

From Fig. 3, it can be observed that EEI levels significantly increased after the 1990s. The average value from 1961 to 1994 was 0.22 Wm-², which increases to 0.32 Wm-² if the effects of major volcanic activities are excluded. From 1995 to 2022, the average value rises to 0.80 Wm-² (see Figure S6). T-tests indicate a statistically significant change around 1995 with a 95% confidence level. This suggests a significant enhancement in EEI in the late 1990s. Furthermore, a similar t-test conducted on the GMST series also revealed a statistically significant change in surface temperatures around 1997 ~ 1998, indicating that the enhancement of EEI aligns with the rise in surface temperatures at the end of the 20th century. This phenomenon is consistent with the physical relationship described by \({{\rm{EEI}}}={{\rm{F}}}-{{\rm{\alpha }}}\Delta {{\rm{T}}}\), and is supported by recent findings73. In addition, the upward trend in EEI also significantly accelerated after 1995, from 0.06 ± 0.12 Wm−2decade−1 before 1995 (0.13 ± 0.18 Wm−2decade−1 excluding major volcanic activities) to 0.21 ± 0.09 Wm−2decade−1 after 1995. This also corroborates the accelerated rise in OHC after 19956.

Further, based on the Earth system energy component dataset59, we observed a possible change in the proportion of the energy change caused by Ocean Heat Content (OHC) to the total energy of EEI around 1995. Before 1995, this proportion exhibited considerable fluctuations, which might suggest variability in OHC during this period (with substantial interannual variability), especially between 1980 and 1995, when the proportion of OHC to EEI was likely at a lower level. However, after 1995, there appears to be a shift in variability, with the OHC/EEI proportion stabilizing around 90% (Figure S7). While this change may correspond with variations in EEI estimates based on CMIP6 and global surface temperature records from the same period, the significant uncertainties surrounding these estimates make it difficult to draw definitive conclusions. The apparent stabilization of the OHC/EEI proportion could result from uncertainties in early ocean heat content observations, as well as external forcing factors (such as greenhouse gas emissions) potentially diluting changes caused by internal ocean variability. These factors contribute to more stable interannual variations in the OHC/EEI proportion, but the overall uncertainty remains too large to reach a conclusive determination. To verify the changes in the impact of internal ocean variability on climate, we examined the influence of El Niño-Southern Oscillation events on global mean surface temperature (GMST). Before the super El Niño event of 1997–1998, El Niño years generally corresponded to high-temperature years, and La Niña years corresponded to low-temperature years. However, after 1998, this pattern changed significantly: El Niño years still tended to be high-temperature years, but low-temperature extremes were not always observed during La Niña events, and some instances even showed high-temperature extremes (Figure S8). This transition is evident in the temperature changes between El Niño and La Niña years, indicating that the impact of internal ocean changes is being masked by stronger external forcing, reducing the correlation between oscillation and temperature changes (although further removal of the influence of external forcing changes may be needed here to see more clearly). These observations may reveal deeper connections between OHC and the global climate system, providing a crucial perspective for understanding the mechanisms of global climate change and predicting future trends.

Summary and discussion

The main findings of this study indicate that the CMIP6 models demonstrated certain capabilities in simulating radiation components. Although the BMA constraint is not always optimal for each component, it generally outperforms most individual datasets and significantly exceeds the results of simple averaging. The average EEI estimate for the period from 2000 to 2014, calculated after applying the BMA constraint to each radiation budget component, performs better than simple ensemble averaging and aligns closely with estimates from several international teams. This study also updates the uncertainty estimate for EEI, taking into account the interdependence of various radiation components. From 1961 to 2022, the primary contribution to the increasing surface EEI trend comes from longwave radiation, reflecting the impact of greenhouse gas warming. The energy balance analysis shows that shortwave components contribute more to the uncertainty, highlighting the importance of more precise shortwave observational data. Since the mid-1990s, both simple ensemble and BMA averages have shown a statistically significant increase in surface EEI. After 1995, both approaches indicate a higher EEI, consistent with more accurate TOA EEI estimates. In terms of decadal variations, the BMA-constrained EEI changes show high synchrony with OHC changes.

The increase in EEI is also reflected in similar estimates obtained through other methods. For instance, the proportion of OHC in EEI, calculated based on ocean, atmosphere, cryosphere, and terrestrial energy59,74, shows a noticeable increase and exhibits lower interannual variability. Additionally, global surface temperature anomalies tend to be higher at elevated EEI levels. Further analysis suggests that this significant increase in EEI may be linked to some recent abnormal climate warming phenomena, seemingly reducing the fluctuations in GMST caused by internal variability of the climate system (e.g., ENSO). Our research offers a perspective for understanding current and future climate change patterns: from the viewpoint of ultimate energy flows, recognizing the contribution of human activities to Earth system climate changes. This is a useful supplement to the prevailing research paradigm in climate numerical simulation (physical, chemical, and biological).

When calculating energy imbalance directly using observational datasets of surface radiation components, it is often difficult to achieve closure between the datasets, resulting in EEI absolute values and uncertainty ranges that are often too large51,56. In contrast, the CMIP6 model calculations of surface energy imbalance provide relatively reasonable absolute value levels and uncertainty ranges (for 2000–2014, the SME result is 1.7 ± 1.0Wm-², and the BMA result is 0.7 ± 1.0Wm-²). Additionally, both SME EEI and BMA EEI show an increasing trend after 2000, which is consistent with the warming trend of the ocean64,75. The higher increasing trend of BMA EEI compared to SME EEI also indicates that the CMIP6 models tend to underestimate the increasing trend of EEI63.

It is worth noting that we also conducted a systematic review of the uncertainty in each component of the surface energy budget and in EEI estimates, which appears to have reduced the numerical estimate of uncertainty. However, models still face many challenges in simulating most of the components of the surface energy budget46. Apart from the surface shortwave solar radiation, other components still rely heavily on very sparse terrestrial in situ observational networks for constraint24,39,49. Although our EEI estimate and those based on the total energy estimate of the ocean, land, atmosphere, and surface are highly consistent in terms of trend changes, there are still certain differences in absolute values (Figs. 34). Therefore, the uncertainties in EEI and its components in our study should not be overlooked.

Data and Methods

Global Temperature Observational Data

China-MST2.0, also known as China-Merged Surface Temperature (China-MST or CMST), is a new global surface temperature dataset developed by a team at Sun Yat-sen University. It is created by merging China-LSAT (China Land Surface Air Temperature or C-LAST)76 as the land component and ERSSTv5 (Extended Reconstructed Sea Surface Temperature, version 5)77 as the ocean component78. CMST-interim79 has been partially used by IPCC AR6. CMST2.0 includes three variants: CMST2.0-Nrec (no reconstruction), CMST2.0-Imax, and CMST2.0-Imin, which are differentiated based on the reconstructed surface air temperature over Arctic Sea ice areas. The most recent reconstruction, CMST2.0-Imax, achieves over 95% coverage in the Northern Hemisphere and serves as the primary baseline data for monitoring global temperature changes and trend estimation. The CMST2.0 dataset is currently available at http://www.gwpu.net.

Radiation Observational Data

Downward shortwave solar radiation is one of the most crucial elements and is the most abundantly observed at the Earth’s surface12,51,80.A recent high-quality observational dataset of global land surface solar radiation (SSR), excluding Antarctica, was developed by integrating all available surface solar radiation observations, including existing homogenized SSR results46. They reconstructed a long-term (1955–2018) global land (excluding Antarctica) SSR anomaly dataset using an improved Partial Convolutional Neural Network deep learning method based on the 20th Century Reanalysis version 3 (20CRv3).

GEBA is an archive providing global, regional, and local energy balance data. It collects, verifies, and publishes a wide range of surface and atmospheric energy balance measurement data, crucial for understanding fundamental processes of the Earth’s climate system, validating climate models, and monitoring climate change. All energy fluxes stored in the GEBA database have undergone “physical reasonableness” checks, with random measurement errors of about 5% for monthly averages and about 2% for annual averages81. This study used data from 98 GEBA sites, with the site distribution shown in Figure S9.

For surface solar radiation (SSR) data, the study used a 5°x5° resolution gridded dataset46, integrating all available SSR observations, including existing homogenized SSR results. This dataset was derived using an improved Partial Convolutional Neural Network deep learning method, offering high reliability in filling and reconstructing missing values.

Ocean Heat Content Data

Ocean Heat Content data were derived from multiple datasets, including global average values of 0–2000 m ocean heat content from 1960 to 202367,68,82,83,84. Changes in ocean heat content were calculated using first-order differencing and smoothed by a ten-year moving average85.

This study assumes that the errors of each ocean heat content dataset are independent of each other. The uncertainty range of the ocean heat content results is calculated using the error propagation method. Assuming the errors of each dataset are independent, the uncertainty range of the results is obtained by dividing the uncertainty of each dataset by the number of datasets, summing the squared values, and then taking the square root.

Satellite Data

Satellite observation data were sourced from the CERES (Clouds and the Earth’s Radiant Energy System) Energy Balanced And Filled (EBAF) dataset, providing monthly and daily values of global top-of-atmosphere and surface radiation budgets. These data, derived from multiple CERES sources, have undergone several iterations and calibrations to ensure the highest data quality and accuracy50.

CMIP6 Model Data

The Coupled Model Intercomparison Project (CMIP) has evolved into a standard framework for assessing and comparing results from multiple climate models. The latest phase, the Sixth Phase (CMIP6), represents the most advanced model outputs, providing scientists with extensive data widely used for studying various issues from climate change to Earth system processes28. This paper selects monthly radiative component forecast data from 34 models in CMIP6, covering the years 1865–2022 (Table S2). The data from 1865 to 2014 come from CMIP6’s “historical all-forcing simulation” experiments. For data from 2015 onwards, the SSP245 data is used to extended62. This includes surface downward shortwave radiation, surface downward longwave radiation, surface upward shortwave radiation, surface upward longwave radiation, latent heat component, and sensible heat component. Due to the different resolutions of various models, bilinear interpolation is first used to interpolate the data from each mode. Models often have multiple ensemble members with slightly different initial conditions. The research by Wild et al. has shown that selecting specific ensemble members is not crucial, as it has only a minor impact on the calculations for multi-model fitting24. Therefore, this study selected only one member from each model and then constrained it with observational data.

BMA Method

The differentiation of models is a key approach in this study. Our goal is to reduce the impact of poorly predicting models on the results through observational constraints, while selecting and assigning higher weights to models with superior predictions. A highly effective method for this purpose is Bayesian Model Averaging (BMA), a technique based on Bayesian statistics. Unlike traditional model selection, which involves choosing a single best model or simple averaging of multiple models, BMA balances and combines information from multiple models or hypotheses while retaining the uncertainty inherent in the multi-model approach.

The fundamental concept of BMA is to consider all possible models and assign each a weight proportional to how well the model matches the data. More specifically, this weight is the model’s posterior probability, which combines its prior probability with its likelihood given the data.

For the target radiation component y, with observational data yt and model data \({M}_{1},{M}_{2},{\mathrm{..}}.,{M}_{K}\), the probability density function of BMA can be expressed as follows:

$${{\rm{p}}}\left({{\rm{y}}}|({M}_{1},{M}_{1},\cdots ,{M}_{K})\right.={\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{{\rm{p}}}}_{{{\rm{k}}}}\left({{{\rm{M}}}}_{{{\rm{k}}}}|{{{\rm{y}}}}^{{{\rm{t}}}}\right){{\rm{p}}}}_{{{\rm{k}}}}\left({{\rm{y}}}|\left({{{\rm{M}}}}_{{{\rm{k}}}}\right)\right)$$
(2)

\({{{\rm{p}}}}_{{{\rm{k}}}}({{\rm{y|}}}({{{\rm{M}}}}_{{{\rm{k}}}}))\) represents the probability distribution prediction of y in each individual model, and \({{{\rm{p}}}}_{{{\rm{k}}}}({{{\rm{M}}}}_{{{\rm{k}}}}{{|}}{{{\rm{y}}}}^{{{\rm{t}}}})\) represents the likelihood of that model being the optimal model, i.e., the posterior probability. We can express the posterior probability as a weight\({{{\rm{\omega }}}}_{{{\rm{k}}}}\), which can be written as:

$${{\rm{p}}}({{\rm{y|}}}({M}_{1},{M}_{1},\cdots ,{M}_{K}))={\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{k}}}}{{{\rm{p}}}}_{{{\rm{k}}}}({{\rm{y|}}}({{{\rm{M}}}}_{{{\rm{k}}}}))$$
(3)
$${\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{k}}}}=1$$
(4)

For the target radiative component, it is assumed that its probability density function approximates a Gaussian distribution, which can be simply expressed as follows:

$${{{\rm{p}}}}_{{{\rm{k}}}}\left({{\rm{y}}}|({{{\rm{M}}}}_{{{\rm{k}}}},{{{\rm{y}}}}^{{{\rm{t}}}})\right. \sim {{\rm{N}}}({{{\rm{\mu }}}}_{{{\rm{k}}}},{{{\rm{\sigma }}}}_{{{\rm{k}}}}^{2})$$
(5)

Here, \({{{\rm{\mu }}}}_{{{\rm{k}}}}\) represents the mean, and \({{\rm{\sigma }}}\) is the standard deviation of the data. \({{{\rm{\mu }}}}_{{{\rm{k}}}}\) can be obtained through linear regression, while \({{{\rm{\sigma }}}}^{2}\) and \({{{\rm{\omega }}}}_{{{\rm{k}}}}\) are determined using the maximum likelihood method. Let s and t represent the spatial and temporal coordinates, respectively, and \({{{\rm{M}}}}_{{{\rm{kst}}}}\) be the prediction result of model k at \(({{\rm{s}}},{{\rm{t}}})\). Assuming that the forecast error is independent in both time and space, the log-likelihood function of the BMA model under given observational constraints can be written as follows:

$${{\rm{l}}}\left({{\rm{\theta }}}\right)={\sum}_{{{\rm{s}}},{{\rm{t}}}}{{\rm{logp}}}\left({{\rm{y}}}|({M}_{1{{\rm{st}}}},{M}_{2{st}},\cdots ,{M}_{{Kst}})\right.$$
(6)

Equation (4) does not have an analytical solution, and common methods for solving it include the Expectation-Maximization Method (EM) or the Markov Chain Monte Carlo Method. In this study, the EM method is employed. EM is an iterative algorithm that alternates between an Expectation (E) step and a Maximization (M) step. The parameters are initialized as follows:

$${{\rm{j}}}=0,{\,{{\rm{\omega }}}}_{{{\rm{k}}}}^{({{\rm{j}}})}=\frac{1}{{{\rm{K}}}},\,{{{\rm{\sigma }}}}_{{{\rm{st}}}}^{2}=\frac{1}{{{\rm{K}}}}{\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}\left({\bar{{{\rm{\mu }}}}}_{{{\rm{st}}}}-{\left({{{\rm{\mu }}}}_{{{\rm{k}}}{{\rm{st}}}}\right)}^{2}\right)$$
(7)

E-step: Set \({{\rm{j}}}={{\rm{j}}}+1\), and then compute log-likelihood function:

$${{{\rm{l}}}({{\rm{\theta }}})}^{({{\rm{j}}})}={\sum}_{{{\rm{s}}},{{\rm{t}}}}\log {\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{k}}}}^{({{\rm{j}}})}{{{\rm{p}}}}_{{{\rm{k}}}}\left.({{{\rm{\mu }}}}_{{{\rm{k}}}{{\rm{st}}}},{{{\rm{\sigma }}}}^{2({{\rm{j}}})})\right)$$
(8)

M-step: Update the weights and variance:

$${{{\rm{\omega }}}}_{{{\rm{k}}}}^{({{\rm{j}}})}=\frac{1}{{{\rm{N}}}}{\sum}_{{{\rm{s}}},{{\rm{t}}}}\frac{{{{\rm{\omega }}}}_{{{\rm{k}}}}^{({{\rm{j}}}-1)}{{{\rm{p}}}}_{{{\rm{k}}}}({{{\rm{\mu }}}}_{{{\rm{k}}}{{\rm{st}}}},{{{\rm{\sigma }}}}^{2({{\rm{j}}}-1)})}{{\sum }_{{{\rm{l}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{l}}}}^{({{\rm{j}}}-1)}{{{\rm{p}}}}_{{{\rm{l}}}}({{{\rm{\mu }}}}_{{{\rm{lst}}}},{{{\rm{\sigma }}}}^{2({{\rm{j}}}-1)})}\,$$
(9)
$${\bar{{{\rm{\mu }}}}}_{{{\rm{st}}}}=\,{\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{k}}}}^{({{\rm{j}}})}{{{\rm{\mu }}}}_{{{\rm{k}}}{{\rm{st}}}}$$
(10)
$${{{\rm{\sigma }}}}_{{{\rm{st}}}}^{2({{\rm{j}}})}={\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{{{\rm{\omega }}}}_{{{\rm{k}}}}^{\left({{\rm{j}}}\right)}\left({\bar{{{\rm{\mu }}}}}_{{{\rm{st}}}}-{\left({{{\rm{\mu }}}}_{{{\rm{k}}}{{\rm{st}}}}\right)}^{2}\right)$$
(11)

Examine the changes in \({{{\rm{l}}}({{\rm{\theta }}})}^{({{\rm{j}}})}\) and \({{{\rm{l}}}({{\rm{\theta }}})}^{({{\rm{j}}}-1)}\). If they are less than a predefined error limit \({{\rm{\varepsilon }}}\), stop the iteration. Otherwise, return to the E-step. Continue this process until convergence is achieved, after which the posterior probability \({{{\rm{\omega }}}}_{{{\rm{k}}}}\) and variance \({{{\rm{\sigma }}}}^{2}\) are obtained.

Uncertainty estimation for addition /subtraction of highly correlated variables

When calculating the uncertainty range of EEI, it is important to note that the radiative components are not independent of each other. If calculated as if they are independent distributions, the resulting uncertainty range could be excessively large51. For radiative components X and Y, their uncertainties are each calculated by 1.96*\({{\rm{\sigma }}}\). If they are independent of each other, then the standard deviation of their sum satisfies:

$${{\rm{\sigma }}}({{\rm{X}}}+{{\rm{Y}}})=\sqrt{{{{\rm{\sigma }}}({{\rm{X}}})}^{2}{+{{\rm{\sigma }}}({{\rm{Y}}})}^{2}}$$
(12)

Therefore, the total uncertainty is the geometric mean of the uncertainties of each component. However, the situation becomes more complex when there is a correlation between X and Y. In this case, the variance of the sum of the two variables includes not only their individual variances but also the covariance between them. At this time, the standard deviation of the sum should satisfy:

$${{\rm{\sigma }}}({{\rm{X}}}+{{\rm{Y}}})=\sqrt{{{{\rm{\sigma }}}({{\rm{X}}})}^{2}{+{{\rm{\sigma }}}({{\rm{Y}}})}^{2}+2{{\rm{Cov}}}({{\rm{X}}},{{\rm{Y}}})}$$
(13)

Covariance measures the degree of similarity in the variation trends of two variables. It is positive when the variables are positively correlated and negative when they are negatively correlated. It is evident that when the covariance is zero, there is no correlation between the variables, and in this case, the formula for the standard deviation of their sum is the same as when they are independent. The relationship between the Pearson correlation coefficient \({{\rm{\rho }}}\) and covariance is as follows:

$${{\rm{Cov}}}({{\rm{X}}},{{\rm{Y}}})={{\rm{\rho }}}({{\rm{X}}},{{\rm{Y}}}) \, {{\rm{\cdot }}}{{\rm{\sigma }}}({{\rm{X}}}) \, {{\rm{\cdot }}}{{\rm{\sigma }}}({{\rm{Y}}})$$
(14)

When X and Y are completely positively correlated, the standard deviation of their sum reaches its maximum value \({{\rm{\sigma }}}({{\rm{X}}})+{{\rm{\sigma }}}({{\rm{Y}}})\), and when they are completely negatively correlated, it reaches its minimum value \(\left|{{\rm{\sigma }}}({{\rm{X}}})-{{\rm{\sigma }}}({{\rm{Y}}})\right|\). Considering that the sum of the various radiative components tends to zero, the correlation between these components should not be overlooked when calculating EEI. By stacking the uncertainty ranges, we are able to obtain a narrower range of uncertainty than previous estimates.