Abstract
The global fraction of anthropogenically emitted carbon dioxide (CO2) that stays in the atmosphere, the CO2 airborne fraction, has been fluctuating around a constant value over the period 1959 to 2022. The consensus estimate of the airborne fraction is around 44%. In this study, we show that the conventional estimator of the airborne fraction, based on a ratio of changes in atmospheric CO2 concentrations and CO2 emissions, suffers from a number of statistical deficiencies. We propose an alternative regression-based estimator of the airborne fraction that does not suffer from these deficiencies. Our empirical analysis leads to an estimate of the airborne fraction over 1959–2022 of 47.0% (± 1.1%; 1σ), implying a higher, and better constrained, estimate than the current consensus. Using climate model output, we show that a regression-based approach provides sensible estimates of the airborne fraction, also in future scenarios where emissions are at or near zero.
Similar content being viewed by others
Introduction
The amount of anthropogenically emitted carbon dioxide (CO2) that stays in the atmosphere, the so-called airborne fraction (AF), is an important quantity for the study of CO2 absorption in the carbon cycle of the Earth system1,2,3. In the literature, it has been investigated and debated whether the AF has increased, decreased, or remained constant over the period from 1959 to today, during which atmospheric measurements of CO2 concentrations have been available. Earlier studies found evidence of an increasing AF4,5,6, even though measurement and estimation uncertainty make these findings statistically dubious7,8. Later studies suggest that the AF has remained constant at around 44%, and this has become the consensus view9,10,11,12. Raupach13 shows that the AF is given by a constant in a system where emissions follow an exponential trajectory, and the sink uptake is linear in atmospheric CO2 concentrations. Bennedsen et al.14 formalize such a system statistically, also allowing for linear growth of emissions on the more recent sample, and report a point estimate of the AF of 0.44.
Previous studies have analyzed the AF as the ratio of yearly changes in atmospheric CO2 concentrations (Gt, numerator) and anthropogenic CO2 emissions (Et, denominator)4,5,6,7,8,9,10,15,16,17,18. An alternative to this approach is to consider the cumulative airborne fraction (CAF)19,20,21, but since this approach is less commonly used in the literature and is less amenable to statistical analysis, we only briefly address it here. Instead, we follow the main body of the literature and adopt the conventional estimator of the AF, which is defined as the sample mean of the yearly ratio Gt/Et. This ratio-based estimator suffers from a number of statistical deficiencies due to its definition as the ratio of two stochastic processes. A particular concern is the presence of trends in the time series of yearly changes of atmospheric concentrations Gt and of yearly emissions, Et, which prohibits the application of a central limit theorem for the ratio-based estimator. Therefore, its limiting distribution may be non-Gaussian, unless a separate assumption of Gaussianity is imposed. Hence, confidence intervals and p-values for test statistics based on the Gaussian distribution may not be valid for the conventional ratio-based estimator. Another concern is the denominator in the ratio, Et, if it has a positive probability density at zero. In this case, the ratio-based estimator does not possess any moments, such that, for instance, the mean and variance of the estimator do not exist. Although the issue of Et being at or close to zero is of no concern during the historical period 1959–2022, it becomes important in future scenarios where CO2 emissions decrease, including scenarios consistent with “net-zero” CO2 emissions, which is a committed goal of the international community22,23. Future scenarios will likely result in a non-constant AF, either because of emissions trajectories departing from exponential growth or because of changing dynamics in the carbon sinks due to, e.g., saturation24,25 or climate feedback effects26. These departures can lead to a non-linear relationship between sink activity and atmospheric concentrations. Climate models have shown that the AF tends to increase in future high-emission scenarios and to decrease in low-emission scenarios19. Hence, analyzing the future AF as implied by output from climate models necessitates an approach that can accommodate a time-varying AF, also in cases when emissions are at or near zero. Such challenging issues with the conventional ratio-based analysis of the AF, when emissions decrease towards zero, have recently been encountered in a study of the future AF implied by output from a climate model18.
In this work, we propose a regression-based approach to estimating the CO2 airborne fraction and show that it is statistically superior to the conventional ratio-based approach. We first show that the time series of yearly changes in atmospheric CO2 concentrations, Gt, cointegrates with anthropogenic CO2 emissions, Et, over the period 1959–2022. This implies a statistically constant AF for the historical sample. On the basis of cointegration between Gt and Et, we prove a number of theoretical results concerning the ratio-based and regression-based estimators of the AF. These results formally establish the statistical deficiencies of the ratio-based estimator mentioned above and show that the regression-based estimator does not suffer from these defects. The theoretical analysis also shows that, under mild assumptions, the regression-based estimator converges at the fast rate of T3/2, where T is the sample size, compared to the slower rate of T for the ratio-based estimator.
We apply the ratio-based and regression-based estimators of the AF to yearly data from the Global Carbon Project27 over the period 1959–2022, and we find that the regression-based estimator improves precision compared with the ratio-based estimator, in line with the theoretical results. Our best estimate of the AF over the period 1959–2022 is 47.0% with an associated standard error of 1.1%, which leads to a 95% confidence interval of [44.9%, 49.0%] for the AF. Using output from the reduced-complexity climate model MAGICC28, we illustrate the challenges in applying the ratio-based estimator of the AF to future low-emission scenarios. We further show that a regression-based approach can address these difficulties. When the underlying regression model can accommodate a time-varying AF, we can adopt the Kalman filter to analyze the dynamics of the AF in future scenarios output from climate models, including those compatible with “net-zero” or “net-negative” emissions goals.
Results
Atmospheric changes, emissions, and cointegration
Figure 1a, b show yearly changes in atmospheric CO2 (Gt) and yearly CO2 emissions from anthropogenic sources (Et), respectively. The black line in Fig. 1c) shows the ratio of these two variables, Gt/Et. Data are obtained from the Global Carbon Project and cover the period 1959–2022 (Methods). The most conspicuous feature of the twotime series, Gt and Et, is that they exhibit upward trends. Trending behavior is indicative of time series being non-stationary. A simple least-squares statistical analysis of the bivariate system (Gt, Et), where the non-stationarity of a time series is not accounted for, can yield invalid inference and should be avoided29. However, the notion of cointegration (see, for example, Chapter 19 in Hamilton30 for a textbook treatment) allows us to keep working with the trending time series Gt and Et while still obtaining valid statistical inference. Cointegration methods have been applied in earlier climate studies31,32. Informally, twotime series are cointegrated if they share a common trend. Formally, the time series Gt and Et are said to be cointegrated when both Gt and Et are non-stationary, and the error term ut in the regression equation Gt = αEt + ut is stationary. We adopt the Dickey-Fuller test33 to determine whether a time series is non-stationary. The test statistic is for the null hypothesis of a unit root, that is, of having a unit value for the autoregressive dependence of Gt on its lagged value Gt−1 (and Et on Et−1). The results from this test strongly suggest that both time series are non-stationary (Supplementary Table 1). The null hypothesis of no cointegration (that is, ut is non-stationary) can be tested formally using the Engle-Granger test34, which is a Dickey-Fuller test on the residuals in the regression Gt = αEt + ut, adjusted for the fact that these residuals are not observed but must be estimated. The null hypothesis of a unit root in ut is firmly rejected (Supplementary Table 1), and we can, therefore, conclude that the two yearly time series Gt and Et are cointegrated.
a Atmospheric concentration changes (Gt). b Emissions (Et) data used in the study over the period 1959–2022. c Data (Gt/Et) in black, fit of (1) in blue (solid), fit of (3) in red (dashed), 95% confidence bands (shaded), “intercept” denotes the ratio-based estimate of the AF α. d Ratio-based estimated residuals (\({\hat{u}}_{t}\)) from (1) in blue (solid) and from (3) in red (dashed). e Data (Gt) in circles, the fit of (2) in blue (solid), the fit of (4) in red (dashed), 95% confidence bands (shaded), “slope” denotes the regression-based estimate of the AF α. f Regression-based estimated residuals (\({\hat{u}}_{t}\)) from (2) in blue (solid) and from (4) in red (dashed). Source data are provided as a Source Data file.
The cointegration analysis supports the hypothesis that the AF parameter α is constant during the period studied here (1959–2022). If the parameter α was changing in a specific direction, this would introduce a trend in the estimated residuals ut. The result from the Engle-Granger test shows that a trend is not present. This is confirmed graphically by the blue line in Fig. 1f and is in line with recent studies9,10,11,12. A Jarque-Bera test35 for normality of the estimated residuals for ut results in a p-value of 24%, implying that we cannot reject the null of ut having a Gaussian distribution.
Statistical properties of the ratio-based and regression-based AF estimators
We consider two approaches to estimating the AF α. These are the conventional ratio-based estimator, using equation \({G}_{t}/{E}_{t}=\alpha+{u}_{t}^{(1)}\), and the regression-based estimator, using equation \({G}_{t}=\alpha \,{E}_{t}+{u}_{t}^{(2)}\). Both estimators can be implemented using least-squares regression (Methods). In the case of independent data, it is known that the regression-based estimator is efficient, i.e., it has lower estimation uncertainty compared to, for example, the ratio-based estimator36,37. We study this property for the case of cointegrated non-stationary time series data, which the cointegration analysis summarized above shows is the relevant case for the AF.
We derive the asymptotic properties, that is, consistency and asymptotic normality, for both the ratio-based and the regression-based estimators (Supplementary Methods). These properties depend on the dynamics of CO2 emissions, which are well-described by a random walk process with drift over the sample period 1959–2022, i.e., Et = E0 + bt + xt, with initial value E0, drift coefficient b > 0, and random walk process xt, for t = 1, …, T, where T is the sample size (Supplementary Methods). We show that the regression-based estimator converges to the data-generating AF α at rate T3/2 (Supplementary Prop. 1). The ratio-based estimator, on the other hand, also converges to the data-generating AF α but at the slower rate T (Supplementary Prop. 2). This implies that the estimation uncertainty in the regression-based estimator will decrease faster with increasing sample size than the ratio-based estimator, as is the case for independent data.
If the process Et has positive probability density at zero, then the ratio-based estimator does not have a finite mean or variance (Supplementary Prop. 3). This follows directly from the definition of the ratio-based estimator as the sample mean of Gt/Et: if values Et = 0 have positive probability in the sample space, then the ratio Gt/Et is not integrable on that space.
The model assumption Et = E0 + bt + xt for CO2 emissions, where xt is a random walk process of, for example, Gaussian increments, implies a positive probability density for Et = 0. However, for the sample period 1959–2022, the trend terms E0 + bt of Et are much larger in magnitude than the random walk term xt, and hence it is not unreasonable to assume that xt = 0 for theoretical purposes. In this case, the ratio-based estimator has standard statistical properties. In particular, it is an unbiased estimator of α, that is, the mean of the estimator equals α, and the variance of the estimator has a simple expression that can easily be estimated. However, we show that even in this case, a central limit theorem does not hold in general (Supplementary Prop. 4(i)). The ratio-based estimator has a limiting Gaussian distribution only if we additionally assume that ut is Gaussian (Supplementary Prop. 4(ii)). In contrast, the regression-based estimator follows a central limit theorem with a limiting Gaussian distribution, and the derivation does not require this additional assumption (Supplementary Prop. 1).
Although the theoretical results show that the regression-based estimator is asymptotically, i.e., for sufficiently large sample sizes, more precise than the ratio-based estimator, it is an empirical question of which estimator is more precise in finite samples. Next, we estimate the variances of the ratio-based and the regression-based estimators on the historical sample and compare their magnitudes.
Estimating the airborne fraction over 1959–2022
We use time series data on yearly changes in atmospheric CO2 (Gt), yearly CO2 emissions from fossil fuels (\({E}_{t}^{FF}\)), and yearly CO2 emissions from land-use and land cover change (\({E}_{t}^{LULCC}\)), for the sample 1959–2022. Total anthropogenic CO2 emissions are then \({E}_{t}= {E}_{t}^{FF}+{E}_{t}^{LULCC}\). The data series are measured in gigatonnes of carbon per year (GtC/yr), obtained from the Global Carbon Project (Methods) and presented in Fig. 1a, b.
The ratio-based estimate (\({\hat{\alpha }}_{1}\)) and the regression-based estimate (\({\hat{\alpha }}_{2}\)) are obtained from least-squares regressions applied to the equations \({G}_{t}/{E}_{t}=\alpha+{u}_{t}^{(1)}\) and \({G}_{t}=\alpha \,{E}_{t}+{u}_{t}^{(2)}\), respectively (Methods). The fits are shown in Fig. 1c, e, and the associated estimated residuals \({\hat{u}}_{t}\) are shown in 1d, f, all as blue lines. To account for possible serial correlation and heteroskedasticity in the model errors ut, we calculate standard errors using a heteroskedasticity and autocorrelation consistent (HAC) estimator38. The results are displayed in the first two columns of Table 1. The estimates largely agree on the magnitude of the AF, \({\hat{\alpha }}_{1}=43.86\%\) and \({\hat{\alpha }}_{2}=44.78\%\). However, the standard error of \({\hat{\alpha }}_{2}\) is 11% lower than the standard error of \({\hat{\alpha }}_{1}\), showing that the faster convergence rate of this estimator (T3/2 versus T) outweighs the fact that the error process \({u}_{t}^{(2)}\) of the regression-based model has a larger variance than the error process \({u}_{t}^{(1)}\) of the ratio-based model. In particular, the estimated standard deviations (SDs) of these model errors are \(\widehat{SD}({u}_{t}^{(1)})=0.13\) and \(\widehat{SD}({u}_{t}^{(2)})=0.91\). The discrepancy is due to the different nature of the two models where \({u}_{t}^{(1)}={u}_{t}^{(2)}/{E}_{t}\), with Et ≫ 1 in the sample 1959–2022.
By introducing covariates in the least-squares regressions, we can reduce the variance of the error processes ut and thus achieve more precise estimates of the AF α. For example, it is common practice in the literature to control for the effects of the El Niño-Southern Oscillation (ENSO) and volcanic activity (VAI)5,17. We follow this approach here (see Methods). The estimation results for the ratio-based estimator (\({\hat{\alpha }}_{3}\)) and the regression-based estimator (\({\hat{\alpha }}_{4}\)) of the AF when ENSO and VAI are included as covariates are presented in the third and fourth columns of Table 1. The corresponding regression fits are presented in Fig. 1c, e, and their associated estimated residuals \({\hat{u}}_{t}\) in Fig. 1d),f), all as red dashed lines. The results show that controlling for the effects of ENSO and volcanic activity increases the estimate of the AF considerably, resulting in \({\hat{\alpha }}_{3}=47.16\%\) and \({\hat{\alpha }}_{4}=46.97\%\). The estimates of the standard deviations of the error terms (\(\widehat{SD}({u}_{t}^{(3)})=0.09\) and \(\widehat{SD}({u}_{t}^{(4)})=0.63\)) decrease substantially compared to the models without covariates, indicating that the covariates ENSO and VAI explain much variation in the data. This is corroborated by the coefficient of determination (R2) values reported in Table 1. We note that the R2 for the model in the first column equals zero by construction since this model only features an intercept. The decreased variances of the residuals from the models including covariates, imply that their estimates of the constant AF α are more precise. The standard error of the regression-based estimate including covariates (\({\hat{\alpha }}_{4}\)) is approximately 16% lower than the ratio-based estimate including covariates (\({\hat{\alpha }}_{3}\)) and approximately 34% lower than the conventional ratio-based estimate excluding covariates (\({\hat{\alpha }}_{1}\)). Our preferred estimate, obtained from the regression-based estimator including covariates (\({\hat{\alpha }}_{4}\)), results in an AF of 47.0% (± 1.1%; 1σ) with an associated 95% confidence interval of [44.9%, 49.0%]. The slightly increased AF estimates for the models with ENSO and VAI, compared to the models without covariates, confirm a similar finding in Betts et al.39.
The variability of the differences in CO2 emissions increased in the early 1990s (Supplementary Fig. 1). This is most likely due to increased variability of emission estimates from land-use and land-cover change starting in the early 1990s (Supplementary Methods and Supplementary Fig. 1). As a robustness check, the right panel of Table 1 presents the results for the more recent subsample 1992–2022; Supplementary Table 1 contains the corresponding coefficient estimates for ENSO and VAI. Our conclusions for the full sample are corroborated by the results for the recent sample. All estimates of the AF α from the subsample are within the respective confidence bands of the estimates from the full sample, while the reductions in uncertainty from including the covariates and from using the regression-based estimator are similar.
Estimating the airborne fraction over 2023–2100
The approximate constancy of the AF over the historical period 1959–2022, as documented in the literature and confirmed by the cointegration analysis in this study, can be understood as the result of a near-exponential growth in emissions and an approximately linear response of the carbon sinks to atmospheric concentrations9. In scenarios describing the future, for example, when emissions are declining, the AF is expected to depart from constancy and may vary over time18,19. This motivates the specification of a time-varying AF α = αt, with αt denoting the AF in year t, i.e., the fraction of emissions (Et) added to the atmosphere (Gt) in year t. This time-varying AF may also be estimated using a ratio-based approach and a regression-based approach (Methods).
To study the performance of the ratio-based (\({\hat{\alpha }}_{1,t}\)) and the regression-based (\({\hat{\alpha }}_{2,t}\)) estimators in situations where the AF is changing over time, we apply the two estimators to output from the MAGICC reduced-complexity climate model28. We let MAGICC produce future trajectories of Gt and Et for t = 2023, 2024, …, 2100 according to the Shared Socioeconomic Pathways (SSPs)40. Here we present the results from the so-called SSP1-2.6 scenario, which is a high mitigation scenario consistent with a forcing level of 2.6 Wm−2 in the year 210040. Results obtained from other SSP scenarios are similar to those reported below (Supplementary Figs. 5–9). Since MAGICC is a deterministic model without a stochastic representation of the climate variables, the trajectories of Gt and Et generated by MAGICC are very smooth. To obtain output that resembles climate data, we perturb the trajectories of Gt and Et by zero-mean Gaussian noise, where we set the variances equal to estimates obtained on the historical data. These simulated trajectories, together with the original output from MAGICC and the historical Global Carbon Project data 1959–2022, are shown in panels a and b of Fig. 2. Panel c presents the historical ratio Gt/Et over 1959–2022 and the ratio-based and regression-based estimates \({\hat{\alpha }}_{t}\) of the time-varying airborne fraction over 2023–2100.
a Atmospheric concentration changes (Gt) for the historical period 1959–2022 (black) and the SSP period 2023–2100 (blue, magenta). b Emissions (Et) data. Magenta lines show the output from MAGICC; blue lines show the perturbed data. c Ratio of atmospheric changes to emissions (Gt/Et). The red line in (c) is the estimated fraction of emissions (Et) added to the atmosphere (Gt) in each year t, and, more specifically, it is the regression-based estimator \({\hat{\alpha }}_{2,t}\) of the time-varying airborne fraction αt, obtained from the Kalman smoother. The shaded area is a 95% confidence band around \({\hat{\alpha }}_{2,t}\). Source data are provided as a Source Data file.
The ratio-based estimate \({\hat{\alpha }}_{1,t}\) (blue) is a very noisy series, especially when Et ≈ 0. In contrast, the regression-based estimate \({\hat{\alpha }}_{2,t}\) (red), obtained from the Kalman filter (Methods), evolves over time in a stable fashion and shows sensible AF estimates, also when Et ≈ 0. A further benefit of the regression-based method is the availability of confidence intervals for \({\hat{\alpha }}_{2,t}\) (shaded red area), which are not immediately available for the ratio-based estimator \({\hat{\alpha }}_{1,t}\). Finally, the covariates for El Niño and volcanic activity can readily be incorporated into the regression-based framework with a time-varying AF αt.
In the SSP1-2.6 scenario studied here, the regression-based AF estimate \({\hat{\alpha }}_{2,t}\) remains roughly constant until 2050, after which it gradually declines toward zero. In 2060, the atmospheric changes turn negative, resulting in a negative estimate of the AF, meaning that the sink uptake exceeds the emissions. In 2077, the emissions turn negative as well, causing a switch to a positive estimate of the AF. The estimates of the AF exceed one from 2077 onwards, indicating that the sinks continue to absorb CO2 even in this regime with highly negative emissions. These findings can be contrasted with the analysis of the SSP1-1.9 scenario (Supplementary Fig. 5), which has a similar trajectory for the regression-based AF estimates \({\hat{\alpha }}_{2,t}\) as the SSP1-2.6 scenario. However, from 2080 onwards, the SSP1-1.9 scenario has \({\hat{\alpha }}_{2,t} \, < \, 1\), implying that the sinks turn into carbon sources (releasing more carbon dioxide than they absorb).
Discussion
Our empirical findings present a slightly higher AF than the consensus estimate of 44%11 and the cumulative airborne fraction CAFt = 44.4% obtained from the Global Carbon Project data (Supplementary Methods). The regression-based estimate of the AF, using the 1959–2022 sample of the Global Carbon Project data and controlling for El Niño and volcanic activity, is 47.0% (± 1.1%; 1σ), with a 95% confidence interval of [44.9%, 49.0%].
When El Niño and volcanic activity are excluded from the analysis, the estimate is 44.8% (± 1.4%; 1σ), which is more in line with the commonly reported results. When we apply the same analysis to two alternative data sets, we obtain slightly higher estimates of the AF than those from the Global Carbon Project (Supplementary Table 2). The more recent 1992–2022 subsample yields an AF estimate of approximately 46% (± 1.0%; 1σ) for the Global Carbon Project data (Table 1) and slightly higher estimates for the two alternative data sets (Supplementary Table 3). To account for possible measurement errors in Gt and Et, we report Deming regressions41, which are in line with the results reported so far (Supplementary Table 4). Therefore, we may conclude that measurement error is not driving our results.
To summarize the theoretical findings in our study, we conclude that the ratio-based estimator of the AF suffers from three main shortcomings. First, due to its definition as the ratio of changes in atmospheric concentrations to emissions, means and variances do not exist if zero emissions are possible. While this is of no concern on the historical sample, it is important when analyzing the AF on net-zero emissions scenarios. Studies of the past AF4,5,6,7,8,9,10,15,16,17 are most likely not influenced to any substantial degree by this issue, but studies of future low-emission scenarios are affected18. Second, stronger assumptions of Gaussianity on the distribution of the error process are necessary, compared to the case of the regression-based estimator, if a central limit theorem is to be invoked to compute confidence intervals and p-values based on the Gaussian distribution. Alternative methods to compute confidence intervals and p-values, such as the bootstrap42, can also be used for this purpose. Again, studies on historical data are most likely not strongly affected by our findings, as Fig. 1d suggests Gaussian residuals. Third, the ratio-based estimator converges to the data-generating AF at a slower rate than the regression-based estimator, even if zero emissions are ruled out, and errors are assumed to be normal. Both estimators converge faster than the common \(\sqrt{T}\) rate due to the non-stationarity of the two yearly time series variables, emissions (Et) and changes in atmospheric concentrations (Gt), and to their cointegration. The ratio-based estimator converges at rate T and the regression-based estimator at rate T3/2.
The preferred regression-based estimator has standard statistical properties, such as the existence of first and second moments, it is defined for zero emissions, and it converges to the data-generating AF at a fast rate. A central limit theorem applies without assuming the Gaussianity of the regression error, and confidence levels and p-values can be computed in the usual way. Based on theoretical arguments, on a simulation study (Supplementary Methods and Supplementary Fig. 4), and on a historical sample of yearly data, we have shown that the regression-based estimator exhibits lower estimation uncertainty compared to the ratio-based estimator. Finally, we have argued that the regression-based estimator can readily be generalized to a time-varying AF specification with its estimation done by the Kalman filter and smoother.
Table 2 summarizes the statistical tests performed in this study. The main empirical findings from these tests are: (1) emissions and changes in atmospheric concentration are trending over the historical sample 1959–2022, (2) emissions and changes in atmospheric concentrations cointegrate on the historical sample 1959–2022 with a constant regression coefficient, motivating the model choice for our theoretical studies, (3) the regression errors appear Gaussian, (4) the findings of this paper are qualitatively the same on a subsample of the last 31 years, (5) measurement error in the twotime series is not driving the results.
The main advantage of the regression-based approach for a historical sample analysis is increased precision. In future projections with emissions approaching zero, it remains valid, in contrast to a ratio-based approach. To illustrate this feature, we have simulated trajectories for emissions and changes in atmospheric concentrations over the period 2023–2100, which are consistent with SSP scenarios using the MAGICC reduced-complexity climate model. We regard these analyses as a first step and consider the use of climate projections from the Coupled Model Intercomparison Project (CMIP) as the next step in our research agenda.
Methods
Data used in the study
The time series data from the Global Carbon Project are available at https://www.icos-cp.eu/science-and-impact/global-carbon-budget/2023 (last accessed June 17, 2024). The variable \({E}_{t}^{FF}\) includes the cement carbonation sink, as described in Friedlingstein et al.27. VAI data for volcanic activity are obtained from Ammann et al.43. ENSO data are constructed from the Niño 3 SST Index of the National Oceanic and Atmospheric Administration (NOAA), available at https://psl.noaa.gov/gcos_wgsp/Timeseries/Data/nino3.long.anom.data (last accessed June 17, 2024). Specifically, we have converted monthly ENSO data into a yearly time series of September-August ENSO means44. This 4-month lag provides the best fit between Gt and ENSOt. The slight trend in yearly ENSO data is removed by taking deviations from a fitted linear trend so that it has no impact on the AF estimates. The data are shown in Supplementary Fig. 2. The data for the SSP scenarios can be run in MAGICC in a web browser via https://live.magicc.org/scenarios/bced417f-0f7f-4bb7-8359-792a0a8b0368/overview (last accessed June 17, 2024).
Ratio-based and regression-based estimators of a constant airborne fraction
The ratio-based approach to estimating the AF takes its departure from the statistical model given by
where Gt are the yearly changes in atmospheric concentrations of CO2, Et are yearly CO2 emissions, the constant parameter α is the AF and \({u}_{t}^{(1)}\) are the disturbance modeled as a zero-mean error process, for t = 1, 2, …, T, with T denoting the number of yearly observations in the sample. The disturbance \({u}_{t}^{(1)}\) captures deviations of the data Gt/Et from the constant value α due to measurement errors and internal variability of the climate system. For the statistical model (1), it is straightforward to estimate the AF parameter α using the sample mean of the data Gt/Et, yielding the ratio-based estimator given by
The model in equation (1) expresses that, on average, the fraction α of emissions Et is absorbed in the atmosphere, resulting in atmospheric concentrations increasing with the amount Gt. An alternative way to express this association between Gt and Et is through the model formulation
for t = 1, 2, …, T, where the disturbance \({u}_{t}^{(2)}\) are a zero-mean error process. A model closely related to (2) has previously been used to reconstruct and predict CO2 growth rates39,45. The relationship between the disturbances in equations (1) and (2) is given by \({u}_{t}^{(1)}={u}_{t}^{(2)}\,/\,{E}_{t}\). Cointegration of Gt and Et implies that \({u}_{t}^{(2)}\) is a stationary process. Then, the parameter α can be estimated directly using a simple least-squares calculation, yielding the regression-based estimator as given by
Including covariates
We may control for the effects of El Niño (ENSOt) and volcanic activity (VAIt) by introducing pertinent data into the models (1) and (2). We thus consider the models
for t = 1, 2, …, T, where \({\tilde{\gamma }}_{i}\) and γi, for i = 1, 2, are regression coefficients, and the model errors \({u}_{t}^{(j)}\) follow a zero-mean error process, for j = 3, 4. For both models, the coefficients can be estimated using least-squares regression. We let \({\hat{\alpha }}_{3}\) and \({\hat{\alpha }}_{4}\) denote the least-squares estimators of α from models (3) and (4), respectively. The estimation results for the ENSO and VAI coefficients γi and \({\tilde{\gamma }}_{i}\), for i = 1, 2, are reported in Supplementary Table 2.
Ratio-based and regression-based estimators of a time-varying airborne fraction
In the case of a time-varying AF αt, the ratio-based model can be written as
where αt is a yearly time-varying coefficient, typically specified as a random walk process. The ratio Gt/Et may be used to track the amount of emitted CO2 that remains airborne, and hence, the estimate \({\hat{\alpha }}_{1,t}={G}_{t}/{E}_{t}\) can be regarded as an appropriate but very noisy time-varying AF estimate. A possible way to reduce the noise in this AF estimate is to apply a local smoothing operation, e.g., a two-sided moving average filter. Filtered or not, the variability of the ratio Gt/Et will be amplified when future emissions Et start to approach zero. Another possible solution for noise reduction previously suggested in the literature is to use the cumulative AF (CAF) in place of the yearly AF19,20,21. However, due to its cumulative nature, the CAF can be slow to detect changes in the behavior of the carbon sinks, making it less useful for the purpose of analyzing a time-varying AF (Supplementary Fig. 3).
When considering the regression-based model with a time-varying AF αt, we obtain
A versatile way of treating such a time-varying regression model is to assume random walk dynamics for αt and estimate it by means of a recursive regression filter, such as the Kalman filter and smoother46. This approach yields the minimum mean-squared error estimator \({\hat{\alpha }}_{2,t}\), and it does not suffer from the deficiencies of the time-varying ratio-based estimator \({\hat{\alpha }}_{1,t}\). In the regression-based model (5), αt is multiplied by Et. When emissions turn negative for the first time in some year t, i.e., Et < 0, we let αt be reflected around one. For this purpose, we adjust the random walk specification for αt at year t with a one-time instantaneous shift in αt when Et < 0 for the first time. Further motivation and technical detail on this procedure can be found in the Supplementary Methods.
Data availability
Data used in this study are publicly available and can be found at https://zenodo.org/records/1376776947. These data are also available in a Source Data file accompanying the paper. Source data are provided in this paper.
Code availability
MATLAB code for replication of all results in the main paper and Supplementary Information can be found at https://zenodo.org/records/1376776947.
References
Bacastow, R. and Keeling, C. D. Atmospheric carbon dioxide and radiocarbon in the natural cycle: II. Changes from A.D. 1700 to 2070 as deduced from a geochemical model. Brookhaven Symp. Biol. 24, 86–135 (1973).
Siegenthaler, U. & Oeschger, H. Predicting future atmospheric carbon dioxide levels. Science 199, 388–395 (1978).
Gloor, M., Sarmienti, J. L. & Gruber, N. What can be learned about carbon cycle climate feedbacks from the CO2 airborne fraction? Atmos. Chem. Phys. 10, 7739–7751 (2010).
Canadell, J. G. et al. Contributions to accelerating atmospheric CO2 growth from economic activity, carbon intensity, and efficiency of natural sinks. Proc. Natl. Acad. Sci. USA 104, 18866–18870 (2007).
Raupach, M. R., Canadell, J. G. & Le Quéré, C. Anthropogenic and biophysical contributions to increasing atmospheric CO2 growth rate and airborne fraction. Biogeosciences 5, 1601–1613 (2008).
Le Quéré, C. et al. Trends in the sources and sinks of carbon dioxide. Nat. Geosci. 2, 831–836 (2009).
Knorr, W. Is the airborne fraction of anthropogenic CO2 emissions increasing? Geophys. Res. Lett. 36, https://doi.org/10.1029/2009GL040613 (2009).
Ballantyne, A. P. et al. Audit of the global carbon budget: estimate errors and their impact on uptake uncertainty. Biogeosciences 12, 2565–2584 (2015).
Raupach, M. R. et al. The declining uptake rate of atmospheric CO2 by land and ocean sinks. Biogeosciences 11, 3453–3475 (2014).
Bennedsen, M., Hillebrand, E. & Koopman, S. J. Trend analysis of the airborne fraction and sink rate of anthropogenically released CO2. Biogeosciences 16, 3651–3663 (2019).
Canadell, J. G. et al. Global carbon and other biogeochemical cycles and feedbacks. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (Cambridge Univ. Press, 2021).
Bennedsen, M., Hillebrand, E. & Koopman, S. J. On the evidence of a trend in the CO2 airborne fraction. Nature 616, E1–E3 (2023).
Raupach, M. R. The exponential eigenmodes of the carbon-climate system, and their implications for ratios of responses to forcings. Earth Syst. Dynam. 4, 31–49 (2013).
Bennedsen, M., Hillebrand, E. & Koopman, S. J. A multivariate dynamic statistical model of the global carbon budget 1959–2020. J. R. Stat. Soc. A 186, 20–42 (2023).
Ballantyne, A. P. et al. Increase in observed net carbon dioxide uptake by land and oceans during the past 50 years. Nature 488, 70–72 (2012).
Keenan, T. F. et al. Recent pause in the growth rate of atmospheric CO2 due to enhanced terrestrial carbon uptake. Nat. Commun. 7, 1–10 (2016).
van Marle, M. J. E. et al. New land-use-change emissions suggest a declining CO2 airborne fraction. Nature 603, 450–454 (2022).
Pressburger, L. et al. Quantifying airborne fraction trends and the destination of anthropogenic CO2 by tracking carbon flows in a simple climate model. Environ. Res. Lett. 18, 5 (2023).
Jones, C. et al. Twenty-first-century compatible CO2 emissions and airborne fraction simulated by CMIP5 Earth system models under four representative concentration pathways. J. Clim. 26, 4398–4413 (2013).
Jones, C. D. et al. Simulating the Earth system response to negative emissions. Environ. Res. Lett. 11, 1–11 (2016).
Liddicoat, S. K. et al. Compatible fossil fuel CO2 emissions in the CMIP6 Earth system models’ historical and shared socioeconomic pathway experiments of the twenty-first century. J. Clim. 34, 2853–2875 (2021).
Rockström, J. et al. A roadmap for rapid decarbonization. Science 355, 1269–1271 (2017).
Riahi, K. et al. Mitigation pathways compatible with long-term goals. In IPCC, 2022: Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (Cambridge Univ. Press, 2022).
Le Quéré, C. et al. Saturation of the southern ocean CO2 sink due to recent climate change. Science 316, 1735–1738 (2007).
Canadell, J. G. et al. Saturation of the terrestrial carbon sink. In Terrestrial Ecosystems in a Changing World, 59–78 (Springer, 2007).
Friedlingstein, P. Carbon cycle feedbacks and future climate change. Philos. Trans. R. Soc. A 373, 20140421 (2015).
Friedlingstein, P. et al. Global carbon budget 2023. Earth Syst. Sci. Data 15, 5301–5369 (2023).
Meinshausen, M., Raper, S. C. B. & Wigley, T. M. L. Emulating coupled atmosphere-ocean and carbon cycle models with a simpler model, MAGICC6 – Part 1: Model description and calibration. Atmos. Chem. Phys. 11, 1417–1456 (2011).
Granger, C. W. J. & Newbold, P. Spurious regression in econometrics. J. Econom. 2, 111–120 (1974).
Hamilton, J. D. Time Series Analysis. (Princeton University Press, 1994).
Kaufmann, R. K. & Stern, D. I. Cointegration analysis of hemispheric temperature relations. J. Geophys. Res. Atmos. 107, D2 (2002).
Schmith, T., Johansen, S. & Thejll, P. Statistical analysis of global surface temperature and sea level using cointegration methods. J. Clim. 25, 7822–7833 (2012).
Dickey, D. A. & Fuller, W. A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 74, 427–431 (1979).
Engle, R. F. & Granger, C. W. J. Co-integration and error correction: Representation, estimation, and testing. Econometrica 55, 251–276 (1987).
Jarque, C. M. & Bera, A. K. A test for normality of observations and regression residuals. Int. Stat. Rev. 2, 163–172 (1987).
Cochran, W. G. Sampling Techniques. (Wiley, 3rd edn, 1977).
Deng, L.-Y. & Wu, C. F. J. Estimation of variance of the regression estimator. J. Am. Stat. Assoc. 82, 568–576 (1987).
Newey, W. K. & West, K. D. A simple, positive semi-definite, heteroskedasticity and autocorrelation-consistent covariance matrix. Econometrica 55, 703–708 (1987).
Betts, R. A. et al. El Niño and a record CO2 rise. Nat. Clim. Change 6, 806–810 (2016).
Riahi, K. et al. The Shared Socioeconomic Pathways and their energy, land use, and greenhouse gas emissions implications: An overview. Glob. Environ. Change 42, 153–168 (2017).
Deming, W. E. Statistical Adjustments of Data. (Wiley, 1943).
Efron, B. and Tibshirani, R. An Introduction to the Bootstrap. (Chapman & Hall/CRC, 1993).
Ammann, C. M., Meehl, G. A., Washington, W. M. & Zender, C. S. A monthly and latitudinally varying volcanic forcing dataset in simulations of 20th century climate. Geophys. Res. Lett. 30, https://doi.org/10.1029/2003GL016875 (2003).
Jones, C. D. et al. The carbon cycle response to ENSO: A coupled climate–carbon cycle model study. J. Clim. 14, 4113–4129 (2001).
Jones, C. D. & Cox, P. M. On the significance of atmospheric CO2 growth rate anomalies in 2002–2003. Geophys. Res. Lett. 32, 14 (2005).
Durbin, J. and Koopman, S. J. Time Series Analysis by State Space Methods. Oxford Univ. (Press, 2nd edn, 2012).
Bennedsen, M., Hillebrand, E. & Koopman, S. J. A regression-based approach to the CO2 airborne fraction. Zenodo, https://doi.org/10.5281/zenodo.13767769 (2024).
Acknowledgements
We thank Morten Ø. Nielsen for helpful discussions regarding the convergence of stochastic processes. This work was supported by the Independent Research Fund Denmark (grant 0219-00001B to M.B.).
Author information
Authors and Affiliations
Contributions
M.B., E.H., and S.J.K. contributed equally to the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Chris Jones and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Bennedsen, M., Hillebrand, E. & Koopman, S.J. A regression-based approach to the CO2 airborne fraction. Nat Commun 15, 8507 (2024). https://doi.org/10.1038/s41467-024-52728-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-52728-1