Introduction

Over the past few decades, China has implemented nationwide ecological restoration (ER) programs1. Empirical evidence demonstrates their efficacy in mitigating land desertification and contributing to climate change adaptation through carbon sequestration2,3,4. However, these interventions have simultaneously induced measurable alterations in hydrological dynamics4,5 and coupled carbon-water cycle6. While afforestation initiatives are recognized for enhancing terrestrial carbon sinks7, they concurrently reduce soil moisture (SM)8, surface runoff (R)9 and groundwater storage (GWS)10 through enhanced evapotranspiration (ET) mechanisms. Emerging evidence, however, suggests that ER may amplify precipitation (P) via vegetation-climate feedbacks, thereby partially offsetting water resource depletion11. This unsolved scientific debate persists regarding ER’s net hydrological impacts, raising critical concerns about water sustainability for both human beings and ecosystems12, especially in water-limited dryland regions13,14.

To address this knowledge gap, there is an imperative need for comprehensive quantification and predictive modeling of terrestrial water storage (TWS) dynamics across pre- and post-ER implementation periods. TWS, defined as the integrated metric combining but not limited to SM and GWS15, serves as a critical indicator for evaluating regional water balance under anthropogenic-climatic interactions16. However, current research limitations related to TWS-ER intersections stem from three key factors: (1) fragmented analysis of isolated TWS components (e.g., SM17 or GWS18), (2) overreliance on isolated approaches (e.g., ground-based monitoring19, remote sensing-derived shallow soil water diagnostics20) with constrained spatiotemporal scales, (3) insufficient decoupling of climatic (e.g., precipitation variability21) and anthropogenic (e.g., ER-induced land cover changes22) drivers in earth system modeling frameworks23.

The evolution of satellite gravimetry, particularly through the Gravity Recovery and Climate Experiment (GRACE) and its successor mission GRACE Follow-On, has demonstrated great potential for quantifying TWS at large scale through precisely measuring monthly gravity field variations24,25. By integrating GRACE-derived TWS with physically constrained hydrological models, such as the Global Land Data Assimilation System (GLDAS), we decouple SM and GWS, thereby enabling component-level analysis of TWS dynamics26. Given that GRACE measurements are unavailable prior to April 200227,28, we develop a reconstruction methodology combining GRACE measurements, observation-verified hydroclimatic proxies, Global Climate Model outputs, and machine learning algorithm. This approach facilitates the generation of a continuous monthly TWS dataset spanning 1987-2020 with extended projections through 2100 under standardized climate scenarios. To quantitatively apportion anthropogenic and climatic drivers to TWS changes, we implement a diagnostic framework incorporating double-mass cumulative curve method, systematically isolating climate-forced TWS from direct human impacts.

The Mu Us Sandyland (MUS) and its surrounding regions, situated in a transitional zone between the Loess Plateau and the Ordos Plateau29, serve as a globally significant case study for revegetation (Fig. 1a). This designation stems from three distinguishing features: (1) The fragile ecosystem, historically degraded by climate variability and overgrazing prior to the 1999 Grain for Green Project5,29, the world’s largest active ER program30; (2) Minor hydrological engineering interventions5, with ER, agricultural expansion and coal mining becoming dominant anthropogenic drivers since the late 20th century (Supplementary Fig. 1a, Supplementary Fig. 2b)5,29,31. (3) Satellite-derived normalized difference vegetation index (NDVI) data demonstrate sustained vegetation greening following ER (Fig. 1e), concomitant with altered TWS dynamics directly or indirectly5,11,32. Zhao et al.5 reported ER-induced TWS depletion of 16.6 mm yr-1 in MUS, and subsequent studies suggest potential overestimate due to unaccounted coal transportation effects29. As China’s premier coal production base yielding 900 million tons (21% national output) in 2020, particularly concentrated in eastern MUS (Supplementary Fig. 1a), coal extraction induces the mass loss that directly affects GRACE-based TWS estimations29,33,34. This necessitates rigorous separation of coal mining impact on TWS through in-situ data and the water budget (WB) approach. In addition, climate change induces a notable warming and wetting trend in global arid and semi-arid regions35, especially in our study area. This trend is intensifying and is expected to continue in the future36. Specifically, fluctuations in precipitation influence the recharge and extraction in TWS, thus affecting the dynamics of water resources over both short-term and long-term periods26. Recent researches also suggest that revegetation efforts do not diminish water yield11,37, even groundwater38, because of the increased precipitation in the revegetation area. Given the combined effects of anthropogenic activities and climate change, the future trends of TWS in MUS remain highly debatable.

Fig. 1: Dynamics of vegetation indices and TWS across different stages.
Fig. 1: Dynamics of vegetation indices and TWS across different stages.
Full size image

a Location of the study area (red polygon) within the Yellow River Basin (black polygon). b Long-term consistency between WB-derived TWS and GAM-predicted TWS over 1987-2020. c, Temporal changes in major land-cover conversions (1992–2020) derived from ESA CCI land-cover datasets and the shade region indicates the period of rapid change. d Detection of breakpoints in NDVI series using Pettitt’s abrupt test. Spatiotemporal distributions of NDVI (e) and TWS (f) linear trends during 2003-2020, with black dots indicating statistically significant trends (p < 0.05). g Relative contribution (%) of different hydrological components of TWS, including SM1, SM2 and GWS. Stage-specific trends of vegetation indices (h) and TWS (i) during Stage 1 (1987−1998), Stage 2 (2003–2010), and Stage 3 (2011–2020), with all displayed trends being statistically significant (p < 0.05). The shaded region in b means 95 % confidence intervals. The error bars in h-i mean 95% confidence intervals for trend.

In this study, we propose an analytical framework to systematically investigate TWS dynamics and its key hydrological components in MUS across historical, contemporary, and projected timeframes. This study integrates multi-source datasets (satellite observations, hydrological modeling, and in-situ data) through sequential analyses: (1) characterizing the changes of long-term TWS and its key hydrological components; (2) analyzing the drivers of TWS resilience at post-ER period; and (3) evaluating the long-term trends of TWS and vegetation index under different Shared Socioeconomic Pathways (SSPs) scenarios.

Results

Long-term TWS reveal accelerated depletion to partial resilience dynamics

Our analysis demonstrates that coal mass loss has a significant effect (−4.1 mm yr-1) on GRACE-derived TWS in MUS during 2003–2020 (Supplementary Fig. 1b). When including coal mass loss correction, the GRACE-derived TWS shows a decline rate of −3.3 mm yr-1, closely matching the −3.5 mm yr-1 estimate from WB method. Without the correction, the rate nearly doubles to −7.4 mm yr-1. To validate these results, we compare GRACE-derived GWS with 126 in-situ groundwater observations - the only available ground truth. During 2019-2020, the corrected GWS trend (−23.0 mm yr-1) aligns better with observed rate (−17.8 mm yr-1) than the uncorrected −29.8 mm yr-1 trend (Supplementary Fig. 3). Based on these consistencies, we incorporate coal mass loss correction into GRACE data processing for subsequent MUS analyses.

The WB approach effectively captures the interannual variability of GRACE TWS, but persistent intra-annual fluctuations hinder long-term interrelationship analysis. To mitigate this limitation, we develop a Generalized Additive Model (GAM) trained and validated on monthly data from 2003 to 2020 (Methods). This model reduces intra-annual noise and reconstructs historical TWS from 1987 to 2002. Integrating WB-derived TWS into the reconstruction framework significantly enhances the performance compared to models relying solely on P, SM and temperature (T) - key variables influencing TWS (Supplementary Fig. 1c). Notably, the model achieves a test set R² of 0.74, demonstrating robustness against overfitting (Supplementary Fig. 1d). The proposed method minimizes discrepancies in trend and seasonal fluctuations between WB-derived and GRACE TWS. Furthermore, in contrast to the divergent patterns in the TWS derived by Li et al.39 and WaterGap Hydrology Model (WGHM) outputs during 1987–2002, our reconstructed TWS (18.8 mm yr-1) aligns closely with WB-derived data (19.9 mm yr-1), validating the efficacy of incorporating WB constraints into the reconstruction algorithm (Supplementary Fig. 4). From 1987-2002, TWS increases at a rate of 18.8 mm yr-1, primarily driven by favorable climatic conditions and limited human activities5 (Fig. 1b).

For characterizing TWS multiphase dynamics, we segment the reconstructed long-term TWS time series into distinct phases. Based on the ESA CCI global annual land-cover maps, barren areas are rapidly restored to grassland from 1999 to 2002 (Fig. 1c). Pettitt’s abrupt change test on NDVI time-series identifies 2011 as a significant change point (Fig. 1d), and Leaf Area Index (LAI) analysis yields a similar result (Supplementary Fig. 5). Thus, the study period is divided into three stages: Stage 1 (1987–1998) for the pre-ER period; Stage 2 (2003–2010) after the post-ER with slowly increasing NDVI; Stage 3 (2011-2020) after the post-ER with increasing NDVI.

During post-ER period (including Stage 2 and Stage 3), we compare the spatial distribution of vegetation indices (including NDVI and LAI) with TWS. The results show that previously barren areas exhibit increased vegetation greenness, whereas TWS displays a significant declining trend across the entire MUS (p < 0.05; Fig. 1e, f, Supplementary Fig. 5b). Notably, regions with the most substantial vegetation growth in eastern MUS spatially coincide with the most pronounced TWS reductions, and this association is stronger than that with variations in topography or precipitation (Supplementary Fig. 6). Given that TWS encompasses both soil water (critical for shallow-root vegetation) and groundwater (vital for deep-rooted ecosystems), we quantitively decompose TWS into three components40,41: surface soil moisture (SM1: 0-10 cm), subsurface moisture (SM2: 10–200 cm), and GWS ( > 200 cm) (Methods). Our component analysis reveals GWS constitutes the predominant portion of TWS, accounting for 59%, as verified by Yang et al.42 (Fig. 1g).

To characterize the dynamic evolution of different water components across distinct temporal phases, we compare the temporal variations of the vegetation indices and TWS during three stages. NDVI exhibits a statistically significant upward trend over time, with annual increases of 1.5% in Stage 3 and 0.3% in Stage 1 relative to the 1987-2020 mean (Fig. 1h). Satellite-derived LAI data further supports this vegetation growth trend. In sharp contrast, TWS, over the same span of the past 34 years, does not follow a simple unidirectional trend like NDVI. Instead, TWS dynamics exhibit distinct phase-dependent characteristics. During Stage 1 (TWS growth phase), both the SM2 and GWS exhibit significant increases (0.7 mm yr-1 and 22.7 mm yr-1, respectively; Fig. 1i). The situation reverses in Stage 2 (TWS decline phase) with accelerated depletion of SM2 (−2.7 mm yr-1) and GWS (−8.5 mm yr-1), indicating intensified terrestrial water loss. Stage 3 (reversal phase) marks a hydrological recovery: SM1 and SM2 transition to positive trajectories (0.2 mm yr-1 and 2.9 mm yr-1), while GWS depletion rate moderates to −2.9 mm yr-1, suggesting groundwater system resilience. Overall, TWS experienced a reduction of −9.84 billion m³ in Stage 2, followed by an increase of 0.20 billion m³ in Stage 3 (Supplementary Table 1). This phased evolution underscores TWS’s sensitivity to environmental drivers, particularly the notable Stage 3 recovery demonstrating ecosystem restoration capacity.

Drivers of TWS resilience at post-ER period

To elucidate the mechanism underlying TWS variations in MUS, we analyze three-stage vegetation-water interactions. In Stage 2, GWS exhibits a significant negative correlation with NDVI (R = −0.37, Fig. 2a), suggesting vegetation growth accelerates groundwater depletion. Conversely, positive SM-NDVI correlations emerge in Stage 3 (SM1: 0.37; SM2: 0.58), indicating enhanced soil moisture supports both ecosystem recovery and TWS replenishment. Crucially, these patterns persist when controlling for precipitation and temperature effects (Fig. 2b, c), demonstrating that vegetation dynamics dominantly regulate the observed TWS diphasic trend: initial drawdown via groundwater uptake (Stage 2), followed by recovery through soil moisture-vegetation feedbacks (Stage 3).

Fig. 2: Interrelationship between NDVI and TWS across three distinct stages.
Fig. 2: Interrelationship between NDVI and TWS across three distinct stages.
Full size image

a Correlation coefficients between NDVI and TWS at different stages. b, c Partial correlation coefficients between NDVI and TWS when controlling for P or T, respectively. d–l NDVI anomalies under different hydrothermal conditions. The second to fourth rows correspond to Stage 1 (1987–1998), Stage 2 (2003–2010), and Stage 3 (2011–2020), respectively.

The nonlinear Granger causality analysis43 demonstrates bidirectional interactions between NDVI and TWS (Supplementary Fig. 7), prompting quantitative evaluation of their water-energy constraints under different hydrothermal conditions44 (Fig. 2d–l). Three distinct regimes emerge: (1) In temperature-limited conditions, NDVI increases with warming through enhanced photochemical processes; (2) Under water stress, soil moisture deficit imposes dual limitations to stomatal regulation suppression and enzymatic respiration attenuation45; (3) Notably in Stage 3, the observed NDVI increase under declining GWS and rising T reveals a paradoxical relationship - enhanced vegetation activity accelerates groundwater consumption, suggesting a self-limiting growth mechanism under water-energy co-constraints.

To determine cause mechanisms of the initial decline and subsequent recovery in TWS, we quantify the relative contributions of anthropogenic versus climatic drivers using the double-mass cumulative curve method. Our analysis demonstrates that anthropogenic factors predominantly drive TWS depletion during Stage 2, accounting for 95.8%, 94.6% and 99.2% of the observed declines in SM1, SM2 and GWS, respectively (Fig. 3a–c). As for anthropogenic factors, coal mining activities exhibit limited hydrological impact in MUS according to China’s “Cleaner production Standard for Coal Mining and Processing” (water consumption ≤0.1 m3 per ton of raw coal). The resultant water loss, including coal washing, reaches merely 0.4 mm yr-1 at maximum production levels in MUS, constituting <13% of total TWS decline. Furthermore, the majority of our study region resides within an endorheic basin5,38 (a closed hydrological system lacking ocean-bound runoff), resulting in limited water loss. Meanwhile, precipitation variation contributes minimally (<5% of TWS decline), as evidenced by statistically insignificant differences between Stage 2 and 1 precipitation means (Fig. 3d). Based on the analysis above, ER and agricultural expansion constitute the predominant anthropogenic activities in this region (Fig. 1, Supplementary Fig. 2). Consequently, plant transpiration (Et) enhancement driven by these activities emerges as the dominant anthropogenic driver for TWS depletion40,46, with NDVI-Et correlation analysis revealing a strong correlation (R = 0.82, p < 0.01; Fig. 3e).

Fig. 3: Quantitative attribution of climate variability and anthropogenic activities on TWS.
Fig. 3: Quantitative attribution of climate variability and anthropogenic activities on TWS.
Full size image

a–c Climate-anthropogenic attribution based on different cumulative TWS and cumulative P, where blue line indicates climate impacts. d Climate forcing in MUS characterized by 34-year trends (1987–2020) in P and T, with stage-averaged values annotated (T1-T3, P1-P3). e, Correlation between NDVI and plant transpiration (Et), demonstrating vegetation-climate feedback. f Variations of transpiration-evapotranspiration ratio (Et/ET) and soil evaporation-evapotranspiration ratio (Eb/ET), with stage-averaged values annotated (Eb1-Eb3, Et1-Et3). Insets in a–c: Contribution percentage of climate-dominated (blue bar) versus anthropogenic-dominated (pink bar) factors on TWS during Stage 2 (2003–2010) and Stage 3 (2011–2020).

However, in Stage 3, climate change emerges as the dominant factor driving the increases in SM1 and SM2, contributing 53.6% and 67.0% respectively (Fig. 3a, b). Notably, the increase in P offset the decline in SM, as the interannual variability of P is a crucial determinant of soil moisture trends on a decadal scale47. Although human activities remain the primary driver of GWS decline in Stage 3, the influence of climate has gradually mitigated this decline, with a contribution of 30.9%. The primary cause of the observed increase in precipitation is related to vegetation-climate feedbacks11,37. The water consumption attributable to agricultural activities was quantified as −15.7 mm yr-1 and −25.9 mm yr-1 in Stage 2 and Stage 3, respectively (Supplementary Table 2), a pattern strongly associated with the ongoing expansion of agricultural planting areas across the region29 (cover approximately 8% by 2020 of MUS, Supplementary Fig. 2b). However, the rate of TWS depletion in Stage 3 remained lower than that in Stage 2, suggesting that ER not only avoided exacerbating water storage loss but also helped slow the decline in water storage over the long term. This potential positive hydrologic effect is further supported by findings from regional climate simulations. Zhang et al. designed a pair of regional climate model scenarios (one representing the actual conditions with vegetation restoration and another assuming no such restoration) to isolate the net impact of ER on local precipitation processes over the Loess Plateau, including MUS48. Their simulations revealed that ER accounted for approximately 37.4% of the observed increase in precipitation rate, while changes in external moisture circulation contributed the remaining 62.6%. Moreover, the positive feedback between vegetation restoration and increased precipitation in this region has been corroborated by numerous studies11,37,49. Additionally, soil evaporation (Eb) is declining, and the intersection of the transpiration-evapotranspiration ratio (Et/ET) and soil evaporation-evapotranspiration ratio (Eb/ET) after 2011 indicates a shift towards greater dominance of vegetation water consumption (Fig. 3f). Since 1999, the increase in ET rates has significantly raised water consumption, as evidenced by the rise in vegetation indices. However, the restored grassland may have higher water-use efficiency as vegetation grows50, which could reduce water consumption compared to the previous phases.

Future trends in TWS under SSP scenarios

Global change may lead to water limitations in ecosystems41. Nevertheless, in our studied region, changes in water resources are more closely associated with the interactions between ER and local hydrometeorological conditions11,37. Using four future Shared Socioeconomic Pathways from the Coupled Model Intercomparison Project phase 6 (CMIP6), our analysis shows that LAI is expected to increase by 30% during 2015-2100 compared to that during 1987–2014 (Fig. 4a). Meanwhile, precipitation will continue to increase and become the dominant factor, even with higher evaporation rates (Supplementary Fig. 8). This suggests that the negative impact of vegetation greening on TWS may be reverse, which differs from the results of related studies41. We use the GAM model and multi-model ensemble mean to predict the variations of GWS during 2015–2100. Despite the inherent uncertainties in Earth system models, the four SSP scenarios reach a robust consensus in hydrological response: substantial TWS increases across all scenarios (Fig. 4b-d). Even under the climate-resilient SSP1-2.6 scenario, SM1, SM2 and GWS are projected to increase by 51%, 325%, and 351% respectively within a credible range during 2015–2100 relative to 1987–2014 baseline (Fig. 4f).

Fig. 4: Projections of vegetation index and TWS in MUS.
Fig. 4: Projections of vegetation index and TWS in MUS.
Full size image

Observations and multi-model ensemble means of (a) LAI, (b) SM1 and (c) SM2. d Observations and GAM model predicted GWS based on multi-model ensemble mean of GWS. e Relative changes (%) of P, ET, R, LAI, SM1, SM2, GWS during 2003-2020 compared to the 1987−1998 baseline period. f Projected relative changes (%) of P, ET, R, LAI, SM1, SM2, GWS for 2015-2100 compared to 1987-2014 baseline, based on CMIP6 multi-model simulations. The shaded areas in a-d represent the standard error of the multi-model ensemble data. The icons in e-f are from IAN Image Library (https://ian.umces.edu/imagelibrary/), which are available under a CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0).

These increase trends appear plausible through vegetation-climate feedbacks: Although elevated atmospheric CO2 concentrations enhance vegetation productivity and subsequent increase ET (Supplementary Fig. 8), precipitation exhibits more pronounced intensification under projected climate regimes11. This synergistic interaction ultimately drives significant enhancement of TWS and its constituent hydrological components. Notably, these upward trends may be further amplified by accelerating precipitation intensification, as indicated by nonlinear amplification of SM1, SM2, and GWS trajectories during 2015-2100.

The uncertainties inherent in model projections could affect the magnitude of TWS trends, but do not fundamentally compromise their directionality. Observational records consistently demonstrate co-occurring upward trajectories in vegetation indices and TWS across divergent emission scenarios (Supplementary Fig. 9). As previously discussed, climate change impacts on TWS are intensifying. Specifically, projections indicate that increased precipitation induced by ER projects will drive more pronounced runoff growth11, while the corresponding ET increase is expected to remain more moderate51 (Supplementary Fig. 8). This study not only captures projected TWS variability induced by ER projects, but also further confirms that enhanced vegetation-mediated hydrological processes driven by ER positively exert TWS increments.

Discussion

As a cornerstone of environmental conservation, ecological restoration is crucial for enhancing ecosystem resilience and securing sustainable development. Focusing on the MUS - a hotspot of ER practices, we develop an integrated framework to explore long-term TWS dynamics and its drivers through satellite remote sensing, hydrological modeling, and field measurement. The findings provide potential insights for optimizing current revegetation strategies and guiding future restoration efforts.

Firstly, we calculate long-term TWS, revealing an increase trend during 1987–2002 followed by a decline during 2003–2020. These results are consistent with groundwater records from the Yellow River Resource Bulletins and Wang et al.22. The declining trend in GWS (−4.2 mm yr-1) aligns closely with the rate derived from ground-based measurements of shallow aquifers (–2.9 mm yr-1), as reported in YRCC (2003–2020). Notably, the pre-ER trend in our study (22.7 mm yr-1) substantially exceeds the LPJ-GUESS model’s simulated rate (3.4 mm yr-1), which quantifies ecohydrological responses to climate and atmospheric drivers5. This discrepancy can be attributed to the LPJ-GUESS model’s limited soil water storage capacity (1.5 m depth) compared to the region’s actual soil layer thickness (30–80 m), where stores most infiltrated water52. Furthermore, TWS exhibits rapid depletion during 2003–2010, contrasting sharply with the increasing periods of 1987-1998 and the reversing periods of 2011–2020. This pattern aligns with field observations in Shenmu country (MUS region), where total soil moisture in the upper 4 m decreases at −5.1 mm yr-1 during 2004–201253.

Secondly, we find that during the reversal phase of post-ER period (Stage 3: 2011–2020), soil moisture exhibits a positive trend, and groundwater depletion slows compared to the decline phase (Stage 2, 2003–2010). This reversal in declining hydrological component ultimately halts the TWS decline. These findings contrast with previous studies5. We then analyze the drivers of TWS resilience at post-ER period via the double-mass cumulative curve method. Results reveal that large-scale artificial forests increase ET and aggravate water shortage during Stage 2 (2003–2010), while the increasing precipitation due to vegetation-climate feedbacks is pivotal in driving TWS recovery during Stage 3 (2011–2020). Compared to naturally restored vegetation, which can adjust water-use strategies, plantation forests (higher NDVI trend) are typically consist of fast-growing tree species with short rotation cycles54, resulting in significantly higher water consumption during the initial phase of ER (Supplementary Fig. 10). Furthermore, agricultural irrigation is a key contributor to the decline in TWS. However, a recent study based on 4926 observations in the Loess Plateau from 2011–2023 demonstrates that SM at 0–10 m depth increases primarily due to rising P and a decreasing PET-P trend22, further validating our findings. Furthermore, we assessed ecosystem resilience in response to drought. For instance, the sharp decline in NDVI during the drought of 2005 highlights the ecosystem’s initial vulnerability to climatic extremes. Nevertheless, NDVI rapidly rebounded to pre-drought levels by 2007, indicating a considerable capacity for resilience and recovery following drought events in these restored ecosystems (Supplementary Fig. 11).

Lastly, we project the future trends in vegetation index and TWS across different SSP scenarios. While climate projections contain inherent uncertainties, all four emission pathways suggest consistent upward trajectories for both TWS and vegetation index, indicating improved prospects for regional water sustainability. This projected TWS increases contrast with previous studies reporting persistent declines in TWS due to ER55. These discrepancies may stem from insufficient consideration of increasing P driven by vegetation-climate feedbacks, which emerges as the primary mechanism mitigating unfavorable ER impacts11. Given the complex mechanisms governing vegetation-hydrology interactions, future studies should prioritize process-based modeling frameworks integrating multiple datasets to establish robust causal relationships. These findings highlight the dual role of ER and climate change in shaping regional water resource dynamics. Therefore, we recommend that policymakers implement integrated and adaptive management strategies to simultaneously achieve carbon sequestration, biodiversity conservation, and water security, thereby future-proofing vital resources against climate uncertainty.

The analytical framework proposed here can be extended to regions undergoing large-scale revegetation efforts, offering a more precise assessment of long-term TWS dynamics globally. Despite these critical insights, several caveats should be acknowledged when interpreting our findings. First, the accuracy of TWS reconstruction before 2002 is constrained by input data quality. Uncertainties in P, ET and R may propagate into the water budget analysis. Second, the spatial resolution of GRACE data limits fine-scale analyses of water dynamics. Third, while critical to water dynamics, the effects of atmospheric CO2, aerosols, and greenhouse gas emissions - alongside other climatic and anthropogenic drivers - are not explicitly addressed in our analysis. Future studies should integrate coupled hydrological-vegetation models to isolate the effects of climatic variability and anthropogenic pressures on TWS dynamics under diverse climate scenarios, for instance, by setting precipitation thresholds (e.g., 300 mm) to stratify analyses across moisture gradients11,54. In summary, our findings demonstrate that ER can support sustainable water resource management through adaptive strategies to mitigate climate variability. This challenges the prevailing assumption that ER inevitably depletes TWS. As global ecological restoration efforts intensify, these insights offer actionable pathways to reconcile vegetation growth with water sustainability in water-scarce dryland ecosystems.

Methods

The methodological flowchart of this study is presented in Supplementary Fig. 12, illustrating the integration of data processing, model calibration, and attribution analysis phases.

TWS data

The GRACE analysis centers provide multiple products, including spherical harmonic and mascon solutions56. Notably, mascon solutions exhibit fewer leakage errors and better distinguish between land/ocean signals compared to spherical harmonic solutions15. This study utilizes GRACE and GRACE-FO TWS derived from Jet Propulsion Laboratory (JPL) R06 mascon spanning 2003-2020. While the normal spatial resolution of 0.5° × 0.5°, the effective resolution approximates 3° × 3° (~ 90,000 km2). Our study region covers approximately 110,000 km2, exceeding GRACE’s effective resolution (300 km × 300 km) and thus preserving its native resolution. Missing monthly data are reconstructed using the method developed by Yi et al.57.

The WGHM is a comprehensive hydrological model that simulates terrestrial water storage dynamics across all continents except Antarctica58. Its standard output provides monthly TWS estimates at 0.5° resolution, developed independently from GRACE observations. Meanwhile, for extended temporal coverage (1979–2020), we incorporate the TWS reconstruction dataset developed by Li et al.39 using statistical methods. Furthermore, hydrological data from the GLDAS version 2.0 (1987–2014) and 2.1 (2000-present) are employed, providing 0.25°-resolution SM measurements across four soil depth layers (0-200 cm). Notably, both GLDAS versions exclude GRACE data assimilation. To ensure temporal consistency, we harmonize GLDAS2.0 data with GLDAS2.1 references using GAM their overlapping period (2000-2014). All TWS datasets are spatially resampled to 0.25° grid.

Vegetation indices data

The NDVI, a key indicator of photosynthesis and evapotranspiration (range: −1 to 1), quantifies vegetation density. We utilize two NDVI sources including the Moderate Resolution Imaging Spectroradiometer (MODIS) at 0.01° resolution (1987–2015), and the latest Global Inventory Modeling and Mapping Studies 3rd generation (GIMMS-3g) at 0.083° resolution (2001-present). To ensure temporal consistency of NDVI records, using GAM model, we harmonize GIMMS-3g NDVI values with MODIS reference values during their overlapping period (2002–2015). To enhance reliability, LAI data from the Global Land Surface Satellites (GLASS) initiative are incorporated, combining MODIS (2001–2020) and Advanced Very High Resolution Radiometer (AVHRR, 1987–2015) records. All remote sensing indices are temporally aggregated to monthly resolution and spatially resampled to 0.25° grid.

Hydrological fluxes data

We employ the WB approach to estimate pre-ER TWS using P, ET and R components. Monthly precipitation data (0.25° resolution) are obtained from the CN05 dataset, integrating observations from 2400+ Chinese meteorological stations29. Global Land Evaporation Amsterdam Model (GLEAM) provides 0.25° ET estimations59, while Global Runoff Reconstruction (GRUN) supplies 0.5° runoff data60. Ground validation uses: 1) six rain gauges and Baijiachuan runoff station (Zhao et al.5), and 2) ChinaFLUX ET measurements (Supplementary Table 3). Gridded P and R show strong correlations with in-situ measurements (r = 0.95 and 0.62, respectively; Supplementary Fig. 13). GLEAM ET demonstrates high accuracy against 12 flux towers (Supplementary Fig. 14). Moreover, we compared the GLEAM dataset with the PLM-V2 dataset61,62 (2000-present), a product validated using ChinaFLUX tower observations, and identified a strong correlation between them (Supplementary Fig. 15), supporting the reliability of the GLEAM data in this study.

CMIP6 Earth system model outputs

We utilize six CMIP6 earth system models (Supplementary Table 4) providing monthly variables critical for vegetation and TWS predictions: LAI, P, ET, R, SM1, SM2, and T. Model outputs span historical (1850-2014) and future (2015-2100) periods, with 1987-2014 serving as the reference baseline due to observational availability. Four SSPs are examined: SSP1-2.6, SSP2-4.5, SSP3-7.0 and SSP5-8.5, corresponding to 2100 radiative forcing levels of 2.6, 4.5, 7.0, and 8.5 Wm−2, respectively.

To mitigate inter-model discrepancies in CMIP6 simulations, we utilize to multimodal ensemble mean. A quantile mapping (QM) approach is implemented to correct the systematic bias between the CMIP6 outputs and observational references63, aligning model statistical characteristics with the observational data. This bias-correction via QM method involves: Eq. (1) calculating cumulative distribution functions (CDFs) for both observed and simulated data during the baseline period (1987-2014), followed by Eq. (2) applying a transfer function that maps modelled quantiles (\({X}_{m}\)) to their observational counterparts (\({X}_{o}\)).

$${X}_{o}=h({X}_{m})$$
(1)
$${X}_{c}={F}_{o}^{-1}({F}_{m}({X}_{m}))$$
(2)

where \({X}_{m}\), \({X}_{o}\) and \({X}_{c}\) refer to the modelled, observed and corrected variables, respectively. The QM function \(h\) transforms modelled data \({X}_{m}\) to match the distribution of observed data \({X}_{o}\). \(F\) refers to the CDF function and \({F}^{-1}\) refers its inverse. To account for pronounced seasonal variability, we apply the QM bias correction on a monthly basis.

Auxiliary datasets

Land cover change data are obtained from ESA CCI global annual maps (1992-2020; 300 m resolution). Coal production records are sourced from Gao et al.29 and Ordos Bureau of Statistics. The T data (0.25° resolution) originate from the CN05 dataset. Groundwater monitoring data comprising 126 wells in the MUS basin (2019-2020) are referred to Luan et al.38 The relatively uniform distribution of the monitoring sites serves to avoid the problem of survivor bias64. The planting areas of the five primary crops (maize, tubers, wheat, oil-bearing crops, and soybean) are obtained from Statistical Yearbook. Groundwater level data for shallow aquifers are sourced from the Yellow River Resource Bulletins. Complete datasets metadata are provided in Supplementary Table 5.

Estimation of TWS components

TWS in MUS primarily comprise five components: GWS, SM, snow water equivalent, vegetation canopy water, and coal mass changes29,65. Given their negligible magnitudes, snow water equivalent and vegetation canopy water are excluded in our analysis. We derive SM by differing GLDAS soil moisture data against its 2004-2009 baseline. Consequently, GWS is isolated through the following equation:

$${SM}={SM}1+{SM}2$$
(3)
$${GWS}={TWS}-{SM}-{{TWS}}_{{coal}}$$
(4)
$${{TWS}}_{{coal}}={M}_{{coal}}/(\rho * S)$$
(5)

where \({GWS}\) is groundwater storage anomaly, \({{TWS}}_{{coal}}\) represents water loss from coal mining, \({M}_{{coal}}\) denotes coal production, \(\rho\) is water density, \(S\) is the area of our study region.

Reconstructing and predicting TWS

To address temporal limitation of GRACE data (~20 years), machine learning and statistical methods have reconstructed extended TWS records39,66. However, current methods oversimplify anthropogenic impacts by assuming fixed linear trends in TWS (e.g., Li et al.’s model in Supplementary Fig. 4), which could be unreliable in regions with dynamic human-nature interactions like ER zones. The WB approach resolves this challenge through dual integration of natural and anthropogenic factors67, thereby significantly enhancing estimation accuracy in areas with limited historical anthropogenic activities. This approach enables temporally adaptive quantification of water storage dynamics, superseding conventional static trend assumptions.

We calculate pre-ER TWS through water balanced Eq. (7) using cumulative P, ET and R. These hydrologic variations are subsequently smoothed using Eq. (6)67 to suppress spurious high-frequency signals arising from finite-difference operations.

$$\widetilde{X(t)}=\frac{1}{4}X\left(t-1\right)+\frac{1}{2}X\left(t\right)+\frac{1}{4}X\left(t+1\right)$$
(6)
$${{TWS}}_{n}={\sum}_{i=1}^{n}({P}_{i}-{{ET}}_{i}-{R}_{i})$$
(7)

where \(X\) is the hydrological fluxes (i.e., P, ET, and R), \(t\) represents temporal scale, and \(n\) indexes monthly intervals.

We project TWS over MUS during 1987–2100 using statistical relationships derived from observational data and CMIP6 simulations under four greenhouse gas emission scenarios. However, TWS variations are influenced not only by external climatic drivers but also by anthropogenic forcing. Here, we assume that the anthropogenic effects related to anthropogenic interventions such as government policy during the predicted period are ER. A GAM model, comprises a combination of generalized linear models and additive models, is employed to reconstruct TWS. This model can capture the response of TWS to vegetation growth through nonparametric smoothing regression correlation. Prior to reconstruction, all datasets underwent seasonal-trend decomposition using STL.

$${X}_{t}={T}_{t}+{S}_{t}+{R}_{t}$$
(8)

where \({T}_{t}\), \({S}_{t}\), \({R}_{t}\) respectively represent the trend, seasonal and remainder component. Based on these decomposed components, we developed separate GAM models for trend and remainder components. The final result was obtained by summing the predictions from each individual component model.

$${TWS}=f(T,{SM}1,{SM}2,{TWS}{{\_}}{WB})$$
(9)

where \(f\) demotes GAM model, \({TWS\_WB}\) means TWS from WB approach.

In practice, we first establish a statistical relationship between the STL-based component from CMIP6 simulations and observational dataset for the period 1987-2020. This established relationship is then extended through 2100 under from CMIP6 emission scenarios, with projected components reintegrated to generate TWS predictions. Referring related studies63,68, the dataset is partitioned into 70% training and 30% testing subsets through randomized sampling. Model performance is quantified using Pearson’s correlation coefficients (R) and root mean squared error (RMSE). The comparable correlation coefficients and RMSEs demonstrate the predictive reliability of our projection approach across emissions scenarios (Supplementary Fig. 16). Error distribution analysis confirms the absence of overfitting, as evidenced by normally distributed residuals (Supplementary Fig. 17).

Trend and correlation calculation

To isolate long-term trends, we first remove seasonal signal from monthly TWS components and vegetation indices using STL decomposition69. Linear regression analysis subsequently quantifies trends in these de-seasonal series. Statistical linkages are then evaluated through Pearson’s correlation and partial correlation coefficients. Nonlinear Granger causality, an extension of the traditional Granger causality test, is employed to detect the causal relationships between variables that may exhibit nonlinear patterns42. Building on correlation analysis, we further investigate the interrelationship between TWS components and vegetation indices using this method. Additionally, we use component contribution ratio to quantify how individual water storage components contribute to the total TWS70.

Contribution assessment of anthropogenic activities and climate change

We use the double-mass curve method to quantify the relative contribution of anthropogenic activities and climate change on TWS dynamics across MUS. This technique operates by plotting cumulative values of two variables to assess their relationship, with abrupt slope alterations indicating systematic deviations from baseline conditions. When observed slope changes occur in baseline periods presumed free from human interference, these variations are attributed to natural drivers. Following the identification of such inflection point, the method quantitatively evaluates the effects of both natural and anthropogenic causes during post-change period.

In this study, the double-mass curve is calculated coupled with TWS and P, where P serves as a proxy for climatic influences. The study area exhibits a persistently arid climate, with P identified as the primary contributor to water storage. We therefore narrow our climate focus to P in this analysis. Based on land use classification and NDVI change-point detection, the study period is divided into three distinct stages: 1987−1998 (Stage 1), 2003–2010 (Stage 2), 2011–2020 (Stage 3). As anthropogenic activities began to significantly impact the region in 1999, changes observed in Stage 1 (prior to human disturbances) are attributed solely to climate change.

$${\Delta {TWS}}_{n}=\overline{{{TWS}}_{n}^{{obs}}}-\overline{{{TWS}}_{1}^{{obs}}}$$
(10)

where \(\overline{{{TWS}}_{n}^{{obs}}}\) and \(\overline{{{TWS}}_{1}^{{obs}}}\) denote the observed mean TWS in the n-th Stage and Stage 1, respectively. To quantify anthropogenic impacts (\({\Delta {TWS}}_{{human}}\)), we first establish a linear regression between cumulative P and cumulative TWS during Stage 1. This relationship is then used to reconstruct the natural TWS variation (\(\overline{{{TWS}}_{n}^{{rec}}}\)) for subsequent stages by removing human interventions. The model, calibrated with Stage 1 data (prior to the enforcement of ER), generates counterfactual predictions (2003–2020) simulating TWS dynamics under a changing climate without ER. In contrast, the factual observations from 2003–2020 reflect both natural variations and the compounded effects of ER implementation.

$${\Delta {TWS}}_{{human}}=\overline{{{TWS}}_{n}^{{obs}}}-\overline{{{TWS}}_{n}^{{rec}}}$$
(11)
$${\Delta {TWS}}_{{climate}}={\Delta {TWS}}_{n}-{\Delta {TWS}}_{{human}}$$
(12)

where \({\Delta {TWS}}_{{human}}\) and \({\Delta {TWS}}_{{climate}}\) denote TWS changes caused by anthropogenic activities and climate change, respectively. Accordingly, the percentage contributions of anthropogenic activities and climate change are calculated as:

$${{Contribution}}_{{human}}={\Delta {TWS}}_{{human}}/{\Delta {TWS}}_{n}\times 100 \%$$
(13)
$${{Contribution}}_{{climate}}={\Delta {TWS}}_{{climate}}/{\Delta {TWS}}_{n}\times 100 \%$$
(14)

where \({{Contribution}}_{{human}}\), \({{Contribution}}_{{climate}}\) denote the percentage contribution of anthropogenic activities and climate change, respectively.