Introduction

Agricultural expansion leading to forest loss has significantly impacted climate, food, energy, and ecological systems1,2,3. The continuous expansion and intensified utilization of cropland have been driven by global population growth and regional economic development, and the expansion in irrigation has modified the surface and subsurface water balance4,5. In addition, the decreased forest cover and its effects on water sustainability6 have been an important factor influencing freshwater security risks7. However, the hydrological influences and the underlying mechanisms of irrigation increases and the subsequent deforestation are unclear, and clarifying them is essential not only for research and resource management but also for national and global progress toward the United Nations’ 2030 Sustainable Development Goals on food, water, and forest8.

Systematically exploring and disentangling the influences of irrigation and deforestation on hydrological processes are still challenging because of the complex linkages and feedback between the water variables. Consequently, recent studies mostly focused on the influence of a single water variable or a specific aspect4,6,9. In the aspect of irrigation, soil moisture (SM) and evapotranspiration (ET) enhancements have been identified in regions with irrigation, especially during the growing seasons10,11. Meanwhile, the groundwater storage (GW) and terrestrial water storage (TWS) depletions were driven by water overuse for heavy irrigation5,12,13, which was likely to cause the spatiotemporal redistribution of streamflow and GW compared to natural conditions14. Accordingly, missing the linkages of different water variables limits our current understanding of irrigation’s impact on the regional water cycle4,14,15. Moreover, a series of studies assessed the isolated effects of forest loss on runoff increments by using paired-watershed experiments6, the empirical Budyko frameworks9,16, and gauge observations combined with statistic models17,18,19, which suggested that deforestation could additionally influence the water cycle when it was accompanied by irrigation.

To elucidate the impacts of irrigation and deforestation on the long-term changes in water cycle variables and clarify the underlying mechanisms, we set up a high-resolution (5 arcmin) and fully coupled model based on a set of climate, hydrological, and geological datasets for the Lancang-Mekong River Basin (LMRB) (Fig. S1). The LMRB is one of the largest river basins and is important for the livelihoods of over 70 million people20. In the LMRB, a fast growth phase of the irrigation expansion originated in the 1980s–1990s due to the rapid growth of population and rice export21, leading to an accelerating deforestation rate22,23. Considering the large uncertainties derived from the lack of reservoir observations and actual operation rules24, we selected the historical period of 1980–2010 for the focus of our investigation. To obtain spatiotemporally explicit information about the water cycle, we used the community water model (CWatM) coupled with the modular three-dimensional finite-difference groundwater flow (MODFLOW), hereafter referred to as CWatM-MODFLOW25. We calibrated it by simultaneously using site-based streamflow observations and the gridded satellite water storage product from Gravity Recovery and Climate Experiment (GRACE)26,27 (Methods). The model was run for two land cover scenarios28. In the first scenario, the annual land cover was fixed as it was in 1980, labeled as the LAND_1980 experiment (exp). Comparably, the other scenario was designed as a control (CTRL) exp, in which the dynamic annual land cover change was simulated. As a result, the differences (denoted by ∆) between CTRL exp and LAND_1980 exp were caused by anthropogenic-induced vegetation changes and associated water use changes (Methods). Since the anthropogenic vegetation changes primarily manifested as different combinations of irrigation and deforestation, the grid cells were grouped into three types: irrigation increases with no deforestation (Irrigation(non-DF)), irrigation increases with deforestation (Irrigation(DF)), and no irrigation change (non-Irrigation) (Fig. 1). Our findings suggest that the contrasting mechanisms and effects of irrigation and deforestation existed in different hydrological processes and differed in seasons.

Fig. 1: Weakened coupling due to irrigation increases and deforestation.
figure 1

a Anthropogenic-induced vegetation changes during 1980–2010 in the Lancang-Mekong River Basin (LMRB) were grouped into three types: no irrigation change (non-Irrigation) and irrigation increases with or without deforestation (Irrigation(DF) or Irrigation(non-DF), respectively). be Consequently, the relationships between the aridity index and soil moisture (SM) and between the SM and groundwater storage (GW) were changed. The x symbol in (a) is shown at a 10 arcmin resolution for clear visualization, within which deforestation occurred in at least one out of four 5 arcmin cells. b Kendall’s rank correlation (r) in the LAND_1980 experiment (no land cover change since 1980). c Differences (∆) in r between the control (CTRL) experiment and LAND_1980 experiment. In (b, c), the color of each grid indicates the co-occurrence of two (∆)r values. r was computed from the annual values of aridity index, SM, and GW. The colors in (d) are tied to the levels of irrigation increases in (a), and the bars in (e) denote the mean and 95% confidence interval of the ∆r among the non-Irrigation, Irrigation(DF), and Irrigation(non-DF) grids.

Results

Widespread decreased correlation between SM and GW

The spatial distributions of the widespread deforestation and irrigation during 1980–2010 in the LMRB were inconsistent; that is, 25% of the grid cells exhibited a combination of deforestation and irrigation (i.e., Irrigation(DF)), and 13% of the grid cells exhibited an increase in irrigation without deforestation (i.e., Irrigation(non-DF)) (Fig. 1a). Within LAND_1980 exp, the correlation between the SM and the aridity index (r(Aridity index, SM)) was generally positive, and so was the correlation between the SM and GW (r(GW, SM)), with r(Aridity index, SM) > 0.5 and r(GW, SM) > 0 in 83% and 90% of the grid cells, respectively (Fig. 1b). The irrigation expansion caused widespread decreases in these correlations, shown by negative ∆r(Aridity index, SM) and ∆r(GW, SM) between CTRL exp and LAND_1980 exp, especially in southwestern parts where the irrigation was enhanced substantially (Fig. 1b–d). The ∆r values were more negative in the Irrigation(non-DF) cells than in the Irrigation(DF) cells and non-Irrigation cells. The negative ∆r(GW, SM) in the non-Irrigation cells suggests that the groundwater abstraction had an impact on the lateral groundwater flow (Fig. 1e).

Impacts of irrigation and deforestation on the increase in SM and decrease in GW

Within the cells in which both correlations decreased, irrigation increased the SM and decreased the GW through abstraction (shown as positive ∆SM and negative ∆GW, respectively, in Fig. 2a, d). However, comparable variations in ∆(Irrigation withdrawal) could lead to more infiltrated water and greater ∆SM for the Irrigation(non-DF) cells than for the Irrigation(DF) cells because irrigation and deforestation had contrasting impacts (Figs. S2 and 2b, c). The reverse deforestation impact was evidenced by the lower infiltration and SM levels in the non-Irrigation grids with a lower forest fraction (Fig. S3). In addition, similar variations in the ∆(Groundwater abstraction) resulted in more negative ∆GW values in the Irrigation(non-DF) cells compared to the Irrigation(DF) cells (Fig. 2e, f). The mechanisms controlling this phenomenon include the following: 1) the steeper terrain in the locations with deforestation contributed to higher lateral groundwater flow, compensating for GW decreases (Fig. S4); and 2) the deforestation itself reduced the recharge, enhancing the decreases in the GW (Fig. S3). It should be noted that irrigation could also have led to a slight increase in the baseflow (compared to the withdrawal rate) due to the increased recharge in the sub-grid groundwater cells unaffected by withdrawals (Fig. S5).

Fig. 2: Long-term changes of SM and GW due to contrasting impacts of irrigation and deforestation.
figure 2

a, d The 31-year changes (the differences between CTRL exp and LAND_1980 exp) in the SM and GW for the Irrigation(non-DF), Irrigation(DF), and non-Irrigation grid cells in the LMRB. bc, ef Contrast in the impacts of irrigation and deforestation. The SM, GW, irrigation withdrawal, groundwater abstraction, and recharge are the yearly values used to calculate the changes based on the Sen’s slope in each grid with a significant trend (p < 0.05) (Methods). The bars in the lower left corners of (a, d) denote the mean and 95% confidence interval among the Irrigation(non-DF), Irrigation(DF), and non-Irrigation grids. The colors in (b, c) are tied to (a), and those in (e, f) are tied to (d).

Impacts of irrigation and deforestation on changes in runoff

In contrast to SM, the ∆Runoff did not exhibit a consistent spatial pattern in the LMRB during 1980–2010 (Fig. 3a). By partitioning runoff into surface runoff and baseflow, results showed that ∆(Surface runoff) and ∆Baseflow contributed contrastingly, with respective positive and negative proportions, and the larger proportion of ∆(Surface runoff) than ∆Baseflow implies that the former dominated the ∆Runoff (Fig. 3b). The partial information decomposition (PID) was then applied, since the SM and ET have interactions and dependencies in controlling the surface runoff (Methods). The redundant information about ΔSM and ΔET, as well as the unique information about ΔET, largely contributed to the negative Δ(Surface runoff) changes (Fig. 3c), especially for the Irrigation(non-DF) cells (Fig. S5a–c). That is, the increases in ΔSM induced ΔET increases by supporting more crop transpiration and more ponded water evaporation in the expanded paddy areas, leading to the decreases in Δ(Surface runoff) (Figs. S6 and 3d). Therefore, with the converse increases in ∆Baseflow, the ∆Runoff in the Irrigation(non-DF) grids exhibited a mixed pattern of 57% grid cells in decreases and 43% grid cells in increases (Figs. S5d–f and 3a).

Fig. 3: Long-term changes of runoff due to the contrasting impacts of irrigation and deforestation.
figure 3

a Changes in the runoff during the 31-year study period in the LMRB. b Proportions of ∆(Surface runoff) and ∆Baseflow, together with c contributions of the ∆SM and ∆ET (namely, evapotranspiration) to the ∆(Surface runoff), reveal that de the inverse impacts of irrigation and deforestation on the different hydrological processes consequently enhanced the ∆Runoff. The calculations of the changes are the same as in Fig. 2. The percentages in the lower left corner of (a) are based on the number of positive/negative cells, and the dashed lines indicate the mean values. The proportions in (b) were calculated by dividing either the ∆(Surface runoff) or ∆Baseflow changes by the ΔRunoff changes. The contributions in c are based on the partial information decomposition, which quantifies the interactions and dependencies in a multivariate system (Methods). d, e show the average levels of contributions under the impacts of irrigation and deforestation.

For the Irrigation(DF) grids with a combination of irrigation and deforestation, the Δ(Surface runoff) was dominated by unique or redundant information about ΔET and tended to be more positive than the situation for Irrigation(non-DF) (Figs. 3c and S5a–c). This is attributed to the above finding that the increases in ΔSM were lower in the Irrigation(DF) grids than in the Irrigation(non-DF) grids, causing less ΔET increases from both perspectives of transpiration and evaporation (Figs. 2a and S6). Consequently, although deforestation simultaneously alleviated the decreases in Δ(Surface runoff) and the increases in ΔBaseflow caused by irrigation (Fig. 3e), the ∆Runoff was primarily influenced by the fact that the surface runoff predominantly increased (Fig. 3a).

Exceptional impacts of irrigation and deforestation in summer

From the perspective of seasonality, we found that the ∆Runoff influenced by irrigation changed from positive to negative during summer months, after which it returned to positive (Fig. 4a). The ∆Runoff was mainly due to the seasonal ∆(Surface runoff), except for the elevated contribution of the ∆Baseflow, which alleviated the decrease in the ∆Runoff during summer (Figs. 4b and S7d). In addition, the Δ(Surface runoff) was mainly controlled by the ΔET that was increased by the ΔSM (Fig. S7a–b), since the redundant information about ΔSM and ΔET made the largest contribution (Fig. 4c).

Fig. 4: Seasonal variability of the runoff changes under the impacts of irrigation and deforestation.
figure 4

a Seasonal variability of ∆Runoff. b Contributions of ∆(Surface runoff) and ∆Baseflow to ∆Runoff. cd Contributions of ∆SM and ∆ET to ∆(Surface runoff) in the LMRB during 1980–2010. Unlike Fig. 3, the ∆Runoff in (a) was calculated using the daily ∆Runoff during the 31-year study period on the same day, and each ∆Runoff of a day is the average among the Irrigation(non-DF)/Irrigation(DF)/non-Irrigation grid cells.

Compared to the Irrigation(non-DF) grids, the decreases in ∆Runoff in summer were intensified in the Irrigation(DF) grids (Fig. 4a) because of the lower increases in ΔBaseflow in the Irrigation(DF) grids (Fig. S7d); however, the increases in ∆Runoff were greater in the other months, which eventually caused more grids with positive ∆Runoff changes (Figs. 4a and 3a). These greater increases in ∆Runoff, which were dominated by the Δ(Surface runoff), were mainly caused by the lower increases in ΔET and its driver, ∆SM (Figs. 4b–d and S7a–b). Exceptionally, the ∆SM among Irrigation(DF) grids show opposite changes to that of Irrigation(non-DF) grids in summer because the irrigation during crop growing seasons increased the ∆SM but deforestation countered the impact of the irrigation (Fig. S7a).

Discussion

The widespread high correlation between the r(Aridity index, SM) and r(GW, SM) in the LAND_1980 exp during 1980–2010 in the LMRB (Fig. 1) indicates that precipitation was the major source of the SM and the correlation between the GW and SM was stable when the regional water cycles were not affected by irrigation and deforestation. Such high correlations between the aridity index and SM (>0.6) have also been identified in different parts of the world, particularly in the western Europe and the southern United States, which have experienced a wetting trend of precipitation in recent decades (1991–2019)29. In addition, site-based observations of water table depths30 and observation-validated groundwater simulations31 have revealed that there were highly positive correlations between the groundwater level and SM in regions where groundwater is an important source for buffering the reliance of SM on precipitation. Therefore, the widespread negative ∆r(Aridity index, SM) and ∆r(GW, SM) values suggest that the increase in irrigation could decrease the reliance of SM on the aridity index and could disturb the relationship between the GW and SM (Fig. 1).

Notably, we found that the impacts of deforestation and irrigation on the SM and GW were different. (1) The irrigation expansion increased the SM, while the deforestation decreased the SM. (2) The increase in the irrigation-induced groundwater abstraction reduced the GW and the recharge changed by lower forest area boosted the decrease in the GW (Fig. 2). The irrigation impact was conducted by exploitation of GW and provided additional water to increase the SM and recharge in the irrigated cells4,14,25,32. In contrast, deforestation with lower tree density, resulting in less dense roots and organic matter (from litter), would reduce the water infiltration and water-holding capacity of the soil and consequently reduce the recharge and baseflow33,34. Furthermore, our findings demonstrate the positive dependency of ET (specifically, transpiration) to SM in water-limited regions35,36 (Fig. S6), differently from energy-limited watersheds in Brazil where increased SM was a negative response to decreased ET rates by deforestation37. In this context, the hydrological mechanisms of irrigation expansion via deforestation were elucidated by isolating and comparing the impacts of irrigation and deforestation.

The occurrence of the spatially inconsistent ∆Runoff in the Irrigation(non-DF) grid cells highlights not only the contrasting impacts of the increases in the ∆SM and ∆ET on the Δ(Surface runoff) but also the contrasting impacts of the decreases in Δ(Surface runoff) and increases in ∆Baseflow (Fig. 3). Building upon the general notion that increasing SM benefits runoff generation while increasing ET decreases the amount of water available for the formation of runoff4,38, we determined that the ∆ET via positive SM-transpiration coupling and the ∆ET enhanced by water evaporation on the saturated SM of the expanded paddy areas were the main factors controlling the Δ(Surface runoff) in the context of irrigation expansion9,39. Moreover, widespread increases in recharge and baseflow during irrigation expansion, which have also been observed in other basins in which irrigation is utilized32,40, could buffer runoff deficits predominantly caused by decreases in surface runoff in most grids or increased runoff in other grids due to increased surface runoff. This resulted in an almost half-and-half mix pattern of both decreases and increases.

The impact of the decreases in forest cover on the increases in runoff (Fig. 3) is supported by the sensitivity calculations of the Budyko assumption for global runoff dataset16, statistical analyses of streamflow records18,19, and watershed modeling37. From this perspective, in the study area, the deforestation conducted to achieve irrigation expansion led to additional surface runoff beyond that caused by irrigation (Figs. 3 and S5), except for the summer when irrigation caused runoff deficits (Fig. S7). This is consistent with observations of other large river basins in Brazil17. Moreover, deforestation led to decreased recharge and baseflow due to alteration of the soil structure, which further exacerbated the runoff deficit (Figs. 4 and S7). In the other months, the ∆Runoff, which was predominantly caused by the Δ(Surface runoff), was increased by irrigation, and deforestation decreased the ΔET, thus leading to a greater increase in the ∆Runoff. Overall, the impact of deforestation on ΔRunoff was in contrast with that of irrigation via different mechanisms for different seasons, shifting to the runoff pattern with more long-term increases.

By identifying those potential mechanisms by which irrigation and deforestation impact surface and subsurface hydrological processes contrastingly, this study enhanced our understanding of how and why the pivotal water variables (like SM, GW, ET, baseflow, runoff) could be changed in the long term and between the seasons, and what were the connections between variables. The findings could benefit the broader investigations in other water-limited regions35, especially under the context that widespread shifts from energy-limited to water-limited conditions were projected due to climate change36. The changing climate also affected baseflow through the changes in precipitation, evaporative demand and snow fraction41. Further assessments of human-natural water systems were suggested by taking into account more reliable reservoir operation information24 and dynamic farmer decisions interacted with the hydrological environment42. Additionally, based on multi-source observation adjustments and numerical experiments, our approaches hold practical applicability on a global scale28, and the multivariate analysis framework could be adapted to other complex systems43.

Methods

Model setup and forcing data

The CWatM is a physics-based large-scale hydrological model that includes human impacts such as irrigation, water withdrawal for other purposes, surface reservoirs, and land cover44. CWatM-MODFLOW further allows the modeling of groundwater lateral flow, groundwater exchanges with surface soil and water, and groundwater pumping25. Surface water-groundwater exchanges are characterized by (1) groundwater recharge from the CWatM to MODFLOW and (2) groundwater capillary rise and baseflow simulated by MODFLOW to the lower soil layer and the river network system in CWatM. The land cover changes are characterized using the six classes44, and the groundwater abstraction was calculated based on the water demands for livestock, industry, domestic use44, and irrigation25.

We applied CWatM-MODFLOW to the LMRB using high resolutions of 5 arcmin (~9 km at the Equator) for CWatM and 1.5 km for MODFLOW. Within the coupled model, the simulations were applied at time steps of 1 day and 1 week for CWatM and MODFLOW, respectively. Regarding driving the model at a high resolution, we used a regional climate product for 1980‒201045. This recently developed product was derived using the Weather Research and Forecasting (WRF) model to dynamically downscale the 0.25° fifth generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalysis of the global climate (ERA5) global data to 9 km in East Asia. It has a reliable representation compared with other products and has been validated using gauge and satellite observations45. The forcing datasets were resampled to a resolution of 5 arcmin.

Regional parameterization

The regional parameterizations were improved in both the CWatM and MODFLOW parts. For CWatM, the unsaturated hydraulic conductivity in the soil layers (K; m d-1) was determined using the van Genuchten equation46 (Eq. (1)) based on the soil saturated conductivity (Ks; m d-1), the actual, maximum, and residual amounts of SM (θ, θs, and θr, respectively, in cm3 cm-3), a pore-size related parameter (m), and an empirical shape factor (l)44,47, as follows:

$$K={K}_{s}{\left(\frac{\theta -{\theta }_{r}}{{\theta }_{s}-{\theta }_{r}}\right)}^{l}{\left\{1-{\left[1-{\left(\frac{\theta -{\theta }_{r}}{{\theta }_{s}-{\theta }_{r}}\right)}^{1/m}\right]}^{m}\right\}}^{2}$$
(1)

where Ks, θ, θs, θr, and m, provided in the CWatM default maps, were calculated based on the soil properties from the Harmonized World Soil Database 1.248 and the pedotransfer functions from the Hydraulic Properties of European Soils (HYPRES)49. However, we found that the range of the provided m for our study area exceeds the lower limit of 0.550, causing K to remain at low levels and the fluxes of the percolation in the groundwater recharge to be much lower than the preferential flow (Fig. S8). The study in the North China Plain51 adjusted the magnitude of the percolation within the groundwater recharge using an exponential equation with a calibrated site-specific parameter α instead of Eq. (1) (see examples of K-θ curves in Fig. S8) based on an earlier field investigation in their study region52. In contrast to their attempts, the parameters of Eq. (1) for the LMRB refer to HiHydroSoil v2.053, which applies pedotransfer functions based on thousands of soil samples and performs significantly better than HYPRES54. Moreover, l, which was originally interpreted as a physical parameter related to the tortuosity structure of the connected pores55, was found to always be lower than its predetermined value of 0.5 in further studies47,56, providing flexibility to compensate for a conceptual deficiency of the capillary bundle models57. Therefore, l was adjusted to an appropriate value during the calibration to accurately predict K in groundwater-soil exchanges58.

For MODFLOW, the upward flow from the groundwater system to the soil layer and channels of CWatM was calculated by using DRN package, which was determined by simulated water table depth and permeability in each cell. Then, the partitioning of upward groundwater flow into capillary rise feeding the soils and baseflow feeding the rivers was conducted by computing 500 m resolution river networks with a flow accumulation area threshold of 25 km259 based on the 3 arcs (~90 m at the Equator) hydrologically conditioned digital elevation model (C-DEM) of HydroSHEDS60. This partitioning was further calibrated for each catchment by adding an adjusted factor25. The aquifer permeability and porosity were obtained from global hydrogeology maps 2.0 (GLHYMPS 2.0)61. In addition, we improved the identification of permafrost distribution using the new dataset based on comprehensive field data62 (Fig. S1), and we conceptualized the permafrost as an aquitard layer with a low hydraulic conductivity of 5.2 × 10-11 m/s63,64. The specific yield is close to the porosity for aquifers with a large grain size, but it is much smaller than the porosity for aquifers with a small grain size65. Therefore, the specific yield was approximated to be 80% of the porosity for sand and larger grain sizes, and this reduction scale was calibrated for slit and clay66.

Based on a limited geological survey67 and our study’s purpose of surface water-groundwater interactions68,69, we considered one unconfined aquifer layer with varying thicknesses from the mountains to sediments for sub-basins 1–10 (Fig. S1)70. In contrast, sub-basin 11, which is mainly located in the Cambodia Mekong River Delta aquifer, was considered to be a confined aquifer71, and we applied a global field data-based relationship between the specific storage and porosity to characterize the specific storage72. The method of determining the aquifer thickness distribution70 was embedded in the model. Accordingly, the elevation differences (E) in the C-DEM along the drainage direction network from the hydrological data and maps based on shuttle elevation derivatives at multiple scales (HydroSHEDS) database were used to classify the cells into alluvial aquifer and mountain range aquifer cells. The relative differences (E’) normalized from E (Eq. (2)) were used to rate the z-scores (Z in Eq. (3)), which indicate the likelihood that the alluvial aquifer cells form a thick layer close to the river or a thinner layer farther away from the steam70.

$${E}^{\prime}=1-\frac{E-{E}_{min }}{{E}_{max }-{E}_{\min }}$$
(2)
$${{\rm{Z}}}={G}^{-1}\left({E}^{\prime}\right)$$
(3)

where G−1 is the inverse standard normal distribution, and the mapping of Z was combined with a log-normal distribution of thickness values with a randomly sampled average (A) and a fixed coefficient of variation (Cv) (Eq. (4))69,70.

$${{\rm{thickness}}}={e}^{{ln}\left(A\right)\left(1+{C}_{v}Z\right)}$$
(4)

where A was adjusted during the calibration, and the mountain range cells, which mainly consisted of hard rock with secondary permeability, were assumed to have only a thin aquifer below the soil layers with the fifth percentile of the thickness distribution.

Calibration scheme and model performance

The LMRB was divided into 11 independent sub-basins based on the locations of the hydrological station and the drainage direction (Fig. S1), and each sub-basin had different parameters for characterizing the spatial heterogeneity within this large river basin73. For each sub-basin, we calibrated and validated the model using the daily discharge at the gauge station locations and the monthly TWS anomalies among the sub-basin cells. The calibration and validation periods were 2002‒2006 and 2007‒2010, respectively (Table S1), as the GRACE TWS datasets are available from 2002 and the operations of hydropower dams after 2010 would obscure the focus of this research. The calibration was performed using non-dominated sorting genetic algorithm II (NSGA-II) to evolve the 14 parameters associated with snowmelt, crop evapotranspiration, open water evaporation, exchanges between soil and groundwater, soil depth, interflow, infiltration, preferential flow, groundwater recharge, runoff concentration and routing, reservoir and lake storage, and aquifer thickness distribution. Three objective functions were utilized to obtain the Pareto-optimal solutions: (1) the modified Kling-Gupta efficiency (KGEdis) between the simulated and observed discharge26,74, (2) the correlation coefficient (r’TWS), and 3) the root mean square error (RMSETWS) between the simulated and observed TWS anomalies51,75, where the r’TWS and RMSETWS scores were integrated from the grid level to the sub-basin level to achieve consistency with KGEdis (see the following methods).

The scarce valuable gauge discharge data have long been a primary concern for LMRB research and resource management20,76. In contrast, the satellite-observed TWS data, as a reliable constraint that improves watershed simulations, were integrated considering variables such as the GW and SM26,77. Accordingly, the KGEdis had a higher weight of 0.7 for the fitness comparisons in NSGA-II, compared to the other two objectives that need to be maximized (i.e., r’TWS and 1-RMSETWS). Although the r’TWS and RMSETWS are comparable in terms of importance, we found that the r’TWS, which exhibited larger variations, had higher sensitivity for the Pareto multi-optimization, so their weights were set to 0.2 and 0.1, respectively. These weights were also used to determine the metric (M) for selecting the optimal parameters among the Pareto frontiers (Eq. (5)), as follows:

$$M=0.7\times {KG}{E}_{{dis}}+0.2\times {r^{\prime} }_{{TWS}}+0.1\times \left(1-{RMS}{E}_{{TWS}}\right)$$
(5)

where r’TWS and RMSETWS were calculated in each 0.5° cell and then averaged for each sub-basin, consistent with KGEdis. The observed TWS anomalies at a 0.5° resolution were the ensemble mean of the Jet Propulsion Laboratory mass concentration (mascon) product (0.5° resolution)78 and the Center for Space Research mascon product (0.25° resolution)79 following the recent study80. The simulated TWS anomalies were implemented with an area-based upscaling from 5 arcmin to 0.5° during the calibrations. Such a multi-objective calibration strategy has been proven to have the ability to decrease the equifinality of the parameterization and reduce the uncertainties of hydrological predictions26,27.

The NSGA-II was set up with a population size of 256, a recombination pool size of 80, and 60 generations to ensure convergence. For each set of parameters, the CWatM-MODFLOW was used to conduct a warmup run for 20 years using the daily average meteorological forcing for 1980‒2010 20 times, which proved to be sufficient to initialize the water table25,51,81. Since the hydrologic predictions for each sub-basin were largely impacted by the discharge from the upstream area of each sub-basin, the calibration and validation were employed consecutively from the upstream sub-basins to the downstream sub-basins. The overall performance is shown in Fig. S1b, which shows that our multi-objective parameter optimization method can provide a robust representation of both the surface and subsurface hydrological processes. We found that the performance of the TWS simulations was poor for the first sub-basin (Fig. S9), which was probably because the parameterization of the groundwater system in this region was data-limited and had large uncertainties. Fortunately, this region was not the primary focus of the analysis in this research.

Experimental design and data analysis

The hydrological model was run for two land cover scenarios28, namely LAND_1980 exp and CTRL exp. Accordingly, ∆ values in each of the variables were the differences between CTRL exp and LAND_1980 exp, reflecting the impacts of deforestation and irrigation on the water cycle, and these impacts occurred in different combinations among the grid cells: Irrigation(non-DF), Irrigation(DF), and non-Irrigation.

Kendall’s rank correlation, which avoids linearity assumptions, was applied to identify the relationships between the aridity index and SM (denoted as r(Aridity index, SM)) and between the GW and SM (denoted as r(GW, SM))19,36. The aridity index was calculated as the ratio of precipitation to potential evapotranspiration, representing the hydro-climatic conditions that could control the SM82, and the variabilities of the GW and SM were found to be consistent in different hydro-climatic conditions83,84. In Figs. 13, the aridity index is based on the annual amounts of precipitation and potential evapotranspiration, which were summed using the daily simulated values, and the SM and GW are the annual averages of the daily simulated values. After calculating these values in each grid of the LAND_1980 exp and CTRL exp, the change in r (i.e., ∆r(Aridity index, SM) or ∆r(GW, SM)) was obtained by subtracting the values of LAND_1980 exp from those of CTRL exp. Therefore, the ∆r was positive when r was higher in CTRL than in LAND_1980.

The long-term changes in the variables (e.g., ∆SM and ∆GW) were calculated based on a non-parametric estimation method (Eq. (6))76 for each grid and each day. In Figs. 2 and 3, the changes were calculated in each grid for the annual averages of the SM and GW and the annual amounts of the irrigation withdrawal, groundwater abstraction, recharge, baseflow, surface runoff, evapotranspiration, and runoff. The recharge is defined as the net groundwater recharge from the soil (i.e., recharge minus capillary flow). In addition to showing these overall changes during the study period (1980–2010), the changes shown in Fig. 4 demonstrate that the seasonal variability was calculated based on the daily ∆ values (e.g., ∆Runoff) over the 31-year period on the same day, where each ∆ value for a day was the average among each type of grid cell (Irrigation(non-DF)/Irrigation(DF)/non-Irrigation), given by

$$\Delta \,{{\rm{change}}}=k\times n$$
(6)

where n is the number of years in the study period, i.e., 31, k is the Sen’s slope estimated for each grid or each day (mm/yr or mm/d) based on the 31 ∆ values for each variable. Sen’s slope estimates the trend of a time series by applying the slope of the Kendall-Theil robust line, and it has been widely used to describe the magnitudes of the trends of climate and hydrological variables. The Mann-Kendall test was conducted at a significance level of 95% for the overall changes.

As runoff consists of surface runoff and baseflow, the ratios of either |∆(Surface runoff)| or |∆Baseflow| to |∆(Surface runoff)|+|∆Baseflow| were used to quantify their respective contributions to the ∆Runoff changes85. In contrast, the ∆SM and ∆ET have interactions and dependencies in controlling the ∆(Surface runoff)38, so their contributions in such a multivariate system were quantified by employing partial information decomposition (PID)43. PID measures the amount of information that the ∆SM or ∆ET uniquely contributes to the ∆(Surface runoff), the redundant information between ∆SM and ∆ET, and the synergistic information (Eq. (7)).

$$\begin{array}{c}I\left(\Delta {SM} - \Delta {ET};\Delta \left({{\rm{Surface\; runoff}}}\right)\right)=U\left(\Delta {SM};\Delta \left({{\rm{Surface\; runoff}}}\right)\right)\\ +U\left(\Delta {ET};\Delta \left({{\rm{Surface\; runoff}}}\right)\right)+R\left(\Delta {SM} - \Delta {ET};\Delta \left({{\rm{Surface\; runoff}}}\right)\right)\\ +S\left(\Delta {SM} - \Delta {ET};\Delta \left({{\rm{Surface\; runoff}}}\right)\right)\end{array}$$
(7)

where I is the total mutual information, and U, R, and S are the unique, redundant, and synergistic information, respectively. The computation was conducted using the method of refs. 43,86, and the unique, redundant, and synergistic contributions of the ∆SM and ∆ET to the ∆Runoff variability were U/I, R/I, and S/I, respectively. The unique ∆SM and ∆ET information reflects the direct contributions of the irrigation-induced and/or deforestation-induced ∆SM and ∆ET variations to the ∆(Surface runoff), respectively, while the redundant information reflects the contributions of the coupled variations in ΔSM and ΔET to the ∆(Surface runoff)43. When the ∆(Surface runoff) increased and the redundant information made the largest contribution, we assumed that the increase in the ΔSM drove the ∆(Surface runoff), although there was negative feedback between the ΔSM and ΔET. In contrast, when the ∆(Surface runoff) decreased and the redundant information made the largest contribution, the increase in the ΔET induced by the increase in the ΔSM contributed to the ∆(Surface runoff).