Main

Mountains act as a disproportionately large source of freshwater streamflow relative to their global land area (half of freshwater use from approximately 32% mean area1). Globally, an estimated 1.6 billion people living directly downstream of mountain areas dominated by snow and ice could be negatively affected by climatic and socio-economic changes2. Multidecadal decreases in snowpack have been observed worldwide3 and are notable in regions such as the western USA where gages and long-term observations are extensive4. High interannual variability in snowmelt-generated streamflow is widely noted and poorly predictable5,6,7. In addition, there is uncertainty in the physical mechanisms by which snowpack changes affect the water budget and streamflow yield, due to nonlinear atmosphere-through-bedrock connections8,9,10. These uncertainties are critical to resolve because water resource infrastructure is built and managed under the assumption of snowpack stationarity.

In the American West, declines in runoff ratio (the ratio of streamflow to precipitation) have been linked to changes in precipitation magnitude, with temperature being a secondary influence11. Other studies have found invariant runoff efficiency trends, despite seasonal shifts in snowmelt timing12. However, groundwater has traditionally been an overlooked component of mountain water budgets or has been assumed to be negligible13,14,15. But a growing body of work has demonstrated the dynamic nature of mountain groundwater15,16,17,18,19,20,21 and its capacity to support streamflow. The role of groundwater in mountain streamflow generation is not easily understood given a dearth of groundwater observations in mountain catchments14,22,23,24. In lieu of these direct observations, long-term reductions in winter baseflow have recently been used as a proxy for groundwater storage, showing a relationship between groundwater and reduced runoff ratio over time, especially in warmer and drier western US catchments6,25, and the potential for high-elevation baseflow to decline with climate change10.

To address this data gap, and the connection between snowpack, headwater groundwater and streamflow, a decade plus of mountainous scientific instrumentation and analysis has been conducted in the East River watershed of the Upper Colorado River Basin26 (Colorado, USA; Supplementary Fig. 1). These efforts include a novel dataset of continuous water table depth (WTD) measurements of a well transect in the lower montane region beginning in 201627,28. Previous work revealed that the higher-elevation well showed >1 m WTD decline during baseflow over 6 years despite high interannual snow variability (approximately ±30% peak snow water equivalent (SWE)) at nearby Snow Telemetry (SNOTEL) stations. These declines in groundwater storage are contrary to the long-held assumption of static groundwater storage in mountain systems, which has only recently been challenged14,29,30. Further, analysis of the well groundwater age (interpreted through a novel suite of environmental tracer observations including dissolved noble gasses, chlorofluorocarbons, SF6 and tritium) suggest the presence of very old groundwater, spanning decades to millennia31. Thus, not only is groundwater declining, the loss of groundwater could be from recharge that occurred long ago, indicative of unsustainable conditions. Two-dimensional simulations of this well transect bolster the evidence suggesting that old groundwater contributes to hillslope and floodplain dynamics32,33. These results are in line with global stream water analysis of stable isotopes showing lower fractions of young stream water in steep terrain, albeit at lower elevations than those considered here34 (<2,500 m; Supplementary Fig. 2). Collectively, these findings prompt us to hypothesize that groundwater, and specifically old-age groundwater, may be buffering streamflow response, especially in low-snow years. If buffering capacity of streamflow by groundwater currently exists, it is further unknown if groundwater storage will be able to support future streamflow with projected snow loss9. With these observation-driven questions, we use high-performance computing, high-resolution integrated hydrologic simulations and advanced particle tracking to simulate water age in the East River watershed (Methods).

Interannual water budget is variable and altered by warming

To compare interannual watershed fluxes, we simulated water input (precipitation (P)), output (total evapotranspiration (ET) and streamflow (Q)) and change in groundwater storage (ΔS) for a 7-year baseline and two atmospheric warming scenarios (Methods). Precipitation is kept constant from baseline in the warming scenarios (Methods). Despite high interannual variability of P (varying over 300 mm per year, mean of 649 mm per year, s.d. of 112 mm per year; Fig. 1a), normalized annual ET fluxes vary little between years (mean of 133 mm per year, s.d. of 10 mm per year; Fig. 1a), consistent with previous studies that suggest an energy-limited environment30,35. However, annual Q fluxes range between 390 and 601 mm per year (mean of 502 mm per y, s.d. of 84 mm per year) over the 7-year period and increase partly in response to P, although nonlinearly. The net impact on ΔS is bidirectional, where 4 years show groundwater storage loss (WY2015-16, WY2018 and WY2020, where ‘WY’ is water year), 2 years show minimal or near-neutral change (WY2017 and WY2021), and 1 year shows notable groundwater storage increase (WY2019). We note this latter water year (WY2019) is the wettest year and one of the years that produce the most snow (77.3% snow; Supplementary Table 1) but follows the driest WY in the simulation period and therefore supports conditions for higher groundwater storage capacity. The greatest decline in ΔS occurs during the lowest P year (WY2018), which yields a moderate amount of snow fraction compared with other years (71.0% snow; Supplementary Table 1). This WY is also the most sensitive to air-temperature increases to precipitation phase (Fig. 2a).

Fig. 1: Annual water budgets show groundwater is non-stationary and mostly partitioning to streamflow that is projected to decline with warming due to more evaporation.
Fig. 1: Annual water budgets show groundwater is non-stationary and mostly partitioning to streamflow that is projected to decline with warming due to more evaporation.
Full size image

a, Simulated annual water budget for WYs 2015–2021 (WY15–21), including precipitation (P), evapotranspiration (ET), discharge (Q) and change in groundwater storage (ΔS), expressed as heights of water equivalents for baseline atmospheric temperature. b,c, The change in water budget terms relative to baseline scenarios given +2.5 °C (b) and +4.0 °C (c) are shown, revealing a trade-off between streamflow due to increased evaporative demand under warming and subsequently exacerbated declines in groundwater storage, especially in low-snow years.

Fig. 2: Multi-year declines in groundwater storage occur despite interannual variability in precipitation and rain–snow partitioning; the rate of decline is exacerbated with warming.
Fig. 2: Multi-year declines in groundwater storage occur despite interannual variability in precipitation and rain–snow partitioning; the rate of decline is exacerbated with warming.
Full size image

a, Simulated cumulative annual precipitation showing rain–snow partitioning in the baseline and warming experiments. Note that the orange and red areas are additions of rain in the respective warming scenarios and were originally snow in the baseline scenario. Increases in rain fraction from baseline range from 4.2% to 26.9% or 1.7% to 6.7% per °C of warming (Supplementary Table 1). b, Simulated hourly groundwater storage dynamics and associated decreasing trend lines through time for each scenario.

On average, annual changes to the water–energy budget from atmospheric warming are nearly proportional per °C across the two warming scenarios. ET fluxes increase an average of ~16–17 mm per year per °C of warming, decrease Q by ~15–16 mm per year per °C of warming and decrease ΔS by ~1 mm per year per °C of warming. WY2019, which experienced the largest increase in groundwater storage in the baseline simulation owing to increased groundwater storage capacity following the previous dry year, shows very minimal change in storage from baseline with warming (Fig. 1c,d). This suggests a limited ability of a wet year to recover groundwater declines from a previous dry year in the presence of additional warming.

The increase in air-temperature results in a different increase in rain fraction per year (Fig. 2a and Supplementary Table 1), ranging from 4% to 27%. On average, the warmer numerical experiment results in slightly higher increases in rain fractions per °C of warming when compared with the lower warming experiment (3.8% per °C and 3.2% per °C, respectively), suggesting less vulnerability to precipitation-phase change with lower temperatures for this watershed. The driest year (WY2018) shows the greatest annual increase in rain fractions with the two warming scenarios (increases of 14% with +2.5 °C and 27% with +4.0 °C). Notably, increased rainfall occurs during the WY2018 winter months, resulting in rain-on-snow to occur and facilitate snowmelt.

Increases in mid-winter groundwater storage volumes, atypical for seasonally snow-covered watersheds at this elevation, do not occur in any year of the baseline simulation but do occur in the winter of WY2018 for both 2.5 and 4.0 °C warming scenarios (Fig. 2b). By contrast, years with small increases in rain fractions with warming (for example, WY2017, WY2020 and 2021) are characterized by only winter shoulder season increases in rain and do not appear to impact the temporal dynamics of groundwater recharge, mostly mirroring the time series of the baseline dynamics, albeit at a lower groundwater storage level after consecutive years of groundwater decline had occurred previously (Fig. 2b). Despite the bidirectional change in ΔS at the annual scale, over the 7-year time series, groundwater storage shows a steady decline in all scenarios indicative of a losing groundwater system despite high seasonal and interannual variability. Baseline groundwater storage loss is 9.39 × 106 m3 per decade, whereas a more rapid decline occurs in the warming scenarios (1.14 × 107 m3 per decade and 1.25 × 107 m3 per decade for 2.5 and 4.0 °C, respectively). Over a decade, the difference in groundwater storage loss between the baseline and 4.0 °C case is approximately 3.08 × 106 m3 (or approximately 2,500 acre-feet), which is approximately equivalent to the annual water consumption for several thousands of family households in US water management, depending on usage36.

Warming disproportionally drains high-elevation groundwater

Warming-induced increases in rain during winter shoulder seasons occurs in all years of both warming scenarios (Fig. 2) and result in more groundwater recharge and smaller depths to water table relative to baseline (d(WTD); Fig. 3a,b, green regions). This is the most evident at the beginning of both warming scenario simulations and becomes more pronounced in May and June as seasonal snow melts earlier in the year, despite less snow having been accumulated during the winter (Fig. 3c). While warming-induced increases in rain and early snowmelt are observed for all WYs, smaller depths to water table are temporary and muted by the simultaneously occurring accelerated loss of groundwater that is also driven by warming (Fig. 2b), in part owing to increased evaporative demand (Fig. 1b,c). Thus, reprieves in groundwater loss via increased rain and earlier snowmelt are short-lived (d(WTD) >0.5 m generally occur only 1–3 weeks per year). By the WY end, deeper water tables are the net resultant of atmospheric warming (Fig. 3a,b, pink regions). Furthermore, because WY simulations are run continuously, the end-of-year loss of groundwater grows over time and, notably, is the most severe at the highest elevations (>3700 m). Generally, higher-elevation water tables are more topography driven than those in the valley and demonstrate greater seasonal variability (Supplementary Figs. 19 and 20) and thus are more impacted. A snapshot of the cell-based WTD declines during the period of greatest decline shows local changes in WTD from baseline are as high as −7 m (Supplementary Fig. 5). This suggests that (1) local regions in high-elevation watersheds may be more susceptible to groundwater storage loss owing to atmospheric warming than lower elevations and (2) although short time periods of groundwater increase may occur seasonally with atmospheric warming, the overall impact on groundwater storage is a net loss by WY end.

Fig. 3: Warmer temperatures result in earlier seasonal snowmelt and higher-than-historical WTDs for short (multiweek) periods before dropping to lower-than-historical levels, especially at the highest elevations (>3,700 m).
Fig. 3: Warmer temperatures result in earlier seasonal snowmelt and higher-than-historical WTDs for short (multiweek) periods before dropping to lower-than-historical levels, especially at the highest elevations (>3,700 m).
Full size image

a,b, Values represent daily change in simulated WTD relative to baseline as a function of elevation (in km above sea level, kmasl) for the full-time series (WY2015–WY2021) given +2.5 °C (a) and +4 °C (b) atmospheric warming. Elevation–time patterns are similar for both warming scenarios. c, Watershed-average SWE for baseline and two warming scenarios (see legend).

Groundwater buffers streamflow response to low-snow years

To assess runoff ‘efficiency’ in historical and future warming scenarios, we compute simulated annual runoff ratios which vary between 70.2% and 82.1% for the baseline simulation (mean of 77.6%, s.d. of 4.9%; Fig. 4a and Supplementary Table 2). Comparatively, these are lower than at a century-long US Geological Survey (USGS) gauge (Supplementary Fig. 6c), which is expected given the higher-elevation boundary of the modelled flow area compared with that of the USGS gauge and the broader Almont Catchment (Supplementary Fig. 1). All approaches and model scenarios reveal a declining trend in annual runoff ratio. This decline is exacerbated with warming and declines nearly proportionally per degree of warming for the two scenarios (−2.5% per °C and −2.3% per °C for the +2.5 °C and +4.0 °C cases, respectively; Supplementary Table 2a).

Fig. 4: Declines in groundwater during low-snow years buffer streamflow efficiency and follow a similar compensatory trend as ET increases with warming.
Fig. 4: Declines in groundwater during low-snow years buffer streamflow efficiency and follow a similar compensatory trend as ET increases with warming.
Full size image

a, Simulated annual runoff ratio (Q/P) for the baseline (black) and two warming experiments (+2.5 °C and +4.0 °C in orange and red, respectively). b, The relationship between annual runoff ratio and annual GSR (ΔS/P) for each scenario, with symbols denoting WY (see a). Point sizes are scaled by the magnitude of annual precipitation in a and b. c, Ternary diagram of runoff ratio, GSR and evaporative index (ET/P).

To explore how these annual changes in streamflow behaviour relate to groundwater dynamics, we calculate the annual increase or decrease in groundwater storage and relate it to annual precipitation through the new metric ‘groundwater storage ratio’ (GSR; Fig. 4b). High annual runoff ratio corresponds to years with the greatest relative declines in groundwater storage and vice versa (Fig. 4b), indicating a mechanism wherein groundwater is buffering streamflow. For example, in the lowest-snow year (WY2018), annual runoff ratio peaks in the 7-year period (82.8%; Supplementary Table 2), corresponding to when annual GSR reaches a minimum (−17.1%; Supplementary Table 2). This result supports either a ‘groundwater bypassing’ and/or ‘groundwater draining’ conceptual model where either (1) new water input bypasses groundwater, failing to replenish storage and instead increasing streamflow and/or (2) existing groundwater reserves are drained to the stream, effectively decreasing storage to subsidize streamflow. Notably, this is contrary to traditional frameworks that assume zero long-term change in subsurface storage. The same dynamics are noted for the warming scenario results where, similar to runoff ratio, warming results in proportional changes to GSR change (−0.2% per °C; Supplementary Table 2b). A ternary diagram (Fig. 4c) further compares changes in runoff and GSR with evaporative index, defined as the annual ET total normalized by annual precipitation (Methods). Here, we see how annual changes in groundwater storage co-vary with changes in ET and Q. For example, those years with negative ΔS, plotting off the ternary grid, also have the highest runoff ratio and highest evaporative index. Atmospheric warming consistently results in decreases in runoff ratio, increases in evaporative index and decreases in groundwater storage (Fig. 4c, right, arrows).

Groundwater age dynamics decipher streamflow source

Groundwater age can be used to test the ‘groundwater bypassing’ and/or ‘groundwater draining’ conceptual models, as water age helps to identify source and travel mechanisms before arrival in the stream. Here, we use a Lagrangian particle tracking eco-hydrologic model (Methods) using >48 million particles to discretize discrete parcels of water in the watershed. This approach allows us to characterize the age of groundwater flux to streams (Fig. 5a), as well as the transient age of groundwater storage (Fig. 5b). We find that during baseflow, the median age of groundwater flux to streams increases from approximately 4 to 6 years in the baseline, transient 7-year simulation and up to 8 years with 4.0 °C of warming (Fig. 5a). These results are consistent with a recent analysis of 42 interior western USA headwater catchments using stream-sampled tritium to determine water age that found that in low-permeability basins, baseflow runoff age was on average 6.5 years old (s.d. of 1.5 years)29. While we caution overextrapolating these trends past the 7-year simulation duration without longer simulations, these results suggest an ageing of baseflow contributions to streams that is consistent with observations.

Fig. 5: The age of baseflow grows with time, in part because groundwater storage is generally ageing.
Fig. 5: The age of baseflow grows with time, in part because groundwater storage is generally ageing.
Full size image

a, Median age of groundwater flux to streamflow through the 7-water-year period and for the different warming scenarios (see legend). During baseflow, the age of groundwater flux is nearly 2 years older at the end of the 7-year baseline period and up to 3 years older with 4.0 °C of warming. b, Corresponding annual trends of subsurface storage mean ages, with and without atmospheric warming.

To determine the mechanism driving these changes, we delineate the multi-year hydrograph by discrete age bins. In the baseline simulation, streamflow derived from groundwater, defined as any water that enters the subsurface before entering the stream, contributes between 70% and 75% of streamflow per year (Fig. 6a,b), the remainder of which originates from overland flow. Groundwater of >3 years contributes 24–33% of total streamflow per year. While most overland flow originates from the seasonal snowmelt pulse, predominately excess flow, and to a lesser degree summer monsoons, much of seasonal snowmelt infiltrates the subsurface and quickly exits to streams subseasonally (Supplementary Fig. 7, blue hydrograph portions). By contrast, old-groundwater (>3 years) discharges in a semi-constant flux through the 7-year period that is mostly invariant to seasonal snowmelt (coefficient of variation (CV) of 0.05; Fig. 6a). Younger (<1 year) groundwater streamflow contributions are on the same order as old-groundwater contributions but are more interannually variable (CV of 0.24; Fig. 6a) and impacted by individual WY meteorology.

Fig. 6: Streamflow is subsidized by a semi-constant flux of old (>3 years) groundwater, partly compensating for declines in intermediate (1–3 years) groundwater that is lost faster with warming.
Fig. 6: Streamflow is subsidized by a semi-constant flux of old (>3 years) groundwater, partly compensating for declines in intermediate (1–3 years) groundwater that is lost faster with warming.
Full size image

a, Annual simulated streamflow partitioned into surface water (overland) and groundwater contributions on the basis of age (0–1 years, 1–3 years and >3 years). The 7-year CVs are denoted in each subpanel, with text colour indicating simulation scenario (Fig. 5, legend). b, Corresponding annual discharge fractions; under each subpanel, the 7-year trend line slopes and confidence intervals are listed. c, Summary schematic demonstrating the notable differences between a contemporary and warmer climate given the hydrologic and age distribution simulations presented here.

To determine the annual age distribution of streamflow, we simply divide the annual total streamflow of each age bin relative to the annual total streamflow of all bins (Fig. 6b). Notably, the lowest-snow year (WY2018) results in the lowest fractions of overland flow and 0–1 year-old water and the greatest fractions of 1–3 year- and >3 year-old waters. This result supports the hypothesis of groundwater ‘draining’ opposed to ‘bypassing’ given the apparent stability of the older water during drought. It also helps to confirm that the driving mechanism supporting streamflow response in low-snow years is groundwater, not earlier snowmelt. Rate of change through time is also calculated for each age bin (Fig. 6b, slopes and confidence intervals). Streamflow contribution of intermediate-age (1–3 years) groundwater declines quickly with time, while the streamflow contribution of old (>3 years) groundwater increases with time, compensating for some of that loss. These changes explain in part how the median age of groundwater flux gets older (Fig. 5a).

The decline in intermediate-age groundwater streamflow contribution is especially exacerbated with warming: a notable −12% groundwater discharge fraction per decade with 4.0 °C of warming, equivalent to nearly a twofold decrease from baseline. A possible explanation for why this decline occurs more quickly with warming is that groundwater storage is getting older (Fig. 5b) and intermediate-age water is ‘next available’ to discharge, although further studies are required to confirm this. While data-limited to infer transferability of the results, ageing mountain groundwater storage is consistent with spring tracer observations at the Sagehen basin (Sierra Nevada Mountains, California, USA), which could be explained by decreasing recharge rates due to less snowmelt19.

Implications for mountain watersheds and end-users

Despite headwater, snow-dominant systems being the source of downgradient water resources, little is known about the fundamental processes driving streamflow generation and responses to warming and declines in snow. While limited in representing all biohydrologic processes, insights from a process-based hydrologic model and particle tracking approach allow us to differentiate streamflow sourced from overland flow versus age-differentiated groundwater as a function of high- and low-snow years and in the absence or presence of atmospheric warming.

By relating runoff and GSRs, we show a mechanism in which groundwater buffers stream response by providing a semi-constant source of old (>3 years) water to the stream. For the lowest-snow year, the omission of the oldest water from the hydrograph would reduce the annual streamflow by approximately 10%, although we caveat that this estimate is based on simulations that were generated on the basis of a full distribution of ages and simple deduction is probably inexact. With atmospheric warming, total streamflow yield declines and becomes older and that warming results in groundwater declines that are disproportionately occurring at the highest watershed elevations. Declining water storage in the alpine and subalpine could have implications for transfer of water out of the basin through interflow, which previous studies have identified as a key driver of subsurface flow in mountain catchments30. Furthermore, geochemical dependencies have strong potential to be impacted by these declines37,38,39.

These findings are the most relevant for systems where groundwater contributions to streamflow are substantial, a process which is potentially heightened during low-snow years. If ‘old’ water is supporting streamflow, that in turn has an impact on how stable or unstable the stream’s response is. Further, these snow–groundwater–streamflow interactions point to potentially unsustainable conditions in the future and depletion of those groundwater reserves (Fig. 6c). This is important as headwater catchments undergo more groundwater characterization, where our results suggest that information may reveal potential tipping points in future streamflow behaviour. While we did not explore hydrologic impacts of multiple low-snow years in sequence9, our findings show the importance of groundwater buffering as an important component of mountain water budgets and the potential for deviations from historical conditions with atmospheric warming.

Methods

Site description

The East River watershed is a community testbed led by Lawrence Berkeley National Laboratory with support by the US-DOE Biological and Environmental Research (BER) and Environmental System Science (ESS) programs and other partner programs, which has been developed over the last decade to study (via instrument and simulation) a 300-km2 mountainous headwater system in detail. The East River is one of two major tributaries of the Gunnison River, which supplies roughly half of the Colorado River’s streamflow at the Colorado–Utah border (Supplementary Fig. 1a), has a mean annual temperature of ~0 °C (with average minimum and maximum temperatures of −9.2 °C and 9.8 °C, respectively) and receives on average 1,200 mm per year of annual precipitation, primarily in the form of snow. Annual precipitation at a nearby SNOTEL (Schofield site 737, data since 1986) shows no statistically significant trends, although September to November precipitation of nearly +10% per decade (P < 0.05) has been observed30. An average annual temperature of 0.44 °C per decade has been noted since 198630.

Elevation spans ~1,400 m with a mean of 3,266 m (Supplementary Fig. 3a) and land cover consists of predominantly evergreen and aspen forests, lower-elevation regions of riparian grass and shrub floodplains and bare soils at high elevations above treeline (Supplementary Fig. 3b, mapped from 2018 National Ecological Observatory Network flyover40). Subsurface soils (Supplementary Fig. 3c from the Soil Survey Geographic Database (SURGO)41) and lithology (Supplementary Fig. 3d from Gaskill et al.42) are heterogeneous, indicative of Paleozoic and Mesozoic sedimentary rocks including the Mancos Shale.

A well network along an elevation transect in the lower montane of the Mancos Shale was installed in 2016 and has been reporting WTDs via an automated pressure transducer since (see Tokunaga et al.27,28 for further details). The location of the ‘pumphouse lower montane’ (PLM) wells, in addition to the location of an ISCO flow metre in the East River, are shown in Supplementary Fig. 3a. Quality-assured/controlled and gap-filled time series of both datasets are available via ESS-dive (see ‘Data availability’ section). WTDs are characterized by annual snowmelt trends, recharging groundwater in April and May, with a long recession period typically in late May or early June and extending through the winter season. The higher-elevation well of the PLM transect shows >1 m decline in baseflow WTD over the period of six WYs (1 October through 30 September of the following year, designated by the calendar year in which it ends), WY2017–2022. Long-term trends in the lower-elevation well are less apparent, even though it is located only 137 m downgradient of the higher-elevation well. This is consistent with environmental age dating analysis at both PLM wells, indicating a greater concentration of premodern water and an increase in mean residence times moving from PLM1 to PLM632. These field-based observations and calculations are used as the basis of a more comprehensive spatio-temporal analysis performed with an integrated hydrologic model of the region and a numerical warming experiment to understand how water partitioning in mountain environments may be impacted by a changing climate and the role of mountain groundwater in these dynamics.

Few studies have attempted integrated hydrologic simulations with validation of in situ groundwater observations at high elevations32,43,44 given a lack of observational data to do so. Here, we compare simulated results with every major component of the water cycle (see Supplementary Information sections III and IV and Supplementary Figs. 11–23), which is further validated with water partitioning behaviour with 7 years of approximately biweekly stable isotope data (Supplementary Fig. 10). Transient simulations of recent historical conditions are compared with two scenarios of elevated levels of atmospheric warming (see below). Results are used to explore underlying physical processes in hydro-meteorologic variability and snow–groundwater–streamflow interactions that, because of common processes across high-elevation complex terrain, may also be applicable in other headwater systems.

Runoff ratio with century-long streamflow gauge

To contextualize these recent observations with a longer-term record of the watershed’s hydrodynamics, we calculate the runoff ratio at a nearby USGS gauge that has measured daily to subdaily streamflow for over 112 years (Supplementary Fig. 6). Daily streamflow at the outlet of the East River USGS Hydrologic Unit Code 10 (HUC10) is available at the Almont Gauge of the East River (USGS gauge no. 09112500), approximately 30 km downstream from the PLM wells, starting in January 1911. Data beginning at the start of the following WY (1 October 1911) through the end of WY 2022 (30 September 2022) are considered. Daily streamflow is aggregated by WY to compare annual totals. Reconstructed estimates of historical daily precipitation and temperature from the Parameter-Elevation Relationships on Independent Slopes Model interpolation method (PRISM)45 at 4-km resolution are used to generate WY annual average values of precipitation and temperature over the same HUC10 catchment area that drains to the Almont streamflow gauge. Annual runoff ratio is calculated as the fraction of annual observed streamflow relative to annual reconstructed precipitation. A 13-year data gap between WY1922 and WY1934 is present in the observed streamflow data. A gap-filled time series of annual streamflow is calculated using a machine-learning approach and the randomForest library in R. Observed flow in the adjoining HUC10 Taylor River at Almont (USGS gauge no. 09110000) is used as the training dataset, which began observation in July of 1910 and has observations during the 13-year data gap at the East River gauge. The location of the model domain, HUC10, and USGS gauges is shown in Supplementary Fig. 1. Linear regression models are calculated using the stats package in R. As shown in Supplementary Fig. 6, we find a decline of 16% per century (P = 2.7 × 10−7) when considering the gap-filled time series, which is larger than when the data gap is omitted (10% per century, P = 4 × 10−4) or when only considering the data after the gap (2% per century, P = 0.55). There is some uncertainty in this machine-learning approach given that the East and Taylor watersheds have unique traits including geology and land cover. However, the percentage of variance explained by the approach is 85.6% with a mean absolute prediction error of 14%, and it therefore provides a meaningful estimate for our purposes.

Water budget partitioning and introduction of GSR

We introduce a new metric analogous to runoff ratio but for groundwater, termed ‘groundwater storage ratio’, which we use to determine the annual rate of change of subsurface water storage as a fraction of annual precipitation. This term aids in the comparison of groundwater storage change between years, and for different scenarios, relative to other terms in the water budget (namely, changes in streamflow and evaporation). We define GSR as the following:

$$\mathrm{GSR}=\Delta S/P,$$
(1)

where a closed water budget requires

$$\Delta S=P-\mathrm{ET}-R{,}$$
(2)

and where runoff ratio (RR) and evaporative index (EI), are, respectively,

$$\mathrm{RR}=R/P,$$
(3)
$$\mathrm{EI}=\mathrm{ET}/P{.}$$
(4)

We substitute equation (1) into equation (2) as follows:

$${\mathrm{GSR}}^{* }P=P-\mathrm{ET}-R,$$
(5)
$$\mathrm{GSR}=1-(\mathrm{ET}/P)-(R/P),$$
(5b)

and substitute equation (3) and equation (4) into equation (5b) as follows:

$$\mathrm{GSR}=1-\mathrm{EI}-\mathrm{RR},$$
(6)
$$\mathrm{GSR}+\mathrm{EI}+R=1.$$
(6b)

A three-axis ternary diagram compares each portion of the water budget (Q, ET and ΔS), each normalized by P and therefore ranging from 0% to 100%. Negative ΔS values plot off the ternary grid.

Integrated hydrologic modelling with ParFlow–CLM

The integrated hydrologic model ParFlow coupled with the land surface model the Common Land Model (CLM) is used here to simulate the water–energy balance of the East River watershed. ParFlow simulates variably saturated and fully saturated flow in three dimensions via Richards’ equation and overland flow in two dimensions via the kinematic wave equation46,47,48,49,50. From a 2015 LiDAR Digital Elevation Model51, changes in surface slope were upscaled to 100 m horizontal resolution. A total of five vertical layers compose the subsurface following Foster et al.52 and incorporate spatial heterogeneity of soils and bedrock (Supplementary Fig. 3c,d). CLM simulates a coupled water–energy balance at the surface layer of the domain by incorporating spatially distributed vegetative processes by including specified land use types53. Land cover is based on the 2018 NEON flyover54 and is parameterized by the International Geosphere-Biosphere Program (IGBP) database (Supplementary Fig. 3b).

A multistep spin-up process was used to obtain an initial state for the ParFlow–CLM model. First a constant precipitation value representative of yearly average recharge conditions (P-ET equal to 0.055 mm h−1, characteristic of the region’s annual climate) was applied to the model initialized with a water table at 3 m below the land surface and simulated until the system reached convergence (after approximately 25,000 h). Then, transient hourly forcing from an average precipitation WY was used to recursively simulate transients typical of an average year. Meteorological variables used to force ParFlow–CLM are hourly and include: long and shortwave radiation, precipitation, 2-m air temperature, east–west and south–north wind speeds, atmospheric pressure and specific humidity. Following Foster et al.52, WY2006 was found to most closely represent historical means for the region (1985–2014). This included analysis of meteorological forcing from NLDAS-2 and in situ observations including the two SNOTEL, a Castnet Station and weather stations within 50 km of the domain. WY2006 and was recursively simulated for 8 years until changes in year-end subsurface storage was less than 1% of precipitation entering the domain, indicative of model convergence. Different spin-up conditions were not used for the atmospheric warming simulations or the baseline simulation, as a direct comparison between all scenarios was desired on the basis of the aridity of the historical climate. This enabled an experimental design where we could determine how the hydrologic system would respond to a warmer forcing. Details for the EcoSLIM spin-up are described below.

Given the need for hourly meteorological forcing, we use data from the North American Land Data Assimilation System (NLDAS-2) as the primary forcing dataset55. NLDAS-2 grid spacing is 1/8 degree grid and thus downscaled to the 100-m horizontal grid used by ParFlow using a bilinear interpolation scheme. Given the uncertainty in meteorological forcing in mountainous regimes56,57,58,59, a rigorous sensitivity analysis on key atmospheric input variables was performed, with ParFlow–CLM output compared against in situ, airborne and remotely sensed data. Sensitivity to bias correction using PRISM 2 m air temperature and daily precipitation were performed independently and in conjunction with each other to assess model skill (Supplementary Table 3, scenarios) of ParFlow–CLM outputs of interest. A full description of the forcing sensitivity analysis and statistical model performance is provided in Supplementary Information section III.

Namely, we statistically quantify model performance for (1) streamflow at the watershed outlet60 (at the WF-SFA Pumphouse ISCO; Supplementary Fig. 1b) and (2) maps and distributions of SWE obtained from the Airborne Snow Observatory (ASO)61 with five flights, representing key snapshots of time in the 7-year time series. We find ‘satisfactory’ to ‘good’ Nash–Sutcliffe model efficiency coefficients (NSE) for most years when using NLDAS-2 forcing (NSE 0.41–0.67; Supplementary Table 4) on the basis of common metrics of model performance and found only a nominal increase in model skill could be obtained for some years when altering the NLDAS-2 baseline simulation 2 m air temperature. Given NLDAS-2 performance of (1) hydrograph performance at the watershed outlet, (2) the consistency in runoff ratio and groundwater storage ratio trends across forcing scenarios (Supplementary Fig. 12) and (3) the low percentage bias in all ASO flights (Supplementary Fig. 16), we chose to use a self-consistent NLDAS-2 dataset for forcing variables used in the full water budget and water age analysis presented here.

Further model performance using NLDAS-2 forcing is provided in Supplementary information section IV, specifically evaluating (1) SWE time series at point scale SNOTEL (Supplementary Fig. 17), (2) X-band radar disdrometer snowfall from the SAIL campaign (Supplementary Fig. 18), (3) well WTD (Supplementary Figs. 19 and 20) and (4) remotely sensed evapotranspiration (Supplementary Fig. 21) and Eddy Covariance tower evapotranspiration and transpiration (Supplementary Figs. 22 and 23).

Numerical warming experiments

A delta approach is used as a surrogate climate-change approach to perturb the baseline temperature conditions from NLDAS-2 in two numerical warming experiments. This is done by homogeneously increasing the 2-m air temperature by 2.5 °C and 4.0 °C in space and in time. No other changes to the forcing are made. This approach allows us to isolate the impact of warming from other expected changes in climate and has been used in several early hydroclimatic change studies62,63,64, as well as in more recent studies63,64. Nonetheless, we acknowledge the limitations on the use of the approach given expected impacts on longwave radiation and humidity that co-vary with increased air temperature, which could be assessed in further studies.

From this spatio-temporal dataset of three time series (baseline and two warming experiments), annual water budgets and associated watershed runoff ratio at the site of the ISCO gauge are calculated. To determine the spatio-temporal distribution of groundwater change, we aggregate hourly, cell-based subsurface pressure head values to daily intervals grouped by 14 discrete 100-m elevation bins across the watershed (Supplementary Fig. 4). We then calculate the changes in WTD from baseline for each warming scenario (Fig. 3).

Groundwater age and fraction of groundwater to streamflow via Lagrangian particle tracking

ParFlow–CLM handles all the water partitioning between the subsurface, surface and atmosphere, while tracking of individual parcels of water (also referred to as ‘particles’) requires an additional code in a Lagrangian framework. The EcoSLIM particle tracking code enables the quantification of flow pathways and water residence times in the subsurface65. ParFlow–CLM is run first and then used as a transient, hourly input for EcoSLIM. In EcoSLIM, particles enter the subsurface at land surface at all timesteps and spatial coordinates with positive infiltration fluxes. Particles then move as a function of the simulated velocity fields via the advection–dispersion equation until exiting the subsurface either as exfiltration to surface water or ET. Upon infiltration, particles are assigned zero-age and age increases linearly with time until the particle exits. Particle masses directly scale with ParFlow–CLM infiltration, ET and exfiltration fluxes, allowing for calculation of spatially and transiently distributed flux-weighted age distributions in storage and streamflow fluxes32,65,66.

To spin up the age and spatial distribution of EcoSLIM particles for the WY2015–2021 simulations, the final year of ParFlow–CLM spin-up state year was used as the initial condition of the EcoSLIM spin-up. The EcoSLIM spin-up was initialized with 1 particle per model grid cell, with zero age. These particles are not given a precipitation phase (rain versus snow) unlike the transient input of particles in the simulation. The majority of ‘initial’ particles exit the simulation during spin-up (Supplementary Fig. 8). A spin-up run recursively applied with WY2006 (similar to ParFlow, above) for over 50 years at hourly timesteps, at which pseudo steady-state conditions were reached on the basis of minimal change (<0.1%) in the groundwater particle age in the model domain. WY2015 through WY2021 was then simulated at hourly timesteps for the baseline and warming scenario models, all using the end of the EcoSLIM spin-up run as the initial particle conditions (equivalent across baseline and warming models).

The masses and ages of all EcoSLIM particles that discharged within the watershed boundary were aggregated at hourly timesteps, which is related to the total groundwater contribution to surface water at the watershed outlet (WF-SFA Pumphouse ISCO). This assumption will overpredict the watershed-scale groundwater contributions to surface water because it does not explicitly account for exfiltrating particles that re-infiltrate in the stream channels and it sums over all exfiltrating particles, even those that are not connected to streamflow at the watershed outlet (for example, disconnected meadows). To limit the latter scenario, we only consider particles that exfiltrate in cells with pressure heads >0.005 m. This pressure threshold value was chosen such that the EcoSLIM mass-flux-based Q (QEcoSLIM) approximately equalled the ParFlow–CLM Q (QParFlow) at Pumphouse during baseflow conditions when surface water inputs via overland flow and direct precipitation are assumed minimal.

EcoSLIM tracks flow pathways and water age of groundwater only (that is, it does not track overland flow). To determine the fraction of streamflow derived from groundwater at Pumphouse, the ratio of EcoSLIM Q (discretized into age bins) to ParFlow Q (that is, QEcoSLIM/QParFlow) is calculated using annual mass-weighted averages (Fig. 6a). The remaining flux (that is 1-QEcoSLIM/QParFlow) was assumed to correspond to overland flow contributions to QParFlow.

Model validation with stable isotope observations

EcoSLIM provides key hydrologic information derived from the several millions of particles it simulates, including quantitative information on streamflow age, infiltration elevations and infiltration phase (rain or snow). To predict a continuous time series of \({\delta }^{18}{\rm{O}}\) from the EcoSLIM output at the watershed outlet, we use the following \({\delta }^{18}{\rm{O}}\) convolution integral:

$$\begin{array}{l}{\delta }^{18}{{\rm{O}}}_{\mathrm{Outflow}}(t)=\mathop{\sum }\limits_{p=1}^{p=np}{m}_{p,\mathrm{rain}}({\delta }^{18}{{\rm{O}}}_{zp,\,\mathrm{rain}})\\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,+{m}_{p,\,\mathrm{snow}}({\delta }^{18}{O}_{zp,\,\mathrm{snow}})/\mathop{\sum }\limits_{p=1}^{p=np}{m}_{p,\,\mathrm{rain}}+{m}_{p,\,\mathrm{snow},}\end{array}\,\,\,\,$$

where δ18OOutflow is the predicted stable isotope of oxygen at the outflow of the watershed at time t, p is a particle exiting the model as surface water with m mass and np is the total number of exiting streamflow particles aggregated over 1 h. Exiting particles are tagged as rain or snow particles on the basis of precipitation phase upon infiltration at elevation zp. Particles for each phase are assigned the average \({\delta }^{18}{\rm{O}}\) of multi-year precipitation observations of rain and snow at a reference location within the watershed67 (Supplementary Fig. 9). For each particle, the associated δ18O is elevation adjusted given the infiltration elevation of the particle within the model on the basis of a lapse rate of −0.16‰ per 100 m elevation with respect to the reference location68.

A total of 7 years of approximately biweekly stream water stable isotope (δ18O) observations have been obtained in the East River and are compared against computed δ18O values from this particle tracking and convolution integral approach (Supplementary Fig. 10). Results show the particle tracking model outputs generally show good agreement with observed stream water δ18O dynamics including the seasonal and interannual temporal dynamics of δ18O. Specifically, the model reproduces the average seasonal variation within single years and also correctly captures years with less or more summer rain contributions. Given the intermittent sampling of the approximately biweekly measurements, there are minor mismatches between the model and observation during the summer rains and the short period where summer baseflow recession begins occurs too quickly in the model when compared against the observations. In general, these results suggest the model correctly predicts input water phase (rain versus snow), infiltration elevation and travel time distributions of recent rainfall or snowmelt inputs, providing an additional line of evidence of adequate model performance.

Model limitations and further considerations

The delta approach considered here intentionally isolates the impact of warmer atmospheric air temperatures (for example, increased rain fractions, higher evapotranspiration, and so on) on the transient 7-year hydrodynamics alone and not the long-term transition of the watershed to an altered state or aridity. Because snowpack is a principal driver of hydrologic change in this system and retains no system memory between years, implications of this approach are mostly on the initial state of the groundwater storage system. Future studies may consider equilibration of the future scenarios with a new dynamic equilibrium initial condition; these types of simulations may also consider changes in land cover type including shifts in vegetation species and demographics. Current model limitations include fixed vegetation types that do not have dynamic structure and/or biomass including vegetation mortality and succession, which are new model capabilities in the ParFlow ecosystem of codes69, but have yet to be incorporated into this model. Furthermore, the coupled code structure permits a one-way interaction with the atmosphere, where atmospheric forcing is enforced a priori and not impacted by transient soil moisture or groundwater conditions, for example.

Further work is required to determine the interaction with disturbance such as vegetation change and/or mortality, as well the role of montane groundwater in different water- and energy-limited environments that have established impacts on snow-driven water inputs70,71,72. Opportunities also exist to compare age distributions of ET with those of streamflow, enabled by the particle tracking approach used here, similar to Dennedy-Frank et al.66.