Introduction

Groundwater provides almost half of all drinking water worldwide, particularly in rural areas1. Yet, it faces substantial threats from socio-economic development and climate change2,3. Serving as a crucial source of drinking water for over 400 cities across China, groundwater is especially vital in the northern regions where it accounts for two-thirds of the drinking water, half of the industrial water, and one-third of the irrigation water4,5. As the world’s largest industrial powerhouse and a major agricultural producer, China has experienced considerable groundwater quality problems over recent decades6. Extensive pollution discharges from agricultural and industrial activities, coupled with large-scale over-exploitation, have led to widespread groundwater pollution across the nation7,8. In 2020, a report by China’s Ministry of Ecology and Environment, indicated that 33.7% of the tested 10,242 sites (dominated by shallow groundwater) had “marginal” groundwater quality, while another 43.6% were classified as “poor”9.

In response to the groundwater quality crisis, nations including China have developed environmental monitoring networks aimed at assessing and mitigating the negative impacts of contamination and over-exploitation on this vital resource10. These networks are important for the early detection of pollutants, comprehending groundwater flow dynamics, and formulating strategies to increase groundwater sustainability11. However, a key limitation of these networks is their sparse spatial distribution of monitoring sites, which fails to provide comprehensive national coverage, often both costly and time-consuming12. This deficiency is particularly acute in emerging economies like China, where the low density of monitoring sites inadequately captures the spatial variability of groundwater quality and pollution incidents, compromising the accuracy and completeness of monitoring efforts13. Furthermore, considerable challenges persist in comprehensively quantifying the driving factors behind groundwater quality deterioration and in projecting future changes.

To address the lack of comprehensive data for groundwater quality assessment, researchers have increasingly utilized machine learning (ML) to enhance the mapping of contaminants like arsenic and fluoride, which often stem from natural geological sources14,15. Pioneering works by Amini et al.16 and Rodríguez-Lado et al.17 have incorporated environmental factors into ML models to create predictive maps. These efforts are notable for their ability to interpret complex environmental data—including climate, soil, geology, and topography—into useful predictive tools for assessing groundwater quality18,19,20,21. However, these studies primarily focused on static environmental factors as predictors, which may not sufficiently capture the dynamic and evolving impacts of human activities and climate change on groundwater systems.

In this study, we employed extensive groundwater survey data, along with natural and socio-economic variables across China, to develop an ML model for predicting groundwater quality. Utilizing this model, we mapped the annual probabilities of poor groundwater quality (PGQ, i.e., Class V based on the Chinese groundwater quality standard22; Table S1) from 1980 to 2100, under various future scenarios of socio-economic development and climate change. This approach allowed us to illuminate evolving patterns of groundwater quality and assess the profound influence of human activities and climate change on China’s groundwater quality. Our analysis offers crucial insights and guidance for crafting strategies to address the environmental threats that China’s groundwater resources are facing. We underscore the importance of adapting to changes in development patterns and climate conditions to effectively mitigate these challenges.

Results

Random forest modeling

In our study, we extracted groundwater quality data from 1977 published surveys, culminating in a dataset. This dataset incorporated geospatial information, including location and land-use type, as well as temporal data represented by the sampling year. The spatial distribution and temporal variation of the survey data are presented in Fig. S1. Recognizing the need for a robust and unbiased model, we systematically segregated these 1977 groundwater quality surveys into two distinct datasets: 90% for training and 10% for validation. To counterbalance any potential issues arising from the relatively small sample size, we utilized a strategic data augmentation technique. This involved the generation of modified duplicates of the existing data, thereby enhancing our training dataset. The augmentation of the validation dataset was conducted independently of the training dataset, ensuring no overlap and maintaining the integrity of our model’s validation process.

To establish a quantitative relationship between predictors and groundwater quality (PGQ or non-PGQ), we employed a non-parametric supervised ML technique known as random forest (RF). To develop a parsimonious classification model, an initial selection of 51 potential predictor variables underwent the elimination of redundant variables (Table S2). This process of variable pruning, which was guided by evaluating the collinearity and independence (Fig. S2), ensured that only those variables that did not compromise the prediction accuracy of subsequent RF models were retained (“Model predictor selection” in Methods). Addressing multicollinearity in these parameters is crucial, as it helps prevent adverse effects such as overfitting, reduced interpretability, and ultimately, compromised predictive performance of the model23. The finally retained 25 predictors (indicated in bold in Table S2) includes factors related to soil properties, geographical and hydrogeological conditions, climate change, groundwater exploitation, pollution discharge, and land-use change. The out-of-bag error of each predictor, reflecting its impact on the model’s predictive accuracy24, is used to assess the importance of each predictor (Fig. S3). Five most notable predictors are identified: Depth-based groundwater type, air temperature, aridity index, precipitation, and sand content of soil (100–200 cm).

The performance of the RF model on the validation set (10% of the data, which was randomly selected while maintaining the relative distribution of PGQ and non-PGQ) is summarized in the confusion matrix in Table S3. Despite a prevalence of PGQ of only 43% in the dataset, the model performs well in predicting both PGQ (sensitivity: 0.82) and non-PGQ (specificity: 0.87) at a probability cutoff of 0.50. The accuracy is correspondingly high at 0.85. Likewise, the model’s area under the receiver operating characteristic curve (AUC), which considers the full range of possible cutoffs, has a very high value of 0.88 on the validation set (Fig. S4). Additionally, the model achieves substantial agreement beyond chance in its classifications with Cohen’s kappa coefficient of 0.69. While slightly below the ideal, this value falls within 0.61–0.80, which Landis and Koch defined as substantial agreement25. This slightly lower kappa value reflects the inherent complexity and variability of the environmental data and processes analyzed in this study. The concordance between our model’s predictions and the PGQ maps presented in “2021 Annual Report of China’s Groundwater Monitoring Project”26, attests to the model’s accuracy and reliability (Fig. S5).

China’s groundwater quality during 1980–2020

Utilizing the developed model, we drew annual maps of the probability of PGQ across China from 1980 to 2020, with a detailed resolution of 1 km. In 1980, regions with PGQ probability exceeding 0.5 were primarily located in Southwest China, Northwest China, and parts of Northeast China, showing diverse levels of PGQ probability (Fig. 1a). As the 21st century unfolded, the situation evolved, with a substantial increase in PGQ probability observed in North China and Central China (Fig. 1b). These regions, which previously had relatively good groundwater quality, experienced a notable degradation by 2020 (Fig. 1c). The probability of being classified as having PGQ exceeds 0.9 in up to 2.1% of the national area, particularly across most regions of North China, signaling a notable deterioration in groundwater quality. In 1980, only about 17.3% of the national area was affected by PGQ (Fig. 2a). By 2000, this percentage had slightly increased to 22.2%. However, there was a substantial rise to a peak of 41.6% in 2019, followed by a slight decrease to 40.8% in 2020.

Fig. 1: Spatial-temporal patterns of China’s groundwater quality and affected population.
figure 1

Distributions of PGQ probability in China in 1980 (a), 2000 (b), and 2020 (c); Population in regions with PGQ probability greater than 0.5 in 1980 (d), 2000 (e), and 2020 (f). Map Source: Standard Map Service of the Ministry of Natural Resources, People’s Republic of China [Map Review Number: GS(2019)1822].

Fig. 2: Temporal Evolution of PGQ Area and Affected Population.
figure 2

a Temporal evolution of the PGQ area ratio from 1980 to 2100; b Ratio of the population affected by PGQ over the same period. Future projections (2025–2100) are evaluated under four scenarios (‘Scenario settings’ in “Methods”). Here, regions with a PGQ probability exceeding 0.5 are classified as PGQ-affected areas.

Figure 3a illustrates the spatial distribution of groundwater quality category changes in China from 1980 to 2020, detailing regions of deterioration (non-PGQ changed to PGQ), improvement (PGQ changed to non-PGQ), and stability (unchanged). Across the country, 25.3% of the national area experienced deterioration in groundwater quality, 1.8% showed improvement, and 72.9% remained unchanged. Figs. S6a and S6b show that the area with deteriorated groundwater quality expanded from 6.8% during 1980–2000 to 20.9% during 2000–2020, suggesting an accelerated deterioration over recent decades. Between 1980 and 2000, regions with deteriorated groundwater quality were primarily scattered across Central China, North China, and coastal regions (Fig. S6a). In the last two decades, this situation spreads to encompass more regions, including North China, Northeast China, Northwest China, and Southeast China (Fig. S6b).

Fig. 3: Groundwater quality changes and the drivers.
figure 3

a Groundwater quality changes from 1980 to 2020 across China, and the inset shows the ratio of regions with changes in groundwater quality category relative to the national territory area; b The spatial distribution of dominant drivers in regions experiencing groundwater quality deterioration during 1980-2020, and the inset shows the ratio of drivers’ dominant areas relative to the national territory area during this period; c The contributions of each driver category to changes in the PGQ area ratio from 1980 to 2020 in China. Here, “C”, “E”, and “L” represent climate, exploitation, and land use, respectively. Within the category of pollution, further subcategories are considered: “A” for agricultural discharge, “D” for domestic discharge, and “I“ for industrial discharge. Map Source: Standard Map Service of the Ministry of Natural Resources, People’s Republic of China [Map Review Number: GS(2019)1822].

Compared to the expansion of PGQ area, the populations affected have undergone a more pronounced change. In the last century, most of the regions with PGQ were not densely populated, with the majority having a population density less than 100 people per km² (Fig. 1d). However, as we entered the new century, there has been a noticeable increase in the population density within the newly emerged PGQ regions (Fig. 1e). Approximately one-third of these PGQ regions now have a population density exceeding 100 people per km² (Fig. 1f). From 1980 to 2020, the area ratio suffering from PGQ has doubled, while the proportion of affected population experienced a fourfold escalation. This demographic change highlights an increase from affecting 6.8% of the population in 1980, to 17.5% in 2000, peaking at 38.3% in 2018 before slightly decreasing to 36.0% in 2020 (Fig. 2b).

By analyzing drivers of groundwater quality changes from 1980 to 2020 (“Analysis of drivers” in Methods), we have obtained the distribution of dominant factors (Fig. S7a) and their relative contributions (Fig. S7b). This analysis showed a heterogeneous geographical pattern, illustrating the spatial heterogeneity of the driver impacts.

Pollution-dominant regions encompassed the largest area, covering 16.2% of the national area (Fig. 3b). Among the pollution sources, agricultural discharge emerged as the predominant contributor, affecting 9.6% of the national area, mainly located in North China, Northeast China, and Central China. Industrial discharge as the dominant driver was concentrated primarily in Northeast China and Central China, affecting 4.4% of national area. In contrast, regions predominantly influenced by domestic discharge were concentrated in Southeast China, comprising 2.2% of the national area. Additionally, regions where groundwater exploitation was the main driver constituted 4.6% of national area, notably impacting Northeast China and Northwest China. The distribution of regions dominated by land use was fragmented, with the smallest area ratio of 0.2%.

During 1980–2020, the drivers of groundwater quality change in China have evolved spatially and temporally. Notably, in regions experiencing groundwater deterioration, agricultural discharge and climate change emerged as the primary drivers from 1980 to 2000, affecting 2.7% and 1.8% of the national area, respectively (Fig. S6c). The period from 2000 to 2020 saw a change in dominant drivers, with agricultural discharge and groundwater exploitation taking precedence, controlling 7.6% and 5.7% of the national area, respectively (Fig. S6d). This period witnessed a marked escalation in the cumulative impact of various environmental factors on the change of PGQ area ratio. Specifically, agricultural discharge experienced a dramatic increase in its contribution to the PGQ area ratio, climbing from 0.04% in 1985 to 10.7% in 2020, while industrial discharge rose from 0.09% to 5.3% over the same period (Fig. 3c). Similarly, groundwater exploitation marked a notable increase, escalating from 0.07% to 5.6%. Domestic discharge, though initially negligible, also notably impacted PGQ area ratio, contributing an increase of 1.7% by 2020 from 0.1% in 1985. Concurrently, climate change demonstrated a fluctuating yet persistent moderate influence, starting from a 0.1% contribution in 1985, peaking at 0.9% in 2000, and stabilizing at 0.5% in 2020. In contrast, land use changes had a consistently minimal and predominantly positive impact on groundwater quality dynamics.

Post-2020 projections of groundwater quality changes

We analyzed the temporal evolution of groundwater quality from 2021 to 2100 under four distinct scenarios, enabling an assessment of long-term influence of these key drivers on groundwater quality (Fig. 4). In Scenario I, we maintained the intensity of all influencing factors at their 2020 levels. To further investigate the responses of groundwater quality to future socio-economic development and climate change, we used the Shared Socio-economic Pathways (SSP) scenarios, i.e., SSP1-1.9 (Scenario II), SSP2-4.5 (Scenario III), and SSP5-8.5 (Scenario IV), from the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR6)27 (“Scenario settings” in Methods). Specifically, Fig. 4a-d shows the PGQ probability in 2050 under four scenarios (I, II, III, and IV), while Fig. 4e-h depicts the affected populations, respectively. Additionally, Fig. S8 explores changes in PGQ probability up to 2050, and Fig. S9 displays the changes in groundwater quality categories between 2020 and 2050 under four scenarios, whereas Fig. S10 extends to 2100, detailing the distribution of PGQ probability as well.

Fig. 4: Projections of China’s groundwater quality under four scenarios (I, II, III, and IV).
figure 4

Projected probability of PGQ in China in 2050 (ad); Population in regions with PGQ probability greater than 0.5 in 2050 (eh). Map Source: Standard Map Service of the Ministry of Natural Resources, People’s Republic of China [Map Review Number: GS(2019)1822].

Scenario I projects a near-constant rate of deterioration, with the PGQ area ratio projected to rise from 40.8% in 2020 to 42.9% in 2050, and slightly increase further to 43.1% in 2100 (affecting ~35.3% of the population) (Fig. 2). Key regions of concern include Southeast China, Southwest China, and Central China by 2050 (Fig. S9a), particularly in South China where deterioration is expected to expand by 2100 (Fig. S10i).

In contrast, Scenario II projects substantial improvements, driven by proactive environmental policies and reduced emissions. This optimistic projection is evidenced by the PGQ area ratio’s decline to 37.9% in 2050, with expectations for a further reduction to 33.2% in 2100, illustrating a sustained positive trend (Fig. 2). The affected population is expected to decrease from 31.1% in 2050 to 21.1% in 2100. This improvement is predominantly observed in Northeast China and North China in 2050 (Fig. S9b), with additional progress anticipated in South China in 2100 (Fig. S10j).

Scenario III presents a concerning outlook, revealing a troubling trend with the PGQ area ratio rising from 47.2% in 2050 to 49.1% in 2100 (Fig. 2). Approximately 39.1% of the population could be exposed to PGQ in 2050, further increasing to 40.5% in 2100. This scenario reflects a situation of medium pollution and persistent emissions, leading to widespread deterioration, particularly in Central China and Southwest China in 2050 (Fig. S9c). Moreover, the projection indicates an expansion of the affected regions, extending groundwater quality deterioration to Northwest China and South China in 2100 (Fig. S10k).

Scenario IV, which represents the highest emission trajectory, projects an even more alarming increase in PGQ area ratio (48.3% in 2050 and 50.8% in 2100) (Fig. 2). Concurrently, approximately 40.7% of the population could be exposed to PGQ in 2050, with this proportion slightly decreasing to about 39.5% in 2100. The spatial distribution of PGQ expands notably, observed in the more industrialized and populated regions like Northeast and Central China in 2050 (Fig. S9d), and later spreads to additional regions in Northwest China and South China in 2100 (Fig. S10l).

Across all four scenarios, relative to the spatial distribution pattern observed in 2020, the probability of encountering PGQ in northern regions, especially in North China and Northeast China, is expected to slightly decrease by 2050 (Fig. S8) and 2100 (Figs. S10e-h), though the overall groundwater quality is projected to remain poor (Figs. S9 and S10i-l).

Our analysis reveals a complex interplay of factors (i.e., climate change, groundwater exploitation, land use and pollution discharge) driving changes in China’s groundwater quality during 2020–2100 (“Analysis of drivers” in Methods). The spatial distribution of dominant factors affecting PGQ probability in China during 2020-2050 is detailed in Fig. S11, specifically focusing on areas experiencing deterioration (Fig. 5a, b, d e) or improvement (Fig. 5c). Temporal trends of these drivers across China are presented in Fig. S12.

Fig. 5: Dominant drivers of China’s groundwater quality change under four scenarios (I, II, III, and IV) during 2020–2100.
figure 5

Subfigures (a, b, d, e) showcase the spatial distribution of dominant drivers in regions experiencing groundwater quality deterioration from 2020 to 2050 under the four scenarios; Here, a corresponds to Scenario I, b to Scenario II, d to Scenario III, and e to Scenario IV; Subfigure c displays the dominant drivers in regions experiencing groundwater quality improvement from 2020 to 2050 in Scenario II, and the insets in (ae) presents the area ratios of regions with different dominant drivers over China; Subfigure f shows the contributions of each driver category to changes in the PGQ area ratio from 2020 to 2100 in China. Here, “C”, “E”, “L”, and “P” represent climate, exploitation, land use, and pollution drivers, respectively. Map Source: Standard Map Service of the Ministry of Natural Resources, People’s Republic of China [Map Review Number: GS(2019)1822].

Pollution discharge plays a critical role, especially in regions like Central China and Southwest China, where industrial, agricultural, and domestic activities are prevalent, with Scenario IV controlling the largest area, with 3.0% of the national area (Fig. 5e). In comparison, pollution affects 2.2% of the national area under Scenario III (Figs. 5d), and 2.6% in Scenario I (Fig. 5a). During 2020–2050, pollution is the leading contributor to the increase in PGQ area ratio, particularly noted in Scenario IV with an increase of 4.0% (Fig. 5f). This trend not only persists but also intensifies by 2100, with pollution contributing an additional increase of 4.2% in PGQ area ratio relative to 2020, reflecting the escalating effects of pollution under high-emission scenarios.

Land use, whose changes are driven by agricultural expansion, urbanization, and other human activities, predominantly impacts groundwater quality in Central China and Eastern China, particularly in regions such as the North China Plain, the middle and lower reaches of the Yangtze River, and the Northeast Plain, where intensive agricultural and urban development are prevalent. Under Scenario IV, land use dominates groundwater quality change in 5.6% of the national area, while Scenario III sees a controlling area ratio of 5.5% (Fig. 5d, e). Figure 5f reveals that land use is one of the key contributors to changes in PGQ area ratio, with Scenario IV leading to a 1.5% increase, and Scenario III showing a 1.2% increase by 2100. Notably, Scenario II stands out with a substantial decrease in land-use-related impacts, reducing the PGQ area ratio by 0.9% in 2050 and further to 1.8% in 2100, indicating a potential positive effect of controlled changes in land use on groundwater quality in certain scenarios.

Groundwater exploitation notably affects regions with high water demand, such as Northeast China, Northwest China, and North China. Under all scenarios, it contributes to groundwater quality deterioration in up to 0.3% of the national area. However, Scenario II stands out for its positive effects, improving groundwater quality across ~4.0% of the national area (Fig. 5c). From 2020 to 2050, reductions in groundwater exploitation, particularly under Scenario II, result in a slight improvement in PGQ area ratio by 0.2% (Fig. 5f). In 2100, continued efforts to reduce groundwater exploitation under Scenario II are projected to further reduce the PGQ area ratio by up to 0.6%, underscoring the importance of sustainable groundwater management.

Climate change exerts a moderate yet notable influence on groundwater quality across all scenarios, especially in Northwest China, Central China, and Southeast China. In Scenario I, climate change affects 1.2% of the national area (Fig. 5a), increasing to 3.1% in Scenario IV (Fig. 5e). The impact is similarly substantial in Scenario III, where it influences 3.1% of the national area. In 2050, climate change has a modest impact on the PGQ area ratio, subtly altering it by less than 1.0% across various scenarios (Fig. 5f). However, in 2100, its effect becomes slightly more pronounced, potentially increasing the PGQ area ratio by up to 1.8% in Scenario IV, reflecting a steady, though gradual, worsening of groundwater quality driven by this factor.

Discussion

In this study, our model ranks the factors that notably influence the prediction accuracy of groundwater quality, identifying depth-based groundwater type as the most critical predictor (Fig. S3). Sensitivity analysis, involving a ±10% change in each variable, highlighted permeability as the most sensitive factor (Fig. S13). In addition, we analyzed the interrelationships among the predictors (Fig. S2), as well as the partial dependence plots and feature interaction charts (Figs. S14, S15), demonstrating the complex interactions among the predictors. Shallow groundwater is more vulnerable to surface contaminants, while deeper groundwater is less impacted28. Regions with lower precipitation, higher temperatures, and higher groundwater exploitation rates showed higher PGQ probabilities (Fig. S15). Climate-driven changes directly affect recharge rates, influence groundwater replenishment, while indirectly alter groundwater usage patterns29. In arid and semi-arid regions, reduced rainfall leads to over-extraction and higher pollutant concentrations due to decreased dilution, whereas humid regions see increased mobilization and dilution of pollutants30,31. As climate change continues to reshape global temperature and precipitation patterns, these dynamics are expected to become increasingly complex. Additionally, soil sand content (100–200 cm) notably affects recharge and contamination, with higher content increasing susceptibility to surface contaminants (Fig. S14f)32. It is essential to note that this analysis is based on a nationwide perspective, and the most critical predictors vary spatially, as shown in Fig. S7a. These insights highlight the importance of informing policymakers and stakeholders about region-specific factors that must be managed to improve groundwater quality.

It’s indisputable that increased pollution discharges and excessive exploitation have played critical roles in the deterioration of this vital resource over the past decades. Groundwater exploitation in China has surged from 645.7 billion m3 in 1980 to a peak of 1133.8 billion m3 in 2012 (Fig. S16a). Similarly, the total volume of domestic and industrial wastewater has also doubled, increasing from around 41.4 billion tons in 1980 to about 85.2 billion tons in 2020, with the majority of this growth attributed to domestic discharge (Fig. S16b). Pollution discharges have been the main drivers of groundwater quality degradation in China over the past four decades, particularly from agricultural discharge in major regions like North China33 (Figs. 3 and S7). Excessive groundwater exploitation serves as a secondary driver, predominantly in North China and Northwest China. Moreover, industrial and domestic discharges notably impact groundwater quality34. North China, one of the most important agricultural and economic centers in China, bears the brunt of groundwater exploitation, with rising groundwater withdrawal and waste discharge intensifying the risk of pollutants seeping into aquifers and threatening the quality of deep groundwater resources35. Remarkably, changes in land use, such as optimized management of industrial and agricultural lands, converting agricultural land to forested areas or conservation zones under the widespread “grain-for-green” policy, establishing urban green spaces, and restoring wetlands, likely contribute to the improvement of groundwater quality by reducing pollution from agricultural chemicals36. Strengthening such environmentally friendly measures, like improving fertilizer and pesticide use efficiency, is crucial for protecting groundwater quality37.

Despite groundwater quality in China has faced decades of decline, recent years have seen a notable change in this trend. This change, marked by the slight decrease in the PGQ area ratio in 2020 after years of consistent increase (Fig. 2), may be attributed to a variety of strategic interventions. Groundwater exploitation has steadily decreased by one-fifth in 2020 compared to its peak intensity in 2012 (Fig. S16a), likely attributable to the operation of major water conservancy projects such as the South-North Water Transfer Project38. The establishment and enforcement of groundwater pollution prevention laws and regulations, combined with environmental inspection systems, have played a vital role in curbing pollution sources39. Furthermore, the proliferation of wastewater treatment facilities in China has been on the rise (Fig. S17), attributable to the proactive environmental protection measures enacted by the government, such as “Action Plan for Prevention and Control of Water Pollution”40. Innovative developments in pollution control technologies—spanning source control, process blocking, and end-point remediation (supported by dedicated scientific programs) along with their engineered applications—have further contributed to groundwater quality improvements41. These actions demonstrate how government initiatives can effectively address water quality challenges and set a precedent for achieving the United Nations’ Sustainable Development Goal 6 on sustainable water and sanitation42.

In response to diverse development and climate change trajectories, China urgently needs proactive and flexible strategies to manage groundwater resources and control industrial pollution, with a focus on enforcing strict compliance and discharge standards for manufacturing enterprises, while prioritizing sustainable development goals and integrating climate resilience tactics to ensure the long-term viability of these assets, particularly in vulnerable areas43,44,45. In North China, particularly the North China Plain, reducing groundwater over-exploitation through stricter regulations and promoting water-saving irrigation is crucial46. Additionally, incentivizing sustainable agricultural practices, such as crop rotation and controlled fertilizer application, can also mitigate groundwater contamination47. In Northwest China, policies should target reducing industrial water use and enhancing recharge programs, while encouraging water recycling and less water-intensive technologies48. In Southern China, particularly in regions prone to extreme weather events, effective flood control and improved land-use planning are needed to reduce contaminant transport during floods, alongside stronger waste management regulations to minimize contamination risks49,50.

Although visual comparisons and statistical evaluations suggest that the model estimates are generally reasonable, notable uncertainties remain. Monte Carlo simulations were applied to assess uncertainties in PGQ probability predictions, providing insights into spatial and temporal variability. As shown in Figs. S18d-f and S19e-h, uncertainties are moderate overall but higher in North and Northeast China from 1980 to 2100, likely due to complex interactions between socio-economic factors and groundwater systems. Under Scenario IV, uncertainties become more pronounced (Fig. S20), likely driven by increased discharges and groundwater exploitation. This highlights the need for adaptive groundwater management and pollution control strategies, particularly in high-emission scenarios, to mitigate adverse impacts and enhance prediction reliability in vulnerable regions.

Building on these uncertainties, our study also encounters certain limitations, particularly concerning assumptions of socio-economic predictors and the spatiotemporal resolution of data. The use of regional averages instead of point-specific values, the variability in sampling methods, and the exclusion of highly correlated predictor variables—though necessary for model generalization and parsimony—may have led to an incomplete representation of complex interactions, potentially underestimating the contributions of correlated factors. Additionally, assuming steady-state conditions for some environmental variables like soil properties and water table depth may overlook key temporal variations, limiting the capture of dynamic contaminant transport over time. Furthermore, the lack of distinction between geogenic and anthropogenic sources of PGQ and differentiation among phreatic, confined, and unspecified groundwater samples could affect the accuracy of predictions. Moreover, the use of Class V as a threshold for PGQ is conservative, and stricter thresholds like Class IV could be considered. Our single-factor assessment focuses on the worst indicator, which may lead to stricter evaluations but offers a cautious approach to groundwater management. Lastly, our model prioritizes factors influencing overall groundwater quality rather than specific indicators, potentially differing from more targeted assessments.

Despite the inherent limitations, our approach offers distinct advantages over traditional methods, which are often labor-intensive and costly. By integrating regional data from previous research with accessible natural and socio-economic predictors, our methodology presents a comprehensive analytical framework for assessing groundwater qualities. This strategy provides detailed patterns of groundwater quality evolution across China from 1980 to 2100, revealing that pollution discharge, groundwater over-exploitation, and land-use change are major contributors to its degradation, particularly in high-risk regions affecting large populations. Furthermore, our projections indicate that proactive environmental policies can substantially improve groundwater quality under diverse future scenarios. This approach is especially valuable for emerging economies, where limited groundwater data and PGQ are common. By offering a cost-effective solution, our model not only fills critical data gaps but also provides actionable insights into resource management and strategic planning. Although uncertainties remain—especially regarding socio-economic models and emissions paths—the findings underscore the need for adaptive, region-specific groundwater management practices to mitigate socio-economic and climate-related impacts. Future research can build on this model by refining data resolution and examining specific pollutant pathways to enhance predictive reliability and support long-term, sustainable groundwater policies.

Methods

Data collection and pre-processing

The groundwater quality data were obtained from published articles by searching in China National Knowledge Infrastructure (CNKI) and Web of Science. Initially, a total of 31,535 studies comprising 25,200 articles in Chinese and 6335 articles in English were retrieved using the following search formula: “Groundwater quality” (Topic) and “China” (Topic). Given the heterogeneity of the data, the retrieved publications were further filtered according to the following rules: (i) at least five groundwater quality indicators were determined; (ii) sampling locations were provided at prefecture level; (iii) there were more than five sampling points and sufficient statistical information (including mean at least); (iv) land-use type was provided; (v) samples were taken between 1990 and 2020. To effectively capture the contemporary groundwater quality status, data predating 1990 were deliberately omitted due to its limited representativeness and inadequacy in reflecting the prevailing conditions. Finally, 753 articles were identified to establish our dataset, which contains a total of 1977 surveys, as shown in Fig.S1.

The groundwater quality data were classified into two categories: non-PGQ and PGQ, based on the criteria outlined in the Groundwater Quality Standards of the People’s Republic of China (GB/T 14848-2017)22. This classification process involves the following four steps:

Step 1: Collect all relevant groundwater quality indicators from the dataset, ensuring they contain more than five indicators, as specified by established testing standards.

Step 2: Assess each indicator individually by classifying it as Class V or not, based on Table S1.

Step 3: Determine the overall groundwater quality with the worst indicator.

Step 4: Categorize the groundwater quality into two types: non-PGQ (Class I to IV) and PGQ (Class V).

In our analysis, we meticulously built the relationship between predictors and groundwater quality category (PGQ or non-PGQ). As listed in Table S2, the predictors included two classes of steady-state factors (soil properties and geographic and hydrogeologic characteristics) and four classes of time-variant factors (i.e., climate change, pollution discharge, groundwater exploitation, and land-use change).

Groundwater quality is influenced by various environmental and chemical processes, including redox reactions, contaminant transport, and natural filtration mechanisms51. Given China’s vast and diverse landscape, along with notable regional variations in data availability and quality, we prioritized accessible and quantifiable predictors to ensure broader applicability and consistency in groundwater quality predictions across different regions. Accordingly, we selected predictors that not only capture redox-related influences indirectly but also reflect other critical mechanisms affecting groundwater quality, providing a comprehensive foundation for reliable assessment. To effectively capture these complex processes, we selected predictors such as soil properties (e.g., porosity, permeability, soil type, and soil composition across different soil layers) to reflect contaminant adsorption and biological degradation, geographic and hydrogeologic characteristics (e.g., topographic wetness index, water-table ratio, and groundwater type) to indicate contaminant accumulation and natural dilution, and climatic factors (e.g., temperature, precipitation, and aridity) to represent groundwater recharge, evaporation, and pollutant mobility17,19,30. Additionally, pollution discharge (agricultural, industrial, and domestic sources) can substantially affect groundwater quality by increasing contaminant loads, while land-use change, can disrupt the natural infiltration and recharge processes52. Notably, groundwater exploitation (e.g., groundwater supply and groundwater exploitation rate) emerges as a crucial anthropogenic perturbation factor53,54. Excessive groundwater extraction lowers water tables, concentrates contaminants, and, in coastal areas, can trigger seawater intrusion, increasing salinity35,55,56. It can also mobilize naturally occurring contaminants like arsenic and fluoride, further degrading water quality30.

To classify groundwater quality at the prefecture level, we implemented a series of data preprocessing steps. First, regional data were converted into 1 km² grid cells based on regional land-use types, with all cells within a prefecture assigned the corresponding groundwater quality data to ensure consistency. The groundwater quality data were then binarized, assigning a value of zero to non-PGQ and one to PGQ. The purpose of this approach was twofold: (i) to prioritize the fundamental health aspect of safety (non-PGQ) or unsafety (PGQ) of groundwater for drinking; and (ii) to address variations in precision resulting from diverse analysis methods employed in different data sources. Next, natural factors in the predictors were extracted at a 1 km² resolution, assigning each grid cell a unique value representing local environmental conditions. Socio-economic factors were uniformly applied to all grid cells within a province, based on the sampling year. The dataset on land-use types covers specific benchmark years: 1980, 1990, 1995, 2000, 2005, 2010, 2015, 2020, and every five years between 2020 and 2100. For other years, land-use data are assumed to be identical to the adjacent prior benchmark year. For instance, land-use data in 1989 are taken to be the same as those in 1980. For other dynamic factors, their cumulative average values from 1980 to the target year were calculated as the model inputs. This accounts for the lag in pollutant migration and effectively captures the long-term impacts on groundwater quality. Fertilizer consumption and pesticide consumption were averagely allocated per km2 of agricultural land, with non-agricultural land assigned zero. Industrial wastewater discharge, industrial solid waste discharge, domestic wastewater discharge, domestic solid waste discharge, and groundwater supply were allocated per km2 based on population density for each province. Per capita data for these variables were calculated by dividing the total value by the population of each province, then distributed per km2 based on population density. Groundwater exploitation rates were uniformly distributed across each province, with regions having a population density below 1 person/km² assigned a rate of zero. Other factors were allocated per km2 based on the resolution of raster data or vector data. Finally, groundwater quality categories for each year and region were matched with the corresponding natural and socio-economic variables to create the training and validation datasets for the RF model.

Model development and evaluation

In this study, the RF model was constructed to predict the spatio-temporal dynamic of groundwater quality from natural and social-economic factors. The RF model was chosen for its ability to handle large datasets with numerous predictor variables while efficiently capturing complex interactions among them24. The RF model uses the bootstrap resampling method to build multiple decision trees. For the construction of each tree, samples are independently selected; however, the distributions for all trees in the forest are the same, which guarantees the robustness of the model. In addition, the RF model offers key advantages such as preventing overfitting by introducing randomness at each decision node, which enhances generalization. By selecting random feature subsets for each tree, RF effectively addresses the challenges of high dimensionality in the data. It also reduces experimental noise and improves prediction accuracy through an ensemble approach, averaging predictions from multiple decision trees. Furthermore, RF provides inherent feature importance measures, helping identify key variables, and handles missing data efficiently, either through imputation or splitting based on available features. With minimal parameter tuning required compared to other machine learning algorithms, RF presents itself as an ideal model for this study14,57,58.

We began by optimizing the hyperparameters of the RF model, with a particular focus on the number of decision trees. By systematically testing tree counts in increments of 10, from 10 to 200, we identified 50 trees as the optimal balance between accuracy and computational efficiency (Fig. S21). Other parameters, such as minimum leaf size, were kept at their default settings, as these have proven effective in similar applications.

To further validate the model’s performance and safeguard against overfitting, we conducted 10-fold cross-validation. The dataset was randomly partitioned into 10 subsets, with 9 used for training and 1 for validation in each iteration. This process was repeated across all subsets, ensuring comprehensive evaluations. The averaged performance metrics (Table S4 and Fig. S22) confirmed the model’s ability to generalize effectively, without overfitting.

After confirming the model’s robustness through cross-validation, we proceeded to retrain it using 90% of the data (1779 surveys, comprising 67,315 points) for training and 10% (198 surveys, comprising 15,335 points) for validation, each preserving the proportion of PGQ and non-PGQ of the full dataset. This approach maximizes the model’s learning capacity by providing a larger training set while preserving an independent validation set to assess performance on unseen data. This final validation confirmed that the model maintained strong predictive accuracy, further enhancing its generalization ability.

To precisely and comprehensively establish the relationship between model predictors and groundwater quality, 51 predictor variables were initially included. We conducted Pearson correlation analysis to reduce redundancy in the predictor variables. Initially, all original variables were included in the RF model training. Among the list of candidate variables, one is opted out if its Pearson correlation coefficient23 with any other variables was larger than 0.7. However, certain variables were retained despite exceeding this correlation threshold. This decision was made given that excluding these variables notably reduced the model’s predictive performance, demonstrating their essential contributions to accurately capturing the system’s dynamics. This step-by-step approach ensures that the final set of variables used in modeling or further analysis is less likely to include redundant information, thus improving the parsimony and interpretability of the model. Finally, 25 predictors were reserved to predict the groundwater quality in China. The data sources and detailed information about these predictors were described in the Supplementary Information.

The final RF model was trained with 50 trees using the selected 25 predictors as inputs to classify each 1 km2 grid as being PGQ or not. Each decision tree in the forest casts a ‘vote’ for a class label based on the input data, with the final class label determined by the majority vote across all trees. The model also calculates class-label probabilities based on the proportion of trees voting for each class. The performance of the RF model is systematically validated with the following metrics.

Prevalence (Prev) quantifies the proportion of actual positives in the dataset and is calculated as follows59:

$${Prev}=\frac{{TP}+{FN}}{{TP}+{FN}+{TN}+{FP}}$$
(1)

where TP (true positives) are samples correctly identified as PGQ; FN (false negatives) are samples incorrectly identified as non-PGQ; TN (true negatives) are samples correctly identified as non-PGQ; FP (false positives) are samples incorrectly identified as PGQ.

General statistical indices, such as accuracy (Acc), sensitivity (Sen), precision (Prec) and specificity (Spec), were employed to quantify the bias of the RF model, with higher values indicating better model effectiveness60. The definitions of TP, TN, FP, and FN shed light on the seemingly subtle distinctions among these statistics, which play a crucial role in assessing the model’s performance and its ability to minimize false positives and negatives59:

$${Acc}=\frac{{TP}+{TN}}{{TP}+{FN}+{TN}+{FP}}$$
(2)
$${Sen}=\frac{{TP}}{{TP}+{FN}}$$
(3)
$${Prec}=\frac{{TP}}{{TP}+{FP}}$$
(4)
$${Spec}=\frac{{TN}}{{TN}+{FP}}$$
(5)

The performance of the trained RF model was then assessed using the area under the receiver operating characteristic (ROC) curve (AUC). The ROC curve is primarily used to evaluate the performance of a classification model by showing the trade-off between the sensitivity and specificity across different thresholds61. AUC is a single measure that summarizes the overall performance of RF for classification. The higher the AUC, the better the model is at distinguishing between PGQ and non-PGQ. A perfect classifier would have an AUC of 1, while a completely random one would have an AUC of 0.5.

Additionally, Cohen’s kappa coefficient (K), which measures the agreement between the observed and predicted classifications corrected for chance, is used to further assess the model’s accuracy59. Kappa coefficient of 0.6–0.8 is generally considered good, and values above 0.8 are deemed excellent, calculated by:

$$K=\frac{{P}_{{{{\rm{o}}}}}-{P}_{{{{\rm{e}}}}}}{1-{P}_{{{{\rm{e}}}}}}$$
(6)

where Po is the observed agreement (Acc), and Pe is the expected agreement by chance:

$${P}_{e}=\frac{\left({TP}+{FP}\right)\times \left({TP}+{FN}\right)+({FN}+{TN})\times ({FP}+{TN})}{{({TP}+{FN}+{TN}+{FP})}^{2}}$$
(7)

Apart from evaluating the predictive abilities of the RF model, the spatial patterns of the estimates were validated by comparing them with nationally recognized maps, ensuring their consistency and reliability.

To assess predictor importance, we utilized the error of out-of-bag (OOB) samples among trees, a common metric employed in RF models. This approach involves randomly permuting the values of a specific predictor variable and calculating the resulting change in OOB error. OOB error refers to the prediction error computed on samples that were not included in the bootstrap sample used to train each decision tree in the RF model. If permuting a predictor leads to a considerable increase in OOB error, it suggests that the variable plays a critical role in prediction accuracy. This approach offers a reliable means of identifying the important features for model prediction.

To further assess the sensitivity of each influencing factor on the model’s predictions62, we conducted a sensitivity analysis by systematically increasing or decreasing each continuous predictor by 10%. This approach allowed us to observe how these perturbations influenced the predicted average probability of PGQ over the historical period from 1980 to 2020. By comparing these modified predictions to the baseline scenario, where no changes to the predictors were applied, we were able to quantify the degree to which each factor contributed to variability in PGQ probability, providing insights into which predictors most notably affect groundwater quality predictions under different conditions.

Following the identification of key predictors, partial dependence plots (PDPs) were generated to visualize each predictor’s marginal effect on the probability of PGQ. The PDPs were generated by holding the focal predictor constant across its empirical range while integrating out the effects of all other predictors, thereby isolating the unique contribution of the predictor to the response63. This was achieved by averaging the model predictions over the distribution of the data, providing a clear visual representation of the dependency between the predictors and the probability of PGQ.

In addition, feature interaction charts (FICs) were constructed to explore the combined effect of two predictors on the predicted probability of PGQ63. It takes into account the range of values for the two concerned predictors. However, instead of holding one variable constant, FICs vary both variables simultaneously over their empirical ranges. The model’s predictions are then averaged over the joint distribution of the two predictors, and the combined effect is visualized in a two-dimensional color-coded map. This approach highlights how the interaction between two predictors influences the model output, thus providing a deeper understanding of the relationships within the data.

PGQ prediction

The RF model was used to reconstruct and predict the spatiotemporal distribution of groundwater quality from 1980 to 2100, enabling us to infer long-term trends in populations potentially exposed to PGQ. We apply a probability cutoff (Probcut) of 0.5 to identify the populations at risk in areas exceeding this threshold. Then affected population living in each grid (1 km2) was calculated by multiplying the grid’s population (Pop) by its probability of PGQ (Probpoor). The calculation of the potentially affected population (Popaffect) is summarized in the following equations:

$${{Pop}}_{{{{\rm{affect}}}}}=\left\{\begin{array}{c}{Pop}\times {{Prob}}_{{{{\rm{poor}}}}},\,{{Prob}}_{{{{\rm{poor}}}}} \, > \, {{Prob}}_{{{{\rm{cut}}}}}\\ \,0,\,{{Prob}}_{{{{\rm{poor}}}}}\le {{Prob}}_{{{{\rm{cut}}}}}\end{array}\right.\,$$
(8)

The dynamics of PGQ probability among different periods were controlled by the four categories of time-variant drivers (i.e., climate change, pollution discharge, groundwater exploitation, and land-use change). To isolate the impacts of a specific category of time-variant driver on PGQ, we designed factorial experiments between 1980 and 2020. In the experiment, we just kept one specific driver category (e.g., climate) constant at their initial states in 1980, while allowing the other driver categories vary with time as they are (i.e., the time series in the real world). For instance, the \({Prob}\)Climate means that all drivers in the climate category remained constant at values in 1980 but with real (time-variant) features of drivers in pollution and groundwater categories.

By comparing the simulations between experiments, we could isolate the impact of a target driver category and quantify its contribution. To implement the comparison, we defined the differences between simulation driven by all variables in period y (\({Prob}\)All,y) and that driven by partial time-variant driver categories (\({Prob}\)Vi,y) as the actual contribution of the target variable i (Vi) and period y (ConVi,y):

$${{Con}}_{{Vi},y}={{Prob}}_{{All},y}-{{Prob}}_{{Vi},y}$$
(9)

The relative contribution (%) of variable i (i = 1, 2, 3,…, n) and period y (y = 1, 2, 3,…, k) (RelConVi,y) was defined as:

$${{RelCon}}_{{Vi},y}=\frac{\left|{{Con}}_{{Vi},y}\right|}{{\sum }_{i}^{n}\left|{{Con}}_{{Vi},y}\right|}\times 100\%$$
(10)

To quantify the contribution of each driver category to changes in PGQ area ratio on the national scale, we define the differences between the area ratio of PGQ driven by all variables in period y (\({AR}\)All,y) and that driven by partial time-variant driver categories (\({AR}\)Vi,y) as the primitive contribution of the target variable i (Vi) and period y (ConARVi,y):

$${{ConAR}}_{{Vi},y}={{AR}}_{{All},y}-{{AR}}_{{Vi},y}$$
(11)

The differences between the area ratio of PGQ driven by all variables in period y (\({AR}\)All,y) and that driven by all variables in the baseline year (\({{AR}}_{{All},{BY}}\)) are defined as the actual growth:

$${{ActAR}}_{{All},y}={{AR}}_{{All},y}-{{AR}}_{{All},{BY}}$$
(12)

Then, the relative contributions of each driver category to changes in the PGQ area ratio are calculated as:

$${{RelAR}}_{{Vi},y}=\frac{{{ActAR}}_{{All},y}}{{\sum }_{i}^{n}{{ConAR}}_{{Vi},y}}\times {{ConAR}}_{{Vi},y}$$
(13)

The grid annual mean relative contributions of the four driver categories in groundwater were used to visualize the spatial patterns of the influential intensity of different driver categories on a national scale. We defined the single dominant driver category of each pixel as the variable with a maximum relative contribution. As pollution discharges are composed of agricultural, industrial, and domestic pollution, the dominant driver category of pollution discharge for each pixel is defined as the secondary driver category with the largest relative contributions among the three secondary driver categories.

To further elucidate the temporal dynamics of groundwater quality, the differences in the projected groundwater quality categories for the years 2020, 2000, and 1980 were calculated, generating maps of groundwater quality changes for the periods 1980–2000, 2000–2020, and 1980–2020. Regions of groundwater quality deterioration (non-PGQ to PGQ) during these periods were extracted as masks. This approach yielded detailed maps of the driving factors behind deteriorated groundwater quality regions across the three distinct periods.

In future scenarios, the analysis of dominant driver focuses exclusively on four primary categories: climate change, groundwater exploitation, land use and pollution discharges, without individually detailing secondary pollution categories. We kept one specific driver category (e.g., climate) constant at its states in 2020, while allowing the other driver categories to vary with time.

We evaluated the temporal evolution of groundwater quality from 2021 to 2100 in four different scenarios, which allowed us to assess the long-term changes in groundwater quality by considering the continuity of the underlying factors over the specified time frame. To evaluate the impacts of climate change on groundwater quality, we utilized the SSP scenarios, namely SSP1-1.9, SSP2-4.5, and SSP5-8.5, as outlined in the IPCC AR664. These scenarios provide a framework for understanding and projecting different socio-economic and climate conditions, allowing us to examine the potential responses of groundwater quality to various development and climate change trajectories.

In Scenario I (Baseline), we assumed a consistent development intensity for all drivers. Climate, groundwater exploitation, and pollution discharges were maintained at the same levels as the cumulative average values in 2020. Similarly, land-use type in Scenario I was also assumed to remain unchanged from 2020. In contrast, Scenarios II-IV incorporated projections to account for future variability. Specifically, climate data from one CMIP6 model——EC-Earth3 were used as model input to project future PGQ probabilities65,66,67. Additionally, land-use data in these scenarios were derived from the SSP-RCP global 1 km land-use simulation dataset (2020-2100) for SSP1-1.9, SSP2-4.5, and SSP5-8.5 scenarios, updated every five years68. To maintain consistency in representation, these land-use data were aligned with those used for the historical period from 1980 to 2020.

In the three scenarios, annual change rates of −3%, 1%, and 3% were assigned as gradients for groundwater exploitation and pollution discharge, respectively. These rates were determined based on historical trends69,70 and future projections71,72 related to industrial activities, agricultural expansion, and environmental policies. The low rate of −3% reflects a proactive approach toward reducing resource use and pollution discharge, aligning with the sustainability-focused SSP1-1.9 scenario73. In contrast, the medium rate of 1% corresponds to a moderate development pathway, as seen in SSP2-4.5. This scenario embodies a balanced approach, allowing socio-economic development to progress while placing emphasis on environmental protection74. The high rate of 3% aligns with the SSP5-8.5 scenario, characterized by rapid economic growth and minimal environmental regulation75. This scenario anticipates high energy demands, with substantial reliance on fossil fuels and industrial expansion. We acknowledge that future development may exhibit more intricate patterns, and our current study may not fully capture this complexity and uncertainty. Then, these rates were applied to the corresponding values from 2021 to 2100, and the cumulative average was calculated as the model inputs for each year. Please refer to Table S5 for detailed information on the specific scenario assumptions.

To evaluate the impact of uncertainties in climate change, groundwater exploitation, and pollution charge on model predictions, we applied Monte Carlo (MC) simulations to quantify the variability in groundwater quality predictions due to input uncertainties76. This analysis was conducted for the period 1980–2100, with data processed at 5-year intervals for 1980–2015, annually from 2016 to 2020, and at 5-year intervals for future projections from 2025 to 2100 under four scenarios.

In each scenario-year combination, MC simulations with 50 members were run to introduce variability in key natural and socio-economic predictors. Perturbations were applied to these dominant factors, such as climate, groundwater exploitation and pollution charge, reflecting their real-world uncertainties. For instance, temperature varied within a ±2 °C range, while predictors like precipitation, fertilizer, and pesticide use were perturbed by up to ±20%.

The results of these simulations can be used to evaluate the range of possible groundwater quality outcomes based on uncertainties in input data. For each year and scenario, we calculated the mean and standard deviation of the predicted probabilities based on the ensemble outcomes, providing a robust estimate of both expectation and variability in groundwater quality predictions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.