Introduction

Clean water and sanitation are essential to the society, as discussed in The United Nations’ Sustainable Development Goals (SDGs)1,2,3,4. Groundwater is a major freshwater resource that provides half of the drinking water available to the global population5,6. This number can increase by up to 70% and 95% for urban and rural residents of China, respectively7. Increasing population and expanding agriculture have continued to threaten global groundwater resources5,8. Groundwater nitrogen (N) accumulation has been reported globally, including in China9, Germany10, India11, and the United States12. Excessive reactive nitrogen, primarily nitrate, has caused cascading effects on ecosystem instability, groundwater degradation, and human health13. Although nitrate removal from groundwater has been achieved in some regions at considerable cost14,15, high nitrate concentrations in groundwater remain a serious and widespread issue, particularly in developing countries. Understanding whether this upward trend will persist due to ongoing development and the intensified use of agriculture is of paramount importance16,17.

Globally, agriculture and wastewater discharge are responsible for high nitrate concentrations in groundwater18,19, except in regions with naturally occurring nitrate from bedrock20. Although nitrogen fertilizers have historically been the primary contributor to high nitrate concentrations in groundwater21, the contribution of wastewater has gained growing importance with expanding urbanization22. In addition, changing environmental conditions across diverse aquatic ecosystems, including changing climate and intensifying anthropogenic activities23,24, could potentially modify nitrogen input and the pathways of reactive nitrogen25,26, which further complicates our understanding of nitrate dynamics in groundwater. Discerning the distinct impacts of individual influential factors remains a significant challenge, partly due to inadequate large-scale groundwater sampling and data. In addition, extrapolation of understanding from local-scale studies to regional or continental levels poses an elusive challenge.

Groundwater stores large quantities of nitrogen and facilitates its transport, transformation, and accumulation27,28,29. However, the spatial relationship between groundwater aquifer depth and nitrogen dynamics has rarely been characterized. Hence, understanding the effect of aquifer depth on groundwater nitrogen dynamics is essential to reveal mechanisms that regulate groundwater nitrate and groundwater quality. Nevertheless, it is challenging to obtain accurate data on aquifer depth and groundwater nitrate dynamics at a large scale9. Fortunately, wells have been the most extensive and direct way of withdrawing groundwater since ancient times, and their widespread distribution across China provides a promising avenue for investigating the dynamics of nitrate in groundwater. The main reasons are as follows: natural convenience for groundwater sampling; moreover, groundwater nitrate dynamics have been extensively investigated throughout China since 2000, and previous studies have shown that groundwater table can serve as a representative measure of local aquifer depth in numerous regions30,31. Therefore, it is reasonable and innovative to study the role of aquifer depth in nitrate dynamics by using well data.

Here, we compiled and analyzed available data on nitrate concentrations and isotopic data in groundwater across gradients of geographical features, climate zones, and economic regions in China. We aimed to answer the following questions: (1) What is the spatial pattern of groundwater nitrate concentrations in China? (2) How do the nitrate concentrations in groundwater vary with aquifer depth? (3) How do climate conditions, geography, and anthropogenic activities interact to influence nitrogen concentration in groundwater? We examined the influence of various factors, including groundwater aquifer depth, hydrochemical characteristics, land use, ecoregion, latitude, longitude, population density (PD), mean annual precipitation (MAP), and mean annual temperature (MAT). The results provide comprehensive datasets and identify influential drivers of nitrate dynamics in groundwater. The outcome of this study can provide valuable insights not only for managing nitrogen pollution in groundwater but also for offering theoretical support for developing groundwater quality management strategies that subsequently contribute to informed decision-making on both national and global scales.

Results and discussion

The dynamics of nitrate concentrations in groundwater across China

The depths of groundwater aquifer range from 0 to 860 m (Fig. 1). The average depths of the groundwater in lowlands, middlelands, and highlands are 83.0 ± 127.5 m (Mean ± SD), 62.7 ± 75.4 m, and 23.3 ± 51.0 m, respectively (Table S1). Statistical data demonstrated that the groundwater aquifer depths in southern China are generally shallower than those in northern China due to higher precipitation levels and shallow groundwater tables in southern China (Fig. 1a).

Fig. 1: The spatial variation of nitrate concentrations in 4047 groundwater sites in China.
figure 1

a The upper and right subgraphs show the counts of longitude and latitude distributions of the sample sites, respectively. The red bar chart above the map shows the number of sampling points in different years. b Mean concentrations of nitrate in eight economic regions: North Coast (NC), Middle of Yellow River (MYR), North East (NE), South West (SW), East Coast (EC), North West (NW), Middle of Changjiang River (MCR), and South Coast (SC). c The relative importance of influential drivers for groundwater nitrate concentrations was determined by regression forecasting using random forests. The columns marked * (P < 0.05) and ** (P < 0.01) indicate that drivers have significant impacts on nitrate concentrations, while ns (P > 0.05) indicates that drivers have no significant differences in nitrate dynamics. d, e Nitrate concentrations under different land use types and population densities, respectively (classification methods in “Materials and Methods”). The box plots include three modules: box limits, upper and lower quartiles; small cube, average value; center line, and median.

Nitrate concentrations (Unless otherwise specified, represents NO3- rather than N-NO3- in this study) ranged from 0 to 824.6 mg L–1 (31.2 ± 69.4 mg L–1 (Mean ± SD); median: 9.0 mg L–1), exceeding previous reports by Gu et al.9 (10.9 ± 18.1 mg L–1 in groundwater in China) and Ioannis et al.13 (5.5 ± 5.1 mg L–1 in global groundwater). Among them, concentrations are highest in shallow groundwater, reaching 824.6 mg L–1 (70.3 ± 136.6 mg L–1). Geographically, nitrate concentrations exhibited significant spatial variation, with relatively higher concentrations in northern China than in southern China (Fig. 1a and Table S1). The economic regions in northern China (NC, MYR, and NE) exhibited the highest mean nitrate concentration. Notably, NC stands out at a mean concentration of 56.8 ± 104.6 mg L–1, aligning with its highly developed economy (Fig. 1b). In addition, lowlands with dense population displayed higher average nitrate concentration (36.1 ± 76.7 mg L–1) compared to the middlelands and highlands. Groundwater nitrate concentrations also changed with land use types. As expected, groundwater in regions with minimal disturbance exhibited significantly lower concentrations compared to those in cropland and urban regions (Fig. 1d). Aside from water quality variables (TDS), MAP and MAT are the second most important attributes driving nitrate concentrations, indicating the importance of climate control on solute concentrations32. This is also consistent with nitrate concentrations in rivers, for example, in the United States33, although mean nitrate concentrations of groundwater aquifers are usually higher in croplands. Similarly, regions with higher population densities showed slightly elevated nitrate concentration, which could be attributed to increased nitrogen load (Fig. 1e).

We used geospatial machine learning (i.e., random forest model) based on predictor datasets (Table S2) and generated nitrate concentrations across China at a spatial resolution of 0.05° × 0.05° (Fig. 2). The model performed well (R2 = 0.53 and NSE = 0.50). The map identified more severe nitrate pollution in northern China than in southern China (Fig. 2a). For instance, the groundwater of agricultural regions (such as North Coast and Guanzhong plain) and urban regions (such as Beijing, Tianjin, and Guangdong-Hong Kong-Macao Greater Bay Area) face a greater threat of nitrate than natural land. In addition, model uncertainty was assessed using the ML model. High uncertainty was mainly concentrated in regions with limited measurements, such as the Qinghai-Tibet Plateau, northeastern China, and Yunnan Province (Fig. 2b). In contrast, regions with abundant observations can significantly reduce model uncertainty, such as the Yellow River Basin and Yangtze River Basin. Note that, large uncertainty is also observed in regions with low concentrations, such as Yunnan Province, indicating the limitation of model capacity in regions with lower values34. Overall, more field measurements of nitrate concentrations are required to reduce prediction uncertainty in hotspot regions.

Fig. 2: Nitrate dynamics in Chinese groundwater.
figure 2

a Simulated nitrate concentrations and (b) uncertainty in Chinese groundwater using a machine learning algorithm (random forest modeling), predicted by 66 spatially continuous environmental variables (Table S2). The nitrate values were grouped into 11 categories to generate a color gradient (on the right), where blue and red colors represent low and high values, respectively.

Nitrate concentrations decrease with groundwater aquifer depth

In contrast to existing studies that predominantly examined nitrogen concentrations at individual sites35,36, this study examined spatial variation and interconnected patterns of nitrate concentration in groundwater. Among all hydrochemical characteristics and other influencing factors, aquifer depth exerted the most predominant influence on nitrate concentration in groundwater (Fig. 1c and Fig. 3a–e). Nitrate concentrations decreased with increasing aquifer depths (R2 = 0.34, P < 0.001) (Fig. 3c). This general pattern was also evident in North China, where aquifer depth and nitrate concentrations varied tremendously (Fig. S1a). δ15N-NO3- increased and then decreased with increasing aquifer depth, peaking at about 8 m (Fig. 3d), suggesting that nitrate may have undergone two distinct biogeochemical stages. The first is that more anoxic conditions are coupled with increasing carbon sources (e.g., DOC) (Fig. S2a), resulting in a greater denitrification with increasing depth. The subsequent decline may be attributed to declining microbial abundance and carbon and nitrogen sources with increasing aquifer depth37. Values of δ18O-NO3- continued to increase, but also showed minimum values at peak values of δ15N-NO3- (Fig. 3e), which may be attributed to processes including water evaporation, which enriched δ18Owater and subsequent isotopic fractionation during nitrification13. These changes in nitrate dynamics and isotopes indicate combined influence of multiple drivers, which will be discussed in the subsequent section.

Fig. 3: Nitrate concentrations, isotopes, and groundwater aquifer depth.
figure 3

a Nitrate concentrations at various aquifer depths and elevations. b The frequency distribution of NO3- concentrations and aquifer depths. c NO3- concentrations, including only samples in the first to third quartile nitrate concentrations for the four classes of aquifer depths, (d) δ15N-NO3-, (e) δ18O-NO3- as a function of aquifer depth. Nitrate concentrations generally decreased with increasing depth and MAP (c); values of δ15N-NO3- peak, whereas those of δ18O-NO3- reach lowest values at some intermediate depths (d, e).

Values of δ15N-NO3- and δ18O-NO3- varied with aquifer depth but the correlations were insignificant (ANOVA, P > 0.05). Values of δ15N-NO3- were lowest in shallow (CV = 1.03, <20 m) to very deep groundwater (CV = 0.40, >100 m), which may be caused by the diverse nitrate sources and/or varying degrees of biological activities38. Although the majority of samples fell within the theoretical range of nitrification (indicated by the box regions in Fig. S3a–d), some were outside this range, suggesting the potential role of other processes such as denitrification in regulating nitrate concentrations39. The lack of a discernible relationship between δ15N-NO3- and 1/NO3- in groundwater at different depths (Fig. S3e–h) implies that the δ15N-NO3- is likely influenced by multiple sources and multiple biological removal processes40. Meanwhile, the δ15N-NO3- and Ln(NO3-) were negatively correlated (Fig. S3i–l), especially in shallow groundwater (<20 m, Fig. S3i), indicating the occurrence of biological removal processes41. Therefore, groundwater biogeochemistry, including nitrification and denitrification, may play a key role in regulating N dynamics and turnover within groundwater.

Groundwater nitrate sources

To quantify the contribution of different sources to nitrate, we used a Bayesian stable isotope mixing model (MixSIAR). We considered atmospheric deposition, manure & sewage, soil nitrogen, and chemical fertilizer41,42 to estimate the potential end-members contributing to nitrate in groundwater (Table S3). The model results showed that nitrate in groundwater primarily originated from soil nitrogen, manure & sewage, accounting for 38.4 ± 23.0% and 31.8 ± 13.4%, respectively (mean ± SD); atmospheric deposition contributed only a small fraction (10.4 ± 3.3%). The contribution of soil nitrogen was greatest in shallow (44.5 ± 22.9%), deep (44.1 ± 26.7%), and very deep groundwater (40.5 ± 23.3%), whereas manure & sewage dominated the nitrate pool in medium groundwater (44.3 ± 22.1%) (Table S4).The unexpected dominant presence of manure & sewage in medium groundwater may be linked to irrigation practices from the 1970s, when manure & sewage were commonly used, although these practices are much less common in China today43. The spatial distribution of nitrogen input to subsurface system varies across ecoregions. For example, in North China, groundwater nitrogen often predominantly originates from manure & sewage (65.1%), with soil nitrogen, atmospheric deposition, and chemical fertilizer contributing 18.0%, 9.3%, and 7.6%, respectively44. Conversely, in South West, groundwater nitrate is derived more from chemical fertilizer (32%) than soil nitrogen, atmospheric deposition, and manure & sewage, contributing 25%, 18%, and 25%, respectively45.

The uncertainties of sources proportion of nitrate in groundwater during calculation would be caused by multi sources mixing and various transformation processes with different isotopic fractionation41. The high soil nitrogen contribution may reflect more samples from croplands than from other land use types (Table S4). As has been discussed extensively in literature (e.g., Van Meter et al.18), agricultural lands have accumulated nitrogen originating from nitrogen fertilizers, manure, and other organic materials for decades to centuries46. The data here potentially indicate that the accumulation of legacy soil nitrogen contributed more to the observed nitrate than the more recent addition of fertilizer and atmospheric addition. Thus, when nitrogen inputs are reduced, the nitrogen accumulated in soil organic matter may still be released and transported to aquifers, contributing to the nitrogen legacy in groundwater18,47. This also indicates that neglecting groundwater nitrogen isotope fractionation may lead to source blurring and inaccurate results. For example, although the δ15N-NO3- values for N fertilizers are approximately 0‰, the volatilization of ammonia can introduce a bias of about 2‰, which may increase to as much as 4‰ when crops first absorb and then degrade48,49.

Chronic pollution prolongs in North China and Middle Yellow Rivers

A comprehensive understanding of the dynamic relationships among influencing factors, groundwater aquifer depth, and nitrate dynamics is a prerequisite for the management of the global groundwater system27,50. The correlations among aquifer depth, geographical location, and climate may reflect an intrinsic relationship between aquifer depth and nitrate dynamics (Fig. 4). When the aquifer depth was greater than 140 m, most groundwater had low nitrate concentrations, whereas groundwater below 140 m in lowlands and middlelands with high population density often had medium (7.5%) and poor (8.8%) water quality, as measured by nitrate concentrations (Fig. 4a). It should be noted that in middlelands with both very high and low populations, the groundwater quality deteriorates significantly to a poor level (35%) when aquifer depth ranges from 0 to 18 m (Fig. 4a), which may be attributed to more recent nitrogen inputs9. According to the threshold of aquifer depth, Class I groundwater (≤140 m, 75.8%) was mainly located in lowlands (53.5%) and middlelands (33.0%), with cropland being the dominant land cover (49.4%). However, less than half of the groundwater (44.8%) met the water quality standard of 10 mg L–1 (Fig. S4). In contrast, Class II groundwater (>140 m, 24.2%) was mainly situated in lowlands (71.0%) and middlelands (22.8%), characterized by high population density levels (55.8%) and urban land cover types (44.0%), and the overall water quality was excellent (82.4%) (Fig. S4). Additionally, the threshold for nitrate dynamics under different influencing factors showed that most Class II groundwater (>140 m) is classified as excellent, whereas there is significant variability in the nitrate dynamics of Class I groundwater (≤140 m) (Fig. 4a and Fig. S4), highlighting the substantial impact of anthropogenic activities33. Note that, there are two turning points in nitrate dynamics as aquifer depth increases, one of which is the turning point shown in Fig. 3e (aquifer depth ca. 8 m), while the other is the threshold from decision tree-heatmap (Fig. 4a, aquifer depth ca. 140 m). This indicated that anthropogenic activities have impacted depths of up to ca. 140 m (i.e., Class I groundwater), and denitrification has concurrently affected shallow groundwater (≤8 m) owing to abundant carbon sources51. This information can provide theoretical support for sustainable future groundwater quality management by explaining why shallow groundwater is more susceptible to high nitrates level.

Fig. 4: The relationships between groundwater nitrate concentrations and influencing factors, including groundwater aquifer depth, geographical factors (including ecoregion, latitude, and longitude), and climatic factors (including MAP and MAT).
figure 4

a A decision tree-heatmap for predicting the threshold for nitrate concentration based on environmental factors associated with anthropogenic activities. b, c Structural equation models describing the direct, indirect, and total effects of aquifer depth, geographical factors, climate factors, and hydrochemical characteristics of nitrate. Path coefficients (i.e., regression coefficient) and coefficients of determination (R2 = 0.63) represent the effect size (λ) of the relationship (numbers adjacent to the paths) and the proportion of variance explained by the relationships in the model. The direct effect represents the impact of the independent variable on dependent variable, while indirect effect reflects its influence through one or more intermediate variables. The total effect is the combined effect of independent variable on dependent variable, which is the arithmetic sum of direct and indirect effects. The thickness of the arrows reflects the relative magnitudes of the standardized path coefficients. Red, blue, and gray color denotes positive, negative, and irrelevant effects, respectively. The dashed lines represent insignificant relationships; *P <  0.05, **P <  0.01, and ***P <  0.001.

The association among auxiliary data (including geographical, climatic factors, and hydrochemical characteristics), aquifer depth, and nitrate concentration using structural equation model showed interesting relationships (Fig. 4b, c). TDS (λ = 0.61), EC (λ = 0.44), and geographical factors (λ = 0.26) exhibited positive effects on nitrate, whereas aquifer depth (λ = –0.15) and T-water (λ = –0.14) showed negative effects. Furthermore, the total effect of climate factors (λ = 0.40) on nitrate was the highest, wherein the indirect effect of aquifer depth (λ = –0.22) and climate factors (λ = 0.46) surpassed the direct effect. However, the direct effect of geographical factors (λ = 0.26) remained greater than the combined indirect effects.

Groundwater nitrate concentrations were predominantly controlled by environmental factors, especially climate factors indirectly32,52,53 (Fig. 4c). Climatic conditions, including temperature and precipitation, can significantly influence the contact time between water and nitrogen-binding soils, thereby facilitating transformation of nitrate into other forms via processes such as denitrification54,55. Sadayappan et al.33 found that arid climate can elevate nitrate concentrations in water by reducing its transport and transformation through lower water content and flow, even without human-induced nitrogen inputs. The influence of climatic factors on nitrate dynamics was achieved through indirect processes (Fig. 4c) and biogeochemical reactions. Munz et al.56 demonstrated that temporal variations in temperature and redox zonation dominate the migration and degradation of nitrate within groundwater aquifer, which is also verified by the relationship between Oxidation-Reduction Potential (Eh) and NO3- in this study (Fig. S2b).

In addition, more humid regions with higher precipitation had lower nitrate concentrations at similar depths (Fig. 3c). To further explore this, we compared North Coast, which has a more arid climate, and South West which has much more precipitation. They also exhibited distinct agricultural practices, lithologies, varied aquifer depths and nitrate dynamics (Fig. S1). The δ18O-NO3- enrichment differs between NC and SW, with isotopic enrichment occurring more in NC as aquifer depth increases in a more arid climate but not as much in humid SW (Fig. S1). The observed isotopic enrichment of δ18O-NO3- in NC with deeper groundwater likely arose from greater evaporation13. In contrast, this does not occur as much in the humid SW57. In effect, low precipitation in NC probably reduces transport and nitrogen use efficiency28,44. In contrast, in wet and rainy SW, abundant rainfall throughout the year facilitates the leaching of legacy nitrogen, as well as the utilization of nitrogen fertilizers through more sufficient soil mositure58,59. Although widespread irrigation practices are present in NC, they are primarily concentrated during the crop growing seasons60. For many years, soil moisture has remained relatively low, limiting the transformation and mobility of soil nitrogen. This intermittent moisture availability may further restrict soil nitrogen transport and processing, contributing to lower nitrogen use efficiency outside the irrigation periods59. This is consistent with the general conclusion that solute concentrations tend to be higher in arid climate and lower in humid climate32. That study examined concentrations of 16 solutes, including nitrate, at hundreds of sites in the United States. The lower concentrations in humid climate were attributed to more rapid solute export from the systems (both surface water and groundwater). The influence of climate on nitrate concentrations in inland waters, especially groundwater, is generally underestimated in literature. Data from the present study highlight the important role of climate in groundwater nitrate concentrations.

Unlike climatic factors, geographical factors had a direct impact on groundwater nitrate concentrations (Fig. 4c). The distribution of ecoregions, location, and population density determine land use types, which significantly affect nitrogen input across various land use types. Nakagawa et al.61 and Sadayappan et al.33 identified land use as the most significant factor affecting nitrate concentrations in rivers. Groundwater in urban and cropland had higher nitrate concentrations (Fig. 1d), similar to observations for riverine nitrate concentrations in Zhi and Li62. Moreover, different developmental and anthropogenic activities also affect nitrate concentration in different regions. There is a roughly linear relationship between NO3- and agricultural N fertilizer (R2 = 0.11, P < 0.001), TN wastewater (R2 = 0.29, P < 0.05), and ammonia wastewater (R2 = 0.11, P > 0.05) across eight economic regions (Fig. S5). Further analysis revealed that nitrate concentration in NC economic region significantly deviated from the fitting line for both agricultural N fertilizer and ammonia N wastewater (Fig. S5). This suggests that nitrate concentrations are primarily influenced by industrial and domestic wastewater, especially TN wastewater, rather than by agricultural activities. For groundwater in cropland, high nitrate concentrations resulted from N legacies due to low nitrogen use efficiency63. The global average nitrogen use efficiency was 0.46 in 201064, and has remained below 0.3 until 2011 in China, the world’s largest nitrogen fertilizer consumer65. Agricultural activities and wastewater discharge containing nitrogen are important drivers of global groundwater N enrichment27,41.

Statistics indicated a consistent increase in the discharge of TN and ammonia wastewater prior to 2015 (Fig. S6). In response, the Chinese government implemented a series of comprehensive treatment measures in 2015, such as improving wastewater treatment that effectively mitigated this trend66. Although the data used in this study is based on single sampling events, the data points across different economic zones are widely distributed during both 2000–2015 and 2016–2020 (see numbers of samples in different economic zones from 2000 to 2020 in Fig. 1a and Table S5). To better capture the temporal variations, we also calculated the overall means and standard deviations (SD) for the two periods: 2000–2015 and 2016– 2020 (Fig. 5). Notably, overall groundwater nitrogen pollution significantly declined during 2016–2020 compared to the data from 2000–2015, except in the NC and MYR (Fig. 5a). This is due to the significantly deeper groundwater in these two regions compared to other regions, such that they do not yet reflect the short-term nitrogen control measures. In other words, it would take longer time for nitrogen management practices to be effective for deeper groundwater18. This reflects the complexity of transport and the persistence of legacy nitrogen, reinforcing the notion that recovery is gradual and varies depending on soil depth, land use, and hydrogeological characteristics52,67,68. There has been a significant reduction in agricultural nitrogen input in these two regions (Fig. 5b), indicating that the current agricultural treatment measures will have positive implications for the mitigation of groundwater nitrogen pollution in the future. However, rising populations and living standards may intensify the impacts of groundwater nitrate legacy.

Fig. 5: Regional-level changes in groundwater nitrate concentration and agricultural N fertilizer input for 2000–2015 and 2016–2020.
figure 5

a Comparison of groundwater nitrate concentrations in eight economic regions in 2000–2015 and 2016–2020. b Comparison of per-unit agricultural N fertilizer input in eight economic regions in 2000–2015 and 2016–2020. This broad spatial coverage is crucial for capturing the major regional trends in nitrogen pollution, adequately representing the two distinct periods. Blue circles and yellow triangles indicate the 2000–2015 and 2016–2000, respectively. Blue shading (and red ellipse) represents the value of 2016–2020 is higher than that of 2000–2015, and vice versa in green shading (and blue ellipse).

Biogeochemistry shapes nitrate dynamics at different aquifer depths

The random forest analysis suggested that groundwater aquifer depth was the primary factor affecting the spatiotemporal variations in groundwater nitrate dynamics (Fig. 1c). The influential factors were ranked as follows: groundwater aquifer depth (GW depth), longitude, total dissolved solids (TDS), mean annual precipitation (MAP), mean annual temperature (MAT), latitude, electrical conductivity (EC), water temperature (T-water), ecoregion, and pH. Although there was a significant range in aquifer depth (0–860 m) and nitrate concentration (0–824.6 mg L–1) (Fig. 3b), nitrate concentrations were generally lower in deeper groundwater (Fig. 3c), highlighting the importance of depth in influencing nitrate concentrations36.

In theory, shallow groundwater with a short retention time69 and vegetation uptake51 offers more efficient removal of nitrogen pollution compared to deep groundwater. However, shallow groundwater also receives more nitrogen input from anthropogenic activities and is more susceptible to external environmental factors, as indicated by the higher anthropogenic N sources in shallow groundwater than in medium-depth groundwater (Fig. 6 and Table S4). The highest mean nitrate concentrations in shallow groundwater can be attributed to large nitrogen inputs from external environment70 and the presence of dissolved oxygen (DO) and nitrifying bacteria in shallow groundwater, which can transform more soil organic N, nitrite, and ammonia into nitrate through nitrification69,71. The lower nitrate concentration in deep groundwater may be influenced by multiple factors. DO concentrations decreased with increasing depth (Fig. 6c), leading to anoxic conditions that enhanced denitrification, which could lower nitrate concentrations in deep groundwater24. This was further supported by the significant positive correlation between δ15N-NO3- and δ18O-NO3- (Fig. S7). However, reducing minerals, such as pyrite, can also chemically react with nitrates, contributing to nitrate reduction in deeper groundwater. Therefore, further research is required to fully understand the relative contributions of denitrification and reducing minerals to nitrogen removal from deep groundwater. Nitrate accumulation in deep groundwater (50–100 m) is generally higher than that in medium groundwater (20–50 m) (Fig. S8 and Table S1), indicating that groundwater nitrogen pollution in China is no longer limited to shallow layers, but gradually accumulates in deep layers27. Therefore, different strategies are required to remediate nitrogen pollution at different depths.

Fig. 6: Vertical distributions of the mean values of δ15N-NO3-, δ18O-NO3-, NO3-, DO, dissolved organic carbon (DOC), and T-water in shallow (<20 m), medium (20–50 m), deep (50–100 m), and very deep (>100 m) from different locations across China.
figure 6

a δ15N-NO3- (‰; N = 525) and δ18O-NO3- (‰; N = 401) at different depth. b NO3- concentrations (mg L−1; N = 4047) at different depths. c DO (mg L−1; N = 689) at different depths. d DOC (mg L−1; N = 278) at different depths. e T-water (°C; N = 1592) at different depths. Values are presented as mean ± SEM (Standard Error of Mean).

The mixing of different sources and biogeochemical processes regulates nitrate concentrations and its isotopic composition72. The variation pattern of nitrate concentrations and their isotopes with aquifer depth further highlights the crucial role of biological activity in nitrogen dynamics (Fig. 3). Specifically, more anoxic conditions and lower DOC (Fig. 6c, d and Fig. S2) in groundwater with increasing depth can explain the increase in δ15N-NO3- by denitrification in the early stage of the previous period (Fig. 3d), which is consistent with our previous hypothesis. Note the negligible impact of MAT (Fig. 1c) and water temperature (Fig. 6) on nitrate concentrations. Although it has been established that temperature influences nitrate dynamics, these effects are often driven by biological activities mediated by microbes73. In deeper subsurface, where microbial activity is scarce, temperature effects may drop to their minimum, as indicated here74,75. In addition, the change in groundwater temperature from shallow to deep layers was small (Fig. 6e), such that the water temperature was not an influential factor in determining groundwater N cycling.

Implications and uncertainties

This study explored the drivers of mean nitrate concentration in groundwater at different aquifer depths under gradients of climate, geographical factors and anthropogenic activities in China. Results revealed the predominant role of aquifer depth in determining nitrate dynamics across China, offering valuable recommendations for groundwater quality management in China and globally. However, it is important to note that the datasets of land use types, ecoregions, and population density were derived from remote sensing data, which inevitably contain some errors and uncertainties76. In addition, although the groundwater nitrate dataset was compiled as extensively as possible, there remains a lack of representative samples from the Qinghai-Tibet Plateau. Therefore, targeted studies in this region should be conducted in the future to enhance statistical significance, thereby strengthening the validity of findings.

These wells connect the ground surface and underground, providing an opportunity to investigate the impacts of anthropogenic activities on subsurface systems. However, the role of aquifer depth in nitrate dynamics and cycling processes remains largely unexplored, especially with regard to groundwater biogeochemical processes36. Despite the consensus that shallow groundwater is more susceptible to nitrogen contamination, this study found that deep groundwater tends to be distributed in lowlands with high population densities, thereby increasing the risk of future contamination and leading to nitrogen enrichment in deep groundwater. Hence, controlling the nitrogen input is a primary strategy for managing global groundwater quality. Moreover, considering the coexistence of deep groundwater and high nitrate concentration in northern China (Fig. 1), more effective nitrogen management policies should be implemented to address the dual challenge of maintaining underground ecosystem stability and protecting groundwater quality, including strengthening control over both point source and non-point source pollution70. Note that nitrogen pollution control is a long-term process that does not yield immediate results due to the presence of legacy nitrogen, particularly in regions with deeper aquifers. According to Van Meter et al.18, even with 100% effective nitrogen use, it would take decades to achieve the nitrogen loading target in the Gulf of Mexico because of legacy nitrogen. In general, comprehensive measures and countermeasures should be proposed to strengthen the supervision and management of groundwater system77,78. The environmental capacity for industrial development and land use changes must be fully considered to ensure the protection of underground ecosystems, promote sustainable development, reduce wastewater discharge, and improve nitrogen use efficiency.

This study revealed that nitrate dynamics were highly correlated with aquifer depth, geographical factors, and climatic factors. These findings provide insight into the vulnerability of shallow groundwater to nitrate contamination and the slow and unsatisfactory nature of remediation. In addition, our analyses revealed the frequently overlooked impact of aquifer depth on the strength of the nitrogen cascade79. That is, nitrogen cascade strength depends on groundwater biogeochemical processes, which are linked to aquifer depth. Considering the high cost of nitrogen removal, especially for groundwater, more carefully controlled experiments are imperative to authenticate this inference and control mechanism that was previously proposed. Undoubtedly, future Water Pollution Prevention and Control Law of China should adopt more flexible management strategies for groundwater water quality and prioritize inter-regional cooperation to effectively address the escalating nitrogen loading in groundwater.

Materials and methods

Data collection

Groundwater nitrate concentration and isotope data (including NO3-, δ15N-NO3-, and δ18O-NO3-) from a total of 4047 groundwater sites in China were compiled from available published papers since 2000 (Fig. 1 and supplemental data). Spring data were collected at a depth of 0 m. Several other parameters, including longitude, latitude, groundwater aquifer depth, total dissolved solids (TDS), electrical conductivity (EC), water temperature (T-water), pH, dissolved oxygen (DO), dissolved organic carbon (DOC), and Oxidation-Reduction Potential (Eh) were also included. To ensure data reliability, significant outliers (below the first quartile minus 1.5 × interquartile range (IQR) or above the third quartile plus 1.5 × IQR) were excluded during the process of data processing80. To ensure the dataset covers the gradients of geographical, climatic, and human factors, we incorporated data from regions with diverse elevations, distinct climate zones, varied land use types, and population densities. This allowed us to capture wide-ranging groundwater conditions, facilitating a thorough analysis of spatial and environmental variability. Groundwater was grouped into four concentration levels according to drinking water standards of nitrate: <10 mg L–1 (excellent), 10–20 mg L–1 (good), 20–45 mg L–1 (medium), and >45 mg L–1 (poor). In addition, groundwater aquifer depth was divided into four groups31: <20 m (shallow), 20–50 m (medium), 50–100 m (deep), and >100 m (very deep) (Table S1). The country is grouped into eight economic regions based on economic development and geographical conditions4, including North Coast (NC), Middle of Yellow River (MYR), North East (NE), South West (SW), East Coast (EC), North West (NW), Middle of Changjiang River (MCR), and South Coast (SC), as shown in Fig. S9.

The classification of ecoregions was based on distinct ecological, climatic, and topographical characteristics and elevation ranges, as defined by the Environmental Protection Agency of United States81,82 (Table S6 and Fig. 1a). Lowlands (elevation ≤ 500 m) are typically flat with naturally fertile soils and well-developed drainage systems, making them suitable for agricultural activities and urban development (Land use percentage of groundwater samples: natural (15.6%), cropland (51.8%), and urban (32.6%)). Middlelands (500 <elevation ≤1500 m) have more moderate slopes and soils, supporting a mix of agriculture, forestry, and urban development but with less intensity compared to lowlands (Land use percentage of groundwater samples: natural (41.8%), cropland (42.0%), and urban (16.2%)). Finally, highlands (elevation >1500 m) are characterized by steep slopes and poorly developed soils, making them generally unsuitable for agriculture or urbanization (Land use percentage of groundwater samples: natural (75.7%), cropland (17.6%), and urban (6.7%)). The elevation data were obtained from the Ecological Forecasting Lab of National Aeronautics and Space Administration (NASA) Ames Research Center (http://ecocast.arc.nasa.gov/) with a spatial resolution of 1 km × 1 km (Table S6 and Fig. S10). The land use types were mainly divided into natural, cropland, and urban areas (Fig. S11), and the average population density from 2000 to 2020 was divided into four categories according to the quartile method of sampling site distribution: <56-person km–2 (low), 56–316-person km–2 (medium), 316–956-person km–2 (high), >956-person km–2 (very high) (Table S6 and Fig. S12). Data with a spatial resolution of 1 km × 1 km were obtained from the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn). The mean annual precipitation (Fig. S13) and mean annual temperature (Fig. S14) for 1982–2015 were obtained from the China Meteorological Administration (CMA) (http://data.cma.cn/) with a spatial resolution of 0.25° × 0.25°. The data of agricultural N fertilizer usage and total sown region of crops, and the discharge of TN wastewater and ammonia wastewater were obtained from the National Bureau of Statistics of the People’s Republic of China (http://www.stats.gov.cn).

Statistical analysis

Random forest model

As an integrated machine learning method and bagging algorithm, the random forest can effectively quantify the relative importance of each factor to reveal potential influence mechanisms68. RF model has been frequently applied in environmental science to map nitrate distributions and identify key influencing factors and control mechanisms. In this study, 66 spatially continuous environmental variables that may be directly or indirectly related to nitrate accumulation in groundwater were combined as potential predictors (Table S2). The principle of prediction uncertainty was calculated by the Bootstrap method by generating different Bootstrap samples, this study obtains various models. The differences in predictions from these models capture the uncertainty inherent in the modeling process. This method effectively quantifies how variations in the input data can lead to variations in the model outputs, thereby reflecting the uncertainty in the predictions. In addition, the RF model based on Gini index was employed to calculate the importance score of 10 variables (GW depth, TDS, EC, MAT, longitude, ecoregion, MAP, T-water, latitude, and pH) for nitrate dynamics. VIM(Gini)j represents the score statistic for variable Xj, indicating the average impurity of node splits for the jth variable across all trees in RF. The Gini index is calculated as follows:

$${{{\rm{GI}}}}_{{{\rm{m}}}}={\sum }_{{{\rm{k}}}=1}^{{{\rm{K}}}}{\hat{{{\rm{p}}}}}_{{{\rm{mk}}}}\left(1-{\hat{{{\rm{p}}}}}_{{{\rm{mk}}}}\right)$$
(1)

Where K represents the number of categories in the dataset, and \({\hat{p}}_{{mk}}\) represents the estimated probability that samples in node m belong to category k. The importance of variable Xj at node m, i.e., the change in Gini index before and after the split at node m was calculated as follows:

$${{{\rm{VIM}}}}_{{{\rm{jm}}}}^{\left({{\rm{Gini}}}\right)}={{{\rm{GI}}}}_{{{\rm{m}}}}-{{{\rm{GI}}}}_{{{\rm{l}}}}-{{{\rm{GI}}}}_{{{\rm{r}}}}$$
(2)

Where GIl and GIr represent the Gini indices of the two new nodes created by the split at node. If variable Xj appears M times in the ith tree, then the importance of variable Xj was calculated by:

$${{{\rm{VIM}}}}_{{{\rm{ij}}}}^{\left({{\rm{Gini}}}\right)} = {\sum }_{{{\rm{m}}}=1}^{{{\rm{M}}}}{{{\rm{VIM}}}}_{{{\rm{jm}}}}^{\left({{\rm{Gini}}}\right)}$$
(3)

Therefore, the Gini importance of Xj was calculated by:

$${{{\rm{VIM}}}}_{{{\rm{j}}}}^{\left({{\rm{Gini}}}\right)}=\frac{1}{{{\rm{n}}}}{\sum }_{{{\rm{i}}}=1}^{{{\rm{n}}}}{{{\rm{VIM}}}}_{{{\rm{ij}}}}^{\left({{\rm{Gini}}}\right)}$$
(4)

Where n represents the number of categories trees in RF. In this study, regression prediction analysis of RF model was performed using the “rfPermute” package in R v.4.3.2 to explore the relative effects of various factors on nitrate concentration.

MixSIAR model

We also applied the Bayesian stable isotope mixing model (MixSIAR) with a Markov chain Monte Carlo (MCMC) algorithm to estimate the relative contributions of different NO3- sources to groundwater38. The calculation formula is as follows:

$${{{\rm{X}}}}_{{{\rm{ij}}}}={\sum }_{{{\rm{k}}}=1}^{{{\rm{k}}}}{{{\rm{P}}}}_{{{\rm{k}}}}\left({{{\rm{S}}}}_{{{\rm{jk}}}}+{{{\rm{C}}}}_{{{\rm{jk}}}}\right)+{{{\rm{\varepsilon }}}}_{{{\rm{ij}}}}$$
(5)
$${{{\rm{S}}}}_{{{\rm{jk}}}} \sim {{\rm{N}}}\left({{{\rm{\mu }}}}_{{{\rm{jk}}}},{{{\rm{\omega }}}}_{{{\rm{jk}}}}^{2}\right)$$
(6)
$${{{\rm{C}}}}_{{{\rm{jk}}}} \sim {{\rm{N}}}\left({{{\rm{\lambda }}}}_{{{\rm{jk}}}},{{{\rm{\tau }}}}_{{{\rm{jk}}}}^{2}\right)$$
(7)
$${{{\rm{\varepsilon }}}}_{{{\rm{jk}}}} \sim {{\rm{N}}}\left(0,{{{\rm{\sigma }}}}_{{{\rm{jk}}}}^{2}\right)$$
(8)

where Xij represents the δ values of isotope j of mixture measurements i (i = 1, 2, 3, …., N and j = 1, 2, 3, …., J); Pk represents the proportion of source k; Sjk represents the δ values of isotope jth from the kth source, which follows the normal distribution with the mean μjk variance and standard deviation (SD) ωjk; Cjk represents the fractionation coefficient of j isotope from the kth source, which follows the normal distribution with the mean λjk and SD τjk; Ɛjk is the residual error representing the additional unquantifiable variance between individual mixtures, which follows the normal distribution with the mean 0 and SD σjk. More detailed information on MixSIAR model is referred to Ren et al.58 and Fadhullah et al.40. Note, the typical end-number isotopic values of atmospheric deposition, manure & sewage, soil nitrogen, and chemical fertilizer, and the fractionation factor (Ɛ) used in the MixSIAR model are derived from the relevant literature on data collection (Table S3). The transport pathways of manure and sewage differ, with sewage often reaching groundwater through leaching or leaking pipes and manure typically through agricultural runoff. However, manure and sewage have overlapping isotopic signatures, which make it challenging to differentiate them. We therefore group ‘manure & sewage’ in one class, as similarly done in literature41,83. In this study, MixSIAR model was conducted using R software with the “MixSIAR” package to quantify the categories of different sources of NO3-.

We further used a few other analysis/visualization tools. Statistical relationships among nitrate dynamics, climate conditions, geography, and anthropogenic activities were examined by one-way analysis of variance (ANOVA) using Tukey’s Honestly Significant Difference test13. The Sankey diagram, composed of nodes (each variable) and connecting lines, shows the process of nitrate enrichment caused by influencing factors and the organic integration of related parameter data and was therefore used to visualize nitrogen sources84. Decision tree-heatmap is a comprehensive visualization method that integrates decision trees and heatmaps85. It used the data as a heatmap of the tree’s leaf nodes and integrates the tree structure to effectively explain how the tree nodes segment the feature space and their execution methods. The decision tree-heatmap can also reveal the correlation structure of the data and the importance of each feature. Here, we used decision-tree heatmaps to predict the threshold values of different nitrate concentrations using the “treeheatr” package based on the interactions among aquifer depth, population density, land use, and ecoregion characteristics. In addition, the structural equation model was used to explore the regularity of nitrate dynamics under different geographical and climate factors46. Structural equation modeling is hypothesis driven, focusing on causal relationships and the direct, indirect, and total effects between variables through path coefficients. It is well suited for validating theoretical frameworks and understanding the mechanisms behind variable interactions. In this study, auxiliary data (including geographical (including ecoregion, latitude, and longitude), climate factors (including MAP and MAT), hydrochemical characteristics (including TDS, pH, EC, and T-water)), and aquifer depth were set as the variables. Correlations and contributions between variables were analyzed using Amos software86.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.