Introduction

The composition of water chemical components in water is a comprehensive reflection of natural factors such as the geological strata, topography, climate, and watershed area, as well as human activities such as urban development and agricultural production1. SurfaceHydrochemical characteristics are primarily influenced by the combined effects of rock weathering, atmospheric deposition, evaporation and concentration, and human activities. Among these, water-rock interactions exert a foundational controlling influence on the composition of major ions, while human activities significantly alter the spatial heterogeneity of ions in water2. However, with the rapid development of the socio-economy, the surface water chemical environment has evolved from being primarily influenced by geological background to being jointly influenced by geological background and human activities, and in some regions, human activities have become the dominant influence. Human activities not only alter the original water chemistry of surface water but also disrupt natural hydrological and geochemical evolutionary processes, significantly increasing the complexity of surface water chemical evolution3. Conversely, changes in the surface water chemical environment also impact human life and production activities4.

Currently, most studies on water chemistry characteristics focused on the watershed scale, with limited research on the water chemistry characteristics and influences in artificially heavily disturbed dike areas, leading to unclear proportions of major ions in polder areas within the watershed. As a unique geographical unit within a watershed, diked areas are characterized by high levels of human activity (such as agricultural fertilization, livestock and poultry farming, and domestic wastewater discharge), with hydrological processes regulated by artificial gates and pumps, and water system connectivity obstructed, leading to poor water flow5. Water chemical environmental changes are more complex and variable, leading to unclear understanding of the hydrochemical characteristics and controlling factors in polder areas. The extent to which high-intensity human activities influence the hydrochemical characteristics in polder areas is also unclear. Therefore, it is necessary to conduct an analysis of the hydrochemical characteristics and controlling factors in polder areas. The primary methods currently used for analyzing chemical characteristics include Gibbs plots, ion ratio coefficients, Piper plots, multivariate statistics, and isotopes. Zheng et al. (2024) studied the Dianbu River sub-basin in the Chaohu Lake basin and found that the formation of hydrochemical characteristics in the region is influenced by both water-rock interactions and human factors, with water-rock interactions being the primary factor6. The impact of precipitation and evaporation was relatively minor, while human factors such as wastewater from sewage treatment plants, domestic wastewater, and manure had significantly altered the hydrochemical characteristics of local water bodies. Ma (2023) used water chemistry diagrams and Gibbs diagrams to find that the primary ion sources in Taihu Lake and its inflowing rivers had gradually shifted from being controlled by “rock weathering” to being dominated by “human activities”7. Ma et al. (2011) used Gibbs diagrams to illustrate that the water chemistry components of shallow groundwater along the northern shore of Chaohu Lake are primarily controlled by rock weathering, and Na+ in the region primarily originated from salt rock dissolution by ratio coefficients, followed by silicate dissolution, with some cation exchange also occurring8. Zheng et al. (2022) conducted a comprehensive analysis using ion ratio coefficients and isotopes, concluded that the Chaohu Lake basin was influenced by the dissolution of evaporite salt rocks, the weathering of silicate rocks, and human activities9. In water chemistry analysis, there is an increasing emphasis on revealing regional hydrogeochemical evolution processes at different scales, and comprehensively understanding the influence of natural background and human activities on the main water chemistry ions in surface water.

The Chaohu Lake basin plays a crucial role in the Yangtze River Delta Regional Integration Development Strategy, and Chaohu Lake is one of the “Three Rivers and Three Lakes” designated as national priorities for water pollution control. There are hundreds of large and small polders surrounding Lake Chaohu. As agricultural intensification and urbanization accelerate within these polders, the water environment faces multiple pressures, including fertilizer non-point source pollution input and reduced hydrological connectivity, leading to issues such as salt accumulation in water and water pollution. Existing studies have primarily focused on water environment assessment and pollutant distribution in Lake Chaohu and its surrounding rivers10,11,12,13. However, while surface water chemical components exhibit a certain response to human activities and can influence the environmental behavior of pollutants, systematic water chemical information on polder systems remains relatively limited.This study focuses on a typical floodplain area in the Chaohu Lake basin. The water chemistry data and environmental isotopes, combined with Gibbs plots, Piper plots, multivariate statistics, and ion ratio analysis are adopted to comprehensively analyze the spatiotemporal characteristics of major ions, identifies the controlling factors of water chemistry components in the floodplain area, explores the variation of river water chemistry under the influence of natural processes and human activities, and quantify the contribution of different factors. This study presents the first systematic analysis of the hydrochemical characteristics and their controlling factors in polder areas, thereby addressing a significant knowledge gap in the understanding of polder water chemistry. The findings provide foundational data and theoretical support for water environment management and protection within polders and in downstream regions.

Materials and methods

Study area

The Chaohu Lake basin (117°16′-117°51′E, 31°43′-32°03′N) located in the central part of Anhui Province. It is an important tributary water system on the left bank of the lower reaches of the Yangtze River, with a basin area of about 13,500 km2 (Fig. 1). The basin belongs to the subtropical monsoon climate zone, with an average annual temperature of 15.5–16.5℃ and an annual precipitation of 900–1100 mm, which is concentrated in June-August (accounting for more than 60% of the year). The study polders belong to the alluvial plain of the lower reaches of the Yangtze River, with a ground elevation of 5.8–8.2 m. Soils are dominated by two types of yellow-brown and rice soils, which account for about 85% of all the soils, and the rest are lime (rocky) soils, purple soils, and sandy-black soils14.

In this study, four typical polder areas on the west bank of Chaohu Lake, namely Dongdawei, Yangziwei, Zhaozhuwei, and Shilianwei, with a total area of about 128.6 km2, were selected for the study, and the four polder areas cover areas with different hydrological connectivity, with Dongdawei focusing on ecological orchards, the Zhaozhuwei focuses on aquaculture, the Yangziwei is dominated by townspeople, and the Shilianwei is a typical agricultural cultivation area, dominated by rice and wheat cultivation. The topography of the Nanfei River basin is primarily characterized by undulating residual hill plains and river-lake lowland plains. with widespread development of Quaternary strata. The entire region is covered by Quaternary strata, with the strata in the area arranged from oldest to youngest as follows: Late Archean-Early Proterozoic strata, Cretaceous strata, Tertiary strata, and Quaternary strata. Among these, the Late Archean-Early Proterozoic strata are primarily composed of plagioclase gneiss, Cretaceous strata consist of brick-red sandstone interbedded with gravelly sandstone and sandstone, with local interbedding, Tertiary strata consist of brick-red fine sandstone, gravelly sandstone interbedded with mudstone, and Quaternary strata are divided into the Gulu River Formation, Xia Shu Formation, and Wuhu Formation15. The Baishitianhe Shilianxu is mainly dominated by arable land, accounting for more than 70% of the total, and it is a typical agricultural area where rice and wheat are mainly grown. The soil layers can be divided into: Quaternary vegetative fill (Q4), Quaternary alluvial soil layer (Q4), Quaternary alluvial flooding soil layer (Q4 + pl), Quaternary Upper Pleistocene alluvial flooding soil layer (Q4), and Jurassic (J) according to the type of genesis, and the geologic map of the watershed is shown in Fig. 1.

Fig. 1
figure 1

Geological map of the study area, modified in ArcGIS 10.7 v software (China Geological Survey, CHN. https://www.cgs.gov.cn).

Sample collection and analysis

Sixty-nine surface water samples were collected in July, November 2024 and March 2025 from the study area, the detail information of the sampling points is shown in Table 1. Among them, 15 were outside the pike (outer rivers: Nanfeihe, Ershibuhe, Dianbuhe, Baishitianhe) and 54 were inside the pike (ditches and ponds), and the distribution of sampling points is shown in Fig. 2. Pre-cleaned 2 L polyethylene plastic bottles were used to collect water samples 20 cm below the surface of the water, and the polyethylene plastic bottles were cleaned with raw water 3–5 times before collecting water samples. During the collection of water samples, the sampler should be filled quickly to ensure that no air bubbles enter the sampling bottle.The water temperature (Temp.), pH, electrical conductivity (EC), dissolved oxygen (DO), total dissolved solids (TDS), air temperature, altitude, etc. need to be measured in the field by the portable multi-parameter water quality meter(EXO2, USA). The water samples were brought back to the laboratory and filtered through a 0.45 μm acetate filter membrane for the determination of anions. The water samples were acidified with nitric acid and adjusted to pH < 2 to test cations.

Fig. 2
figure 2

Map of the study area with sampling locations, modified in ArcGIS 10.7 v software. (Geospatial Data Cloud, CHN. https://www.gscloud.cn)

Table 1 Sampling point latitude and longitude.

The cations in the water were detected by microwave plasma atomic emission spectrometer (MP-AES 4200, USA), Instrument working conditions: Peristaltic pump, 5 channels; Atomization chamber, glass swirl; Reading time, 3 s; Stabilization time, 15 s; Background correction, Auto; Atomizer flow rate: 0.5–0.85 L/min. The limits of detection (LOD) ranged from 0.0002 to 0.0004 mg/L of the cations, with the recoveries of 95.6%-97.3%16.

The anions in the water were analyzed by ion chromatography(ICS900, USA). AS23 anion analysis column (250 mm×4 mm), AS23 guard column (50 mm×4 mm) Mobile phase: 4.5mmol/L sodium bicarbonate + 0.8mmol/L sodium carbonate eluent; 100mM sulfuric acid suppressor regenerant; Injection volume: 25µL; Flow rate: 1.0mL/min, the limits of detection (LOD) ranged from 0.006 to 0.018 mg/L, and the recoveries ranged from 99.1% to 116.3%17. HCO3 in water was analyzed by acid-base titration (methyl orange as indicator, dilute hydrochloric acid titration)18.

δ2H and δ18O were tested using a liquid water isotope analyzer (LGR; Picarro, L2130-i). The resultant δ was expressed as (‰) and defined as:

$$\:\delta\:=\frac{{R}_{Sample}-{R}_{VSMOW}}{{R}_{VSMOW}}\times\:1000\permille$$
(1)

\(\:{\updelta\:}\) is the isotopic deviation in ‰, RSample is the isotopic ratio of the sample, and RVSMOW is the international reference isotopic ratio. The experimental data were calibrated according to the International Atomic Energy Vienna Standard Mean Ocean Water (V-SMOW) with measurement errors of δ18O <± 0.08‰ and δ2H <± 0.3‰.

Data analysis methods

Principal component analysis

The results of principal component analysis (PCA) are relatively easy to interpret and simple to express, and it is one of the most widely used analytical methods in multivariate data analysis19. Principal component analysis is the extraction of principal factors with eigenvalues greater than 120. Its core objective is to convert high-dimensional, correlated original variables into low-dimensional, linearly uncorrelated principal components, while retaining the variance information of the original data as much as possible, which can be done using a few factors to represent many variables, and then combining the distribution characteristics of each source can determine the source.

Steps in principal component analysis: with n samples, m indicators constitute the sample space X, X = Xij (i = 1, 2…. .n; j = 1, 2… .m).

(1) Standardize the data, the standardized data matrix is still recorded X array: X = (Xij-Xj)/δj (Xj, δj refers to the mean and standard deviation of the jth indicator, respectively);

(2) Find the correlation coefficient matrix X array R = (Rij)p×p.

(3) Find the characteristic roots of R and the corresponding eigenvectors η1i, η2i, …, ηpi;

(4) Determine the number of principal components (factors) based on the cumulative contribution rate \(\:\sum\:_{i=1}^{k}{{\uplambda\:}}_{i}\)/\(\:\sum\:_{i=1}^{p}{{\uplambda\:}}_{i}\) of the first k principal components.

(5) Interpret the meaning of principal components according to the content of specific indicators and the size of indicator variable coefficients;

(6) Use the contribution rate \(\:{{\uplambda\:}}_{\text{j}}\)/\(\:\sum\:_{i=1}^{p}{{\uplambda\:}}_{i}\) of each principal component as the weighting factor to give the multi-indicator comprehensive evaluation value.

Absolute principal component-multiple linear regression (APCS-MLR) receptor model

Absolute Principal Component-Multiple Linear Regression (APCS-MLR) receptor modeling can be used to obtain quantitative information on each identified source, and the basic principle of the method is to perform multiple linear regression on the basis of principal component analysis. The principal component factors obtained after factor analysis are used as independent variables, and the calculation of the regression coefficient results yields the contribution value of the emission source corresponding to each factor to the substance in the receptor, which further validates the scientific validity of the principal component analysis results21. The independent factors were determined using the eigenvector decomposition of the pairwise correlation matrix between elemental contents. The computational equation is as follows:

$$\:{C}_{i}=\sum\:_{i=1}({a}_{im}\times\:{APCS}_{m})+{b}_{i}$$
(2)
$$\:{PC}_{im}=\frac{\left|{a}_{im}\times\:\overline{{APCS}_{im}}\right|}{\left|{b}_{i}\right|+{\sum\:}_{m=1}^{m}\left|{a}_{im}\times\:\overline{{APCS}_{im}}\right|}$$
(3)
$$\:{PC}_{iu}=\frac{\left|{b}_{i}\right|}{\left|{b}_{i}\right|+{\sum\:}_{m=1}^{m}\left|{a}_{im}\times\:\overline{{APCS}_{im}}\right|}$$
(4)

Ci is the estimated value of the content of element i and bi is the constant term of the multivariate linear model; aim is the multivariate regression coefficient of pollution source m for element i; APCSm is the contribution of the mth factor to Ci; aim\(\:\times\:\)APCSm is the contribution rate of source m to the content of Ci; PCim is the contribution rate of element i and pollution source m in known sources; PCiu is the contribution rate of element i and unknown source u; bi is the contribution of unidentified sources; is the average value of the absolute principal component factor scores of all samples for i.

Positive matrix factorization (PMF) model

The PMF model is an ideal method for identifying and quantifying the sources of pollutants by simplifying high-dimensional variables by using correlation and covariance matrices to ensure non-negative source contribution values and factor distributions22.The PMF model decomposes the matrix of the original dataset Xij into two factor matrices (the source contribution matrix gik and the source component spectral matrix fkj) and residual matrix eij. The basic calculation is shown in Eq. where the units of the parameters (except eij) are determined according to the actual situation.

$$\:{X}_{ij}=\sum\:_{k}{g}_{ik}{f}_{kj}+{e}_{ij}$$
(5)

Xij is the measured value of the jth element in the ith sample; k is the source number; gik is the contribution of the kth source to the ith sample, %; fkj is the measured value of the jth element in the kth source; and eij is the residual, which can be computed by minimizing the value of the objective function Q. The value of Q is calculated as follows:

$$\:Q=\sum\:_{i=1}^{n}\sum\:_{j=1}^{m}{\left(\frac{{e}_{ij}}{{u}_{ij}}\right)}^{2}$$
(6)

The uij is uncertainty of the jth element in the ith sample, which is calculated based on the method detection limit (MDL) for the specific ion, the concentration and the provided error rate. If the element concentration(C) is greater than the MDL, the uncertainty can be calculated as follows:

$$\:{u}_{ij}=\sqrt{{(RSD\times\:C)}^{2}+{MDL}^{2}}$$
(7)

If the element concentration is less than the MDL, the uncertainty is calculated as follows:

$$\:{u}_{ij}=\frac{5}{6}\times\:MDL$$
(8)

Results

Statistical characteristics of water chemistry

The statistical analysis results of water chemical indicators in typical floodplain areas of the Chaohu Lake basin during different seasons are shown in Table 2. The average pH values for the wet season, normal-flow period, and dry season were 7.65, 7.73, and 7.81, respectively, showing a generally weak alkaline trend; overall, these values were lower than those of Chaohu Lake and its inflowing rivers23. The average of hardness (TH) values were 225.55, 186.75, and 238.20 mg/L, respectively. The TH value during the wet season was significantly higher than that during the normal-flow period, but lower than that during the dry season, significantly higher than those in Chaohu Lake and its inflowing rivers, indicating higher water hardness. The average electrical conductivity (EC) values during the wet season, normal-flow period, and dry season were 470.26, 576.48, and 552.86 µs/cm, respectively, with relatively higher values during the normal-flow period and dry season, and significantly higher than those in Chaohu Lake and its inflowing rivers. The average TDS values for the wet season, normal-flow period, and dry season were 329.30, 611.96, and 416.57 mg/L, respectively.

Table 2 Statistical characteristics of water chemistry composition in typical dike areas of the Chaohu river basin during different water seasons.

Distribution characteristics of water chemistry components

(1) Spatial distribution characteristics.

According to the location of sampling point, it can be divided into three types, which are ditch, pond and outer river. The content of K+ and Mg2+ are stable and does not change significantly with location(Fig. 3a), The content of Na+ in ditches and outer rivers is similar, and it is greater than that in ponds. The content of Ca2+ increases in the following order: ditch > outer river > pond. The content of anion in different locations is shown in Fig. 3b, the content of HCO3 increases in the following order: ditch> pond> outer river, The content of Cl and SO42− are both lower in ponds than in rivers and ditches. The content of NO3 varies a lot according to the location, and has the highest value in the outer river, which is probably related to that the outer river flows through the towns and is subjected to the discharge of sewage from human life24. The contents of anion and cation in the ditches are higher than those in the ponds, which may be mainly due to the large amount of farmland and cultivated land around the ditches, which are more susceptible to anthropogenic tilling and irrigation/ drainage activities, and stronger runoff scouring than the ponds25.

Fig. 3
figure 3

The content of Main ion at different locations(a cation and b anion).

(2) Characteristics of changes in different seasons.

The cations in the water of each polder in wet season, normal-flow period and dry season are dominated by Na+ and Ca2+(Fig. 4a), and generally increases in the following order: Ca2+> Na+>.

Mg2+> K+. The contents of K+ and Mg2+ don’t show significant changes in different seasons. The content of Na+ increases in the following order: dry season> normal-flow period> wet season, which might be due to the dilution by rainfall in the wet season, while evaporation in the normal-flow period concentrated the content of Na+. The concentration of Ca2+ concentration is higher in the wet season than in the normal-flow period and slightly lower than in the dry season, which may be related to the fact that the rainfall washed the ground in the wet season, dissolved the calcium salts in the soil and rock and bring them into the water, while in the dry season, they are concentrated by evaporation from the water. The content of Mg2+ and K+ show similar changes, which is slightly higher in the normal-flow period and in the dry season than in the wet season, and it is not easily affected by seasonal variations.The anions in the water of the dike during the wet season are dominated by HCO3(Fig. 4b), which generally show the characteristics of HCO3> SO42−> Cl> NO3. The concentration of HCO3 decrease slightly in the normal-flow period, probably the water flow is weak in the normal-flow period, and the dissolution of rocks produced relatively less HCO3, and at the same time, due to the weakening of atmospheric precipitation and gas exchange, CO2 dissolved into the water was less. The contents of Cl and SO42− in the water during the dry season are higher than those in other seasons. The content of NO3 reach its highest level during the normal-flow period and its lowest level during the dry season. The concentration of Cl changes are closely related to human activities, especially agricultural fertilizer, domestic sewage and industrial wastewater and other pollutants, may be poor hydrodynamic conditions and evaporation and concentration lead to Cl enrichment. The content of NO3 is high in the normal-flow period, may be related to the agricultural tillage, and may be related to the agricultural tillage in the normal-flow period. It may be related to agricultural tilling and fertilizer application during the normal-flow period. The content of SO42− did not change much with season, and the change of content showed a slightly elevated trend, which is not much affected by season.

Fig. 4
figure 4

Columnar comparison of major ions in water during different seasons(a cation and b anion).

From the point of view of the coefficients of variation (CV) of the main ions in the water in each season (Table 2), the CV of K+ in each season are not much different, and they were 30.17%, 27.29% and 27.30% in the wet season, the normal-flow period and the dry season respectively, and their spatial distributions were less affected by seasonal variations, and the contents were relatively stable in the various water bodies, The CV of Ca2+ was higher in the wet season (60.11%) than in the normal-flow period (30.25%) and higher than in the dry season (28.30%), which means that the spatial variability was stronger in the wet season. The CV of Mg2+ was not much different in the abundant and normal-flow period, which was 58.30% and 53.41%, respectively, higher than that in the dry season (39.79%), and there was a certain spatial variability in the abundant and normal-flow period.The coefficients of variation for HCO3 were slightly higher than that of the normal-flow period in the wet season and were affected by seasonal rainfall.The CV for NO3 were extremely high, which were 103.72%, 98.14%, and 97.31% in the abundant, flat, and dry season, respectively, which may be related to the frequency of localized human activities. Cl and SO42− also showed strong spatial variability.

Water chemistry type analysis

The water chemical components in the wet season are single, and most of the sample points fall within the triangular regions 1, 5 and 6 (Fig. 5), which belong to the calcium salt type and carbonate rock type, and the sample points are projected to the diamond-shaped C region, and the water chemical type is Ca-HCO3 type. Some other points falls in the diamond-shaped B and D area, and the water chemistry type is mixed. During the wet season, the water exchanges frequently, and there is not much difference in the water chemistry type. The Ca-HCO3 type of water is usually related to the dissolution of carbonate rocks, which releases Ca2+ and Mg2+ during the weathering process, and the dissolution of carbonate minerals (e.g., calcite) produces HCO3, resulting in the formation of this type of water chemistry. At the same time, atmospheric CO2 is dissolved into the water, promoting the dissolution of carbonate minerals, thus increasing the concentration of HCO3 and forming the HCO3-Ca type.

The water chemistry is more complex during the normal-flow period and dry season, and the distribution of the samples is more dispersed and the overall distribution is shifted to the right compared to that of the wet season. In the cation triangulation, most of the samples are concentrated in area 2 (mixed type), with a few in areas 1 (Ca type) and 4 (Na + K type), and in the anion triangulation, the samples are concentrated in areas 5 (HCO3 type) and 6 (mixed type), and only a few samples are distributed in area 8 (Cl type). The projection of the points into the rhombus during the normal-flow period and dry season showed that most of the sample points were distributed in regions B (mixed type), C (HCO3-Ca type), and D (Cl/SO4-Na type). With reduced rainfall during the normal-flow period and dry season, and the poor mobility of the water, human activities such as agricultural fertilizer use and domestic sewage discharge have some impact on the water chemistry type, making the chemistry type complex and variable, and shifting to the mixed type.

Fig. 5
figure 5

Piper’s three-line diagram for water during different seasons.

Analysis of major controlling factors

The Gibbs diagram is used to analyze the chemical composition of water and its controlling factors26. The Gibbs plot allows one to qualitatively analyze the chemical composition of groundwater or river water and determine which natural mechanisms influence it, such as atmospheric precipitation, rock weathering, and evaporation-crystallization processes27. By representing the total dissolved solids (TDS) content in the vertical coordinate and the ratios of Cl/(Cl+HCO3) and Na+/(Na++Ca2+) in the horizontal coordinate, it is possible to analyze whether the water is mainly affected by rock weathering, evaporation-concentration, or atmospheric precipitation.

From the Gibbs plot, it can be seen that most of the sample points in the wet season are distributed at the weathered end of the rocks(Fig. 6), and the ions in the water mainly come from the weathering of the rocks. The abundant precipitation during the wet season and more freshwater input may dilute the dissolved substances in the water, but the replenishment of ions in the water is not dominated by atmospheric precipitation, in addition, the average value of Cl/Na+ in the water in the area is 0.78, which is lower than the world’s average precipitation Cl/Na+ ratio of 1.1528, suggesting that evapotranspiration has an effect on the ionic fractions of the water chemistry but it is not a dominant control factor. The samples for the normal-flow period were distributed between the weathered end and the evaporated end of the rock, reflecting seasonal changes with reduced rainfall and more significant evaporation. The distribution of the sample points towards the evaporative crystallization end is more significant in the dry season relative to the normal-flow period, indicating that the dry season is more significantly affected by the evaporative crystallization of the water.

Fig. 6
figure 6

Gibbs diagram of water during different seasons(a TDS vs. Cl/(Cl+HCO3) and b TDS vs. Na+/(Na++Ca2+).

Discussion

Main ion correlation

To gain a more intuitive understanding of the proximity and similarity between various elements, variable cluster analysis and Pearson correlation analysis were performed on the main ions in the water of typical polders in the Chaohu Lake basin. The major ions in the water within the area can be divided into three major categories. The first category includes Ca2+, Mg2 + and HCO3(Fig. 7), with each ion showing significant positive correlations with the others (p < 0.01). These ions exhibit similar geochemical behavior and controlling factors, with contents slightly higher than those in the main stem of the Yangtze River. They may share common or similar sources, primarily resulting from the dissolution of carbonate rocks such as dolomite and calcite. The second category includes K+, Na+, Cl, SO42−, and NO3. This category can be further divided into two subcategories: K+, Na+, Cl and SO42−. In nature, Na+ and Cl have widespread sources, with rock weathering being one of the important sources. It is significantly correlated at the 0.01 level, with a correlation coefficient of 0.97, possibly due to a common source related to the dissolution of salt rocks. SO42− and Cl, in addition to originating from the dissolution of rock minerals, are also significantly influenced by human activities. The correlation coefficient between the two at the 0.01 level is 0.71, indicating a strong correlation. Na+ and K+ have a correlation coefficient of 0.86 at the 0.01 level, which may be related to the dissolution of silicate rock minerals. NO3 forms a separate category and shows no correlation with other ions. suggesting that NO3 has a unique source, potentially linked to agricultural fertilizer application or domestic wastewater discharge.

Fig. 7
figure 7

Ion correlation heat map and cluster analysis.

Water-rock interaction

Rock weathering is one of the key processes controlling surface water quality in the study area. The ratio of Mg2+/Ca2+ to Ca2+/Na+ and HCO3/Na+ to Ca2+/Na+ was used to determine the impact of different types of salt rock weathering on the study area29. As shown in Fig. 8a, the relationship between Mg2+/Ca2+ and Ca2+/Na+ indicates that the water samples from the study area are distributed near silicate rocks and evaporite rocks, indicating that silicate rocks and evaporite rocks act as primary controlling factors influencing the origin and distribution of ions in the water. As shown in Fig. 8(b), the relationship between HCO3/Na+ and Ca2+/Na+ indicates that carbonate rocks also exert a certain controlling influence on ions in the water, particularly during the wet season, where sample points are distributed closer to the carbonate rock-controlled.The potential sources of Na+, K+, and Cl include the dissolution of silicate rocks, the dissolution of evaporite rocks, and human activity inputs30.

Fig. 8
figure 8

Component plots (a Mg2+/Na+ vs. Ca2+/Na+ and b HCO3/Na+ vs. Ca2+/Na+), all values are in meq/L.

The dissolution of evaporite rocks releases Na+, K+, and Cl. When the (Na++K+)/Cl ratio approaches 1 (Fig. 9a), it indicates that Na+ and K+ primarily originate from the dissolution of evaporite minerals; when the ratio exceeds 1, Na+ and K+ in the water originate from the dissolution of silicate minerals31. Based on the relationship between (Na++K+) and Cl milliequivalent contents, most sample points in the region are distributed below and near the 1:1 line, with Na+/Cl values ranging from 0.41 to 3.27 and an average of 1.20. This suggests that the primary sources of Na+ and Cl in the region are likely the weathering of salt rocks, with some Na+/K+ feldspar weathering from silicate minerals also present. Human activities such as the use of agricultural fertilizers or the discharge of domestic wastewater can also increase the contents of K+ and Cl in water32.

Research has shown that both sulfuric acid and carbonic acid may be involved in the weathering process of carbonate rocks and silicate rocks. However, under the same conditions, the weathering rate of carbonate rocks is significantly higher than that of silicate rocks33. When only carbonic acid is involved in the weathering of carbonate rocks, the ratio of (Ca2++Mg2+) to HCO3 should be 1. When only sulfuric acid is involved in the weathering of carbonate rocks, the ratio of Ca2++Mg2+ to HCO3 should be 234. During the wet season and normal-flow period, all sample points in the study area are located below the 2:1 relationship line and above the 1:1 relationship line (Fig. 9b), indicating that Ca2+, Mg2+, and HCO3 primarily originate from sulfate weathering of carbonate rocks, but carbonate weathering is required to balance Ca2++Mg2+. When SO42− is added, most of the samples during the normal water period are distributed along the 1:1 line (Fig. 9c), confirming that carbonate also participates in the dissolution of carbonate rocks.

To further assess the extent of sulfate and carbonate involvement in carbonate rocks within the basin, an analysis was conducted using the relationship between the (Ca2++Mg2+)/HCO3 and SO42−/HCO3 milliequivalent contents35. When only CO2 participates in karstification, the (Ca2++Mg2+)/HCO3 ratio is 1, and the SO42−/HCO3 ratio is 0; when sulfuric acid participates in the weathering of carbonate rocks, the (Ca2++Mg2+)/HCO3 ratio is 2, and the SO42−/HCO3 ratio is 1. Point A indicates sulfate participation in carbonate rock weathering(Fig. 9d), Point B indicates both carbonate and sulfate participation in carbonate rock weathering, and Point C indicates carbonate participation in carbonate rock weathering. Most samples from the region during the wet season and normal-flow period are distributed above point A and above point B, indicating that they are primarily controlled by the joint weathering of carbonate and sulfate on carbonate rocks. A small portion of samples are distributed between points B and C, indicating that individual sites exhibit a higher degree of carbonate weathering on carbonate rocks.

$$2NaAlS{i_3}{O_8}\left( {Sodiumfeldspar} \right)+2C{O_2}+3{H_2}O \to 2HC{O_3}^{ - }+2N{a^+}+A{l_2}S{i_2}{O_5}{\left( {HO} \right)_4}+4Si{O_2}$$
(9)
$$2KAlS{i_3}{O_8}\left( {Potassiumfeldspar} \right)+11{H_2}O \to 2{K^+}+2O{H^ - }+A{l_2}S{i_2}{O_5}{\left( {HO} \right)_4}+4{H_2}Si{O_2}$$
(10)
$$A:2C{a_{1\, - \,x}}M{g_x}C{O_3}+{H_2}S{O_4} \to 2\left( {1 - x} \right)C{a^{2+}}+2xM{g^{2+}}+2HC{O_3}^{ - }+S{O_4}^{{2 - }}$$
(11)
$$B:3C{a_{1\, - \,x}}M{g_x}C{O_3}+{H_2}C{O_3}+{H_2}S{O_4} \to 3\left( {1 - x} \right)C{a^{2+}}+3xM{g^{2+}}+4HC{O_3}^{ - }+S{O_4}^{{2 - }}$$
(12)
$$C:C{a_{1\, - \,x}}M{g_x}C{O_3}+{H_2}C{O_3} \to \left( {1 - x} \right)C{a^{2+}}+xM{g^{2+}}+2HC{O_3}^{ - }$$
(13)
Fig. 9
figure 9

Geochemical behavior of major cations and anions (a Cl vs. (Na++K+), b (Ca2++Mg2+) vs. HCO3, c (Ca2++Mg2+ ) vs. (HCO3+SO42−), d (Ca2++Mg2+ )/HCO3 vs. (SO42−/HCO3)).

Cation exchange plays a significant role in influencing water chemistry components. Typically, the relationship between the milliequivalent contents of (Na+-Cl) and (Ca2++Mg2+-HCO3-SO42−) is used to determine whether cation exchange is present in a water36. Most sample points of the polder area are distributed near the relationship line with a slope of -1 (Fig. 10a) during the normal-flow period, while those during the wet season are distributed above it, indicating that cation exchange is more intense during the normal-flow period than during the wet season. Heavy rainfall dilutes ion contents and inhibits exchange in the wet season. While during the dry season, although contents are high, stagnant water may cause exchange sites to become saturated28. The normal-flow period is intermediate, with a weakened dilution effect and sufficient flow to promote exchange.

The intensity of cation exchange can be further analyzed based on the magnitude of the chlor-alkali index37. The formula for calculating the chlor-alkali index is:

$$\:CAI1=\frac{N\left({Cl}^{-}\right)-N({Na}^{+}+{K}^{+})}{N\left({Cl}^{-}\right)}$$
(14)
$$\:CAI2=\frac{N\left({Cl}^{-}\right)-N({Na}^{+}+{K}^{+})}{N\left({SO}_{4}^{2-}\right)+N\left({HCO}_{3}^{-}\right)}$$
(15)

Equations (14) and (15) represent the equivalent contents of each ion. If both CAI1 and CAI2 are negative, it indicates that Ca2+ and Mg2+ in the water have been exchanged with Na+ and K+ in the surrounding rock; conversely, if they are positive, the higher the absolute value, the stronger the ion exchange. As shown in Fig. 10b, the value of most sample points have negative values for CAI1 and CAI2, indicating that Ca2+ and Mg2+ in the water have been exchanged with Na+ and K+ in the mineral phase, resulting in Na+ and K+ from the minerals entering the water. Overall, the ion exchange intensity is higher during the wet season and dry season than during the normal-flow period.

Fig. 10
figure 10

Component plots (a (Na+-Cl) vs. (Ca2++Mg2+-HCO3-SO42−) and b CAI1 vs. CAI2).

Evaporation

The hydrochemical composition of river water serves as a significant indicator of the natural environment in the areas it flows through. Hydrogen and oxygen isotopes from different water sources can reveal the moisture sources of atmospheric precipitation and help determine the flow paths and recharge relationships between groundwater and surface water38. Craig (1961) conducted mass spectrometric analysis of approximately 400 water samples from rivers, lakes, and precipitation, establishing the relationship between hydrogen and oxygen isotopes in atmospheric precipitation39. By plotting δ2H against δ18O and performing linear fitting, the Global Meteoric Water Line (GMWL) was derived as δ2H = 8δ18O + 10. Furthermore, the δ2H–δ18O plot can be used to assess the effects of evaporative fractionation. Given the geographical proximity and similar hydrological conditions between the study area and Nanjing, the Nanjing Meteoric Water Line (δ2H = 8.47δ18O + 17.52) is adopted as the Local Meteoric Water Line (LMWL) in this study40.

The deuterium excess (d = δ2H – 8δ18O) is a key parameter for evaluating the impact of evaporation on the physicochemical properties of surface water. A lower d value generally indicates stronger evaporation41. The d-value represents the intercept when the slope is 8; for the GMWL, the d-value is + 10‰. Deviation from this reference value suggests that the surface water has undergone secondary evaporation42. The calculated d-values for water bodies in the study area are presented in Table 3. Based on average values, the order of isotopic enrichment is: river (δ18O = -6.05‰, δ2H = -44.32‰) > pond (δ18O = -7.98‰, δ2H = -56.93‰) > trench (δ18O = -8.42‰, δ2H = -60.35‰), indicating that the river water is the most isotopically enriched, while the trench water is the most depleted. The average d-values of all water bodies are lower than the reference value of the GMWL (trench: 6.97‰; pond: 6.91‰; river: 4.08‰), consistently indicating that they have undergone varying degrees of evaporative fractionation.

River water exhibits the most enriched isotopic composition (mean δ18O = -6.05‰) and the lowest mean d-value (4.08‰), with even negative values observed (Min: -2.47‰), clearly reflecting the strongest evaporation. During evaporation, lighter isotopes (16O and 1H) escape preferentially, leading to the enrichment of heavier isotopes (18O and 2H) in the remaining water and a decrease in the d-value. The high coefficients of variation (CV) for river isotopes (35.47% for δ18O and 110.52% for d-value), significantly higher than those of trenches and ponds, suggest a more complex water source composition and hydrological dynamics, likely influenced by multiple sources mixing or human activities.

Pond water shows an isotopic composition intermediate between rivers and trenches, with d-values similar to those of trenches (around 6.9‰), indicating noticeable evaporation but less intense than in rivers. As standing water bodies, ponds are prone to evaporation. Their relatively high CV values (11.04% for δ18O and 32.17% for d-value) suggest poor water homogeneity, possibly due to uneven spatial evaporation or intermittent recharge.

Trench water is the most isotopically depleted and has the highest d-value, indicating the weakest evaporation and the closest resemblance to the initial isotopic signature of local precipitation. The lowest CV values reflect relatively stable isotopic composition over time and space, suggesting a more singular water source and simpler hydrological processes.

Table 3 δ2H-H2O, δ18O-H2O, and d statistical characteristics of water in the study area.

The relationship between δ18O and δ2H in water samples from the polder area of Chaohu Lake Basin, with both GMWL and LMWL plotted(Fig. 11). The fitted Local Water Line (LWL) for all samples is δ2H = 6.197δ18O – 7.522 (R2 = 0.976), indicating a strong linear relationship and suggesting that all water bodies originate from initial precipitation with similar isotopic compositions but have undergone differentiation along the same evaporation line due to varying evaporation intensities. Compared to the LMWL, the lower slope and intercept of the LWL further confirm the influence of evaporation. Previous studies have suggested that slopes between 6 and 7 typically indicate post-precipitation evaporation43; the slope of approximately 6.197 in this study indicates that the water bodies in the area have generally experienced post-precipitation evaporation. Most samples are distributed below the GMWL and to the right of the LMWL(Fig. 11). indicating that the water primarily originates from atmospheric precipitation, but evaporation has caused the isotopic values to deviate from the meteoric water lines.

Fig. 11
figure 11

Plots of δ18O and δ2H relationship.

Human impact

Human activities primarily influence the concentration of dissolved substances in river water, particularly through changes in components such as SO42−, Cl, NO3, and Na[+ [44. Among these, mining activities are often the primary source of SO42−, while agricultural fertilization, livestock manure, and municipal wastewater discharge can influence Cl, NO3, and Na+45. When water are influenced by human activities such as agriculture and domestic use, the milliequivalent concentration ratios of NO3/Na+ and Cl/Na+ are generally higher46. As shown in Fig. 12(a), during the wet season and normal-flow period, some samples in polder area are closer to the human activity endpoint, while during the dry season, they are farther from the human activity endpoint, indicating that water in polder area are more significantly influenced by human activities during the wet season and normal-flow period. Most sample points are distributed at the evaporite end, indicating more significant influence from evaporite weathering. Related studies indicated that high NO3/Cl and low Cl ratios suggest NO3 contributions from agricultural fertilization, while low NO3/Cl and high Cl ratios indicated NO3 primarily influenced by domestic wastewater and feces. Low NO3/Cl and Cl ratios suggest NO3 originates from soil nitrogen47. As shown in Fig. 12(b), the sampling points for each season are primarily located at the fecal and domestic wastewater endpoints, indicating that the water in the reservoir area are primarily influenced by domestic wastewater discharge to a greater extent than agricultural fertilization. This is mainly due to the fact that the study area is primarily located in rural areas, where untreated domestic wastewater is discharged into the reservoir area. Additionally, the NO3/Cl ratio was significantly higher during the wet season than during the dry and normal-flow periods, indicating that agricultural activities have a greater impact on the water chemistry of the water during the wet season.

Fig. 12
figure 12

Component plots (a NO3/Na+ vs. Cl/Na+ and b NO3/Cl vs. Cl).

Main ion sources

APCS-MLR model contribution analysis

Factor analysis was performed on K+, Na+, Ca2+, Mg2+, HCO3, SO42−, Cl and NO3 in the water within the area. Generally, a KMO value greater than 0.50 and a Bartlett test p-value less than 0.05 indicate that the data is suitable for principal component analysis. The KMO values for the eight ions were calculated using factor analysis in SPSS, and Bartlett’s sphericity test was performed. The results showed KMO values and Bartlett’s test p-values of 0.589 and 0.00, respectively. This indicates that these eight conventional ions are suitable for principal component factor loading analysis48. Factor loadings were rotated using the maximum variance method, and components were screened based on the requirement that initial values be greater than 1. Principal component 1 and principal component 2 were obtained through factor analysis.

The cumulative variance explained by the first two principal components reaches 77.5% of the total variance, with the first principal component (PC1) accounting for 48.6% and the second (PC2) contributing 28.9%(Fig. 13). Variables exhibiting relatively high loadings on PC1 include K+, Na+, NO3, and Cl, whereas Mg2+, Ca2+, and HCO₃⁻ show higher loadings on PC2. As presented in Figs. 4, 5, 6, 7 and 8, most sample points fall within the 95% confidence interval. The high loadings (> 0.7) of K+, Na+, NO3, and Cl on PC1 suggest significant influences from evaporite dissolution, silicate weathering, and anthropogenic activities (such as agricultural fertilization and sewage discharge). Meanwhile, Mg2+, Ca2+, and HCO₃⁻ display strong positive loadings on PC2, with values of 0.846, 0.859, and 0.966, respectively. These three components exhibit strong intercorrelation, indicating that their sources are likely governed by carbonate weathering.

Fig. 13
figure 13

Ion load map.

APCS-MLR model contribution analysis

Based on principal component analysis, the APCS-MLR modeling approach can be used to conduct a quantitative analysis of the contributions from various sources for each element. The coefficient of determination R2 is used to measure the correlation between the model and actual observed values. The closer the value is to 1, the higher the linear fit, resulting in better simulation results. During the calculation of contribution rates in the APCS-MLR model, negative values may occur, which could mislead the accuracy of source allocation. To address this issue, all negative values are converted to absolute values, representing the contribution rate of the corresponding sources to water quality parameters, thereby transforming all negative values into positive contributions. As shown in Table 4, except for NO3 and SO42−, the R2 values for all other ions are greater than 0.7, indicating that the model is feasible.

As shown in Fig. 14, the APCS-MLR model identified three sources. Source 1 is primarily controlled by K+, Na+, NO3, Cl and SO42− elements, indicating that its origin is salt rock and silicate rock. The coefficient of determination R2 for NO3 is 0.557, indicating poor fitting, suggesting that NO3 may have other sources. Mg2+, Ca2+ and HCO3 exhibit higher fitting degrees.

Table 4 Linear regression model coefficient statistics.

Considering that the alluvial layers in the area are primarily composed of silt-like clay, sub-clay, and sandstone, Mg2+ and Ca2+ primarily originate from the weathering of carbonate rocks, indicating that the source controlled by these three elements (Source 2) is associated with carbonate rock weathering29. Additionally, the proportion of unknown sources reaches 16.59%, which may be due to the complex hydrogeological conditions and human activities in the area, leading to discrepancies in the contribution rate analysis. Using the APCS-MLR model, the contribution rates of source 1 (evaporite salt rock and silicate rock weathering) were quantitatively analyzed to be 2.60% to 83.47%, with an average of 37.74%, while the contribution rate of source 2 (carbonate rock dissolution) was 3.64% to 88.68%, with an average of 45.66%.

Fig. 14
figure 14

APCS-MLR contribution rate analysis.

PMF model contribution analysis

Since the use of the APCS-MLR model alone resulted in errors and the fitting results for NO3 and SO42− were not ideal, the PMF model was introduced for analysis to correct the results of the APCS-MLR model. Based on the results of principal component analysis, four factors were selected to explain the categories of pollution sources. After four iterations, a low Q value of 44.5 was obtained, with all eight water quality parameters classified as “Strong” (S/N > 1). The majority of residuals for all water quality parameter samples were within the range of -3 to 3. The R2 coefficient range between measured and predicted contents was 0.50 to 0.99, indicating that the PMF model has reasonable predictive capability49. The source component spectra of each ion analyzed by the PMF model are shown in Fig. 15.

Factor 1 is primarily loaded by Ca2+, Mg2+ and HCO3, accounting for 45.5%, 77.5%, and 76.4%, respectively. Other elements have lower loads. Based on cluster analysis and geological background, this may be related to the dissolution of carbonate rocks, so Factor 1 represents a source of carbonate rocks. The main load elements of Factor 2 are Na+ and Cl, accounting for 65.2% and 71.7%, respectively, indicating a correlation with the dissolution of salt rocks. The main load elements of Factor 3 are Ca2+ and SO42−, accounting for 52.4% and 77.7%, respectively, typically associated with the dissolution of gypsum. Therefore, Factors 2 and 3 can be attributed to the dissolution of evaporite salt rocks. Factor 4 is primarily loaded with NO3, accounting for 72.6%. Under normal circumstances, NO3 in aquatic environments primarily originates from human activities, such as domestic wastewater and fertilizers. The study area has frequent agricultural activities, and rivers within the region flow through urban residential areas. Therefore, Factor 2 can be considered to represent domestic wastewater discharge and agricultural fertilizer application, i.e., human activity impacts. Based on the factor fingerprint characteristics obtained from PMF analysis, the contribution rates of the four factors to the eight water quality parameters can be determined, as shown in Fig. 15b. Factor 1 represents carbonate rock dissolution, with a contribution rate ranging from 0.2% to 77.5%, and an average of 32.25%. Factor 2 represents evaporite dissolution, with a contribution rate ranging from 0% to 71.7%, and an average of 27.08%. Factor 3 is gypsum, with a contribution rate of 0% to 77.7% and an average of 21.61%. Thus, factors 1, 2, and 3 are all controlled by rock weathering, with a contribution rate of 80.95%. Factor 4 is human activity, with a contribution rate of 0.1% to 72.6% and an average of 19.05%. Human activity has a high contribution rate to the water chemistry components in the area and exerts a significant influence.

Based on the combined results of the APCS-MLR and PMF model analyses, it is evident that PMF can further analyze the unknown sources identified by the APCS-MLR model, providing a more detailed breakdown of ion sources and achieving superior ion fitting results, thereby enhancing the reliability of the outcomes. This indicates that the PMF model is more suitable for ion source analysis in this study.

As the polder is affected by man-made activities, especially intensive agricultural activities and decentralized rural domestic sewage discharge, mainly in the following aspects: (1) the use of agricultural fertilizers, resulting in an increase of ions into the natural water, especially on potassium ions and nitrate ions more obvious; (2) agricultural activities tilling activities, resulting in soil loosening, rainfall surface scouring effect intensified, and Weathering is further enhanced; (3) As the polders are relatively closed, evaporation is stronger, especially during the normal-flow period and the dry season, which may lead to a further increase in the ion content; (4) As the polders are mainly inhabited by residents of the countryside in a decentralized manner, most of the domestic sewage is discharged directly into the water, which may further increase the ion content of the polders and the risk of water pollution.

Based on the above points, it is suggested to strengthen the management of polder area from the following points: (1) intensive agricultural activities in the polder area need to strengthen the regulation of fertilizer use, such as as far as possible to recommend the application of organic fertilizers, in order to enhance the utilization rate of agricultural fertilizers; (2) enhance the efficiency of the water in the polder area and nitrogen utilization, to take the polder area of the recycling of irrigation measures, in order to achieve the increase of nitrogen retention in the farmland and to enhance the nitrogen uptake of crops on nitrogen absorption efficiency of the target (3) Collect rural domestic sewage into septic tanks as centrally as possible to form “artificial fertilizers”, and try to water the nitrogen and phosphorus-rich “artificial fertilizers” on sunny days to the farmland and dry land, so as to reduce the discharge of domestic sewage in the countryside, reduce the use of agricultural fertilizers, and save agricultural production costs. Agricultural production costs can be saved.

Fig. 15
figure 15

Component plots (a source composition spectrum and b contribution rate of the study area obtained from the PMF model.

Analysis of the APCS-MLR model shows that it is able to identify only sources associated with different geological backgrounds, but cannot effectively resolve anthropogenic pollution sources. Furthermore, the proportion of unknown sources accounts for as much as 16.59%, indicating limited source identification capability. In contrast, the PMF model demonstrates a more comprehensive ability in source resolution. It not identifies geological sources but also effectively discriminates anthropogenic contributions, with human activity sources representing 19.05%—significantly outperforming the results from the APCS-MLR model.

Conclusion

The water in the typical polder area of Chaohu Lake Basin is weakly alkaline, the content of TDS is much higher than that of the Chaohu Lake and the rivers into the lake. The cations in the water are dominated by Na+ and Ca2+, and were higher than those in other seasons in the dry season; the anions were dominated by HCO3, and were slightly higher than those in the other two seasons in the wet season, and the content of ion in the ditches was generally higher than that in the ponds. The water chemistry type of the is mainly HCO3-Ca type in the wet season, and the water chemistry types of the normal-flow period and the dry season are Cl/SO4-(Ca-Mg) type, HCO3-Ca type and Cl/SO4-Na type. Under the influence of nature and human activities, the water chemistry type changes from a single type to a mixed type.

The main hydrochemical characteristics of water in typical polder areas in the Chaohu Lake Basin are mainly controlled by rock weathering, but at the same time, they are affected by anthropogenic activities, NO3 in the area is mainly affected by the discharge of domestic sewage. In addition, the hydrochemical characteristics of the polder also show some seasonal changes. The wet season is mainly controlled by rock weathering, and the normal-flow period and dry season are mainly controlled by rock weathering and evaporation crystallization. From the multivariate statistical analysis, it can be seen that the PMF model is more suitable for the analysis of the main sources of ions in this study than the APCS-MLR model, and the main ions in Chaohu Lake polder mainly come from rock weathering, which contributes 80.95% of the ions, and anthropogenic activities account for about 19.05%.