Introduction

Human activities have led to changes in the original physical and chemical properties of soil elements in nature. Determining the extent of this change and finding a scientific and reasonable reference standard for soil element content are of great significance for studying the degree of soil pollution and evolution trends1,2,3,4. Currently, the reference standards for soil geochemistry include background value, background value, reference value, and baseline value5,6,7. These concepts seem to be the same but have significant differences in practice. If inappropriate evaluation criteria are used, it is easy to lead to “excessive” or “insufficient” evaluation when evaluating soil pollution8,9,10,11.

The background value is the natural content of the chemical and environmental aspects without the influence of human activities. It does not exist in current environmental conditions, is difficult to obtain and does not have practical application value12. The background value is the content of chemical elements in the relatively clean area of a study area under the current environmental conditions, including a certain degree of human influence, which is the transition from the environmental background value to the environmental status value2,13. The reference value is the maximum dose or concentration of pollutants in the environment that does not have adverse or harmful effects on specific objects14. The baseline value is the natural variation of the concentration of chemical substances in the Earth’s surface material1,2,3,15,16,17,18. It is further expressed as the natural variation of the abundance of an element in a specific substance (such as soil, sediment, rock) when a region or data set is used as a reference. It is a single limit to distinguish geochemical background and anomalies19. The geochemical baseline value is essential for characterizing natural and anthropogenic environmental impacts. It is defined as the upper limit of the geochemical background (not disturbed by human activities) or the lower limit of the effects of human activities. It is an essential reference for assessing the effects of human activities on the enrichment of elements on the Earth’s surface. It provides a necessary basis for subsequent environmental risk assessment20,21. Research on soil pollution mainly selects regional or even global soil background values22,23,24 and ignores spatiotemporal differences in the content of heavy metal elements in soil. The pollution evaluation results are generally severe and challenging in reflecting the actual pollution status of the soil in the study area. Therefore, geochemical baselines should be delineated and used as reference values to scientifically evaluate the degree of soil pollution in the study area25,26.

The Yangtze River is the longest in China. The Yangtze River Basin is a vast region that connects the western, central, and eastern parts of China, spanning over ten provinces. It is an important economic center and industrial corridor in China. The watershed has abundant soil resources and diverse soil types, with arable land accounting for 33.3% of the country’s arable land area. With the rapid development of the economy in the Yangtze River Basin over the years, soil heavy metal pollution and agricultural product safety issues have occurred frequently, posing a significant threat to the ecological environment and food safety27,28,29,30,31,32. Research has found that nearly half of the critical heavy metal prevention and control areas in China are located in the Yangtze River Economic Belt, and there are concentrated and contiguous heavy metal pollution problems in areas such as the Yangtze River Delta, the middle reaches of the Yangtze River, and the Chengdu Chongqing urban agglomeration33. The soil heavy metal content in China’s southwestern region is higher than in the other areas34. The Cd content in rice grains grown in farmland soil around an industrial zone in Hunan Province exceeded the standard severely35. The soil’s heavy metal content is relatively high at the junction of Jiangsu Province, Zhejiang Province, and Shanghai City, where urbanization is progressing rapidly36. The excess rates of Hg and Cd in some farmland soils in Suzhou are relatively high, and high-risk areas are mainly concentrated in industrial-developed towns and factories37. In addition, China’s multi-objective geochemical survey has also found abnormal enrichment of heavy metals such as Cd in soil along the Yangtze River coast38. It can be seen that the problem of soil heavy metal pollution in the Yangtze River Economic Belt is prominent, which poses a significant threat to the local natural environment and human health. To protect and improve the ecological environment of the Yangtze River Economic Belt, China has successively introduced a series of policies to protect the ecological environment of the Yangtze River Economic Belt, clearly proposing to prioritize the protection and restoration of the Yangtze River ecological environment and comprehensively promoting the construction of a beautiful China.

Taking Wuxian County in the Yangtze River Basin as an example, this paper constructs the environmental geochemical baseline of heavy metals based on investigating the content of heavy metals in surface soil. Taking the geochemical baseline value and environmental background value as the evaluation criteria, the geo-accumulation index method is used to evaluate the pollution and analyze its causes, trying to explore the soil pollution evaluation model suitable for small watersheds to reflect and monitor the impact of human activities on the geological environment, to provide scientific support for the prevention and control of soil heavy metal pollution and environmental quality management in the Yangtze River Basin.

Materials and methods

Study area

The study area is located in the southern part of Wushan County, the center of the Three Gorges Reservoir area. The county bounds the south, east, west, and the north is divided by the township boundary, including Miaoyu Town, Tonggu Town, Guandu Town, Baolong Town, Jianping Town, Peishi Town, Duping Town, Dengjia Town and Hongchun Town (Fig. 1). The area is 1002 km2 (the core area of the nature reserve is 134.50 km2). The south is high in the area, and the north is low. The middle is arranged in a low mountain strip, and the hills and flat dams are scattered. The highest peak in the southern margin of the territory is Shiliping, with an altitude of + 2104 m; the lowest point is the northern side of the Yangtze River, and the altitude changes with the water storage of the reservoir area, with a minimum of + 145 m (Fig. 1). The relative height difference is 1959 m. The area has many mountains, valleys, and rivers and intense surface erosion. The mountain body has been uplifted in the long-term gap of the neotectonic movement, forming a geomorphological landscape with steep mountains, overlapping mountains, karst development, dense valleys, and deep canyons. The primary mineral resources in the territory are coal mines, gypsum mines, limestone mines, copper mines, iron mines, shale for bricks and tiles, stone coal (associated with vanadium, molybdenum, and selenium), etc. The soil types are mainly limestone soil, followed by yellow soil, purple soil, tiny amounts of yellow-brown soil, paddy soil, and fluvo-aquic soil.

Sample collection and testing

Generally speaking, the surface soil layer (0–20 cm) is the main area for plant growth, and the soil layer is most closely in contact with human activities. This soil level not only contains the natural geological background content of chemical elements in the parent material but also adds various exogenous chemical substances from human activities. Therefore, by analyzing the chemical elements of this soil level, it is possible to more effectively reflect the impact of human activities on the soil environment and the actual condition of the soil environment.

Fig. 1
figure 1

Geographical location and sampling point distribution map of the study area. (a) Map of China; (b) Image map of Chongqing, China (prepared with Bigemap GIS Office, version 30.0.0.0, URL: https://gac-geo.googlecnapps.club/maps/vt?lyrs=s&x={x}&y={y}&z={z}); (c) Elevation distribution map of the study area; (d) Sampling point distribution map of the study area (prepared with ArcGIS, version 10.7.1, URL: https://www.esri.com/en-us/arcgis/products/arcgis-desktop/resources); (e) Field sample collection and preprocessing.)

The sample collection was strictly carried out by the Technical Guidelines for the Geological Survey of Land Quality in Chongqing (1:50000) (revised in 2020). The layout of the sampling points adopts the method of ‘grid considering land use patterns’, and the sample collection is mainly selected in areas with thick soil layers, such as farmland, vegetable fields, orchards, and grasslands. This study designed a total of 90 surface soil sampling points (Fig. 1). On the set-up of sampling points, with the sampling terminal fixed point and the positioning point as the sampling center, 4–6 sampling points are determined by radiating 30–50 m to the surrounding area and combined into a mixed sample in equal parts. An ‘S’ shape is used to arrange sampling points when the sampling plot is rectangular. When the sampling plots are approximately square, an ‘X’ or chessboard shape is used to arrange sampling points. A shovel is used to dig a sampling pit. The pit is dug, a wooden shovel is used to remove the soil in contact with the shovel, and a wooden shovel is used to dig a 0–20 cm soil column. After each sample is collected, the soil is immediately removed from the sampling tool and used for the following collection to ensure the cleanliness of the sample. The soil at each sampling point is crushed, and impurities such as roots, straws, stones, and insect bodies are removed. The sample is mixed thoroughly; about 1.0–1.5 kg is obtained by the quartering method, and the sample is placed into sample bags. The sample bag is a clean and sturdy cotton bag with a damp sample lined with a plastic bag. The collected samples are packed into the bag and labeled with the sample number.

After natural air drying, the sample is knocked until it passes through a ten-mesh nylon sieve and then reduced into one sample and one sub-sample using the reduction method. The sample is placed in a cowhide paper bag, which is covered with a self-sealing bag and then stored in a box. We fill out the sample delivery list and send it to the Chongqing Mineral Resources Testing Centre of the Ministry of Natural Resources for sample analysis; the other copy shall be kept as a duplicate sample.

The soil sample is thoroughly dried in a constant temperature oven below 60 ℃ to remove non-soil impurities. After sieving (10 mesh), it is thoroughly mixed and ground to 200 mesh. After sieving, it is bagged for testing and analysis. According to the regional geochemical sample analysis method DZ/T 0279–2016, the specific testing process is as follows: 0.1000 g of soil sample was weighed into a digestion tank, dissolved, and dried by adding HF, HNO3, and a small amount of HClO4. The residue was extracted by nitric acid and transferred to a 50 mL sample tube at constant volume. It has inductively coupled plasma mass spectrometry determined Cd, Co, and Mo (ICP-MS, Agilent 7900, USA). A sample of 0.1000 g sample was weighed, dissolved, and dried using HF, HNO3, and HClO4. The residue was extracted by hydrochloric acid and transferred to a 10 ml sample tube at constant volume. The contents of Cu, Ni, and Zn were determined by inductively coupled plasma spectrometry (ICP-AES, Thermo ICAP6300, USA). The 4.0 g sample was pressed under 30 t pressure, and the Pb content was determined by X-ray fluorescence spectrometry (XRF PANalytical PW4400). A 0.5000 g sample was weighed and placed in a digestion vessel with aqua regia. It was then heated on an electric hot plate until approximately 2 mL of liquid was obtained. The liquid was extracted with hydrochloric acid and transferred to a 50 mL volumetric flask, where it was made up to volume with a mixture of thiourea and ascorbic acid. The sample was then reduced with potassium borohydride, and the arsenic content was determined using atomic fluorescence (AFS, Beijing Haiguang AFS-9760). 0.5000 g sample was weighed, and aqua regia was placed into the digestion tank, put on an electric heating plate for digestion, steam to about 2 mL, nitric acid was moved to a 50 mL volumetric bottle for constant volume, and potassium dichromate solution, reduce with potassium borohydride, and determine Hg by atomic fluorescence (AFS, XGY1011A).

According to the above testing method, 12 parallel measurements were conducted on 12 national first-class standard materials for soil. The logarithmic difference between the average measured and standard values was calculated to be -0.031 ~ 0.028, and the relative standard deviation was 1.37% ~8.86%. (Table 1) The test results have high accuracy and meet the quality control standards.

Research methods

Baseline value determination method

According to Reference39, the baseline level refers to the level of soil environmental quality and its ecological service function in the region when the pollution of the environment or the destruction of the ecological behavior does not occur. Priority should be given to using historical data as the baseline level in baseline determination or to the survey data of the control area and the reference environmental quality standards as the baseline level. This study utilized soil heavy metal survey data, processed them using the cumulative frequency method for comprehensive determination, and analyzed the soil heavy metal pollution status in the study area.

The cumulative frequency curve is one of the ideal methods to display the distribution characteristics of geochemical data, which can sensitively reflect anomalies in the data40. Bauer et al.41 introduced the cumulative frequency curve into soil environmental quality assessment to determine environmental geochemical baselines. Suppose an inflection point exists in the cumulative frequency element content distribution curve. In that case, the values above the inflection point represent the part affected by human activities, and the baseline value is determined by the average value of element content (data within a 95% confidence interval) below the inflection point. If two inflection points exist, the higher point represents the lower limit of the.

Table 1 Accuracy and precision of soil sample analysis methods.

anomaly, which is the part affected by human activities; the lower point represents the upper limit of the baseline value and the average value of element content (data within a 95% confidence interval) below this point is determined as the baseline value.

Evaluation method of geo-accumulation index

The geo-accumulation index (Igeo), proposed by German scholar Müller42, can reflect the potential degree of pollution of heavy metal elements in soil and is widely used to quantitatively evaluate the degree of accumulation of heavy metal pollution. The calculation method is:

$$I_{{geo}} = \log _{2} [C_{n} /kB_{n} ]$$
(1)

In Eq. 1, Igeo is the geoaccumulation index; Cn means the measured content in the sediment, mg·kg− 1; Bn means the background value, mg·kg− 1; k is the coefficient of variation of background value that may be caused by the difference in parent rock (in general, the k value is 1.5). According to the calculation results of Igeo, the degree of pollution accumulation of heavy metals is divided into seven levels (Table 2).

Table 2 Grading of geo-accumulation index.

Data statistics and analysis

In this study, ArcGIS10.2 was used to arrange sampling points; the WPS office and IBM SPSS Statistics27 software completed the statistical analysis of heavy metal content in the soil. OriginPro2018c and Surfer15.0 were used to construct data charts. The cumulative frequency curve was used to determine the geochemical baseline, and the geoaccumulation index was used to analyze the accumulation degree and spatial distribution of heavy metals in the study area.

Analysis and discussion

Statistical analysis of heavy metals in soil

From the statistical results of soil heavy metal content in the study area (Table 3), the average values of heavy metals As, Hg, Co, Cd, Pb, Cu, Ni, Zn, and Mn are 14.06, 0.09, 19.39, 0.99, 31.01, 37.08, 47.77, 121.10 and 960.20 mg/kg, respectively. The mean values of heavy metals in the soil of the study area are significantly higher than the geochemical background values of surface soil in Sichuan Province. The coefficient of variation can reflect the uniformity and variability of heavy metals in soil. A higher coefficient of variation indicates a more uneven distribution of elements in soil and a more significant impact on human activities43. The coefficient of Co and Pb variation in the soil of the study area is 15–35%, which belongs to moderate variation and is mainly affected by landform and soil parent material44. The coefficient of variation of As, Hg, Cd, Cu, Ni, Zn, and Mn in the soil is more significant than 35%, which is highly variable, indicating that the distribution in the soil was uneven and significantly affected by human activities.

In addition, the skewness and kurtosis values are 4.83 and 27.20 for Cd, 3.95 and 18.71 for Ni, 4.15 and 20.92 for Zn, 2.15 and 6.92 for Hg, and 2.39 and 8.49 for Cu. The probability distribution of the data is slightly right-skewed, and the content of heavy metals Cd, Ni, Zn, Hg, and Cu deviates from the normal population, reflecting that human activities have had a particular impact on the soil quality in this area.

Evaluation results and analysis

The index of geoaccumulation (Igeo) is a method used to classify soil heavy metal content into different grades (pollution degree) based on the multiples of soil heavy metal content and its background value. The calculation results are shown in Table 4. The geoaccumulation indices of heavy metals range from − 1.82 to 0.76 (As), -2.90–2.04 (Hg), -1.66–0.35 (Co), 0.02–7.06 (Cd), -1.40–-0.17 (Pb), -2.46–1.72 (Cu), -0.90–2.21 (Ni), -1.06–1.65 (Zn) and − 3.13–1.21 (Mn). The average values of the geoaccumulation index of As, Hg, Co, Cd, Pb, Cu, Ni, Zn, and Mn are − 0.29, -0.36, -0.47, 1.88, -0.60, -0.54, -0.18, -0.17 and − 0.18, respectively. The order of average values from high to low is Cd > Zn > Ni = Mn > As > Hg > Co > Cu > Pb, Cd at the middle pollution level, and the other heavy metals are at the pollution-free level. According to the classification standard of geoaccumulation index, the soil heavy metals in the study area are mainly non-polluting and lightly polluted. Still, Cd, Hg, Cu, Ni, Zn, and Mn have a certain proportion of moderate pollution (43, 7, 2, 3, 2, and 1 points, respectively). Hg and Ni have one soil point with moderate–strong pollution level. Cd has six strong pollution points, four strong–strong pollution points, and five extremely strong ones.

Calculation of geochemical baseline

The geochemical baseline value is determined by the relative cumulative frequency method to find the inflection point of the cumulative frequency curve accurately. The box plot of soil heavy metal elements in the study area was first drawn to identify the inflection point value accurately. Outliers outside the 1.5-fold quartile difference were identified, and the normal distribution test was gradually excluded18. The absolute values of the skewness coefficient (SK) and the kurtosis coefficient (K) are less than 1 (Table 3), which conforms to the normal distribution (Fig. 2).

Fig. 2
figure 2

Statistical box diagram and normal distribution test of soil heavy metal content in the study area (“b” represents before screening of outliers, “a” represents after screening of outliers ).

Table 3 Descriptive statistics of heavy metals in soil in the study area.
Table 4 Geoaccumulation index and statistical analysis based on different evaluation criteria ( N = 90 ).

The cumulative frequency curve of each element in the study area is shown in Fig. 3. Taking Cu as an example, the number of statistical samples is 81, and the relative cumulative frequency curve of Hg was drawn. The curve has two inflection points: the inflection point 1 is 0.09 mg/kg, and the inflection point 2 is 0.13 mg/kg. Therefore, the lower limit value affected by human activities is 0.13 mg/kg; the element content part of < 0.09 mg/kg can be used to calculate the baseline value. The average value of the data in this part was calculated as the baseline value of Cu (0.06 mg/kg). The points corresponding to the element values of 0.09–0.13 mg/kg may be affected by human activities or related to the high background values of natural conditions. The inflection point values of the eight other heavy metal elements can be found (Table 5), and the baseline values of As, Co, Cd, Pb, Cu, Ni, Zn, and Mn are 12.69, 16.47, 0.22, 30.15, 24.61, 34.54, 104.40 and 628.32 mg/kg, respectively.

Fig. 3
figure 3

Relationship between heavy metal element contents and relative cumulative frequency and its inflection point.

Table 5 Geochemical baseline of heavy metal base on relative cumulative frequency (mg/kg).
Fig. 4
figure 4

Spatial distribution of soil pollution based on geochemical baselines.

Spatial distribution characteristics of soil pollution based on geochemical baselines

In general, soil heavy metals mainly come from soil parent rock. For the same geological background area, the difference in soil heavy metal element content is not insignificant under natural conditions. However, in the process of natural soil formation, the redistribution of various heavy metals in the secondary layer, as well as the human intervention of different activity types and intensity in the later stage, such as urban automobile exhaust, domestic waste stacking, mineral resources development and utilization, industrial and agricultural production, will further expand the difference of soil pollution.

The spatial distribution map of heavy metal elements based on the geo-accumulation index was drawn (Fig. 4) to clarify the soil pollution status and its sources in the study area. The spatial distribution of soil heavy metal pollution is different due to the difference in the degree of regional economic development. The high-value regions are in Jianping Township, Baolong Town, and Duping Township. The low-value regions are primarily in Miaoyu Town, Tonggu Town, and Guandu Town. For example, the distribution of As, Pb, Mn, and Cu pollution in the study area is consistent. The high-value regions are located in Jianping Township, Baolong Town, Duping Township and the western part of Miaoyu Town, Hongchun Township and the southern part of Guandu Town; the low-value areas are located in Miaoyu Town, Tonggu Town, Guandu Town and Peishi Township. According to the analysis, Hg and Ni have one medium–strong pollution point, while Cd has six strong pollution points, four strong–strong pollution points and five strong pollution points. Taking Cd as an example, it is mainly distributed in the west of Miaoyu Town, the south of Baolong Town, Dengjia Township, and Duping Township. On-the-spot investigation indicates that the region mainly develops agriculture and animal husbandry, and the industry is relatively developed, mainly including coal, power generation, casting, mechanism charcoal, and building materials.

In general, the characteristics of the spatial distribution of heavy metals in the soil of Wushan County are obviously different, and the soil quality is generally good, but the pollution of heavy metals in the soil is still not negligible; in particular, the pollution of Cd should be further investigated. In view of the distribution characteristics and pollution status of soil heavy metals in the current study area, people should implement effective measures to prevent and control soil pollution, such as paying attention to the leakage of heavy metals in industrial production and transportation, banning mining enterprises with severe heavy metal pollution, rationally adopting irrigation water that meets the standards of agricultural water use, reducing or avoiding the use of high-heavy metal fertilizers and improving the utilization rate of fertilizers and pesticides, etc., from all aspects of society to change the current soil status and then promote the comprehensive prevention and control of soil heavy metal pollution.

Discussion

Environmental geochemical baseline and environmental background values are commonly used to evaluate heavy metal pollution. However, a difference exists between geochemical baseline values and geochemical background values. The geochemical background value represents the concentration of elements in natural materials that do not include the impact of human activities. In contrast, the baseline value represents the concentration of elements measured in time at some sites in areas disturbed by human activities, which is usually not the natural background. Taking Hg in this study as an example, the number of statistical samples was 81, and two inflection points were found in the relative cumulative frequency curve. The inflection point 1 value was 0.09 mg/kg, and the inflection point 2 value was 0.13 mg/kg. The lower limit affected by human activities was 0.13 mg/kg. The element content of < 0.09 mg/kg can be used to calculate the baseline value (0.06 mg/kg). The points corresponding to the element values of 0.06–0.09 mg/kg may be affected by human activities or related to the high background values of natural conditions. The background value is usually more challenging to determine than the baseline value due to the wide range of human activities. Therefore, using baseline values for environmental quality assessment is more scientific and accurate, and can more accurately reflect the pollution input that the local soil environment has received over a period of time, thus preventing and controlling soil environmental risks in advance45,46,47,48.

In addition, selecting appropriate evaluation criteria helps to distinguish the status of soil heavy metal pollution accurately. Taking Cd as an example and using environmental background values as the evaluation criteria, we found 0 non-polluting sample points and that the rest have a certain degree of pollution, especially 5 sample points with extremely serious pollution. When using the baseline values of soil geochemistry as the evaluation criteria, 50 uncontaminated sample points were found in the evaluation results, indicating the lack of extremely serious pollution. The main reason for the differences in these results lies in the selection of evaluation criteria. The evaluation results based on large-scale environmental background values reflect a relatively high level of heavy metal pollution in the study area. In contrast, the evaluation results based on environmental geochemical baselines reflect a relatively low level of heavy metal pollution in the study area and can more accurately reflect the soil pollution situation.

With the long-term accumulation of human activities and the rapid development of modern industry and agriculture, finding soil environments that are absolutely unaffected by human activities is difficult. In the study of soil environmental quality in some regions (such as mining areas, etc.), if the reference standards are selected by expanding the scale index of environmental background values, then the evaluation results may be blurred or distorted due to the superposition of multiple influencing factors. Therefore, establishing an environmental geochemical baseline around the research area is essential to effectively compensate for the ambiguity in defining the evaluation results caused by the lack of baseline values. Otherwise, it cannot honestly and accurately reflect the actual pollution status of the soil and even directly affect the formulation of comprehensive soil pollution control policies and the final treatment effect.

It should be emphasized that this study only considered using the relative cumulative frequency curve method to calculate the baseline values of soil geochemistry in the study area, and the research method is relatively simple. It is known that the methods for determining soil geochemical baseline values usually include standardization, statistical methods, geochemical comparison methods, and geological dating methods4344. In the future, we will attempt to develop geochemical baseline values using multiple methods and compare and analyze the differences between various methods to explore the most suitable analytical approach. In addition, when conducting soil pollution assessment, this article only used the geological accumulation index method for evaluation. We will explore multi-pollution assessment methods to comprehensively evaluate soil pollution status based on the determined geochemical baseline values.

Conclusion

Based on the above analysis, the following conclusions can be drawn:

  1. (1)

    The average values of As, Hg, Co, Cd, Pb, Cu, Ni, Zn, and Mn in the soil of the study area are 14.06, 0.09, 19.39, 0.99, 31.01, 37.08, 47.77, 121.10 and 960.20 mg/kg, respectively.

  2. (2)

    The ground accumulation index indicates that heavy metals in the soil of the study area are mainly non-polluting and light-polluting. Still, there is a certain proportion of moderate pollution for Cd, Hg, Cu, Ni, Zn, and Mn (with 43, 7, 2, 3, 2, and 1 point respectively). Hg and Ni each have one soil point with a moderate to strong pollution level, while Cd has six strong pollution points, four strong to extremely strong pollution points, and five extremely strong pollution points.

  3. (3)

    Based on the distribution characteristics of cumulative frequency curves, the geochemical baseline values for Hg, As, Co, Cd, Pb, Cu, Ni, Zn, and Mn are 0.06, 12.69, 16.47, 0.22, 30.15, 24.61, 34.54, 104.40 and 628.32 mg/kg, respectively.

  4. (4)

    The environmental geochemical baseline determines the evaluation criteria for small-scale spatial areas, which can compensate for the problem of unclear definitions of evaluation results caused by the lack of reference values. The determined soil heavy metal environmental geochemical baseline can scientifically and accurately identify the current concentration of pollutants in the research area.