Introduction

Superior crop quality and yield are intricately linked to the geochemical background of ecological and geological environments. In agricultural practice, high-quality crops often degrade across different regions even with identical crop varieties, cultivation techniques, climate, vegetation, topography, and companion crops1,2,3,4,5. Rare earth element (REE) distribution in plants is influenced by soil composition and biogeochemical properties6,7,8,9,10,11. The study of soil geochemistry aids in identifying the most suitable crops for various soil genetic types and recommends soil enhancement measures through the supplementation of specific trace elements and rock fertilizers to improve crop yields, thereby fundamentally enhancing the quality of arable land12.

Phosphorus promotes seed development and starch accumulation in tubers and roots13. It is critical for root development; phosphorus deficiency causes dull green leaves, stunted growth, and impairs reproductive organ formation and fruit development14. Potassium is essential for carbohydrate synthesis and translocation, indirectly influencing photosynthesis. It also enhances nitrogen metabolism, regulates plant protoplasm colloidal states, and improves resistance to cold, drought, lodging, and pests15. As soil potassium and phosphorus levels directly affect crop quality and yield, monitoring these elements is vital for understanding soil chemistry, tracking their geochemical behavior, and supporting agricultural cultivation and planning16,17,18,19,20,21.

Geochemical soil survey is a systematic method to quantify elemental content and other geochemical properties in soil22,23,24. Studies have explored how geomorphology, landscape features, climate, soil genesis, and elemental migration mechanisms influence the efficacy of this method25. Residual soil measurement is a mature and effective approach in chemical exploration. The effectiveness of soil measurement in the transport layer depends on the conditions of the measurement area. In wind-formed sand areas, soil measurement has advanced through sampling size interception tests; in organic soil areas, it has been enhanced by partial extraction techniques. Compared with traditional geochemical methods, remote sensing inversion for soil geochemistry offers advantages such as wide coverage, abundant data, advanced technology, rapid data acquisition, frequent updates, and dynamic monitoring.

The ZY1–02D satellite, launched in September 2019, is equipped with an embedded multispectral camera and hyperspectral imager (AHSI). It can acquire 9 band multispectral data over a 115 km swath and 166 bands hyperspectral data over a 60 km swath, covering the spectral range from the visible (400 nm) to the short - wave infrared (2,500 nm). The full chromatic band has a resolution of up to 2.5 m, and the multispectral bands have a resolution of 10 m. The AHSI can capture the spectral features of elements across 166 narrow bands, which is crucial for the inversion and analysis of soil chemical elements.

Shilan Felegari et al. utilized multi-temporal imagery, employing Support Vector Regression (SVR), Partial Least Squares Regression (PLSR), and Artificial Neural Networks (ANN) to estimate the concentration of Cd. The results indicated that the SVR model, using the original imagery as input, provided the most accurate estimation of Cd concentration in the region, ranging from 8 to 26 mg/kg. This method, however, relies on multispectral data with fewer bands, making it difficult to effectively extract the characteristic bands of Cd26. Mohammad Esmaeili et al. proposed the ResMorCNN model, which integrates 3D CNN with morphological feature residual injection to extract spatial-spectral features from hyperspectral images. On datasets such as Indian Pines, it achieved an Overall Accuracy (OA) of 99.71%, significantly outperforming traditional CNN and attention models, validating the enhancement provided by morphological features for classification. However, this approach relies on Principal Component Analysis (PCA) for dimensionality reduction (using the first three principal components), which may lead to the loss of crucial spectral information, thus affecting the differentiation of complex land cover27. Saeideh Marzvan et al. analyzed the expansion trend of Azolla filiculoides in the Anzali Lagoon, Iran, using Landsat time series (1988–2018) and Spectral Angle Mapper (SAM). The use of multispectral data, however, hindered fine-scale (SAM). The use of multispectral data, however, hindered fine-scale resolution28. Gomez Cécile explored remote sensing monitoring of soil geochemical elements, focusing on the continuum removal method and partial least squares regression (PLSR) for predicting soil clay and calcium carbonate content using visible and near-infrared (VNIR, 400–1200 nm) and short-wave infrared (SWIR, 1200–2500 nm). Results showed PLSR’s advantage with Hymap hyperspectral data, especially when soil spectral features are weak29. Hummel John W applied multiple regression techniques, including MLSR, ANN, random forest, and BP neural networks, to invert the content of soil elements like As, Fe, Hg, Cu, Mo, Zn, and Pb in mine soils, confirming the effectiveness of remote sensing spectroscopy in detecting these elements29,30,31. Additionally, research on soil rare earth element inversion in Anxin County achieved an R² of 0.982 using partial least squares, random forest, and BP neural networks32. Yuehan Qin, using GF-5 data, measured Au content in the Chahuazhai gold mining area, demonstrating that the geographically weighted regression (GWR) model, combined with S-G filtering, provided better fitting for Au content, though requiring a uniform sample distribution33. Mehrdad Daviran developed hybrid models combining Particle Swarm Optimization (PSO) with Support Vector Machine (SVM) and Random Forest (RF) to predict copper mineralization in southeastern Iran, showing superior performance over traditional methods, despite high computational costs and kernel selection issues34.

Seyed Mahdi Mirhoseini Nejad et al. proposed the ConvLSTM-ViT model, which integrates 3D-CNN, ConvLSTM, and Vision Transformer (ViT) to predict soybean yield using multispectral remote sensing data. Experiments showed that the model outperformed traditional methods in terms of Root Mean Square Error (RMSE) and correlation coefficients. This method, however, requires long-term, high-quality multispectral data and suffers from an abundance of model parameters, which complicates parameter setting35. Arezou Akhtarmanesh et al. improved the UNet model by incorporating attention blocks in the decoder and addressing class imbalance in the DeepGlobe dataset through data augmentation (slicing, rotation). The model achieved an accuracy of 98.33% in road extraction. However, based on the attention mechanism, large data volumes demand significant computational power, making it resource-intensive36. Hadi Mahdipour et al. utilized multiple satellite images and adopted an “ultrafusion” method for land cover segmentation. Compared to the latest similar methods, the overall accuracy, Kappa coefficient, and F1 score improved by approximately 0.86%, 0.52%, and 1.03%, respectively. The method faces challenges related to multi-source data matching and high computational requirements37. Mohammad Mahdi Safari et al. combined the U-Net architecture with backbone networks such as VGG and DenseNet, using Sea Surface Temperature (SST) data to identify mesoscale vortices in the Atlantic. Using sparse classification cross-entropy loss, the model achieved an accuracy of 99.37% on multiple datasets, demonstrating the potential of deep learning in ocean dynamics research. However, the method is limited by small image resolution37. Nizom Farmonov et al. proposed the HypsLiDNet framework, which integrates 3D/2D CNN and morphological attention mechanisms, combining hyperspectral (HSI) and LiDAR data for crop classification. Morphological operators are used to extract geometric features, and the attention mechanism optimizes feature fusion. Experiments on the DESIS and Houston datasets achieved high classification accuracy (OA of 98.67%), outperforming traditional machine learning and deep learning methods. The method requires synchronized acquisition of LiDAR and HSI data, which presents significant challenges. The model is also complex and processing is slow38. Alireza Vafaeinejad et al. utilized the Segmentation Anything Model (SAM) for high-precision segmentation, achieving a significant Intersection over Union (IoU) score of 92%, which notably surpasses traditional methods. However, this method struggles with the recognition of small-area features and lacks sufficient accuracy in complex scenes. Despite GPU acceleration, large-area computations still require considerable processing power39. Alireza Sharifi et al. developed a deep learning model based on Transformer architecture to enhance the spatial resolution of Sentinel-2 images. The model outperforms advanced methods such as ResNet, Swin Transformer, and ViT. Despite its lightweight design, the model’s long training times restrict its application to large-scale areas40.

Against the backdrop of advancements in remote sensing technology and machine learning for environmental monitoring, this study focuses on using hyperspectral data for soil geochemical analysis. While previous research has explored remote sensing applications across diverse environmental domains, significant gaps remain in high-precision soil geochemical inversion—particularly for potassium (K) and phosphorus (P) content. Traditional methods such as support vector regression (SVR) and partial least squares (PLS) have achieved notable results in solving complex problems like pattern recognition, regression prediction, and time-series analysis. However, these approaches often require extensive feature selection, manual parameter tuning, or complex kernel functions, which can limit model efficiency and scalability. Additionally, although deep learning techniques are powerful, they often incur high computational costs in certain scenarios and are prone to overfitting. In contrast, the Extreme Learning Machine (ELM) has emerged as a promising alternative due to its ability to capture nonlinear data features through random feature mapping. This unique characteristic of ELM not only reduces the complexity of feature engineering but also enhances computational efficiency, offering faster training speeds and simpler parameter optimization.

Therefore, this study aims to achieve high-precision inversion of soil geochemical potassium content by exploring the relationship between AHSI hyperspectral data from the ZY1-02D satellite and soil K/P geochemical contents. A novel processing approach is proposed, integrating Savitzky-Golay filtering, Minimum Noise Fraction (MNF) transformation, spectral differentiation, and continuum removal to enhance spectral features. A “vegetation-bare soil” zonal inversion paradigm is established, using the ELM algorithm to invert soil K/P geochemical contents. Model reliability is evaluated using four parameters-R², MSE, RMSE, and training time-to explore the potential of hyperspectral data in high-precision soil geochemical inversion and propose a scalable framework for regional geochemical mapping. The research findings not only provide a new method for soil nutrient status analysis but also offer critical support for precision agriculture practices and cultivated land quality improvement in plateau regions.

Materials and methods

Study area

The study area (97°31′ and 98°16′ E longitude and 24°24′ and 25°20′ N latitude) is located in Yingjiang County, Dehong Dai Jingpo Autonomous Prefecture, Yunnan Province, with a total study area of 100 km2and it is 3.8 km from Yingjiang County. The region experiences a southern subtropical monsoon climate, characterized by an average annual temperature of 19.3 °C and an average annual precipitation of 1464 m.

The regional stratigraphy of the Yingjiang area is mainly composed of the Paleozoic Gaoligongshan Group and the Holocene and Pleistocene. The rock formations consist of black cloud schist, dolomite schist, black cloud diorite meta granite, and black cloud dio-rite gneiss. The study area is predominantly granite, with significant mica content in the soils, particularly in the rice-growing regions. The soils are chiefly composed of brick red loam, red loam, yellow loam, yellow-brown loam, and brown loam. The majority of soils in the county are phosphorus-deficient, acidic, and exhibit an imbalanced nutrient ratio. The soil types and their altitudinal distribution are as follows: brick red loam at 210–600 m, red loam at 600–1400 m and 1400–2000 m, yellow loam at 2000–2300 m, yellow-brown loam at 2300–2800 m, and brown loam at 2800–3400 m. Notably, the brown loam is found at altitudes around 2800 m (Fig. 1).

In 2021, the grain cultivation area in the study area is 117.63 million hectares, in which the rice cultivation area is 29.92 million hectares41,42.

Fig. 1
figure 1

(a) Map of China; (b) Map of Yunnan province; (c) Location of the study area.

Data

ZY1-02D hyperspectral data

The study primarily utilized ZY1-02D hyperspectral data from December 14, 2020, purchased from the Yunnan Remote Sensing Center, with the data ID “ZY1E_AHSI_E97.85_N24.61_20201214_006588_L1A0000208280.” The surface reflectance of the L1A data was obtained through radiometric calibration and atmospheric correction (Table 1). Subsequently, noise removal, spectral smoothing, and data enhancement were performed on the hyperspectral data using Savitzky-Golay filtering and spectral transformation techniques, including differential processing and continuum removal.

Table 1 The parameters of the advanced hyperspectral imager (AHSI) data of the ZY1-02D satellite.

Geochemical data

A total of 856 soil geochemical samples were collected in the study area. Sampling density ranged from 4 to 16 points per square kilometer, with most samples collected from cultivated land. Other land use types were sampled at 4 points per square kilometer to ensure comprehensive coverage and avoid gaps. Soil samples from arable and forest lands were collected at a depth of 0–20 cm. Using each GPS positioning point as the center, 4–6 subsamples were taken within a 50–100 m radius to form a mixed sample of equal volume.

Collected samples were fully air-dried and sieved through a 10-mesh screen. A 200-g subsample was weighed and sent for analysis, while at least 300-g subsamples were stored in clean plastic bottles and transferred to a sample repository. Analytical procedures strictly followed the Specification for Multi-target Regional Geochemical Survey (DZ/T 0295–2016) and Technical Requirements for Analysis of Ecological Geochemical Evaluation Samples (Trial) (DD2005-03). Potassium content was measured by inductively coupled plasma atomic emission spectrometry (ICP-AES), and phosphorus content was determined using the alkali fusion-molybdenum antimony anti-spectrophotometric method.

Analytical quality was ensured through a combination of external quality control (EQC) and internal quality monitoring (IQM). EQC involved inserting password-coded standard control samples; accuracy and precision of external standards ranged from 97 to 100%, correlation coefficients (r) were 0.966–0.999, and F-test values for two-sample ANOVA were 1.02–1.09 (all < one-tailed F critical values). IQM tracked quality parameters including accuracy/precision of national standard substance analyses, reporting rates, repeatability tests, and anomaly checks. For IQM: accuracy/precision pass rates for all indices were 100%; reporting rates ranged 98.2–100%; repeatability test pass rates were 95.8–100%; and anomaly repeatability test pass rates were 95.3–100% (Table 2).

The average potassium content is 28.42 g/kg, falling within the medium range for agricultural land (20–30 g/kg), but with significant spatial heterogeneity (standard deviation 6.82, variance 46.50). The range is as wide as 45.05 g/kg (3.49–48.54), indicating a phenomenon of both local enrichment and depletion. The left-skewed distribution (skewness − 0.62) and low kurtosis (1.37) suggest that the data is concentrated in the lower value range and has a relatively flat distribution, which may be related to the differences in mountain parent materials and the mobile nature of potassium. Agricultural management should focus on regions with potassium depletion (where low-value areas dominate) and hotspots of enrichment (where the maximum value exceeds the mean by 1.7 times).

The average phosphorus content is 0.78 g/kg, which is below the critical value for effective phosphorus in farmland (1.0 g/kg). Overall, it shows the characteristic of “general deficiency, local excess.” The standard deviation of 0.26 (33% of the mean) and the strong right-skewed distribution (skewness 1.37) indicate that phosphorus is deficient in most areas, but there are a few high-value outliers (the maximum value is 2.04 g/kg, 2.6 times the mean). The peaked distribution (kurtosis 3.17) shows that the data is highly concentrated in the lower value range. Combined with a range of 1.85 g/kg (0.19–2.04), this reflects an imbalance in the spatial application of phosphorus fertilizers.

Using ZY1-02D hyperspectral remote sensing data, vegetation index normalization categorized sampling points in the study area into 235 within vegetated areas and 621 in bare areas for chemical analysis (Fig. 2).

Table 2 The statistical treatments of Raw values of 2 analyzed elements.
Fig. 2
figure 2

The distribution of the geochemical soil samples.

Fig. 3
figure 3

The flow chart of the research process.

Research process

The research flowchart is shown in Fig. 3. This study primarily includes three parts: (1) Hyperspectral data preprocessing: Noise is addressed via S-G filtering and MNF methods, followed by differentiation and continuum removal transformations to eliminate variations, thereby enhancing absorption and reflection features in spectral curves; (2) Selection of characteristic spectral bands with significant correlations to corresponding soil geochemical elements; (3) Development of predictive models for the two elements using Multiple Linear Regression (MLR), Partial Least Squares Regression (PLSR), BP, and ELM regression analysis, with the optimal model selected for soil element content inversion in the Yingjiang region.

Data preprocessing

First, the raw ZY1-02D hyperspectral data are pre-processed, which includes radiometric calibration, atmospheric correction, and orthorectification. Specifically, radiometric calibration converts the digital number (DN) of the image to radiometric brightness, reflectance, or surface temperature. Atmospheric correction compensates for atmospheric effects using the MODTRAN radiative transfer model. Additionally, orthorectification corrects the image for tilt and projection distortions.

After removing atmospheric and geometric distortion effects, the differences between the spectral curves after atmospheric correction and the USGS standard spectral library are compared in the selected broad-leaved woodland area (Fig. 4).

Fig. 4
figure 4

The ZY1-02D spectral profile after radiometric calibration and FLAASH.

Savitzky-Golay filter

Due to the influence of imaging instruments and environmental factors, the hyperspectral data contain varying degrees of noise in both the imaging and pre-processing stages. This noise affects both the spatial and spectral domains, reducing the advantages of hyperspectral resolution. To address this issue, the pre-processed ZY1-02D hyperspectral data were filtered using the Savitzky-Golay (S-G) filtering method43.

The SG filtering method is based on local polynomials in the time domain using least squares fitting. It efficiently reduces noise compared to traditional methods by directly moving a “window” in the time domain. This approach preserves spectral characteristics such as extrema and width while effectively eliminating noise.

$$\:{R}_{i}^{{\prime\:}}=\frac{1}{H}\sum\:_{j=-k}^{k}{R}_{i+j}*{h}_{j}$$
(1)

where \(\:{R}_{i}^{{\prime\:}\:}\) is the fitted value, \(\:{R}_{i+j}\) is the original value of the pixel, \(\:\frac{{h}_{j}}{H}\) is the smoothing coefficient, which is obtained by fitting a polynomial by the least squares method, k is the number of unilateral bands to be fitted.

After SG filtering, the spectral curve becomes smoother compared to the atmospherically corrected result. The absorption and reflection intervals of the spectral curve are clearly visible (Fig. 5).

Fig. 5
figure 5

The spectral profile after S-G filtering.

Minimum noise fraction rotation(MNF)

Minimum Noise Fraction (MNF) is commonly used in hyperspectral data processing, particularly in the field of remote sensing. Its fundamental concept involves applying a linear transformation to hyperspectral data in order to separate the noise components from the valid signal components, thereby enhancing the effectiveness of subsequent processing. MNF mainly involves two principal component transformations. The first transformation uses the noise covariance matrix to reorder and separate the noise in the data(Fig. 6); the second transformation is mainly the standard principal component change of the noise-whitened data, and it alleviates the impact of noise on image quality compared to principal component analysis (PCA)44.

$$\:P=U*{D}_{N}^{-\frac{1}{2}}$$
(2)

where \(\:{D}_{N}\) is the diagonal matrix of \(\:{C}_{N}\) eigenvalues in descending order, \(\:I\) is the unit matrix, and \(\:P\) is the transform matrix. When \(\:P\) is applied to the image data \(\:X\), the original image is projected into the new space by applying the \(\:Y=PX\) transformation, and the noise in the resulting transformed data has unit variance and is not correlated between bands.

$$\:{C}_{D-adj}={P}^{T}*{C}_{D}*P$$
(3)

where \(\:{C}_{D}\) is the covariance matrix of image X; \(\:{C}_{D-adj}\) is the matrix after P-transformation, and it is further diagonalized into a matrix \(\:{D}_{D-adj}\)

$$\:{D}_{D-adj}={V}^{T}*{C}_{D-adj}-V$$
(4)

where, \(\:{D}_{D-adj}\) is the diagonal matrix of eigenvalues of \(\:{C}_{D-adj}\) in descending order; \(\:V\) is the orthogonal matrix composed of eigenvectors.

Fig. 6
figure 6

The MNF analysis curve.

Continuum removal

The continuum removal method is a commonly used spectral analysis method to compare feature values with other spectral curves, which can highlight the absorption and reflection features of the spectral curve by normalizing the absorption features of the spectrum to a consistent spectral background. The band depth is subtracted from the envelope spectrum by 1 46, indicating the depth of the formed absorption valley due to the absorption spectral features of the soil chemical components at a certain wavelength point with lower reflectance than the neighboring bands. The calculation formula is calculated as follows:

$$\:{R}_{H}^{{\prime\:}}\left({\lambda\:}_{i}\right)=1-\frac{R\left({\lambda\:}_{i}\right)}{{R}_{C}\left({\lambda\:}_{i}\right)}$$
(5)

where \(\:{\lambda\:}_{i}\:\)is the spectral reflectance of the i-th band, \(\:R\left({\lambda\:}_{i}\right)\) is the spectral reflectance of the corresponding wavelength, \(\:{R}_{C}\left({\lambda\:}_{i}\right)\) is the band reflectance value on the envelope of the band, \(\:\frac{R\left({\lambda\:}_{i}\right)}{{R}_{C}\left({\lambda\:}_{i}\right)}\) is the envelope line, and \(\:{R}_{H}^{{\prime\:}}\left({\lambda\:}_{i}\right)\:\)is the depth of the band.

First-order differential

The spectral differential transform is a calculation of the spectral curve to increase the trend of the spectral curve and expand the spectral features while eliminating the noise effect of the spectral data, and the difference of the spectrum is generally used as a finite approximation of the differential in the actual calculation46.

$$\:{R}_{H}^{{\prime\:}}\left({\lambda\:}_{i}\right)=\frac{\left[R\left({\lambda\:}_{i+1}\right)-R\left({\lambda\:}_{i-1}\right)\right]}{\left({\lambda\:}_{i+1}-{\lambda\:}_{i-1}\right)}$$
(6)

where \(\:{\lambda\:}_{i+1}\), \(\:{\lambda\:}_{i}\), and \(\:{\lambda\:}_{i-1}\) are adjacent wavelengths; \(\:\text{R}\left({\lambda\:}_{i+1}\right)\),\(\:\:\text{R}\left({\lambda\:}_{i}\right)\), and \(\:\text{R}\left({\lambda\:}_{i-1}\right)\) are the spectral reflectance of the corresponding wavelengths; \(\:{R}_{H}^{{\prime\:}}\left({\lambda\:}_{i}\right)\:\)is the first-order differential of the wavelength \(\:{\lambda\:}_{i}\).

Second-order differential

$$\:{R}^{{\prime\:}{\prime\:}}\left({\lambda\:}_{i}\right)=\frac{\left[{R}^{{\prime\:}}\left({\lambda\:}_{i+1}\right)-{R}^{{\prime\:}}\left({\lambda\:}_{i-1}\right)\right]}{\left({\lambda\:}_{i+1}-{\lambda\:}_{i-1}\right)}$$
(7)

where \(\:{{\uplambda\:}}_{\text{i}}\) is the adjacent wavelength, \(\:{\text{R}}^{{\prime\:}}\left({{\uplambda\:}}_{\text{i}}\right)\) is the derivative of the corresponding wavelength, and \(\:{\text{R}}^{{\prime\:}{\prime\:}}\left({{\uplambda\:}}_{\text{i}}\right)\) is the second-order differentiation of the wavelength \(\:{{\uplambda\:}}_{\text{i}}\).

Characteristic spectra selection

To investigate the relationship between geochemical element contents and the reflectance in soil, the original and the transformed spectra are correlated with soil geochemical elements, respectively. The correlation coefficients are calculated as follows.

$$\:\text{R}=\frac{1}{n-1}\sum\:_{i=1}^{n}\left(\frac{{X}_{i}-\stackrel{-}{X}}{{S}_{X}}\right)\left(\frac{{Y}_{i}-\stackrel{-}{Y}}{{S}_{Y}}\right)$$
(8)

where \(\:X\) and \(\:Y\:\)are the spectral reflectance and the measured elemental content at the i-th wavelength, respectively; \(\:\overline{X}\) and \(\:\overline{Y}\:\)are the sample mean of the spectral reflectance and the measured elemental content, respectively; \(\:{S}_{X}\) and \(\:{S}_{Y}\) are the sample variance of the spectral reflectance and the measured elemental content, respectively; \(\:n\) is the number of samples; \(\:R\) is the correlation coefficient.

The images after S-G filtering are filtered by Pearson correlation analysis with the potassium and phosphorus elements of the measured chemical sampling sites, and the bands with a higher correlation of the respective elements are selected (Figs. 7 and 8).

Fig. 7
figure 7

(a) Correlation curves of soil geochemical potassium sample analysis data from vegetation cover areas to the removal of hyperspectral image bands from primary, differential, and continuum systems; (b) Correlation curves of soil geochemical potassium sample analysis data in bare areas to the removal of hyperspectral image bands by primary, differential, and continuum systems.

Fig. 8
figure 8

(a) Correlation curves of soil geochemical phosphorus sampling and analysis data from vegetation cover areas to the removal of hyperspectral image bands from primary, differential, and continuum systems; (b) Correlation curves of soil geochemical phosphorus sampling and analysis data in bare areas to the removal of hyperspectral image bands by primary, differential, and continuum.

Pearson correlation analysis revealed that the correlation of potassium content in vegetated areas is stronger compared to bare areas, with a significant correlation at the 0.01 significance level. The correlation coefficient curve for differential transformation in vegetated areas exhibits significant fluctuations, whereas the spectral curve of continuous unified removal transformation performs optimally in the ranges of 400–500 nm, 1100–1200 nm, and around 1650 nm. Following the third-order differential transformation, the correlation coefficients for potassium in bare areas are pronounced, with stronger correlations observed in the ranges of 1000–1100 nm, 1650–1700 nm, 1850–1900 nm, and 2200–2300 nm, generally exceeding 0.4. The correlation between phosphorus elements and the spectra is not evident, with most bands showing no significant correlation, and the correlation coefficients remaining within 0.20. After first-order differential transformation, significant correlations were observed at the 0.05 significance level in the ranges of 1650–1700 nm, 2150–2250 nm, and 2350–2400 nm (Table 3).

Table 3 The ZY1-02D hyperspectral remote sensing data and element correlation.

Based on data collection and previous experience, the characteristic spectral bands in regions of spurious correlation were eliminated. Ultimately, the characteristic spectral bands necessary for model construction were selected within the near-infrared (NIR) band.

Following the third-order differential transformation of the spectra, the potassium (K) content in the vegetation zone exhibited significant correlations at 979 nm, 1031 nm, 1929 nm, and 2334 nm. The potassium in the bare soil area showed significant correlations at 1014 nm, 1677 nm, 1880 nm, and 2216 nm. The third-order differential transformed spectral data were selected as the characteristic bands for potassium in the vegetation zone. Phosphorus in vegetated areas was significantly correlated at 1812 nm, 2031 nm, 2216 nm, and 2501 nm. The phosphorus in the bare soil area was significantly correlated at 816 nm, 1005 nm, 2216 nm, and 2501 nm. The third-order differential transformed spectral data were selected as the characteristic bands for phosphorus in the vegetation area.

Regression models

In this study, hyperspectral spectral data are modeled with measured soil geochemical data using multiple linear regression (MLR), partial least squares regression (PLSR), the BP neural network, and the ELM regression. Subsequently, the ZY1-02D hyperspectral spectral data obtained from the aforementioned correlation analysis are modeled with measured soil geochemical data for potassium (K) in the vegetation cover area, potassium in the bare area, phosphorus (P) in the bare area, and phosphorus in the bare area. Response models are established between the ZY1-02D hyperspectral spectral data and the soil geochemical measurements for potassium in the bare area and phosphorus in the vegetation cover area

In the construction of the mathematical models, 70% of the total sample data are used for model construction and the rest for model validation. In the construction of the two neural network models, 170 and 65 pieces of sample data are involved in model training and model validation for the vegetation cover area, respectively. 500, and 121 pieces of sample data are involved in model training and validation for the bare area, respectively

Multiple linear regression (MLR)

Multiple linear regression analysis predicts the dependent variable by Establishing a regression equation between multiple independent variables and the dependent variable, providing greater accuracy than traditional univariate regression analysis models22

$$\:\text{Y}={\beta\:}_{0}+{\beta\:}_{1}{X}_{1}+\dots\:+{\beta\:}_{j}{X}_{j}+\dots\:+{\beta\:}_{n}{X}_{n}+\epsilon\:$$
(9)

where \(\:Y\) denotes the characteristic to be analyzed, which is the content of soil geochemical elemental abundance in this study; \(\:{X}_{j}\) denotes the jth independent variable, which is the extracted spectral feature in this study is; \(\:{\beta\:}_{j}\) denotes the regression coefficient corresponding to the jth independent variable; \(\:\epsilon\:\:\)is the random error of the regression equation, and \(\:n\) is the number of independent variables used in modeling.

Partial least squares regression (PLSR)

Partial least squares regression (PLSR) is a novel multivariate statistical data analysis method that addresses the issue of multicollinearity among variables by performing a covariance test to identify the presence of covariance, determining the number of components, and establishing a partial least squares regression model. The model is straightforward to calculate, offers high predictive accuracy, and is easy to interpret qualitatively29.

$$\:\text{X}=\text{T}{P}^{T}+E$$
(10)
$$\:\text{Y}=\text{U}{Q}^{T}+F$$
(11)

where \(\:X\) is a matrix of predictors (element content), and \(\:Y\:\)is a matrix of responses (ZY1-02D spectra value); T and U are projections of \(\:X\) and \(\:Y\), respectively; P and Q are orthogonal loading matrices; matrices E and F are error terms. \(\:X\) and \(\:Y\) are decomposed to maximize the covariance between T and U.

Back propagation (BP) neural network

The BP neural network is a multilayer feedforward network trained using the error backpropagation algorithm, capable of learning and storing numerous input-output pattern mapping relationships32. It consists of an input layer, a hidden layer, and an output layer (Fig. 9).

The core idea of the BP algorithm is to utilize gradient descent for searching the hypothesis space of possible weight vectors to find the optimal weight vector that best fits the samples. Specifically, using a loss function, the algorithm iteratively adjusts weights and biases in the direction of the negative gradient until the loss function reaches a minimum value. Additionally, the backpropagation algorithm computes gradients, the partial derivatives of the loss function with respect to weights and biases in each layer, updating initial weights and biases iteratively until either the loss function minimizes or a predefined number of iterations is completed. This approach is crucial for optimizing parameters within neural networks.

The parameter setting of the neural network is mainly the number of layers and neurons in each layer. After several experiments, the initial parameter setting of the BP neural network for samples in the vegetation area is: the network is set to contain three layers with 30, 10, and 30 neurons respectively, the maximum number of epochs of the training network is 1000, the learning rate is 0.01, the maximum number of verification checks is set to 10, and the TrainParam goal is set to 0.0000000004.

The initial parameter setting of the BP neural network for samples in the bare area is: the network is set to contain three layers with 20, 20, and 20 neurons respectively, the maximum epoch of the training network is 1000, the learning rate is 0.1, the maximum number of validation checks is set to 5, and the TrainParam goal is set to 0.0000000004.

Fig. 9
figure 9

The structure of the backpropagation neural network model.

Excess learning machine (ELM)

This study attempts to apply it to the construction of inversion models for soil geochemical content. ELM is a class of machine learning systems or methods constructed on the Feedforward-Neuron-Network (FNN) for supervised and unsupervised learning problems. ELM is regarded as a special class of FNN or an improvement of FNN and its backpropagation algorithm, and it has several characteristics: the weights of the nodes in the hidden layer are randomly or artificially given and do not need to be updated, and the learning process only calculates the output weights. The standard ELM adopts the Single Layer Feedforward neuron Network (SLFN) structure. Specifically, the SLFN is composed of an input layer, an implicit layer, and an output layer (Fig. 10).

ELM has found applications in various fields that require the processing of complex data, pattern recognition, regression prediction, and time series analysis. Whether in finance, healthcare, intelligent transportation, or in areas such as natural language processing and image processing, ELM has demonstrated exceptional performance. Furthermore, as research progresses, its areas of application continue to expand. Traditional mathematical models, such as Support Vector Regression (SVR) and polynomial regression, typically require manual feature selection or the use of complex kernel methods to enhance model capability. In contrast, ELM captures the nonlinear characteristics of data through random feature mapping, reducing the workload of feature engineering. By mapping to a high-dimensional space in a random manner, ELM can effectively address nonlinear problems, enabling it to outperform traditional models such as linear regression or SVM in nonlinear regression tasks. Additionally, compared to deep learning, it offers advantages such as faster training speeds, simpler parameters, and a reduced risk of overfitting.

Fig. 10
figure 10

The structure of the excess learning machine model.

ELM can randomly initialize the input weights and thresholds and obtain the corresponding hidden node outputs. In terms of the structure of the neural network, the transcendental learning machine is a simple single-layer forward neural network, which contains three layers: the input layer, the hidden layer, and the output layer. The hidden layer has L neurons, L is much smaller than N, and the output layer outputs a vector of m dimensions.

$$\:\sum\:_{i=1}^{L}{\beta\:}_{i}g\left({a}_{i}*{x}_{i}+{b}_{i}\right)={f}_{L}\left({x}_{j}\right),j=\text{1,2},3,\cdots\:,N$$
(12)

where \(\:g\left(x\right)\) is the activation function; \(\:{{a}_{i}=({a}_{i1},{a}_{i2},\cdots\:,{a}_{in})}^{T}\) is the input weight of the ith hidden unit; \(\:{\beta\:}_{i}={({\beta\:}_{i1},{\beta\:}_{i2},\cdots\:,{\beta\:}_{in})}^{T}\) is the bias of the ith hidden unit; \(\:{f}_{L}\left({x}_{j}\right)\) is the output weight of the ith hidden unit.

Results

Model accuracy evaluation

After distinguishing between vegetation-covered and bare areas, the vegetation cover area and bare areas, the vegetation cover area was modeled using 170 soil geochemical test data points, with model performance evaluated using 65 measured sample data points and the corresponding predicted values. For the bare area, the model was developed based on 500 measured analytical data points and assessed by comparing 121 measured data points with the predicted values. The correlation between the measured and predicted values was evaluated using the coefficient of determination (R²), where a higher R² indicates a stronger correlation. The model was constructed by integrating the aforementioned parameters, and the models derived from the regression of two elements using different methods were compared and analyzed (Table 4).

For the K model in the vegetated area, the Partial Least Squares Regression (PLSR) and Multiple Stepwise Regression (MSR) models exhibit comparable accuracies, with an R² value of approximately 0.31. The accuracy of the ELM model substantially surpasses those of the MSR, PLSR, and Backpropagation (BP) models. The ELM model achieves the highest accuracy, with an R² of 0.654. While preserving the R² value, this model significantly diminishes the prediction error, and the model training time is reduced to 0.023 s. For the K-element model in the bare area, when compared to the MSR model (R² = 0.023), the PLSR model effectively enhances the inversion accuracy, with an R² of 0.262. Both the Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) are also notably reduced. The BP model yields an R² of 0.359, demonstrating approximately a 37% performance improvement relative to the PLSR model. It reduces the MSE and RMSE while maintaining model accuracy. The ELM model sustains a high R², and the model training time is 0.0032 s.

For the P models in both the vegetated and bare areas, the MSR and PLSR models have similar R², MSE, and RMSE values, indicating poor accuracy. Compared with traditional mathematical models and the BP model, the ELM model shows a huge improvement in accuracy, reaching approximately 0.33, and the model construction time is shortened to 0.003 s.

The performance stability of ELM in heterogeneous vegetation-bare land environments highlights its unique environmental adaptation mechanism. In vegetated areas, ELM’s prediction accuracy for potassium (R² = 0.654) was notably higher than that in bare land (R² = 0.617), with both significantly surpassing traditional methods. This indicates that the randomly generated hidden-layer basis functions of ELM can autonomously separate the spectral masking effects of vegetation canopies (e.g., chlorophyll absorption valleys) from soil background spectral features—a dynamic decoupling unattainable by linear methods like PLSR due to their fixed basis functions (exhibiting an R² difference of 0.355 in bare land). Furthermore, the MSE for phosphorus was lower in bare land (0.035) than in vegetated areas (0.041), suggesting that ELM is more robust to high-frequency surface noise (such as abrupt mineral reflectance variations). Additionally, during training, unlike the vanishing gradient problem encountered in BP for phosphorus prediction, ELM directly solves for output weights, significantly outperforming BP and enabling a much faster training process.

Table 4 The regression model analysis of the vegetation cover area with potassium.

The table demonstrates that, for both potassium and phosphorus, neural network-based models significantly outperform traditional mathematical models in accuracy. Specifically, the ELM model for potassium in vegetated areas exhibits a 10% improvement over the conventional BP model, with an R² of 0.654 and an MSE of 22.686. This suggests that the neural network regression model achieves optimal modeling accuracy. Considering both R² and MSE, the ELM model was selected as the inversion model for potassium. The residual plot, histogram, and Q-Q plot of the potassium element model in the vegetation zone are presented below (Fig. 11).

Fig. 11
figure 11

(a) Plot of standardized residuals; (b) Histogram of residual distribution. (c) Q-Q plot.

The regional ELM model shows a nearly 71% improvement over the BP model, with an R² of 0.617 and an MSE of 9.102. The extreme learning machine regression model was selected as the inversion model for potassium in the bare area. The residual plot, histogram, and Q-Q plot of the potassium element model in the bare soil zone are presented below (Fig. 12).

Fig. 12
figure 12

(a) Plot of standardized residuals; (b) Histogram of residual distribution. (c) Q-Q plot.

For phosphorus in vegetated areas, among the four models, the neural network-based inversion model exhibits a substantial accuracy improvement, approximately reaching 400%. The ELM model has an R² of 0.3354 and an MSE of 0.0414. Considering both R² and MSE, the ELM model was selected as the inversion model for phosphorus in the vegetated area. The residual plot, histogram, and Q-Q plot of the phosphorus element model in the vegetation zone are presented below (Fig. 13).

Fig. 13
figure 13

(a) Plot of standardized residuals; (b) Histogram of residual distribution. (c) Q-Q plot.

The inversion model accuracy for regional phosphorus has increased by 900%, with the ELM model showing an additional 58% improvement compared to the BP model. The R² of the ELM was 0.3314 with an MSE of 0.0351, and the accuracy of the ELM model was higher, so the ELM model was used for the inversion model of the phosphorus in the bare area. The residual plot, histogram, and Q-Q plot of the phosphorus element model for the bare soil zone are shown below (Fig. 14).

Fig. 14
figure 14

(a) Plot of standardized residuals; (b) Histogram of residual distribution. (c) Q-Q plot.

Element inversion

The optimal excess learning machine model obtained by training was inverted based on the obtained excess learning machine model for the potassium in the vegetation cover area, potassium in the bare area, phosphorus in the vegetation cover area, and phosphorus in the bare area (Figs. 15 and 16).

Fig. 15
figure 15

(a) The results of potassium inversion without differentiating between vegetation cover and bare areas. (b) The results of potassium inversion with differentiating between vegetation cover and bare areas.

Fig. 16
figure 16

(a) The results of phosphorus inversion without distinguishing vegetation cover and bare areas. (b) The results of potassium inversions in overlaid vegetation cover and bare areas.

Discussion

Actual data comparison

The distribution maps of potassium and phosphorus elements derived from the Extreme Learning Machine regression model showed a strong correspondence with the measured sample data from Taiping Street and Nongzhang Street. The maximum potassium content detected was 5.321 mg/kg in bare areas and 3.891 mg/kg in vegetated areas. Correspondingly, the highest phosphorus content observed was 2.390 mg/kg in bare areas and 1.974 mg/kg in vegetated areas.

The predicted geochemical potassium anomalies exhibited a variation trend analogous to that of the field geochemical exploration sampling analysis data (Fig. 17). Following the differentiation between vegetated and bare areas, the inversion results revealed a strong correspondence between potassium concentrations in areas A, C, and D of the study area and the measured sample processing data. Notwithstanding, in areas B, despite its low vegetation coverage, the results remained inconsistent with the measured data.

Fig. 17
figure 17

(a) The inversion results of potassium content in differentiated vegetation cover bare areas; (b) The interpolated content data of potassium samples.

The phosphorus inversion results after vegetation differentiation aligned with the high anomalies in the measured sample data for areas A, C, and E of the study area. Concurrently, phosphorus high-anomaly zones were concentrated in areas B, D, and F, though discrepancies with measured sample data were observed, with only some high-anomaly zones remaining consistent. The inversion results were indistinct in the critical vegetation-covered and bare soil areas of D and E (Fig. 18).

Fig. 18
figure 18

(a) The inversion results of phosphorus content in bare areas with differentiated vegetation cover; (b) The interpolated phosphorus content data of the measured samples.

Through field verification and comparison with the potassium element inversion distribution, it was observed that at field points YJ01-R, YJ02-R, YJ03-R, YJ05-R, YJ06-R, YJ07-R, YJ08-R, YJ09-R, YJ11-R, YJ12-R, YJ13-R, YJ14-R, YJ15-R, YJ17-R, YJ18-R, YJ19-R, YJ20-R, and YJ25-R, the majority of soils were rich in mica. Mica minerals contain abundant potassium, explaining why soils with higher mica content generally exhibit elevated potassium levels. This observation closely aligns with the potassium element inversion distribution (Fig. 19).

Fig. 19
figure 19

(a) Field-based verification overlaid with the potassium element inversion distribution; (b) On-site verification of soil conditions.

Undifferentiated vegetation comparison

After differentiating vegetation using normalized vegetation indices, the distribution of soil geochemical potassium elements in the study area was generally consistent with the inversion results from undifferentiated vegetation (see Fig. 20). Vegetation cover was categorized into four areas: A, B, C, and D. Among these, Zones A and D exhibited higher vegetation cover. The results showed scattered high anomalies of soil geochemical potassium elements in vegetated areas, with overall low potassium content in these regions. Both methods indicated that Zone C had the highest potassium content across the entire study area.

However, the ELM model was used to select distinct characteristic spectra for inverting bare areas within vegetated zones. Results indicated that this approach emphasized potassium high-anomaly areas in the southern Dayingjiang River region. Compared with the undifferentiated vegetation geochemical inversion method, the ELM model delineated the spatial distribution of potassium high anomalies across the entire study area while preserving the accuracy of low-potassium-content regions, effectively filtering high-value areas.

Fig. 20
figure 20

(a) The inversion results of potassium content after distinguishing vegetation (b) The inversion results of potassium content without distinguishing vegetation.

The inversions of phosphorus exhibited significant variations in vegetation areas A and D. In area A, the inversions showed a notable increase in the continuous distribution of high-value areas after vegetation differentiation, while the results in area B highlighted some high anomalous areas. The results for bare area C highlighted the high anomalous areas of phosphorus, which were consistent with the measured soil geochemical data, confirming the accuracy of extracting anomalies from the differentiated vegetation (Fig. 21).

Fig. 21
figure 21

(a) The inversion results of phosphorus content after distinguishing vegetation (b) The inversion results of phosphorus content without distinguishing vegetation.

Conclusions

This study presents a high-precision soil geochemical element inversion method by leveraging large-scale soil geochemical analysis data and ZY1-02D hyperspectral data. Through data transformations including denoising and enhancement, an ELM model is established to derive more accurate soil geochemical anomaly information for potassium. Experimental results show that this model currently exhibits higher operational efficiency and improved accuracy compared to the BP model.

The development of a high-precision soil geochemical survey method can, on one hand, effectively provide a scientific basis for regional agricultural planning, such as rice cultivation. By mapping the precise distribution of soil geochemical data, it offers recommendations for high-quality rice planting areas, thereby supporting the development of plateau-specific agriculture. On the other hand, this method can be extended to environmental pollution assessment by constructing relevant models for soil heavy metal concentrations, which are critical for engineering projects and human daily activities. This approach also serves as an important complement to existing technical methods. Compared with traditional soil geochemical surveys, which are time-consuming and labor-intensive, hyperspectral remote sensing technology enables more accurate prediction of soil geochemical content over larger areas.

The study is subject to certain limitations. The model construction focuses solely on the selected feature bands, without integrating the feature band selection with the model-building algorithm. In the future, methods for selecting feature bands will be incorporated to further streamline the model-building process, enhancing both its speed and accuracy, with the aim of achieving a more precise soil geochemical inversion model. Additionally, the number of sample points in vegetation-covered areas is comparatively smaller, which affects the model’s accuracy. In practical applications, high-density vegetation and varying vegetation types in these areas can significantly influence model precision. Therefore, when constructing the model, factors such as the region’s vegetation cover should be considered based on specific regional requirements. Moreover, other external factors may also impact the feature bands and model accuracy.