Introduction

The Loess Plateau is one of the most significant ecological barriers in China and globally1. The sustainability of its ecosystem services is closely linked to the lifeblood of the region’s environmental security and socioeconomic development. Habitat quality, as a core indicator of ecosystem services, directly reflects the region’s ability to maintain biodiversity and the overall health of its ecosystems2. However, the Loess Plateau has long been plagued by soil erosion, land degradation, and ecological fragility3. This has resulted in a steady decline in habitat quality and the deterioration of ecosystem services4,5. In recent years, climate change, rapid urbanization, and intensified human activities have made the Loess Plateau’s environmental problems more prominent6. It is therefore urgent to investigate the spatial and temporal evolution of habitat quality and its driving mechanisms in the region.

China has always attached significant importance to the Loess Plateau’s environmental management. In the 1950s–1970s, the focus of management was on “soil and water conservation,” where soil erosion was mitigated by planting trees and building terraces, etc1,7. In the 1980s–1990s, the policy direction gradually shifted to “ecological restoration, with reforestation, grassland restoration, and environmental migration becoming the main tasks8. In the 21st century, the management strategy was further upgraded, and the concepts of “sustainable civilization construction” and “green development” became guidelines. The promotion of the national strategy of “environmental protection and high-quality development of the Yellow River Basin” has injected new policy impetus into the ecological restoration of the Loess Plateau8,9,10. Despite increasing policy support, the natural environment in this region still faces many difficulties. The spatial and temporal evolution of habitat quality and its driving mechanisms has not been thoroughly studied.

Habitat quality is influenced by natural and human factors. Natural factors, including precipitation, temperature, and topography, play a significant role11. Precipitation affects water conditions for vegetation growth, but it is also closely related to soil erosion12. Adequate and evenly distributed precipitation promotes vegetation growth, which enhances habitat quality. Temperature affects vegetation growth cycles and species distributio13. Extreme temperature events threaten organism survival and reduce the habitat quality14. Topography determines land drainage, light and heat distribution, and different terrains, such as mountains, hills, and plains, have different impacts on habitat quality15. For example, vertical climatic variations in mountainous areas create diverse ecological environments. These environments are conducive to biodiversity but may also exacerbate soil erosion risk due to the steepness of the terrain16. Human activities also affect habitat quality. Urbanization invades natural habitats, leading to habitat fragmentation and living space compression17. Agricultural activities, such as excessive cultivation and irrational irrigation methods, can lead to soil degradation and desertification and reduce habitat suitability18. Pollution caused by industrial activities, such as air, water, and soil pollution, directly jeopardizes organism survival and reproduction and seriously damages habitat quality.

In previous studies, scholars used traditional methods such as correlation coefficient analysis, multiple regression analysis, spatial regression analysis, and geo-detectors19,20. However, in practice, there are often complex nonlinear interactions between factors21. When dealing with this nonlinearity and multidisciplinary datasets, they face the challenges of insufficient explanation power and limited ability to capture complex relationships22. To overcome these limitations, domestic and international studies have begun to turn to the application of more advanced machine learning models, including machine learning models such as elastic network regression (ENR) and random forest (RF), as well as geographically weighted regression (GWR) and multiscale geographically weighted regression (MGWR)23,24,25. XGBoost, as an advanced machine learning algorithm, has unique advantages in ecological research. Based on the gradient boosting framework, XGBoost integrates multiple decision trees to optimize prediction performance step by step26,27. This handles large-scale data efficiently but also has excellent prediction accuracy28. When dealing with a large amount of data in the study of habitat quality on the Loess Plateau, XGBoost can quickly analyze complex data about environmental factors29. It can also explore potential patterns therein. Using the SHAP (Shapley Additive Explanations) algorithm, which is based on the Shapley value in game theory, this study further enhanced the interpretability of the model30,31. It quantifies each feature’s contribution to the model output and provides local and global explanations for the XGBoost model32. Therefore, combining the XGBoost and SHAP algorithms can provide insights into the key factors affecting habitat quality.

There are mainly two innovations in this study: first, the InVEST model is combined with XGBoost-SHAP using a multi-model fusion approach. In addition to assessing habitat quality with the InVEST model, the XGBoost-SHAP model analyzes the driving mechanism in depth, thereby compensating for the shortcomings of a single model. Secondly, the XGBoost-SHAP model can provide a more scientific basis for ecological governance than the linear method by revealing the complex nonlinear relationship between factors affecting habitat quality33. The SHAP value can clarify the key influencing factors and their degree of contribution by quantifying their contributions. It helps target environmental management strategies. This study aims to provide a solid theoretical foundation and practical guidance for scientific management and ecological restoration on the Loess Plateau. Using high-precision spatial and temporal analyses, the study reveals the dynamic characteristics of habitat quality changes on the Loess Plateau; identifies the key drivers of habitat quality and their respective contributions using the XGBoost-SHAP model; and analyzes the driving mechanisms of dominant factors on habitat quality in depth so as to provide a scientific basis for ecological management policies. It not only contributes to a deeper understanding of the Loess Plateau’s ecosystem service function. They also provide valuable references for eco-sensitive areas’ sustainable development.

Data sources and methods

Study area

Located in central China (103°24′–114°33′ E, 33°43′–41°16′ N), the Loess Plateau covers approximately 640,000 km2 across Shaanxi, Gansu, Ningxia, Shanxi, Henan, Qinghai, and Inner Mongolia (Fig. 1). It is the largest loess deposit globally, characterized by loess hills, plateaus, ridges, and mounds. Elevations range from 800 to 3,000 m, ascending westward. The deep loess layer, tens to hundreds of meters thick, consists of loose, porous aeolian loess with well-developed vertical joints. With temperatures averaging 6–14 °C and precipitation ranging from 300 to 600 mm, the region experiences a temperate continental monsoon climate. Water resources are limited, and the climate is arid. There is a lack of fertility and organic matter in the soil, primarily yellow loessal and dark loessal. The ecosystem is very fragile due to the sparse forest cover of only 15.29%24. In China, the Loess Plateau is a key area for soil erosion control and environmental restoration. Natural factors and human activities influence habitat quality here, varying significantly over time and space. The ability to understand these dynamics and their driving mechanisms is critical to sustainable development.

Fig. 1
figure 1

The study area. Administrative boundary data of the Loess Plateau and 2020 land use data with a spatial resolution of 30 m were obtained from the Resource and Environmental Science Data Platform (https://www.resdc.cn/); DEM data were obtained from the Geospatial Data Cloud (https://www.gscloud.cn/) with a spatial resolution of 30 m. The map data of China were downloaded from GS (2022) 1873 standard maps from the National Administrative Division Information Query Platform (https://www.gscloud.cn/). The Chinese map data are based on the GS (2022) 1873 standard map downloaded from the National Administrative Division Information Query Platform (http://xzqh.mca.gov.cn/map). ArcGIS 10.7 generated the above data; the boundaries of the base map remain unchanged.

Data sources

The primary data utilized in this study are as follows:

  1. (1)

    Land use data: Raster datasets of land use for the Loess Plateau for the years 1980, 1990, 2000, 2010, and 2020, sourced from the Center for Resource and Environmental Science and Data (https://www.resdc.cn). These datasets are categorized into 7 major land types and 24 minor land types with a spatial resolution of 30 m.

  1. (2)

    Climate and Environmental Data: Digital Elevation Model (DEM) data were obtained from the Geospatial Data Cloud (https://www.gscloud.cn/), with slope and Terrain Ruggedness Index (TRI) data derived from subsequent calculations. All these datasets have a spatial resolution of 30 m. Temperature (Tem), soil erosion intensity and precipitation (Pre) data were provided by the Institute of Mountain Hazards and Environment, Chinese Academy of Sciences (https://imde.cas.cn/), both featuring a spatial resolution of 30 m. Normalized Difference Vegetation Index (NDVI) data were sourced from the Resource and Environment Science and Data Center (https://www.resdc.cn), with a spatial resolution of 250 m.

  2. (3)

    Socio-economic data: Gross Domestic Product (GDP) data were obtained from the Resource and Environment Science and Data Center (https://www.resdc.cn), with a spatial resolution of 1 km. Nighttime Light (NTL) data were extracted from the Defense Meteorological Satellite Program’s Operational Linescan System (DMSP-OLS) Version 4 Nighttime Light Time Series (https://eogdata.mines.edu/), with a spatial resolution of 500 km. Population Density (Pop) data were derived from the WorldPop project (https://www.worldpop.org/), providing a high-resolution dataset of 100 m.

  3. (4)

    Other data: China map data is based on the national administrative division information query platform (http://xzqh.mca.gov.cn/map). The downloaded GS (2022) 1873 standard map. The river data is sourced from the National Geographic Information Resource Catalog Service System (https://www.webmap.cn/). The administrative division data of the Loess Plateau is sourced from the Resource and Environmental Science Data Platform (https://www.resdc.cn/).

The mentioned data were harmonized within the predicted coordinate system in preparation for the upcoming study: Krasovsky_Albers_1940.

Research methods

Habitat quality assessment

This study uses the Habitat Quality module of the InVEST model to evaluate the habitat quality of the Loess Plateau. The habitat quality index, ranging from 0 to 1, reflects the ecosystem’s capacity to sustain species23.The degradation of habitat quality is quantified using the following formula:

$$\begin{array}{*{20}c} {D_{{xj}} = \mathop \sum \limits_{{r = 1}}^{R} \mathop \sum \limits_{{y = 1}}^{{Y_{r} }} \left( {(\frac{{\omega _{r} }}{{\mathop \sum \nolimits_{{r = 1}}^{R} \omega _{r} }}} \right)r_{y} i_{{rxy}} \eta _{x} S_{{jr}} } \\ \end{array}$$
(1)

Where: \(\:{D}_{xj}\) Is the habitat quality stress intensity index of grid x in land use type j; R represents threat factor; R represents the number of threat factors; \(\:{\omega\:}_{r}\) is the weight of the threat factor. The range is between 0 and 1. The closer the weight is to 1, the greater the influence on the habitat quality; \(\:{S}_{jr}\)refers to the sensitivity of land use type j to threat factor r. The range is between 0 and 1. The greater the value, the stronger the sensitivity. \(\:{i}_{rxy}\:\)represents the threat source value \(\:{r}_{y}\:\)of grid y. The threat level of y to grid x. The model also proposes two ways to calculate \(\:{i}_{rxy}\), with the following formula:

$$\begin{array}{*{20}c} {i_{{rxy}} = \left\{ {\begin{array}{*{20}c} {1 - \left( {\frac{{d_{{xy}} }}{{d_{{r~~\max }} }}} \right)\left( {Linear~decay} \right)} \\ {\exp \left[ { - \left( {\frac{{2.99}}{{d_{{r~~\max }} }}} \right)d_{{xy}} } \right]\left( {Exponential~decline} \right)} \\ \end{array} } \right.} \\ \end{array}$$
(2)

In the formula: The distance between grids x and y is denoted by \(\:{d}_{xy}\), and the maximum influence range of the threat factor is shown by \(\:{d}_{r\:\:max}\). A higher estimated value of \(\:{D}_{xj}\) denotes a higher degree of habitat degradation as a result of the threat factor’s larger impact on habitat quality. The habitat quality assessment formula is shown as follows25:

$$\begin{array}{*{20}c} {Q_{{xj}} = H_{j} \left( {1 - \frac{{D_{{xj}}^{z} }}{{D_{{xj}}^{z} + K^{z} }}} \right)} \\ \end{array}$$
(3)

Where: \(\:{Q}_{xj}\) is the habitat quality index of grid x in land use type j; \(\:{H}_{j}\) is the habitat suitability of land use type j, the value range is between 0 and 1, and the closer to 1, the stronger the suitability is; Z is the normalized constant; K is the semi-saturation constant, generally 1/2 of the maximum value of\(\:{D}_{xj}\);

According to the parameter settings in the relevant literature23,26,27the InVEST model manual the reference value range for this model was determined, with specific parameter settings as shown in Tables 1 and 2,The semi-saturation parameter K adopts the default value, which is set to 0.5.

Table 1 Threat factor parameter setting.
Table 2 Sensitivity of different landscape type to habitat threat factors.

Selection of influencing factors

In this study, we selected topography, climate, vegetation, soil and socio-economic factors to construct a comprehensive evaluation system, aiming to systematically analyze the spatial differentiation mechanism of habitat quality and its driving factors in the Loess Plateau. From the dimension of natural carrying capacity, DEM, slope and terrain relief (TRI) regulate the soil and water conservation capacity and material-energy redistribution process through the topographic gradient effect, which directly affects the stability of the ecosystem and the threshold of resistance to disturbance; Temperature (Tem) and precipitation (Pre) are important climate factors that influence how plants grow and how ecosystems function by affecting the balance of heat and water; NDVI, which measures how much vegetation is present, is crucial for understanding the quality of habitats on the Loess Plateau and plays a significant role in ecological processes. NDVI, as a quantitative characterization of vegetation cover, together with soil erosion intensity, constitutes a dual pointer for ecosystem health diagnosis, with the former reflecting the level of primary productivity and the latter revealing the risk of degradation of surface cover. From the dimension of anthropogenic coercion, GDP, nighttime lighting (NTL), and population density (Pop) map the intensity of human activities through spatial heterogeneity, and their gradient characteristics are significantly coupled with the distribution of ecological footprints, which can quantitatively characterize the coercive effects of industrialization, urbanization and agricultural development on natural habitats; and the river data, as the carrier of hydrological connectivity, has a special geographic value in the function of ecological corridors and the protection of biodiversity. River data, as a carrier of hydrological connectivity, has special geographic value in maintaining ecological corridor function and biodiversity conservation. This organized combination of different sources can show how the delicate ecosystem of the Loess Plateau reacts to natural conditions, explain how human activities and the environment affect each other, and offer theoretical help in understanding the complex reasons behind the differences in habitat quality in the region.

XGBoost regression

By using XGBoost-SHAP, this study evaluated the factors affecting habitat quality on the Loess Plateau. With XGBoost, multiple decision trees are integrated iteratively to improve prediction accuracy. It is also capable of handling complex nonlinear relationships and high-dimensional data19. Compared with traditional gradient boosting algorithms, XGBoost employs weighted data allocation in data processing to prioritize key data points, while using regularization techniques to effectively prevent overfitting, and supports parallel computation to significantly accelerate the training speed, and has a built-in function to deal with missing values, and excels in large-scale, high-dimensional data processing scenarios by virtue of its powerful learning capability and excellent generalization performance28,29,30.XGBoost has the following formulas17:

$$\begin{array}{*{20}c} {\hat{Y}_{q}^{K} = \mathop \sum \limits_{{k = 1}}^{K} f_{k} \left( {C_{q} } \right) = \hat{Y}_{q}^{{K - 1}} + f_{K} \left( {C_{q} } \right)} \\ \end{array}$$
(4)

Here,\(\:{\widehat{Y}}_{q}^{K}\)represents the predicted result of the q-th input variable after k iterations,\(\:{f}_{K}\)is the objective function of the k-th tree model, and C is the input value of the q-th input variable.

SHAP algorithm

The resultant model exhibits the characteristics of an inscrutable “black box.” To conduct an in-depth analysis of the XGBoost regression model and to gain a more comprehensive understanding of the impact of various influencing factors on habitat quality, the SHAP model was employed to interpret the XGBoost model, thereby elucidating the specific contribution of each driving factor17. Furthermore, the SHAP model has the capability to unveil the precise manner in which each driver influences habitat quality21,30,31.

$$\begin{array}{*{20}c} {g\left( h \right) = \varphi _{0} + \mathop \sum \limits_{{j = 1}}^{L} \varphi _{j} } \\ \end{array}$$
(5)
$$\begin{array}{*{20}c} {\varphi _{j} = \mathop \sum \limits_{{C{\text{E}}\backslash \left\{ j \right\}}} \frac{{\left| C \right|!\left( {\left| E \right| - \left| C \right| - 1} \right)!}}{{\left| E \right|!}}\left\{ {f\left( {C \cup \left\{ {\text{j}} \right\}} \right) - f\left( C \right)} \right\}} \\ \end{array}$$
(6)

where g is the SHAP model, \(\:{\phi\:}_{0}\) is the average of all predicted values, L is the number of features, and \(\:{\phi\:}_{j}\) is the SHAP value of the jth feature. \(\:E\) signifies the set of all features. \(\:C\)iterates over all subsets of E excluding \(\:\text{j}\). The core term, \(\:f\left(C\cup\:\left\{\text{j}\right\}\right)-f\left(C\right)\), computes the difference in model predictions before and after adding feature \(\:\text{j}\). The weighting coefficient, \(\:\frac{\left|C\right|!\left(\left|E\right|-\left|C\right|-1\right)!}{\left|E\right|!}\), assigns weights to subsets \(\:C\) of different sizes.

Results and analysis

Spatio-temporal patterns of habitat quality

Habitat quality spatiotemporal characteristics

The habitat quality index serves as a crucial indicator of habitat suitability within a region. Overall, the mean habitat quality index for the Loess Plateau from 1990 to 2020 was recorded as 0.527, 0.525, 0.526, and 0.522, consistently maintaining a favorable level (Fig. 2). From a temporal perspective, habitat quality in the region declined from 1990 to 2000. However, a notable recovery was observed from 2000 to 2010, primarily attributed to the implementation of ecological conservation policies, such as the Grain-for-Green Program initiated in 1999, along with the continuous enhancement of regional environmental protection and management. Nevertheless, post-2010, with the rapid progression of urbanization, the expansion of construction land significantly increased, continually encroaching upon agricultural and ecological lands, resulting in a renewed decline in the mean habitat quality index of the Loess Plateau.

To more intuitively illustrate the evolution of habitat quality, this study draws on relevant literature and, in consideration of the specific conditions of the Loess Plateau24,32classifies the habitat quality index into five levels (Fig. 3): mediocre (0.0-0.2), average small (0.2–0.4), medium (0.4–0.6), moderate moderate (0.6–0.8), and excellent (0.8-1.0). Analysis of area changes (Fig. 4) reveals significant differences in the quantitative structure of habitat quality levels from 1990 to 2020. Overall, high-quality and low-quality habitat areas have increased, with increases of 1046.38 km² and 113.89 km², respectively. In contrast, low, medium, and relatively high-quality habitats have generally decreased, with reductions of 9829.24 km², 106.85 km², and 2615.36 km², respectively. Spatially, the distribution generally exhibits a pattern of higher quality in the southeast and lower quality in the northwest. High-quality habitat areas are primarily located in the southeastern regions of the Luliang Mountains and Taihang Mountains, as well as in the southern part of the Shanbei Plateau and the western part of the Longzhong Plateau. Relatively high-quality habitat areas are concentrated in the northern and central parts of the Loess Plateau and the eastern grassland regions of Inner Mongolia. Low-quality areas are predominantly found in the Mu Us Sandy Land and Kubuqi Desert regions, with scattered patches in the southeast. Generally, poor-quality areas are mainly distributed in a strip-like pattern in the north and southeast, while medium-quality areas are distributed around relatively low-quality areas.

Fig. 2
figure 2

Spatiotemporal patterns of habitat quality in different periods. (a) 1990, (b) 2000, (c) 2010, (d) 2020, (e) sample of spatial variation in habitat quality. This map was created using the Habitat Quality module in InVEST 3.10.2 and ArcGIS 10.7, and it is based on the results obtained by running the land use classification result data.

Fig. 3
figure 3

Degree of habitat quality degradation and quantitative changes in habitat quality.

Habitat quality degradation characteristics

Habitat degradation is classified into five levels (Figs. 3 and 4): No degradation, Mild degradation, Moderate degradation, High degradation, and Severe degradation. From the perspective of the area-to-area ratio, the trend of habitat degradation from 1990 to 2020 indicates that the areas of non-degraded and severely damaged habitats decreased by 6.97% and 0.14%, respectively. In contrast, mild, moderate, and high degradation increased by 4.05%, 2.49%, and 0.57%, respectively. As shown in Fig. 4, in 1990, habitat quality degradation on the Loess Plateau was primarily concentrated in the northwest and central regions, characterized by moderate and high degradation. Over time, by 2000, the extent of degradation expanded, particularly in the central and eastern regions. This is where the intensity of degradation increased, and extremely degraded areas began to emerge. By 2010, the degradation trend further intensified, with severely degraded areas extending to the southeast and covering more extensive regions. By 2020, habitat quality degradation on the Loess Plateau reached its peak. This was due to severely damaged areas dominating most of the region, especially in the eastern and southeastern parts, where degradation was most pronounced. Overall, the degradation of habitat quality on the Loess Plateau has intensified from the northwest to the southeast over the studied period.

Fig. 4
figure 4

Degree of habitat quality degradation. (a) 1990, (b) 2000, (c) 2010, (d) 2020, (e) sample areas of significant degradation. This map was created using the Habitat Quality module in InVEST 3.10.2 and ArcGIS 10.7, and it is based on the results obtained by running the land use classification result data.

Analysis of the driving factors of habitat quality

The regression results of the XGBoost-SHAP model

The XGBoost regression model, after hyperparameter tuning, achieved an optimal parameter combination as follows: nrounds = 186, max_depth = 5, eta = 0.05, gamma = 0.3, colsample_bytree = 1, min_child_weight = 1, and subsample = 0.8. The test results demonstrated that the XGBoost regression model exhibits strong generalization capabilities (R²=0.802, MSE = 0.0287, RMSE = 0.169), indicating that the XGBoost model has high accuracy and reliability in explaining and predicting the impact of habitat quality. The high R² value and low error metrics suggest that the selected influencing factors in this study can largely explain habitat quality spatial variability.

SHAP-based model interpretation results

XGBoost model prediction results were complemented with SHAP values to determine the extent to which influencing factors affected habitat quality.In terms of the selection of influencing factors, the study covered multi-dimensional factors such as topography, climate, vegetation, human activities, etc., and included a total of 11 characteristic variables, As shown in Fig. 5, the factors affecting the habitat quality of the Loess Plateau are listed in descending order of importance as follows: X3 (TRI) > X8 (Pop) > X6 (NDVI) > X4 (Tem) > X11 (Soil Erosion Intensity) > X5 (Pre) > X1 (DEM) > X2 (Slope) > X10 (Distance to River) > X9 (GDP) > X7 (NTL).

The X3 (TRI), X8 (Pop), X6 (NDVI), X4 (Tem), X11 (Soil Erosion Intensity) and X5 (Pre) all showed positive correlations with habitat quality (Fig. 5). There was a significant contribution from TRI with 8.12%, indicating that ecological diversity brought about by complex topography significantly enhanced habitat quality; Population contributed with 6.16%, suggesting that high-density urban areas might contribute to habitat improvement by implementing ecological protection measures or greening initiatives; vegetation cover influenced habitat quality positively with a 3.85% contribution, as indicated by NDVI; However, the predicted value of habitat quality is positively correlated with temperature and precipitation recharge through suitable thermal conditions and precipitation recharge.

In contrast, the X1 (DEM), X2 (slope), X10 (distance from river), X9 (GDP), and X7 (NTL) results were highly correlated. With increasing distance from the river (X10), the negative effect of reduced water recharge on habitat quality increased; GDP and nighttime lighting (NTL) contributed 0.78% and 0.54%, respectively, indicating a slight negative effect on habitat quality caused by human economic activity intensity and light pollution; slope (X2) and digital elevation model (X1) have an inverse correlation. Overall, TRI, Pop, and NDVI are the core positive factors driving changes in habitat quality. In contrast, anthropogenic factors such as NTL, although weaker in magnitude, still exert a non-negligible negative perturbation on habitat quality.

Fig. 5
figure 5

SHAP value importance swarm chart and bar chart. (a) X1: Digital elevation model (DEM); X2: Slope; X3: Terrain ruggedness index (TRI); X4:Temperature(Tem); X5: Precipitation(Pre); X6: Normalized difference vegetation index(NDVI); X7: Night-time light(NTL); X8:Population density(Pop); X9:GDP; X10: Distance to Rivers; X11: Soil erosion intensity; (b) The percentages on the right represented the contribution of each feature to the model’s predicted results, i.e. the magnitude of the impact of changes in each feature on the predicted results.

The dependency relationship between dominant factors and model prediction results

In order to better elucidate the effects of dominant factors on habitat quality, the top six factors in order of SHAP visualization importance - X3 (TRI) > X8 (Pop) > X6 (NDVI) > X4 (Tem) > X11 (Soil Erosion Intensity) > X5 (Pre) - were selected for analysis in this study. This approach is intended to visualize the effects of these factors on habitat quality, thereby increasing the credibility of the model (Fig. 6).

The dependence of each factor on habitat quality showed significant non-linear characteristics. X3 (TRI) SHAP value increased with the increase of terrain ruggedness index, and its effect on habitat quality was weak at low TRI values, and its positive effect was significantly enhanced at high values; x8 (Pop) SHAP value increased and then decreased with the increase of population density, and its effect was small at low density, positively promoted at medium density, and turned to negative effect after exceeding the threshold value; x6 (NDVI) had limited effect on habitat quality when vegetation cover was sparse, and its effect was limited when NDVI values increased, and its effect was limited at high values. Threshold, then turned into a negative effect; X6 (NDVI) had a limited effect on habitat quality when vegetation cover was sparse, and SHAP values fluctuated and climbed as NDVI values increased, with positive enhancement of habitat quality being particularly prominent at high value stages. In addition, X4 (Tem), X5 (Pre) and X1 (DEM) gradually turned from low fluctuation to significant increase in SHAP values with increasing temperature, precipitation and elevation, respectively, suggesting that low temperature, little rainfall and low elevation areas have a weak effect on habitat quality, while its positive effect on habitat quality is more pronounced under high temperature, abundant precipitation and high elevation conditions. Together, these results revealed that the effects of the factors on habitat quality varied in strength across the range of values, and that they were all nonlinearly correlated.

Fig. 6
figure 6

Dependence relationship between the dominant factors and the model prediction results.

Conclusions and discussion

Conclusion

This study innovatively integrates the InVEST-XGBoost-SHAP framework to assess the spatiotemporal evolution and driving mechanisms of habitat quality on the Loess Plateau from 1990 to 2020. The key findings are as follows:

  1. (1)

    Habitat quality on the Loess Plateau exhibited a declining trend from 1990 to 2000. It had a distinct spatial distribution pattern characterized by higher quality in the southeast and lower quality in the northwest.

  2. (2)

    The degradation of habitat quality on the Loess Plateau has intensified, spreading from the northwest to the southeast over the study period.

  3. (3)

    The habitat quality of the Loess Plateau was positively driven by the core factors of terrain complexity (TRI), population density (Pop), and vegetation cover (NDVI), which contributed 8.12%, 6.16%, and 3.85%, respectively, whereas human economic activities (GDP, NTL) and terrain elevation (DEM) showed a negative inhibitory effect.

Discussion

Spatiotemporal distribution of habitat quality

Habitat quality on the Loess Plateau declined from 1990 to 2000, followed by a significant improvement from 2000 to 2010. Overall, however, habitat quality on the Loess Plateau demonstrated a downward trend from 1990 to 2020. This aligns with existing research findings24,26,33. Policy interventions significantly influence habitat quality and ecosystem functions on the Loess Plateau, as these changes indicate. Ecological restoration initiatives, such as the “Grain for Green” program and the “Three-North Shelterbelt Forest” project, have played a pivotal role in enhancing habitat quality. The recent decline in habitat quality may be attributed to the Land Management Law’s increasingly stringent requirements for quality management of cultivated land, forests, and grasslands. Moving forward, it is imperative to strengthen ecological protection and restoration efforts, enhance vegetation coverage and soil quality, and promote the rational utilization and planning of land resources. The goal is to mitigate environmental degradation caused by overexploitation and unsustainable practices.

Habitat quality driving factors analysis

During the period 1990–2020, the habitat quality of the Loess Plateau as a whole showed a decreasing trend, and its spatial distribution was characterized by relatively high habitat quality in the southeast and low habitat quality in the northwest, with the degree of degradation gradually increasing from the northwest to the southeast. This phenomenon is mainly influenced by positive factors such as terrain ruggedness index (TRI), population density (Pop), normalized vegetation index (NDVI), temperature (Tem), precipitation (Pre), and soil erosion intensity (X11). The Terrain Ruggedness Index (TRI) had the highest contribution of 8.12%, and the complex terrain provided diverse habitat conditions for a wide range of organisms, promoting ecosystem stability and biodiversity. Population density (Pop) contributed 6.16%, and habitat quality can be improved by ecological protection inputs or urban greening projects within a certain range. However, excessive density can also drive land use change and ecological degradation, especially in the southeast, where greater population pressure exacerbates habitat degradation. With a Normalized Vegetation Index (NDVI) contribution of 3.85%, increased vegetation cover directly enhances ecosystem stability and biodiversity. Temperature (Tem) and precipitation (Pre) had a moderate positive contribution to habitat quality through suitable heat and precipitation recharge, but the contribution was relatively low. Soil erosion intensity (X11), although positively correlated with habitat quality, may increase to reflect reduced vegetation cover and other ecological degradation processes, whose improvement could enhance habitat quality. In contrast, factors such as digital elevation model (DEM), slope (X2), distance from river (X10), gross domestic product (GDP), and nighttime lighting (NTL) had a weaker effect on habitat quality, which was mainly inhibitory. Among them, high altitude and steep slope areas had weaker vegetation cover and ecosystem function due to natural conditions; increased distance from rivers led to reduced water recharge and weakened ecosystem stability; GDP and nighttime lighting index contributed 0.78% and 0.54%, respectively, reflecting the slight negative effects of human economic activity intensity and light pollution on habitat quality, which mostly originated from overdevelopment and accelerated urbanization, leading to habitat fragmentation and degradation. Overall, the Terrain Ruggedness Index (TRI), Population Density (Pop), and Vegetation Cover (NDVI) were the core positive factors driving habitat quality changes, whereas anthropogenic factors such as Night Time Lighting (NTL), although weaker, still exert negative perturbations on habitat quality.

To improve the ecological environment quality of the Loess Plateau, ecological protection zones should be demarcated around the areas with high Terrain Ruggedness Index (TRI) values. The “ecology-economy” synergistic development model will be implemented in the urban-rural transition zones. This model will help communities get involved in restoring plants by providing ecological compensation, while also limiting the uncontrolled growth of areas with too many people. Enhance environmental resilience in climate-sensitive areas by integrating vegetation configuration and temperature and precipitation thresholds. Reduce slope development environmental risks by returning farmland to grass and implementing comprehensive soil erosion management in high-altitude zones. A “environmental red line + industrial access” system should be used in areas where economic development is rapid and the nighttime light index (NTL) is increasing, thereby limiting the layout of highly polluting industries and promoting green infrastructure to reduce light pollution and land hardening. Further, policy support for restoration and protection of the environment should be strengthened, land-use planning should be optimized, and urbanization and population density should be regulated to prevent the destruction of natural habitats. In order to balance southeastern economic and environmental development, the green development model should be promoted to balance the environmental and economic development of the southeastern region; it is necessary to strengthen climate change adaptation measures, and rainfall must be used efficiently to enhance ecosystem resilience; an ecological compensation mechanism should be developed to motivate local governments and residents to take part in the protection of the environment and to build a cooperative relationship across society. Encourage local governments and residents to participate actively in ecological protection, and develop a framework for collaborative governance. Through these comprehensive strategies, the habitat quality of the Loess Plateau region will be gradually restored and improved. The foundation will be laid for the sustainable development of its economy, environment, and social sectors.

Limitations and prospects

The InVEST-XGBoost-SHAP framework adopted in this study offers valuable insights into habitat quality and its drivers in the Loess Plateau, but it also has certain limitations. While the XGBoost model is capable of capturing complex nonlinear relationships, its “black-box” nature may make it difficult to understand its internal mechanisms. Although the SHAP values explain feature importance and dependencies, in some cases, this explanation may still have limitations. The SHAP values, for instance, reflect the average impact of features on the predicted results of the model. The complex interactions between features are not fully revealed by them. Additionally, functional independence may not be fully valid. Models may fail to capture interdependencies and synergies among influences in real ecosystems. Although soil physicochemical properties (e.g., organic carbon, texture) may influence local habitats, their effects are predominantly overridden by erosion-deposition processes at the landscape scale of the Loess Plateau. For example, vegetation cover (NDVI) and soil erosion intensity (X11) may have a mutual feedback mechanism. An increase in vegetation cover reduces soil erosion, and a decrease in soil erosion favors vegetation growth—a complex dynamic relationship that may not be adequately captured by simple feature importance ranking and dependency analyses. Considering these limitations, future studies should:

  1. (1)

    Explore the framework’s internal mechanisms in greater depth and combine them with other explanatory models to overcome XGBoost’s “black box” nature;

  2. (2)

    Break through the assumption of functional independence by analyzing interdependencies and synergistic effects (e.g., NDVI–X11 feedbacks) through complex ecological models or multidisciplinary approaches;

  3. (3)

    Incorporate additional soil factors and improve the spatial resolution of selected factors to reduce scaling errors.