Introduction

Liriomyza sativae Blanchard is a species in the genus Liriomyza, order Diptera. It is recognized as a major agricultural pest with significant global implications1. This pest predominantly infests the foliage of diverse plants. Notably, the morphological and textural characteristics of these leaves exert a profound influence on both their foraging and oviposition behaviors. Consequently, this leads to disparities in larval survival rates2. L. sativae consumes leaf tissue, creating irregular serpentine tunnels on the leaves. With the affected area expands, there is a marked reduction in the leaves’ photosynthesis and transpiration, substantially impeding the normal growth and development of the host plants3. The host range of the organism is notably broad and encompasses a variety of significant fruits and vegetables, such as cucumber (Cucumis sativus L.)4, common bean (Phaseolus vulgaris L.)5, and lettuce (Lactuca sativa L.)6. It also affects economic crops like tobacco (Nicotiana tabacum L.)7 and cotton (Gossypium hirsutum L.)8, as well as various ornamental plants, which poses a substantial threat to agricultural production. Additionally, L. sativae exhibits traits such as protracted oviposition cycles, high fecundity, and rapid development9. All these traits collectively amplify the complexity of implementing effective control measures for this pest.

Global surface temperature rose by 0.8°C in 20th century10. In the 21st century, human activities, such as excessive fossil fuel use, deforestation, and soil degradation, may accelerate Earth’s warming from the projected 1.5°C to up to 5°C by 210011. Previous study demonstrated a strong correlation between the development time, adult lifespan, and reproductive capacity of L. sativae and temperature. It was observed that the reproductive potential of L. sativae peaks when the temperature is between 20°C and 30°C12. The development cycle of L. sativae shortens with the temperature increasing13. Collectively, these factors have expedited the proliferation of L. sativae across China, posing a significant challenge to crop production in infested areas. Detected initially in Sanya City, Hainan Province in 1990s, L. sativae has rapidly disseminated nationwide, causing considerable damage to agriculture and environment14. By 2000, it had permeated 23 provinces and 4 autonomous regions, with its rapid proliferation linked to the frequent movement of plant material in greenhouses and plastic sheds15,16.

Given the substantial damage and persistent spread of L. sativae, ongoing monitoring and predictive analysis of its habitat patterns and climatic influences are imperative. Ecological Niche Modeling (ENM) identifies species’ ecological prerequisites by incorporating environmental variables from known distribution points and defines optimal habitat conditions17. This model has been widely applied in fields such as pest invasion early warning18, potential geographic habitat19, and climate change impact analysis20. However, study on the suitable habitat areas of L. sativae in China is still lacking. This study employs the MaxEnt-SDM (Maximum Entropy Species Distribution Model) and ecological niche simulation tools to construct a climate-distribution relationship for this important agricultural pest. The purpose is to comprehensively understand how climate change affects its geographical spread and assist in effective pest management. The model predicts the potential suitable habitats for L. sativae within China and explores potential expansion under future climates. This not only provides insight into the ecological characteristics of L. sativae, but also provides a scientific basis for targeted control strategies.

Materials and methods

Occurrence sites

Distribution data for L. sativae is primarily sourced from field sampling records, both domestic and international published literature, and the Global Biodiversity Information Platform (GBIF, https://www.gbif.org/). The methodology for filtering this distribution data is as follows: Firstly, any invalid or duplicate distribution points are removed. Secondly, the distribution data of species was carefully refined using the buffer analysis tool available in the ArcGIS 10.4.1 toolbox. Within the 2.5 minute resolution grid system, duplicate distribution points are randomly eliminated to ensure that each grid contains only a single distribution point, thereby mitigating spatial autocorrelation20. After a thorough screening process, we identified a total of 132 valid global distribution points, of which 102 were located within China (Fig. 1).

Fig. 1
Fig. 1
Full size image

Geographical distribution of sampling data of L. sativae in China. Note: The base map used (Map Approval Number: GS (2024) 0650) comes from the Standard Map Service Platform of the Ministry of Natural Resources of the People’s Republic of China (https://www.tianditu.gov.cn/?type=1), and the geographical distribution of sampling data was illustrated using ArcGIS 10.4.1 software (https://www.esri.com/en-us/home). The base map itself has not been modified and is only used for the visualization of data.

Acquisition and screening of climate data

The environmental data utilized in this study is sourced from the WorldClim global climate database (http://www.worldclim.org/). This database offers bioclimatic data ranging from 1970 to 2000, 2041 to 2060 (which represents the 2050 scenario), and 2061 to 2080 (which represents the 2070 scenario). It includes 19 bioclimatic variables (Bio01 to Bio19) that significantly influence species distribution, in addition to 1 altitude variable (Alt). All data have a spatial resolution of 2.5 min. To forecast the impact of future climate change on species distribution, this study employs two emission scenarios under the BCC-CSM2 climate system model: an optimistic scenario (SSP126) and a pessimistic scenario (SSP585). These scenarios correspond to an increase in radiative forcing relative to pre-industrial levels of 2.6 W/m2 and 8.5 W/m2 by 2100, respectively21,22, which aims to analyze the changes in the potential suitable distribution pattern of L. sativae under two extreme scenarios.

The screening process of environmental variables commences by eliminating those with a contribution rate below 1%. This is followed by executing Pearson correlation analysis, whereby if the absolute value of the correlation coefficient (|r|) exceeds 0.7 between two variables, the one with the higher contribution rate is retained23,24. The rigorous screening process culminates in the selection of seven environmental variables, specifically Alt, Bio01, Bio02, Bio03, Bio15, Bio16, and Bio17 (Fig. 2).

Fig. 2
Fig. 2
Full size image

Correlation coefficient matrix of environmental variables

Modelling approach

The current study utilized MaxEnt modelling version 3.4.4 (https://biodiversityinformatics.amnh.org/open_source/maxent/)25. The MaxEnt software offers five distinct feature types, which include linear (L), quadratic (Q), product (P), threshold (T), and hinge (H)23. To optimize the configuration of parameters, this study employs the ‘ENMeval’ package (https://github.com/danlwarren/ENMTools/)26 within R4.3.2 software (https://cran.r-project.org/)27. The Kuenm package is specifically designed to compute both the feature combination (FC) and the regularization multiplier (RM). In total, there are 31 potential combinations of these FCs. The RM parameter is set within a range of 0.1 to 4.0, with increments of 0.1. The Kuenm package was employed to evaluate the 1,240 candidate models generated from the combination of 40 RMs and 31 FCs26. Based on the following criteria, the optimal models were identified under current climate conditions utilizing the Kuenm package(https://github.com/marlonecobos/kuenm/): (1) Significant models exhibiting omission rates of ≤ 5%, and (2) the lowest delta-corrected Akaike Information Criterion (AICc) of ≤ 2%28. The system employs a random selection process whereby 75% of the distribution data is designated as the training dataset. The remaining 25% is allocated for testing purposes. The number of replicates has been established at 10, with the type of replicated run classified as a subsample. The file format is designated as ‘Asc’, while all other parameters are set to their default values27. The findings are to be presented in a “Logistic” format.

Model accuracy evaluation

In the development of ecological niche models, a critical step involves comparing and analyzing the disparity between the theoretical omission rate and the test omission rate. The model’s predictive accuracy is assessed based on a predefined threshold. A significant discrepancy between these rates necessitates further data preprocessing and optimization of the model29. This study employs the technique of constructing omission rate curves and computing the area under curve (AUC) values to quantify the precision of model predictions. The AUC value, ranging from 0 to 1, is indicative of the model’s predictive efficacy; a higher value denotes superior predictive performance30. Typically, AUC values are divided into five categories: Failure (≤0.6), Poor (0.6 ~ 0.7], Fair (0.7 ~ 0.8], Good (0.8 ~ 0.9], and Excellent (0.9 ~ 1.0]31.

Division of suitable habitats

This study, building upon the work of Aidoo et al. (2021)32, identified the Maximum Test Sensitivity plus Specificity Logistic Threshold (MTSPS) by utilizing distribution point data of L. sativae and seven dominant environmental variables in conjunction with the MaxEnt model. The resultant average value was determined to be 0.266. The spatial analysis function of ArcToolbox in ArcGIS10.4.1 software (https://www.esri.com/en-us/home) is used to extract the suitable habitats25, that is, the area with a prediction probability of less than or equal to 0.266 is defined as non-suitable habitat, and the area with a prediction probability greater than 0.266 is defined as suitable habitats. Moreover, the latter is further divided into lowly suitable habitat (0.266 ~ 0.4], moderately suitable habitat (0.4 ~ 0.6] and highly suitable habitat (> 0.6)23.

Results

Optimization of model parameters and verification of precision

Here, we comprehensively tested 31 different FCs (covering all possible combinations of H, L, Q, P, T and H) and 40 different regularization coefficient values (ranging from 0.1 to 4.0 with a step size of 0.1). Therefore, this study ultimately determined the H feature combination (with the lowest delta AICc score) and a regularization coefficient of 3.7 as the optimal scheme for constructing the model of the suitable habitat of L. sativae in China (Fig. 3).

Fig. 3
Fig. 3
Full size image

The analysis of 31 different feature combinations using ENMeval

The alignment between the sample omission rate and predicted omission rate curves in this study was strikingly high (Fig. 4A), suggesting that the MaxEnt model exhibited robust predictive performance. Moreover, the receiver operating characteristic (ROC) curve was constructed using 1-specificity as X-axis and the omission rate as Y-axis (Fig. 4B). For the test dataset, the mean area under the ROC curve (AUC) is 0.895, with a standard deviation of 0.014. In contrast, for the training dataset, the average AUC is 0.908 with a standard deviation of 0.005. Notably, all AUC values exceeded 0.8, demonstrating that the current model meets the “Good” standard. Consequently, the outcomes of this modeling exercise offer valuable insights for predicting potential suitable habitats for L. sativae in China.

Fig. 4
Fig. 4
Full size image

Omission rate and ROC curve of MaxEnt model. Note: In (A), the values represented by the red curve indicate the mean background prediction derived from ten replicate runs, while the blue margins illustrate the standard deviation (SD) calculated across these replicates. Similarly, the values denoted by the green curve reflect the mean omission rate on test data obtained from ten replicate runs, with the yellow margins representing the SD values across these replicates. In (B), the red curve denotes the average area under the curve (AUC) of ten replicate runs, and the blue margins indicate the standard deviation of AUC values computed across these replicates.

The correlation between the habitat of L. sativae and environmental variables

Among the tested environmental factors influencing the potential habitat of L. sativae, annual mean temperature (Bio01) was the most significant, contributing to 61.3%. Precipitation of the wettest quarter (Bio16) followed with a contribution of 14.0%. Other factors, including precipitation seasonality (Bio15) and altitude (Alt), contributed to   9.4% and  6.5%, respectively. The contribution of remaining variables did not exceed 5%. In permutation importance, Bio01 also ranked highest at 58.1%. Both metrics confirmed that Bio01 was the primary factor determining potential habitat of L. sativae (Table 1).

Table 1 Contribution rate and important value of dominant environmental variables

The seven environmental variables initially identified as significantly influencing the habitat of L. sativae were analyzed employing the jackknife analysis. As illustrated in Fig. 5, the length of the blue bar is indicative of the importance of each variable to the species habitat: a longer bar denotes a greater impact. Notably, the Bio01, Bio16 and Bio03 stand out as key environmental factors. On the other hand, the length of the green bars, representing the influence on species habitat when a variable is omitted, shows that the Bio01, Bio15 and Bio16 have notably pronounced effects. A thorough analysis confirmed that the Bio01, Bio16 and Bio15 are the principal environmental variables affecting the potential suitable habitats for L. sativae.

Fig. 5
Fig. 5
Full size image

Regularized training gain of Jackknife. Red represents with all variables, green represents without variables, and blue represents with only variable.

This study employed a threshold of 0.266 for environmental variables to delineate its suitable habitat. The results revealed that the suitable range of Bio01 was between 4.2-26.3 °C, of which the highly suitable temperature range is 11.6-18.9 °C. Within the range of 4.2-26.3°C, the suitability of L. sativae significantly increases in the interval of 4.4-15.5 °C, peaking at 15.5 °C. However, the suitability gradually decreased once this temperature was exceeded. In terms of precipitation seasonality, the suitable habitat was defined as 13.0-255.2. The optimal suitability was attained when the precipitation seasonality fell within the range of 98.2-112.1. Beyond this range, suitability decreased as coefficient of variation increased. Finally, for precipitation of the wettest quarter, the suitable range was established as 137.2-4100.8 mm (Fig. 6).

Fig. 6
Fig. 6
Full size image

Response curve illustrates the correlation between the probability of pest presence and three bioclimatic variables. Note: The numbers provided by the red curve represent the average of ten replicate runs, with blue margins indicating the standard deviation (SD) calculated across these replicates. Bio01: Annual mean temperature (°C); Bio15: Precipitation seasonality (coefficient of variation); Bio16: Precipitation of wettest quarter(mm).

Predicting suitable habitats under existing climatic conditions

The current climate conditions support a widespread habitat of L. sativae across China, encompassing all provinces, autonomous regions, and municipalities. The suitable habitat for this species spans 540.8×104 km2, comprising approximately 56.3% of China’s total land area. Highly suitable habitats are mainly distributed in North China, Central China, East China, and South China, as well as the southern part of Northeast China. These areas collectively cover 288.2×104 km2, representing roughly 30.0% of China’s total land area. They include all of Beijing, Tianjin, Shandong, Hebei, Henan, Shaanxi, Anhui, Jiangsu, Zhejiang, Fujian, Guangdong, Guangxi, Jiangxi, Hubei, Hunan, Chongqing, and Guizhou. Additionally, they partially overlap with southwest Heilongjiang, west Jilin, middle and west Liaoning, east Inner Mongolia, middle and south Shanxi, east Sichuan, southeast Yunnan, the northern coastal area of Hainan, and the northern and central coastal areas of Taiwan. Moderately suitable habitats, on the other hand, are situated around the periphery of the highly suitable regions and are dispersed throughout parts of North China, Southwest China, and Northeast China, covering an area of 124.6×104 km2, which accounts for 13.0% of China’s total land area. These areas primarily include the southwest region of Heilongjiang, central Jilin, northeastern Liaoning, central Inner Mongolia, northern Hebei, central and northern Shanxi, Ningxia, western Shaanxi, eastern Gansu, southeastern Sichuan, Yunnan, Hainan Island, and central Taiwan. The area encompassing lowly suitable habitat spans 128.0×104 km2, constituting 13.3% of China’s total land area. These zones are predominantly located in the eastern regions of Heilongjiang and Jilin, the western and central areas of Inner Mongolia, the border region between Gansu and Inner Mongolia, the central and eastern sectors of Xinjiang, the southwestern part of Yunnan, and the southeastern region of Tibet. Additionally, there are sporadic distributions in the central and eastern areas of Qinghai, Sichuan, and Taiwan. (Fig. 7).

Fig. 7
Fig. 7
Full size image

The potential suitable habitat regions of L. sativae under the current scenarios in China. Note: white, unsuitable habitat area; Green, lowly habitat suitability area; Yellow, moderately habitat suitability area; Red, highly habitat suitability area. MTSPS, Maximum Test Sensitivity plus Specificity Logistic Threshold. The base map used in this paper (Map Approval Number: GS (2024) 0650) comes from the Standard Map Service Platform of the Ministry of Natural Resources of the People’s Republic of China (https://www.tianditu.gov.cn/?type=1), and it is made using ArcGIS 10.4.1 software (https://www.esri.com/en-us/home) based on the above base map. The base map itself has not been modified and is only used for the visualization of data.

Forecasting suitable habitats for future climate scenarios

In projected climate scenarios, the patterns in the area of varying suitable habitats exhibit considerable variability (Table 2). Under the SSP126 scenario, there is an upward trajectory in the total suitable habitat from the present to the 2050s, evidenced by a substantial increase of 21.28%. However, this growth pattern undergoes a reversal in the 2070s, wherein the total suitable habitat records only a modest 4.06% increase relative to the present. Upon further analysis, it was observed that the extent of highly suitable habitats demonstrated a progressive decline. By 2050, there would be a significant reduction of 77.11%, and by 2070, an evident decrease of 86.39% is projected. The area of moderately suitable habitats witnessed a significant surge of 121.33% from the present to the 2050s. However, this increase moderated to 59.84% between the present and the 2070s, indicating an initial expansion followed by a contraction. The area of lowly suitable habitats, on the other hand, demonstrated a highly significant expansion trend, increasing by 145.37% from the present to the 2050s. This growth continued, albeit at a slower rate, with an additional increase of 7.96% by the 2070s.

Table 2 Areas of suitable habitats for L. sativae under future climate scenarios.

Under the SSP585 scenario, the total area of suitable habitats increased significantly in the 2050s, by 25.90%, but the growth rate of suitable habitats contracted in the 2070s, reaching only 21.38%. In this scenario, the expansion of medium and low suitability habitats is extremely obvious, while high suitability habitats show a significant contraction trend. The reduction in area reached 68.46% and 66.15% respectively in the 2050s and 2070s. These findings indicated that rapid warming associated with high emission scenarios may accelerate the degradation of highly suitable habitats. Concurrently, moderately and lowly suitable habitats may expand due to more favorable conditions within climate marginal zones. This underscores the diverse ecological responses elicited by different emission pathways.

From the present to 2050s, under the SSP126 scenario, the suitable habitats for L. sativae exhibits a pronounced expansion, stretching in two divergent directions: southwest and northeast across China. This expansion is primarily centered on northeastern Inner Mongolia, northern Heilongjiang, Qinghai, Tibet, central Gansu, and western Sichuan. In contrast, areas showing contraction are fewer and fragmented, largely found in Xinjiang, western Inner Mongolia, eastern Gansu, southern Tibet, northwestern Yunnan, southern Shaanxi, southeastern Hunan, southern Jiangxi, and southern Fujian. Additionally, there are sporadic reductions in Sichuan, Chongqing, Henan, Hubei, and Taiwan. However, under the SSP585 scenario, while the expansion of L. sativae’s suitable habitats mirrors that of the low emission scenario (SSP126), notable differences arise in southern China. For instance, in the SSP126 scenario, both southern Jiangxi and southern Fujian experience marked reductions, with scattered decreases in southeastern Zhejiang. Conversely, under the SSP585 scenario, there is a significant reduction in eastern Zhejiang, scattered reductions in western Jiangxi, but no alterations in Fujian’s suitable area.

In the SSP126 scenario, the expansion of suitable habitats for L. sativae in the 2070s is less pronounced than the contraction. Notably, areas of expansion are primarily located in the southwest and east of Tibet, the southwest of Qinghai, the central and northeastern parts of Gansu. In contrast, regions experiencing contraction are predominantly found in the western segment of Inner Mongolia, Xinjiang, southern Tibet, the southern reaches of Gansu and Shaanxi, and several provinces in southern China including Yunnan, Guizhou, Chongqing, Hubei, Hunan, Anhui, Jiangsu, Zhejiang, Fujian, and Taiwan, all exhibiting varying degrees of reduction. In the SSP585 scenario, however, the expansion trend of suitable habitats is considerably more robust than the contraction. Here, expansion areas are largely concentrated in the western regions of the country, encompassing Qinghai, Tibet, the eastern portion of Sichuan, the northeastern section of Inner Mongolia, and the northern part of Heilongjiang. Conversely, areas of contraction are predominantly situated in Xinjiang, eastern Inner Mongolia, southern Tibet, and the northwestern corner of Yunnan. When comparing the two scenarios, it is evident that the expansion trend of suitable habitats for L. sativae under the high emission scenario (SSP585) is markedly stronger than under the low emission scenario (SSP126) (Fig. 8).

Fig. 8
Fig. 8
Full size image

Potential suitable habitat regions of L. sativae under two kinds of representative concentration pathways in 2050s and 2070s. Note: Red represents range expansion; light green represents range contraction; brownish-yellow represents no change. SSP126 characterizes a low-emission scenario (SSP1-2.6), emblematic of a sustainable development trajectory marked by moderate growth and a swift reduction in greenhouse gas emissions. Conversely, SSP585 depicts a high-emission scenario (SSP5-8.5), indicative of a development path predominantly driven by fossil fuels, showcasing consistent escalation in greenhouse gas emissions. The timeframes “2050s” and “2070s” pertain to the climate simulation periods spanning 2041-2060 and 2061-2080, respectively.

Discussion

Optimizing model parameters in maxent

The MaxEnt model has been extensively utilized in diverse scientific studies and practical domains, including species habitat prediction and ecological niche model. Its predictions are frequently regarded as accurate and reliable33. However, despite the model’s significant achievements, users should exercise caution and not underestimate the importance of model optimization. Utilizing unoptimized models could introduce fitting biases, which may compromise the accurate evaluation of species’ ecological niches and potentially misguide the scientific development of conservation and management policies34. The optimal model should strive for simplicity while ensuring that the selected features provide a comprehensive and precise representation of species habitat35,36. Within the MaxEnt model, the FC and RM are pivotal parameters that significantly impact the accuracy of model predictions. The FC parameter adjusts the model’s alignment with environmental variables by precisely calibrating the computation of habitat suitability probability, thereby augmenting the model’s predictive capacity. Conversely, the RM parameter serves to strike a balance between the model’s complexity and the risk of overfitting. A small RM value results in a more localized model habitat that aligns closely with observational records, but this also increases the susceptibility to overfitting. Moreover, a large RM value enhances the model’s generalizability, albeit potentially at the expense of some accuracy25,37. Consequently, this study utilized the Kuenm package in the R language to optimize the FC and RM, with the objective of constructing the most optimal model for the suitable habitats of L. sativae in China. Following thorough analysis and testing, we established that the optimal FC is the H feature combination. Concurrently, the RM was set to 3.7. This approach effectively mitigated the issue of overfitting while significantly enhancing the model’s generalization capability and prediction accuracy.

The influence of climatic factors on the potential habitat suitability of L. sativae in China.

Climatic conditions play a crucial role in the establishment and expansion of insect populations. Specifically, temperature and precipitation - core climatic variables - significantly influence the density, rate of diffusion, and developmental stages of these populations. Consequently, this impacts the potential geographical habitat range of insects38,39,40. This study employs the MaxEnt model, incorporating 7 dominant environmental variables for a thorough analysis. These findings indicate that temperature-related factors contribute to approximately 63.6% of the variance, thereby serving as the primary determinant of the habitat of L. sativae. Precipitation factors constitute the second-most significant contributor, with a contribution rate of  27%, while altitude factors have a relatively minor impact, accounting for only 6.5%. This suggests that temperature is the predominant factor limiting the geographical habitat of L. sativae, followed by precipitation.

Through the application of Jackknife tests and a singular response curve analysis of dominant environmental factors, we have established that the Bio01 is the key determinant of the habitat pattern of L. sativae. The optimal temperature for its growth ranged between 4.2°C and 26.3 °C, with the particularly favorable range being 11.6-18.9°C. Given the current climatic conditions, the central and eastern regions of China, with their mean average temperatures of 10°C to 20°C, seem to align well with the suitable temperature conditions for this pest. Consequently, these regions could potentially serve as suitable habitats for L. sativae. This model prediction aligns closely with the climatic characteristics of these regions41. During the warmer months of May through June, the initial generation of L. sativae necessitates a development period of 15 days, while in the cooler months of October through November, this period extends to an estimated 25 to 30 days16. The data suggests that the population of L. sativae will increase in areas with higher temperatures, specifically in relation to the generation number. It is notably highlighted that these flies are highly sensitive to low temperature environments and cannot tolerate prolonged exposure to pupae at low temperatures of 0 - 5 °C. As a result, they are unable to survive outdoors during the winter in the colder northern regions of China42.

The precipitation seasonality (Bio15, ranging from 13 to 255.2) is also significant. While the influence of humidity on L. sativae is less pronounced than that of temperature, it remains a crucial factor for its survival and reproduction. Both highly arid and excessively humid environments cause pupae to detach from leaves prematurely. In conditions of low humidity, the likelihood of adult emergence diminishes significantly, and the adults often struggle to expand their wings. Conversely, heavy rainfall, leading to water accumulation in the soil, poses a severe risk to pupal development and adult survival43. Prior studies have demonstrated that the ideal relative humidity for the larval stage of L. sativae is 50%, while a relative humidity of 90% is preferred during the pupal stage. These high humidity conditions contribute positively towards the survival and normal development of the larvae13,44. These findings in this study align closely with the conclusion, revealing that L. sativae thrives in a wide range of precipitation during the most humid season, spanning from 137.2 mm to 4100.8 mm. This highlights the species’ broad ecological niche. Environmental factors like temperature and precipitation influence the habitat and survival of these flies. Therefore, understanding these mechanisms is crucial for effective pest management.

Systematic analysis of maxent model limitations and data bias

The MaxEnt ecological niche model, based on the assumption that species distribution is mainly limited by climate factors, fails to fully account for non-niche mechanisms during biological invasions45. Especially in anthropogenic contexts, the unintentional introduction of species into new habitats is occurring at an unprecedented rate. Once a stable population of the invasive species is established, eradication becomes nearly impossible46,47. The primary methods of spreading non-native pests are through human-mediated transport, such as international trade and goods movement48. In global trade, particularly in the exchange of vegetables, live plants, and related commodities, pests often travel undetected on infected plant materials, soil, containers, or even on machinery and footwear. Because they are hard to spot during transport, they are the primary cause of invasive pest introductions49. Elevated trade and interception rates raise the risk of pest invasion50. Wang et al. found that L. sativae populations peak during the dry season in certain crops, independent of climate, indicating that crop phenology and agricultural practices (such as planting period, greenhouses) affect their dynamics. Greenhouses also help them survive in otherwise unsuitable cold regions51. In January 2019, inside a plastic greenhouse at Tikrit University, the temperature was consistently maintained at 12.5 °C with 92% relative humidity. Under these conditions, the average count of L. sativae larvae per plant was 35.3, peaking at 62.852. These results align with the findings of this study.

Existing research indicates that relying on readily available data can help avoid biases from unreliable or missing records53. Museum collections often overrepresent rare species while underrepresenting common ones, revealing a clear bias in collectors’ preferences during sample collection54. Consequently, accurately reflecting the true abundance of species in the natural environment is challenging due to the relative quantities of specimens in herbariums. Furthermore, uneven survey efforts, data storage, and mobilization, worsened by geographical differences, introduce spatial biases in species occurrence records55. Our results suggest that agriculturally significant provinces or economically advanced regions, such as Shandong, Yunnan, and Jiangsu, possess more comprehensive species record data due to the increased frequency of surveys. Conversely, provinces like Fujian and Jiangxi may have undocumented species owing to limited survey efforts or insufficient survey frequency. This discrepancy could reduce the accuracy of species distribution models and introduce prediction bias56. However, the adverse effects of spatial bias are not always evident57.

Furthermore, excluding pivotal environmental variables from the model could potentially impact the prediction accuracy. Vellend et al. (2014) discovered that in mid-to-high elevation areas, soil factors have a stronger influence on species distribution than climatic factors58. In low-altitude regions, land cover, inclusive of human-modified landscape elements, plays a more pivotal role in regulating species distribution and can diminish the dominance of climate variables59. This suggested that the actual suitability potential in complex landscapes may be underestimated if variables such as soil type, topographical complexity are ignored. Host plant factors (such as species, quantity, and growth cycle) are also crucial, directly affecting pest population dynamics and survival. For example, the abundance and reproduction success rate of L. sativae may vary with different host plants. For instance, the abundance and reproductive success of L. sativae can vary depending on the host plant. Oviposition typically occurs on the underside of cowpea leaves, while traits such as fecundity and longevity are influenced by plant species and leaf characteristics60.

Based on the “Logistic” threshold, suitable habitats can be categorized into three grades: non-suitable habitat (0-MTSPS), medium-lowly suitable habitat (MTSPS-0.5), and highly suitable habitat (0.5-1). Alternatively, they may be classified into four grades: non-suitable habitat, lowly suitable habitat, moderately suitable habitat, and highly suitable habitat23,25. To better differentiate habitat suitability, this study adopts a more refined four-grade classification (0-MTSPS, MTSPS-0.4, 0.4-0.6, 0.6-1). To improve the accuracy of the MaxEnt ecological niche model in predicting invasive species distribution, future research should: (1) integrate multi-source survey data to mitigate recording biases; (2) include crucial variables, notably the distribution of host plants, soil characteristics, and topographical elements; (3) refine modelling techniques to capture the complex mechanisms of biological invasions.

Conclusions

This study utilizes the MaxEnt model to simulate the distribution range and spatial pattern of suitable areas for L. sativae under current and future climate scenarios in China. The study revealed that the most suitable habitats for L. sativae, given the current climate conditions, are primarily found in North China, East China, Central China, South China, Southwest China, and the southern parts of Northeast China. Under the SSP126 and SSP585 scenarios, the total suitable habitat is projected to increase by 2050s and 2070s. Highly suitable habitats are anticipated to contract significantly, while the moderately and lowly suitable habitats are expected to expand significantly, highlighting the ecological response differences under different emission pathways. The study reveals that annual mean temperature, precipitation seasonality and precipitation of wettest quarter are key determinants affecting the potential geographical distribution of L. sativae. Our results would offer valuable insights for adapting pest management strategies in response to future climate change.