Introduction

Mountain regions globally cover more than 27% of the Earth’s land surface and are home to over 1 billion people, representing approximately 14% of the global population1. These regions are often characterized by steep and unstable slopes, which create significant challenges for their inhabitants. Between 2005 and 2014, countries with mountainous terrains accounted for over 70% of disaster-related fatalities2,3. However, these statistics likely underrepresent the broader impact of recurrent and localized mass movements, which are considerable barriers to sustainable livelihoods and regional development4,5,6,7,8. Landslides, which are among the deadliest natural hazards, account for approximately 17% of all natural disasters9,10. This trend is expected to rise in the future, exacerbated by increasing urbanization, deforestation, and the impacts of climate change11,12. Extreme precipitation, often amplified by global warming, is a key driver for landslide occurrences, with the potential to trigger nonlinear increases in landslide activity13. The annual global economic losses due to landslides are estimated to reach approximately 18 billion Euros, with significant social and environmental repercussions14. From 1980 to 2017, data from the Emergency Events Database (EM-DAT) documented 631 landslide disaster events, resulting in 44,541 fatalities15. Further, the global catastrophic landslide database, covering 2004–2010, reveals an average of 374 catastrophic landslides annually, leading to an average of 4,617 fatalities each year9.

Numerous studies have also highlighted the temporal and spatial distribution of landslides, underlining their status as a major global hazard with substantial human and economic consequences every year16,17,18,19,20. Countries in regions such as Colombia, Peru, Brazil, Nicaragua, El Salvador, Italy, Nepal, India, China, and New Zealand are particularly prone to landslides21. A recent study by22 projects a significant increase in landslide-related population risk across different timeframes (1971–2000, 2031–2060, and 2066–2095), with countries like China, India, Turkey, the Philippines, and Nepal expected to face heightened risks. India, in particular, ranks second globally, with projected increases in population risk from 360 to 760 (111%) between 2031 and 2060, and a sustained high level of 690 (92%) from 2066 to 2095. This highlights the growing landslide risks in Asia, especially in the coming decades. Asia is the continent with the highest landslide risk, particularly along the Himalayan arc10,23. High Mountain Asia has long been recognized as a hotspot for landslide risks24, and studies suggest that landslide hazards in this region are expected to increase further in the future17. According to EM-DAT, Asia accounted for 58% of all global landslides and 41% of all disaster-related incidents from 1982 to 202225. Southern Asia, home to countries like India, Nepal, and Pakistan, is particularly susceptible, representing 34% of Asia’s landslide occurrences25. The Himalayan Mountain range alone contributes to 89% of landslides in Southern Asia12.

The risk of casualties from landslides is expected to rise markedly in the forthcoming decades, especially in Asia, which comprises 60% of the ten countries with the highest risk22. India is projected to incur an average of 760 fatalities annually from 2031 to 2060, ranking after China (1,670) and before Afghanistan, the Philippines, and Indonesia22. Relative to the 1971–2000 baseline, these nations are anticipated to experience annual increases over 200 casualties22. This escalation is propelled by increased extreme precipitation and anticipated population development in mountainous, landslide-prone regions22.

India, being one of the most landslide-prone countries, faces considerable challenges in mitigating these risks. With an estimated population of 1.46 billion in 2025, India represents approximately 17.78% of the global population26. As per Geological Survey of India (GSI), around 0.42 million square kilometers, or 12.6% of India’s land area is prone to landslides27. This includes the NWH, NEH and Ghats regions of India. Landslides in India are primarily triggered by precipitation, with the Himalayan region also susceptible to earthquake-triggered landslides due to its location in the highest seismic zones (Zones IV and V) of India28. The region has experienced numerous devastating earthquakes, such as the Shillong earthquake (8.1 magnitude) in 1897, which triggered significant landslides29.

Landslide susceptibility (LS), hazard, and risk zoning are three interconnected approaches that play a critical role in land-use planning, offering valuable insights into the potential impacts of landslides on communities and infrastructure30,31,32,33. Susceptibility mapping, in particular, serves as the initial step in assessing the likelihood of landslides in a given area by analyzing the influence of various geo-environmental factors, excluding the temporal element of landslide frequency34,35,36. High-resolution landslide susceptibility zonation (LSZ) studies at the national or global scale have been limited due to the unavailability of comprehensive landslide inventories and the high computational demands required for analysis37,38. Presently, the majority of extensive LSZ maps depend on heuristic methodologies and are often generated at low spatial resolutions. Recent studies have investigated the application of machine learning (ML) and deep learning (DL) techniques; however, these endeavors frequently utilize restricted input datasets. Table 1 encapsulates prior studies on LSZ, spanning global to national levels. It underscores the diversity of approaches, databases, and spatial resolutions employed in global, continental, and national research. However, with advancements of technology and availability of open-source data, it is now possible to develop more detailed susceptibility maps.

A variety of strategies are employed in landslide susceptibility zoning, and the selection of method substantially influences predicted efficacy. These methods are typically divided into two broad categories: qualitative (knowledge-driven) approaches, which rely on the expert judgment of researchers32,36,39,40 and quantitative (data-driven) approaches, which apply mathematical or statistical models to correlate landslides with their controlling factors41,42. Recent studies indicate that methods such as AHP and statistical models like Yc and FR are among the most commonly used for both large and small scale LSZ32,43,44,45,46,47. This study utilizes a combination of these models to mitigate the potential bias from any single method, with the results being validated through ROC analysis.

This research enhances National Landslide Susceptibility Zonation (NLSZ) at high resolution by overcoming significant limitations in prior evaluations. Although48 produced a landslide susceptibility map for the entire country of India, their research was limited by a small number of causative factors, without considering important predictors like lithology, geomorphology, land use/cover, and earthquake density, all of which are important factors that influence the likelihood of landslides. This research incorporates these essential predictors with additional geo-environmental predictors, markedly improving the predicted accuracy of susceptibility mapping. The identification of key predictors relies on comprehensive expert insight, an exhaustive literature review, and meticulous multicollinearity assessment, ensuring a resilient model. This work utilizes a spatial resolution of 90 m, an enhancement above the 100 m resolution48, hence offering enhanced spatial information. To further enhance susceptibility evaluations, the analysis is confined to landslide prone States and Union Territories (SUTs), omitting areas with gentler slopes to reduce categorization errors. In addition, this study includes a heuristic to data data-driven approach to eliminate dependency on a single model and enhance the results through comparative assessment of the models. The susceptibility models are validated by receiver operating characteristic (ROC) curve analysis, utilizing extensive landslide datasets (including all landslides, training landslides, and testing landslides) to robotic reliability. Upon validation, the best susceptibility zonation map is utilized to evaluate landslide risk, providing essential insights for disaster mitigation.

The establishment of a novel, high-resolution methodological framework for the purpose of landslide susceptibility zonation and risk assessment on a national scale is one of the most important contributions that this study has made. The approach improves the precision, reliability, and usefulness of susceptibility models by systematically integrating major causal elements at a spatial resolution of ninety meters across the entire area. The results of this research have significant value not only for the scientific community but also for those who make decisions regarding climate and environmental policy, international funding agencies like the World Bank, governmental bodies, and industries that are involved in infrastructure planning, land-use management, and disaster risk reduction.

Table 1 Overview of previous studies on LSM from global to national scale.

Study area

The chosen eighteen SUTs exemplify India’s most landslide-prone areas, featuring varied physiographic zones and hazard characteristics. The NWH and NEH include steep gradients, elevated seismic activity, and substantial monsoonal precipitation, resulting in recurrent landslides. The Peninsular region include states like Kerala, Goa, Maharashtra, Tamil Nadu, and Karnataka, which feature the Western Ghats, where orographic rainfall and mountainous topography increase the risk of landslides. SUTs with gentler slopes and low susceptibilty are being eliminated to preserve the geo-environmental relevance necessary for precise LSZ. The chosen study area encompasses the complete geographical and hazard diversity of India’s landslide-prone regions, rendering it essential for formulating effective national strategy. This extensive coverage facilitates the formulation of region-specific mitigation plans within a unified national framework.

The study area (Fig. 1), covering approximately 1,365,826 square kilometers, includes regions in India highly susceptible to landslides, such as the NWH and NEH, as well as the Western Ghats and Konkan hills (https://ndma.gov.in). Landslides are most prevalent in the NWH, followed by the NEH and the Western Ghats, as per the records of GSI27. The river systems in these regions primarily flow from north to south, driven by steep gradients and the presence of glaciers in the northern and northeastern parts of India. Mawsynram in Meghalaya holds the record for the highest annual rainfall in India and is recognized as the wettest place on Earth, with an average annual rainfall of 11,872 millimeters59. In 1985, Mawsynram set a world record by receiving an astounding 26,000 millimeters of rainfall, according to the Guinness Book of World Records. The Himalayan belt, being a young and tectonically active geological region, is predominantly composed of meta-sedimentary rocks that are highly prone to denudation and erosion12. The combination of steep slopes and rapidly flowing rivers in this region leads to significant toe erosion, resulting in unstable slopes. Besides geological factors, anthropogenic activities and natural triggers, such as the intensity and duration of rainfall, play a critical role in the spatial and temporal occurrence of landslides. The majority of landslides in these areas occur on the windward sides of the southwest slopes of the Western Ghats and Himalayas, where rainfall is heaviest. Human activities, particularly the construction and widening of roads, further increase the risk of landslides in these susceptible regions.

The research area is mostly situated in seismically active regions28, notably Seismic Zones IV and V, which indicate a significant potential for earthquake-induced hazards, including landslides. Elevation varies from 12 to 8,546 m above mean sea level, indicating significant topographic diversity. The region has significant geological, physiographic, and geomorphic diversity, featuring lithological strata from the Holocene to the Paleoproterozoic era27.

Fig. 1
figure 1

Location of the study area on the world map, highlighting landslide occurrences. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Materials and methods

The GSI serves as the nodal agency for landslide-related activities across the country and has made significant contributions to the field through extensive work in landslide inventory and susceptibility mapping in various regions of India27. To achieve comprehensive national coverage, GSI initiated the National Landslide Susceptibility Mapping (NLSM), conducted on a 1:50,000 scale, between 2013 and 2020. Recognizing the reliability and authenticity of GSI’s database, the inventory, along with lithology, structural lineament, and other relevant datasets, has been sourced from the GSI website (www.gsi.gov.in) for use in this research work. The specific details of additional datasets utilized in this research, which were obtained from their respective sources, are outlined in Table 2.

Table 2 Overview of datasets used in the study, including data type, scale, and source.

The primary objective of this research is to develop a LSZ for landslide-prone SUTs in India through a comparative analysis of various methods34,46. The selection of causative factors in statistical and data-driven methods is typically determined by the specific characteristics of the study area and the availability of data36,60,61,62. In light of the lack of universal criteria for selecting landslide conditioning factors, this study considers ten factors based on the terrain condition and extensive literature review and data availability60,63,64,65,66,67,68,69,70. These factors encompass topographical, hydro-geological, geomorphological, geological, anthropogenic influences, and triggering factors. The inclusion of a diverse range of causative factors enables a comprehensive analysis of the interrelationships between each factor and landslide distribution, ultimately aiding in the identification of the most suitable LSZ. In this study, widely used data driven models, such as the FR and Yc, and AHP have been used to calculate the weights of various conditioning factors, which are then applied to generate the NLSZ. The comparative analysis is instrumental in determining the most appropriate model for the Indian terrain. The detailed methodology adopted in the study is presented in Fig. 2.

Fig. 2
figure 2

Methodological framework for developing Landslide Susceptibility and Risk Maps. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/ and Microsoft PowerPoint URL: https://www.microsoft.com/microsoft-365/powerpoint).

Multicollinearity

In LSZ, evaluating multicollinearity among factors is very important to make sure that the model is reliable and correct71. Multicollinearity happens when predictor variables are strongly linked, which makes coefficient values unstable and makes it harder to understand the model72. The Variance Inflation Factor (VIF) and Tolerance (TOL) are two common ways to find multicollinearity73. A high TOL value, approaching 1, indicates minimal multicollinearity, while a low TOL value, nearing 0, signifies significant multicollinearity, which may pose challenges74. A VIF exceeding 10 is typically regarded as a benchmark for significant multicollinearity. These metrics can be computed using RStudio software (RStudio-2025.09.2–418.exe https://posit.co/download/rstudio-desktop/) and are theoretically represented as:

$$\:TOL\:=\frac{1}{VIF}=1-{Ri}^{2}$$
$$\:VIF\:=\frac{1}{{TOL}_{i}}=\frac{1}{1-{Ri}^{2}}$$

In this case, Ri² denotes the coefficient of determination derived from regressing the i-th predictor against all other predictors. TOL represents the fraction of variance in a predictor that remains unexplained by other independent variables. The VIF quantifies the degree to which the variance of a regression coefficient is augmented as a result of collinearity.

Analytic hierarchy process (AHP)

The AHP is a popular multi criteria decision-making (MCDM) method for systematically figuring out how likely a landslide is by giving factors different weights based on expert opinion40. AHP works well for combining many geo-environmental factors, and it has been used successfully in landslide risk assessments32. There are several steps to the method used to implement AHP in this work. The first step is to choose the right landslide influencing factors based on how important they are to the landslide happening. Then, a Pairwise Comparison Matrix (PCM) is made to see how important each factor is compared to the others. Using 9-point scale, a 1 means that the two things are equally important, a 3 means that they are moderately important, a 5 means that they are strongly important, a 7 means that they are very strongly important, and a 9 means that they are extremely important75. To make the comparisons more accurate, numbers in the middle (2, 4, 6, and 8) are used. According65,76, experts give scores based on what they know about the features of landslide-prone terrain. This makes sure that the ranking shows how each factor really affects the likelihood of a landslide. The eigenvector method is used to get adjusted weights for each factor after the PCM has been built65. The subsequent approach is employed to compute the adjusted weight (Wi) for each predictor:

$$\:{W}_{i\:}=\:\frac{{\sum}_{j=1}^{n}{A}_{ij}}{n}$$

Where Aij​ represents the value of the ith row and jth column in the PCM, and n is the total number of factors. This step ensures that all conditioning factors are assigned appropriate weights based on their relative significance in landslide occurrence.

The final weights derived from the AHP analysis are used to compute the Landslide Susceptibility Index (LSI) through the Weighted Linear Combination (WLC) method43,77. The LSI for AHP is calculated using the following equation:

$$\begin{aligned}\:{LSI}_{AHP}=&\:{(Aspect\:x\:W}_{AHP})+\:{(Convexity\:x\:W}_{AHP})+\:{(Drainage\:x\:W}_{AHP})+\:{(Earthquake\:x\:W}_{AHP})\\&+\:{(Fault\:x\:W}_{AHP})+\:{(Geomorphon\:x\:W}_{AHP})+\:{(LULC\:x\:W}_{AHP})+\:{(Rainfall\:x\:W}_{AHP})\\&+\:{(Slope\:x\:W}_{AHP})+\:{(Lithology\:x\:W}_{AHP})\end{aligned}$$

WAHP represents the weight assigned to each conditioning factor of the landslide.

Frequency ratio (FR)

The FR method is a prevalent bivariate statistical technique employed to evaluate landslide susceptibility by examining the correlation between historical landslides and their causal elements78. This method quantifies the likelihood of landslide development concerning each contributing component category, offering an objective assessment of susceptibility79,80. The initial step entails the preparation of a detailed landslide inventory dataset, assembled from field surveys, remote sensing, and historical documentation. Numerous conditioning elements affecting slope stability are chosen based on previous research and their significance in the study area81. All thematic layers are processed utilizing GIS methodologies, guaranteeing consistent spatial resolution for analysis. The FR value for each factor class is thereafter calculated utilizing the algorithm68:

$$\:FR\:=\frac{\%\:of\:landslide\:pixels\:in\:Predictor\:class}{\%\:of\:area\:covered\:by\:Predictor\:class}$$

The numerator denotes the quantity of landslide pixels within a designated factor class, while the denominator indicates the fraction of that class in the research region. The FR value indicates the relative probability of landslides occurring within a specific category of a conditioning factor. An FR greater than 1 signifies a robust association between the component class and landslides, whereas an FR less than 1 indicates diminished susceptibility69.

The LSIFR is determined by aggregating the FR values of all chosen parameters81,82.

$$\begin{aligned}\:{LSI}_{FR}=&\sum\:\left(FRi\right){\text{ = F}}{{\text{R}}_{{\text{Aspect}}}}{\text{\; + F}}{{\text{R}}_{{\text{Convexity}}}}{\text{ + F}}{{\text{R}}_{{\text{Drainage}}}}{\text{\; + F}}{{\text{R}}_{{\text{Earthquake}}}}{\text{ + F}}{{\text{R}}_{{\text{Fault}}}}\\&{\text{\; + F}}{{\text{R}}_{{\text{Geomorphon}}}}{\text{ + F}}{{\text{R}}_{{\text{LULC}}}}{\text{\; + F}}{{\text{R}}_{{\text{Rainfall}}}}{\text{ + F}}{{\text{R}}_{{\text{Slope}}}}{\text{\; + F}}{{\text{R}}_{{\text{Lithology}}}}\end{aligned}$$

FR stands for the frequency ratio of each factor class. The final LSI map is put into different susceptibility zones.

Yules coefficient (Yc)

The YC method, together with the Landslide Occurrence Frequency Score (LOFS), offers a statistical framework for assessing the correlation between landslide events and conditioning factors83,84,85. This approach improves LSZ by integrating categorical factor correlations and the frequency of landslide events within designated factor classes86. The initial phase entails the compilation of a landslide inventory and thematic layers that illustrate essential conditioning factors87. The landslide inventory is utilized to delineate landslide-affected and unaffected regions for statistical analysis. A contingency excel is created for each class of conditioning factors to quantify the relationship between landslide occurrences and these parameters. Yule’s Coefficient (Yc) is computed as described83.

$$\:Yc\:=\frac{\sqrt{\left(\frac{Fab}{F{a}^{{\prime\:}}b}\right)}\:-\sqrt{\left(\frac{Fab{\prime\:}}{F{a}^{{\prime\:}}b{\prime\:}}\right)}}{\sqrt{\left(\frac{Fab}{F{a}^{{\prime\:}}b}\right)}+\sqrt{\left(\frac{Fab{\prime\:}}{F{a}^{{\prime\:}}b{\prime\:}}\right)}}$$

Fab denotes a positive match, where both landslide and factor class are present. Fa’b and Fab’ represent areas of mismatch, with Xa’b indicating the absence of a specific factor class despite the presence of landslides, while Fab’ signifies the presence of a factor class in the absence of landslides. Fa’b’ indicates a negative match, where both landslide and factor class are absent. The Pearson correlation coefficient and YC values range from − 1 to + 1, with positive values signifying a greater spatial association and negative values indicating the opposite. The Landslide LOFS is calculated utilizing the YC values from all predictors maps .

$$\:LOFS\:=\left\{\begin{array}{c}0,\:\:\:\:\:\:for\:\:\:Yc\le\:0\\\:\frac{Yc}{{Yc}_{max}},\:for\:\:\:Yc>0\end{array}\right.$$

LOFS represents the degree of influence of each factor class on susceptibility to failure, ranging from zero to one, whereas YCmax reflects the maximum Yc among all classes within a spatial predictor.

Various spatial elements may be associated with landslides in unique manners. However, because to the association of landslides with multiple interacting variables, an investigation of inter-predictor weights is essential for predictive susceptibility modeling. A comprehensive grasp of the predictors of landslides may enhance the analysis; yet, expert knowledge is subjective and may assign arbitrary weights to various variables. Utilizing Eq. 3, a prediction weight (Wi) was calculated for all geo-factors, predicated on the extent of geographical association with the landslides.

$$\:{W}_{i}=\:\raisebox{1ex}{$({Yc}_{max}-\:{Yc}_{min})$}\!\left/\:\!\raisebox{-1ex}{${\left.({Yc}_{max}-\:{Yc}_{min})\right|}_{min}$}\right.$$

To determine the predictor weights for each geo-factor, the absolute difference between the maximum and minimum Yc values was divided by the minimum Yc value.

The LSI for Yc was computed by integrating the LOFS values for each geo-factor class (Eq. 2) and the Wi values for each geo-factor map (Eq. 3) through the weighted multi-class index overlay method in the GIS platform.

$$\begin{aligned}\:{LSI}_{Yc}=&\frac{{\sum\:}_{i}^{n}({LOFS}_{ij}\:x\:{W}_{i\:})}{{\sum\:}_{i}^{n}{W}_{i}}{\text{ = [(LOF}}{{\text{S}}_{{\text{Aspect}}}}{\text{ x }}{{\text{W}}_{{\text{Aspect}}}}{\text{) + }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Convexity}}}}\:{\text{x }}{{\text{W}}_{{\text{Convexity}}}}} \right)\\&{\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Drainage}}}}\:{\text{x }}{{\text{W}}_{{\text{Drainage}}}}} \right){\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Earthquake}}}}{\text{x }}{{\text{W}}_{{\text{Earthquake}}}}} \right)\\&{\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Fault}}}}{\text{x }}{{\text{W}}_{{\text{Fault}}}}} \right){\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Geomorphon}}}}{\text{x }}{{\text{W}}_{{\text{Geomorphon}}}}} \right)\\&{\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{LULC}}}}{\text{x }}{{\text{W}}_{{\text{LULC}}}}} \right){\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Rainfall}}}}\:{\text{x }}{{\text{W}}_{{\text{Rainfall}}}}} \right)\\&{\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Slope}}}}\:{\text{x }}{{\text{W}}_{{\text{Slope}}}}} \right){\text{ }} + {\text{ }}\left( {{\text{LOF}}{{\text{S}}_{{\text{Lithology}}}}\:{\text{x }}{{\text{W}}_{{\text{Lithology}}}}} \right)]{\text{ }}/{\text{Cumulative weight}}\end{aligned}$$

 

where i is the quantity of predictor maps, and j denotes the classes of predictors within the maps. The generated map (LSI) illustrates a probabilistic model of landslide susceptibility, offering insights into the probability of slope failures in the examined region60.

Landslide inventory

A landslide inventory systematically documents landslide occurrences together with their locations and characteristics, serving as the basis for susceptibility mapping, risk evaluation, and early warning systems88,89. The premise is that the future landslides will transpire under situations analogous to previous occurrences, therefore necessitating a comprehensive inventory for dependable models90. Landslides can be delineated by field surveys, aerial images, and remote sensing, contingent upon the scope of the study and the availability of data91. This study utilized landslide data from the GSI27, which maintains a comprehensive national inventory that integrates polygon and point datasets representing regional landslide attributes. Polygon and point data from the Himalayas and Ghats regions were standardized by converting polygons into points and subsequently cleaned to eliminate duplicates, yielding an inventory of 109,504 landslides at a resolution of 90 m. Figure 1 illustrates the spatial distribution of landslides in different landslide susceptible regions of SUTs of India. Rainfall-induced landslides predominate in the Himalayas and Ghats, but earthquake-triggered incidents are concentrated in tectonically active regions of active tectonic belt of Himalayas.

Landslide predictor maps

The selection of suitable predictor variables is essential for producing dependable LSZ34,92. This work compiles an extensive array of predictors for both Himalayan and non-Himalayan terrains, ensuring spatial and geomorphological diversity53,93,94,95. The predictors are classified into geological (lithology, fault density), topographic (slope, aspect, convexity), geomorphological (geomorphons), hydrological (drainage density), climatic (annual rainfall), anthropogenic (land use/land cover), and seismological (earthquake magnitude) factors, each of which influences slope instability64,66,69,96,97,98,99,100,101. The raw predictor datasets, which are available in diverse resolutions and formats (vector and raster), are standardized by converting all layers to raster format and resampling them to align with the 90 m resolution of the Digital Elevation Model (DEM). To alleviate collinearity among predictors, an issue that might misrepresent statistical relationships, this research utilize TOL and VIF102. Variables in the top decile with VIF values beyond five are progressively eliminated until all remaining variables demonstrate VIF values beneath this threshold, hence providing statistical robustness103. After eliminating collinear variables, the predictor datasets are synchronized with the training data for LSZ modeling. Altitude was omitted from the final model due to its disproportionately high impact on LSZ outputs, perhaps resulting from the considerable altitudinal variance within the study area. The array of predictors employed in the analysis and their corresponding significance are encapsulated below:

Aspect

The orientation of slopes affects microclimatic variables, including sun radiation, evapotranspiration, and weathering. In the Indian subcontinent, south-facing slopes frequently undergo accelerated weathering due to extended solar exposure, whilst other orientations may maintain elevated soil moisture levels due to reduced evaporation, hence heightening the risk of saturation-induced failures21. Aspect classification is illustrated in Fig. 3a.

Convexity

Terrain convexity (Fig. 3b) significantly influences material transit and accumulation. Concave zones generally serve as depositional sites, accumulating loose material, while convex zones are more susceptible to erosion and operate as detachment locations, hence heightening the probability of landslide initiation12.

Draiange

Proximity to drainage networks is a pivotal feature affecting landslide susceptibility, as regions adjacent to streams and rivers frequently endure elevated erosion, soil saturation, and slope undercutting. Proximity to drainage channels generally signifies areas with heightened surface runoff and diminished infiltration capacity, especially in places experiencing heavy monsoonal precipitation, thereby increasing the likelihood of slope failure48,59. This study utilized GIS-based Euclidean distance analysis to determine the distance to drainage, which was integrated as a crucial conditioning factor in the susceptibility model (Fig. 3c).

Earthquake

Earthquakes induce landslides by ground shaking, liquefaction, and the reduction of shear strength in unconsolidated materials46,47. A seismic density map was created using nearly a century of magnitude-based seismic recordings, resampled to a 90-meter resolution for spatial modelling (Fig. 3d).

Fault

The proximity to active faults and thrust zones significantly influences landslide susceptibility, as these regions are frequently structurally weakened by tectonic fracturing, seismic activity, and cumulative ground deformation66,67. This study assessed the distance to faults using GIS-based Euclidean distance analysis, which was included as a vital conditioning factor in the susceptibility modeling framework (Fig. 3e).

Geomorphon

Geomorphon46 based classification identifies landform features including ridges, valleys, and hollows, offering a structured approach to describe terrain morphology (Fig. 3f). This classification improves the recognition of geomorphic zones more susceptible to mass movements68,69. Geomorphons were produced using SAGA GIS and incorporated into the modeling process (Jasiewicz & Stepinski, 2013).

LULC

Anthropogenic induced land use alterations, including deforestation, agriculture, and urbanization, disrupt natural slope equilibrium and heighten susceptibility, while vegetated regions bolster slope stability through root reinforcement and hydrological management53,63. The worldwide LULC dataset from ESRI was employed and resampled to a resolution of 90 m (Fig. 3g).

Rainfall

Prolonged and intense precipitation is a principal catalyst for both shallow and deep-seated landslides, as it increases pore-water pressure, diminishes effective stress, and decreases the shear strength of slope materials47. This work used gridded precipitation data from the Indian Meteorological Department (IMD, 2023) inside a GIS context and resampled it to a spatial resolution of 90 m for consistency with other model inputs (Fig. 3h).

Slope

The slope angle is a critical topographic element affecting landslide initiation, as steeper inclines amplify gravity forces and diminish stability. The probability of failure is especially elevated when steep slopes coincide with weak lithology and areas of significant precipitation30. Standard slope categorization methodologies were utilized during the modeling process (Fig. 3i).

Fig. 3
figure 3

Spatial distribution maps of landslide predictor maps: (a) Aspect, (b) Convexity, (c) Drainage distance, (d) Earthquake magnitude, (e) Fault distance, (f) Geomorphon, (g) Land Use/Land Cover (LULC), (h) Rainfall, and (i) Slope. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Lithology

Lithologies differ in their extent of weathering, permeability, and mechanical strength. Highly fractured, jointed, or weathered lithologies have a markedly greater susceptibility to failure compared to large, competent rock blocks33,41. Lithological units were categorized by geological age to improve model interpretability and performance, hence managing dataset complexity (Fig. 4).

Fig. 4
figure 4

Spatial distribution of lithologies, categorized according to their geochronological age. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Results

The VIF values reached a maximum of 2.55, while TOL values remained at or above 0.998, indicating no multicollinearity among the predictor variables (see Figs. 5a, b). These values are significantly below the standard thresholds of concern (e.g., VIF > 10, TOL ≈ 0), thereby confirming the reliability of the model inputs. The topographic predictors, guided by expert knowledge and existing literature, demonstrate both statistical robustness and representativeness, thereby reinforcing the credibility of the LSZ maps.

Factors vs landslide distribution

The examination of landslide frequency among several predictors indicates a distinct disparity in susceptibility (Fig. 6). Slopes oriented between 157.5° and 202.5° exhibited the highest landslide frequency at 20.1%, followed by those between 112.5° and 157.5° at 19.2%, and slopes between 202.5° and 247.5° at 17% (Fig. 6a). Conversely, the intervals between 327.5°–360° and 292.5°–327.5° had the lowest frequencies at 2.8% and 6.3%, respectively. No landslides transpired on level terrain (0°), underscoring the impact of slope gradient and orientation. Regarding convexity, regions with low values (0–21.7) had no landslide occurrences, signifying enhanced stability (Fig. 6b). The incidence of landslides escalated significantly with increased convexity, reaching a maximum of 56% within the 47.9–80.4 range and 38.2% within the 42.9–47.9 range. The data indicate that highly convex slopes are more susceptible to failure, possibly due to diminished support, elevated shear stress, and water infiltration.

Fig. 5
figure 5

Assessment of multicollinearity among predictors using (a) VIF and (b) TOL values, indicating minimal collinearity.

Landslide susceptibility varies significantly with distance from drainage networks (Fig. 6c). The highest frequency (34.66%) occurs at 2000–5000 m, suggesting mid-range distances are more prone to slope failure, likely due to factors such as soil saturation, erosion, and steep topography. In contrast, areas close to drainage (0–100 m: 3.8%) or far from it (10,000–50,000 m: 3.31%) show minimal landslide activity. Frequency increases steadily from 3.8% at 100 m to 16.7% at 2000 m, before declining beyond 5000 m (17.73% at 5000–10,000 m; 3.31% at 10,000–50,000 m), indicating reduced susceptibility at greater distances from water channels. Similarly, landslide distribution relative to tectonic faults shows a peak frequency (25.39%) at 2000–5000 m (Fig. 6d), possibly due to the interplay of structural weakening and adverse geomorphic conditions. A marked increase is also observed at 1000–2000 m (14.21%), with a gradual decline beyond 5000 m (17.92% at 5000–10,000 m; 17.97% at 10,000–50,000 m). Regions within 0–100 m of mapped faults have the lowest landslide frequency (2.66%), suggesting a national impact. In tectonically active places like the Himalayas, fault proximity directly increases landslide frequency, suggesting regional variances in fault-related susceptibility.

Earthquakes of intermediate magnitude (4.57–4.92 Mw) correlate with the highest frequency of landslides (34.7%), succeeded by those in the 4.25–4.57 Mw range (29.3%), demonstrating that low to moderate seismic events substantially influence slope collapses (Fig. 6e). These magnitudes are likely to induce landslides on pre-weakened or degraded inclines. Conversely, lower-magnitude earthquakes (1.8–4.25 Mw) constitute merely 11.4% of occurrences, presumably owing to inadequate seismic energy. High-magnitude earthquakes (5.39–8.5 Mw) account for just 7.1%, likely due to their infrequency, occurrence in geological formations, or propensity to induce fewer yet more devastating failures.

Slope-related landforms within geomorphon classes serve as a principal determinant of landslide occurrence, illustrating the intrinsic instability of steep terrains influenced by gravitational forces, severe weathering, and mass-wasting processes (Fig. 6f). Hollows (19.7%) and spurs (16.7%) demonstrate the highest frequency of landslides, suggesting that both concave and convex shapes contribute to instability because to their tendency for water collection, uneven weathering, and structural degradation. Valleys exhibit significant activity (14.6%), perhaps influenced by toe-cutting and erosional processes related to surface drainage networks. Conversely, landforms generally linked to stability, such as flat regions (0%), shoulders (0.1%), and footslopes (0.1%), have less landslide activity due to their mild gradients, which diminish gravitational stress and restrict erosional pressures. Peaks (1.2%) and pits (3.6%) exhibit minimal activity, possibly associated with localized erosion or discrete collapses. Rridges exhibit a comparatively low landslide incidence (9.1%), probably because to lithological resilience, vegetative cover, and minimal human interference.

The results of the LULC analysis indicate that the majority of landslides occur in forested areas with tree heights of 15 feet or more, accounting for 61.9% of all landslides (Fig. 6g). The elevated frequency of landslides in forest regions can be attributed to the lack of classification based on forest density. Classifying forest regions into sparse, moderate, and dense density levels may reveal significant patterns or correlations. Vegetation is generally associated with slope stability, likely due to the prevalence of steep topography in densely forested mountainous regions. Under some situations, intense precipitation, prolonged soil saturation, and pre-existing geological weaknesses can overwhelm the stabilizing effect of root systems, resulting in both shallow and deep-seated landslides. Rangelands account for 28.7% of landslide incidents, underscoring their susceptibility due to little vegetation and exposed soils that facilitate erosion and hydrological instability. Conversely, built-up areas (4.1%) and bare terrain (3.2%) have reduced frequency, possibly attributable to constructed slope protections or inherently stable circumstances. Croplands demonstrate negligible landslide activity (0.7%), perhaps due to their level topography and cultivated environments.

In regions with minimal annual precipitation (52–500 mm), landslides are rare, comprising approximately 3.6% of incidents (Fig. 6h), suggesting restricted hydrological triggering potential in dry to semi-arid environments. The frequency of landslides markedly escalates with rainfall, reaching its zenith in the 1000–1500 mm range. A minor decrease is shown in the 1500–2000 mm category (24.3%), however vulnerability persists, indicating the destabilizing impact of prolonged rainfall on slope materials. Above 2000 mm, the frequency of landslides progressively diminishes—18.2% (2000–2500 mm), 9% (2500–3000 mm), and 10% (> 3000 mm)—potentially attributable to geological resilience, saturation limits, or vegetation cover. These trends underscore the significance of moderate to high precipitation in the incidence of landslides, particularly when coupled with additional factors such as slope and lithology.

The distribution of landslides in relation to slope gradient indicates that the majority of occurrences transpire on slopes ranging from 30° to 45° (37.9%) and from 20° to 30° (31.3%), where gravitational forces surpass slope resistance under specific triggering conditions (Fig. 6i). Moderately inclined terrains (10°–20°) constitute 18.8% of landslides, suggesting that such slopes may destabilize under particular geological or hydrological conditions. The most inclined slopes (45°–87°) have the lowest frequency (4.9%), presumably attributable to restricted soil accumulation, exposed bedrock, and inherent stabilizing elements such surface roughness and vegetation.

Fig. 6
figure 6

Landslide susceptibility (%) variation across key controlling factors in the study area. (a) Aspect classes, (b) Convexity ranges, (c) Geomorphon types, (d) Land Use/Land Cover (LULC) categories, (e) Slope ranges (degrees), (f) Rainfall ranges (mm), (g) Drainage distance classes (m), (h) Fault distance classes (m), and (i) Earthquake magnitude (Mw) intervals. Bar plots illustrate categorical variables, while line plots depict trends in ordered continuous variables, with percentage values annotated for clarity. These visualizations highlight the relative influence of each factor on landslide occurrence.

Significance of factors

In AHP, predictor weights were obtained using Saaty’s Pairwise Comparison Scale (1–9), yielding a consistency ratio of 0.096 (< 0.1), hence affirming the trustworthiness of expert assessments (Tables 3 and 4). The AHP results indicated that slope (22.8%), land use/land cover (19.8%), and lithology (14.8%) are the most significant factors, highlighting the impact of terrain steepness and human activities on landslide incidence (Fig. 7). The FR model, utilizing frequency ratios, identified convexity (24.6%), land use/land cover (12.5%), and slope (11.4%) as predominant variables, underscoring the significant impact of terrain curvature. Conversely, the Yc model allocated about uniform weights (~ 10.7%) to slope, lithology, geomorphons, aspect, and fault proximity, indicating a more equitable contribution among the components. The effects of earthquakes was negligible in all models, with the lowest weights attributed in AHP (2.7%) and Yc (3.6%), signifying a restricted effect on susceptibility throughout the research area.

Table 3 AHP pairwise comparison matrix for factor weights.
Table 4 Consistency index (CI), random index (RI), and consistency ratio (CR) Calculation.
Fig. 7
figure 7

Comparative significance of predictor variables in the AHP, FR, and Yc models, based on their assigned weight values for landslide susceptibility assessment.

National landslide susceptibility zonation

The spatial distribution of landslide susceptibility (LS) exhibits little yet significant variance among the AHP, FR, and Yc models (Figs. 8, 9 and 10). The FR model identifies the most extensive region of very high landslide susceptibility (VHLS) at 10.9%, succeeded by Yc at 10.1% and AHP at 9.7% (Fig. 11), demonstrating no major variation. In the high susceptibility (HLS) category, AHP accounts for 17%, whilst Yc represents 14.9% and FR constitutes 13.7%. All models provide approximately comparable coverage for moderate susceptibility (MLS), with 19.4% (AHP), 19.6% (Yc), and 19.5% (FR), indicating continuous concordance on transitional susceptibility zones. The low susceptibility (LLS) zone is more prevalent in Yc (24.7%) and FR (24%) than in AHP (19.4%), however AHP accounts for the highest proportion of very low susceptibility (VLLS) at 34.6%, succeeded by FR (31.9%) and Yc (30.6%), reflecting a more conservative classification methodology employed by AHP.

At the national level (total area ≈ 3,287,263 km²), VHLS zones encompass 4.5% (FR), 4.2% (Yc), and 4.0% (AHP) of India’s landmass, consistent with previous findings (4.7%) with ML techniques⁴⁸. The combined susceptible area (HLS + VHLS) comprises 11.0% (AHP), 10.4% (Yc), and 10.2% (FR), aligning with earlier estimates (12.6% by GSI; 13.17% by Sharma et al.⁴⁸), hence confirming models accuracy. AHP identifies Sikkim as having the highest VHLS proportion at 54.9%, followed by Uttarakhand at 44.1% and Himachal Pradesh at 40.5%. Arunachal Pradesh encompasses the most extensive VHLS area, approximately 32,942 km² (Table 5). The Yc model ranks Nagaland (55.0%), Mizoram (53.1%), and Arunachal Pradesh (52.1%) as the highest, with Arunachal Pradesh encompassing around 44,392 km² (Table 5). Likewise, FR designates Nagaland (60.8%), Mizoram (57.9%), and Uttarakhand (51.2%) as having the largest proportions, whereas Arunachal Pradesh encompasses around 43,860 km² classified as VHLS (Table 5).

Notwithstanding model-based variances, the uniformity in pinpointing highly sensitive regions particularly the Himalayas and Western Ghats, highlights the geodynamic fragility of these areas. The results underscore the significance of multi-model methodologies for effective LSZ mapping and its relevance in focused land-use planning, risk reduction, and disaster preparedness initiatives.

Fig. 8
figure 8

This figure illustrates the landslide susceptibility map of India, derived through the AHP method, showcasing varying susceptibility zones across the country. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Fig. 9
figure 9

This figure illustrates the LSZ map of India, derived through the FR method, showcasing varying susceptibility zones across the country. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Fig. 10
figure 10

LSZ of India generated using the Yc method, highlighting spatial variations in susceptibility zones across the country. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Fig. 11
figure 11

Percentage of landslide susceptibility area across different classes in AHP, Yc, and FR Models.

Table 5 This table presents a comparison of different landslide susceptibility models, highlighting the approach, VHLS percentage focus, VHLS area, strengths, and limitations of each model.

Identification of the top 10 susceptible SUTs of India

Sikkim demonstrated the highest VHLS in AHP model projecting 54.9%, the FR model 52.1%, and the Yc model 45% (Fig. 12). Arunachal Pradesh (AP) and Mizoram demonstrated significant VHLS, with AP exhibiting values from 38.7% in the AHP model to 52.1% in Yc and 51.5% in FR, whilst Mizoram ranged from 31.5% in AHP to 53.1% in Yc and 57.9% in FR. The VHLS of Nagaland exhibited significant variation, with AHP at 25%, and Yc and FR at 55% and 60.8%, respectively. Manipur demonstrated reduced susceptibility, with VHLS scores of 14.7% (AHP), 30.6% (Yc), and 25.2% (FR). Uttarakhand (UK) and Himachal Pradesh (HP), both susceptible to frequent landslides, demonstrated moderate to high VHLS, with UK registering 44.1% in AHP, 40% in Yc, and 51.2% in FR, whilst HP reported 40.5% in AHP, 28.9% in Yc, and 37.6% in FR. Jammu & Kashmir (J&K) and Meghalaya demonstrated reduced VHLS, with J&K recording 27.4% (AHP), 20.2% (Yc), and 25.1% (FR), whereas Meghalaya reported 10.9% (AHP), 22% (Yc), and 18.7% (FR). Tripura exhibited VHLS ≤ 8% across all models.

The elevated VHLS values in the Yc and FR models presumably indicate their data-driven responsiveness to variables such as deforestation and rainfall intensity. The results underscore the considerable influence of steep gradients, intense precipitation, and unstable geological formations, but the minor differences across models indicate variances in sensitivity and methodological approach.

Fig. 12
figure 12

Model-wise variation of VHLS (%) across the top ten most landslide susceptible Indian SUTs.

Validation

The Receiver Operating Characteristic (ROC) curve is an effective tool in LSZ, as it offers evaluation of the model’s predictive capability for landslides81. This curve facilitates the identification of the ideal equilibrium between precise landslide prediction and minimizing false alerts. The ROC curve enables researchers to evaluate the reliability of susceptibility models, optimize parameter values, and enhance overall predictive accuracy104. The susceptibility maps were validated using independent testing datasets comprising 30% of both landslide and non-landslide occurrences. The validation results yielded area under curve (AUC) values of 0.889 for the Yc model, 0.891 for the AHP model, and 0.905 for the FR model (Fig. 13), indicating strong predictive capability. Additionally, cross-validation using the precision-recall method produced AUC values of 0.856 for Yc, 0.867 for AHP, and 0.872 for FR (Fig. 14). Collectively, these evaluations confirm the high performance and reliability of all three models.

Fig. 13
figure 13

Sensitivity and specificity curve illustrating the performance of various susceptibility models.

Fig. 14
figure 14

Precision-recall curve illustrating the performance of various susceptibility models.

Validation of maps through landslide distribution

The maps are also validated by examining landslide spatial distribution across susceptibility classes. The VHLS zones includes majority of landslides, with 60.6%, 58.0%, and 66.8% for AHP, Yc, and FR models, respectively (Fig. 15). This remarkable relation between VHLS zones and landslides verifies the models ability to identify susceptible areas. The HLS zones consist of landslide concentration of around 28.2% in AHP, 30.7% in Yc, and 25.4% in FR models, confirming the models’ prediction dependability. The MLS covers less landslides, from 6.7% (FR) to 9.4% (Yc), indicating a transitional zone between stable and unstable terrain. LLS and VLLS zones cover low landslide percentage, not exceeding 1.9%, indicating that these places are more stable. The VHLS and HLS include large landslide concentrations, demonstrating that the models are good in predicting locations susceptible to landslides.

Fig. 15
figure 15

Percentage of landslide numbers across different susceptibility classes in AHP, Yc, and FR Models.

Ground validation

Field observations indicate that the majority of recent landslides, including historically devastating events, are concentrated within zones classified as HLS and VHLS (Fig. 16). This strong spatial correlation between observed landslide occurrences and the predicted high-risk areas provides robust empirical validation for the assessment model. Such consistency not only confirms the accuracy of the susceptibility mapping but also enhances confidence in its use for disaster risk management, land-use planning, and early warning systems.

Fig. 16
figure 16

Landslides superimposed on susceptibility map, showing strong alignment with high and very high susceptibility and risk zones. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/ and Microsoft PowerPoint URL: https://www.microsoft.com/microsoft-365/powerpoint).

Best fit model

The LSZ map produced by the Yc model was deemed the most appropriate for further analysis, based on various evaluation criteria, including the significance of predictor weights (Fig. 7), ROC values (Figs. 13 and 14), and spatial alignment with documented landslide occurrences (Fig. 15). As a result, the Yc-derived LSZ map was chosen for further landslide risk mapping. Figure 17 displays the LSZ map generated by the Yc model, superimposed with the relevant STUs boundaries to improve geographical analysis and visualization.

Fig. 17
figure 17

LSZ of India generated using Yc method, delineating varying degrees of susceptibility across SUTs. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

National landslide risk map (NLRM) at qualitative scale

The earlier study105 undertaken in Gopeshwar Township, Chamoli District, Uttarakhand, India, concentrated on a risk assessment based on building types within a delineated area of 8.39 km². However, our study employs a generalized methodology that offers a comprehensive qualitative risk evaluation at the national level. The qualitative risk map was created by combining built-up area and road network layers with HLS VHLS obtained from the Yc model (Fig. 18). The analysis indicates that around 8,606.13 km² of study area is situated within high to very high-risk zones, representing about 0.26% of the nation’s total land area. Approximately 2,398.66 km² is designated as very high risk, whilst 6,207.47 km² is categorized as high risk under study area.

Fig. 18
figure 18

Illustrates the landslide risk map of SUTs, derived through the Yc method. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/).

Discussion

This research presents a detailed framework for the creation of a NLSZ and a NLRM, based on expert consultation, thorough literature review, and stringent multicollinearity analysis. All three models (FR, Yc and AHP) exhibited consistent outcomes, with VHLS zones encompassing 10.9% (FR), 10.1% (Yc), and 9.7% (AHP) of the study area, reflecting negligible variation in pinpointing the most susceptible zones. The AHP model identified the HLS zones around 17%, in contrast to 14.9% for Yc and 13.7% for FR, while moderate susceptibility zones had similar proportions across models, indicating a consistent classification of transitional terrain. Zones of low and very low susceptibility exhibited minimal fluctuation, with FR encompassing the largest area (55.9%), closely followed by Yc (55.4%) and AHP (53.9%). The obtained very high susceptibility coverage in India, varying from 4.0% (AHP) to 4.5% (FR), roughly aligns with previous results48 (e.g., 4.7% derived using machine learning models). The amalgamation of high susceptibility and VHLS zones (10.2%–11.0%) aligns with the estimates from GSI and Sharma et al. 202448 (12.6% and 13.17%, respectively), hence affirming the trustworthiness and robustness of the models for national-scale landslide susceptibility assessment.

The state-wise research indicates that the AHP model identifies the highest percentage of VHLS in Sikkim (54.9%), followed by Uttarakhand (44.1%) and Himachal Pradesh (40.5%). Arunachal Pradesh, meanwhile, possesses the most extensive VHLS area, measuring 32,942 km². The Yc and FR models identify Nagaland, Mizoram, and Arunachal Pradesh as the most vulnerable, underscoring regional variability influenced by geological, climatic, and anthropogenic causes106. Minor inconsistencies among models indicate variations in methodological sensitivity, with data-driven models (Yc, FR) exhibiting more responsiveness to recent environmental alterations, including deforestation and infrastructural development107. Model validation produced AUC values between 0.874 and 0.905, signifying robust predictive efficacy. A significant proportion of mapped landslides (60.6%–66.8%) resides within high and VHLS, corroborating the spatial correlation between anticipated risk and recorded occurrences108. The VLLS has the most extensive geographic area across all models, indicating that a significant chunk of India faces minimal landslide danger.

Among the models, Yc has exceptional performance in terms of classification accuracy and spatial sensitivity. The highest VHLS is identified in Nagaland (55.0%), Mizoram (53.1%), and Andhra Pradesh (52.1%), followed by Sikkim (45.0%), Uttarakhand (40.0%), Manipur (30.6%), Himachal Pradesh (28.9%), Jammu & Kashmir (20.2%), Meghalaya (22.0%), Tripura (8.0%), West Bengal (1.9%), Assam (2.9%), Goa (3.7%), Kerala (2.9%), Ladakh (0.7%), Karnataka (0.3%), Tamil Nadu (0.3%), and Maharashtra (0.2%). These findings highlight the model’s sensitivity to essential variables including topography, rainfall, and land-use intensity. The choice of model profoundly affects risk perception and policy results. Incorporating multi-model methodologies with GIS and real-time monitoring can enhance landslide mapping and facilitate more efficient mitigation techniques109.

Comparison with previous studies at global and National scale

This work enhances prior national and worldwide landslide susceptibility evaluations by using high-resolution thematic datasets, several causative elements, and an extensive inventory of 109,504 landslide occurrences, leading to improved susceptibility zonation accuracy. Globally, findings correspond with previous studies110 indicating significant vulnerability in the Himalayas and Western Ghats (Fig. 19a), although enhance accuracy through improved data and broader factor incorporation. The results partially corroborate previous global maps19, particularly concerning elevated susceptibility in the Himalayas (Fig. 19b), while rectifying overgeneralizations of moderate susceptibility in non-prone areas. Likewise, the global map by111 delineates susceptible areas in the Himalayan and Western Ghats regions (Fig. 19c). Nevertheless, this study avoids such erros by using focused datasets, which results in an overestimation of susceptibility in stable areas. Unlike the previous study48, which created a NLSM, this work concentrates on the most vulnerable SUTs (Fig. 19d). This study incorporates more relevant parameters with greater clarity than earlier studies48, many of which were overlooked. The latest findings also uncover new high-to-very high susceptibility zones, particularly in Ladakh and select J&K regions, that were not previously identified.

Fig. 19
figure 19

Individual landslide susceptibility maps from selected past studies, including Lin et al. (2017), Stanley & Kirschbaum (2017), Titti et al. (2021), and Sharma et al. (2024), used to contextualize and compare susceptibility patterns with the present study. (Software: QGIS, Version: 3.4.4, URL: https://qgis.org/download/ and Microsoft PowerPoint URL: https://www.microsoft.com/microsoft-365/powerpoint).

Significance, limitations, uncertainty, and future research directions

This work enhances landslide susceptibility and risk mapping in India, highlighting the necessity for periodic updates to the GSI landslide inventory to augment precision. Future studies must include high-resolution geoenvironmental data and revised inventories that account for dynamic vulnerability shaped by environmental and anthropogenic influences.

Current maps provide general overviews, use LULC-based built-up and road data for risk modeling, which constrains accuracy. Variations stem from model selection, inventory classification, and data granularity. Incorporating more detailed statistics, supplementary vulnerable components (such as agriculture and infrastructure), and landslide inventories directly into risk models would improve regional precision and risk evaluation.

Conclusion

This study offers a comprehensive national-scale assessment of landslide susceptibility and risk in India, utilizing a hybrid framework that combines heuristic (AHP) and data-driven (FR, Yc) models. The incorporation of ten meticulously chosen predictor variables from geological, topographic, hydrological, climatic, anthropogenic, and seismological domains, in conjunction with a comprehensive landslide inventory, improves model thoroughness and dependability. The comparative assessment reveals that the Yc model exhibits enhanced prediction performance (AUC reaching 0.905) and greater geographic concordance with actual landslide distributions. Lithology and land use/land cover were identified as the primary factors influencing susceptibility, succeeded by slope, geomorphons, and proximity to faults. Although rainfall and drainage proximity significantly contribute, seismic variables exhibited minimal impact inside the Yc framework.

The FR model determined the majority area of VHLS at 10.9%, succeeded by Yc at 10.1% and AHP at 9.7%. In HLS, AHP approximated 17%, just above Yc (14.9%) and FR (13.7%). MLS exhibited a comparable distribution across models, signifying topographic uniformity. Areas of LLS and VLLS were most prevalent in FR (55.9%), followed by Yc (55.4%) and AHP (53.9%).

At the national level, VHLS zones encompass approximately 4.0%, 4.2%, and 4.5% of India’s land area according to the AHP, Yc, and FR models, respectively, consistent with prior research. The zones of combined HLS and VHLS account for approximately 11.0% in AHP, 10.4% in Yc, and 10.5% in FR models.Validation demonstrated robust prediction accuracy, with AUC values ranging from 0.874 to 0.905. Risk mapping with the Yc model reveals that around 0.26% of India (~ 8,606 km²) is situated within high to very high landslide risk zones.

These findings offer essential geospatial insights for disaster risk mitigation, highlighting the necessity for focused slope monitoring, public awareness, and the incorporation of landslide risk into land-use planning at mapped scale.