Introduction

Glaciers are crucial parts of the Earth’s cryosphere, acting as enormous reservoirs of freshwater and they play a fundamental role in governing global climate and sea level change. Glaciers hold approximately 69% of the freshwater resources in the world, supporting ecosystems, agriculture, and human communities, especially in mountainous areas such as the Himalayas, Andes, and Alps1. The stability and integrity of these natural features are significantly affected by changes happening in the climatic scenarios. The accelerated rate of global warming has greatly impacted glacier dynamics and initiated global-scale retreat and thinning of glaciers2. Increased temperatures have accelerated mass loss, with research showing that glaciers lost more than 9 trillion tons of ice between the 1960s to 20163. This accelerated deglaciation not only contributed to sea level rise but also destabilized glacial ecosystems, creating problems for higher elevation areas. One of the most notable effects of this change is the formation and expansion of glacial lakes, which are water bodies present in depressions created by receding glaciers and are usually dammed by moraines or glacier ice4.

Glacial lakes are usually harmless but pose considerable dangers when they expand and become unstable and sometimes causing glacial lake outburst floods (GLOFs). GLOFs are among the most devastating floods that are unexpected and disastrous outflow of water due to the collapse of a natural dam/ barrier such as a moraine or ice wall5. They are highly dangerous to communities and infrastructure in downstream areas as well as ecosystems, sending torrential volumes of water, sediment, and debris within a short span of time. In addition to GLOFs, these lakes can contribute to landslides, avalanches, and erosion, increasing the vulnerability of mountainous regions6. Historically, GLOF events—such as the 1941 Huaraz disaster in Peru or the 1985 Dig Tsho flood in Nepal—have claimed lives, destroyed houses, and disrupted livelihoods, displaying its highly destructive nature7,8. The impact of GLOF is complex, and involves loss of life, displacement, and economic destruction. Glacier communities, which usually rely on glacial meltwater for agriculture and hydroelectric power are at a risk of facing extinction as GLOFs ravage houses, roads, and fields9. In the Himalayas alone, millions of residents occupy possible flood paths that lack early warning and adaptive infrastructure10.

As global warming continues to intensify, these events are predicted to become more frequent and intense, fuelled by the accelerating expansion of glacial lakes in High Mountain Asia (HMA), where satellite images show that thousands of new lakes have appeared over recent decades11. Satellite data reveal that the volume, area and count of glacial lakes have increased significantly, especially in high-mountain Asia, where warming is greater than the global average12,13,14,15. This rapid development of glacial lakes is an urgent issue at present, as the process of deglaciation has gained momentum and erratic weather conditions lead to their formation. These glacial lakes can be classified/ identified based on spectral indices such as the NDWI (Normalised Difference Water Index); NDSI (Normalised Difference Snow Index) and other contributing satellite features by applying supervised machine learning, support vector machines, and random forests techniques14,16.

Deep learning, particularly CNN, has enhanced the extraction of features and classification even under difficult scenarios such as hilly terrain conditions, clouds, shadows, or mixed pixels17,18,19. For example, Kaushik et al. 202220 crafted GLNet, a deep convolutional neural network trained for glacial lake mapping using multisource remote sensing images such as multispectral, thermal, microwave, and Digital Elevation Model (DEM) inputs. They have obtained high accuracy (0.98) with good spatial transferability, as the method is fully automatic. Wangchuk et al. 202021 also addressed the problems of small lake sizes, cloud cover, seasonal snow, and changing turbidity by combining Sentinel-1 Synthetic Aperture Radar, Sentinel-2 Multi-Spectral Instrument, and DEM data in a random forest classifier model. The completely automated Python library “GLakeMap” that maps glacial lakes in alpine areas across geographic and climatic differences. Mustafa et al. 202422 combined Sentinel-1 radar backscatter parameters, Sentinel-2 spectral indices, and topographic parameters to train their ML models, ANNs, SVMs, and random forest. ANN achieved the highest accuracy at 95% and CNNs were able to identify glacial lakes with minimal human intervention22. LSTM networks and RNNs have also been applied to model temporal variations in hazards and enhance forecasting23. Nevertheless, high-altitude terrain restricts training data quality and quantity, and extensive validation is needed to account for regional heterogeneity. Addressing these challenges will involve concerted efforts to augment datasets, increase algorithm resilience, and combine ML with conventional techniques.

The improvements in machine learning algorithms have gone a long way in the analysis and observation of glacial lakes beyond their remote and dynamic nature. Thus, the present study focuses on developing an automated and robust multi-level methodology for accurate detection of glacial lakes in Himachal Pradesh utilising high-resolution multisource remote sensing data and machine learning techniques. With the growth and destabilization of glacial lakes, understanding and minimizing their threats is imperative for scientists, policymakers, and affected populations.

Study area

This work is carried out in the state of Himachal Pradesh covering an area of 55,673 km2 that falls in the Western Himalaya. The state shares its eastern border with China; its northern and north-western borders are Jammu and Kashmir; Uttarakhand is to the southeast; and Punjab is to the southwest. The elevation of the state varies from ~ 250 to 7026 m above mean sea level, creating a diverse topography and climate. The major rivers are the Beas, Chenab, Indus, Ravi, and Satluj, including their tributaries that support local ecosystems and livelihoods. There are approximately 2100 glaciers, spanning an area of ~ 3799 km2, covering approximately 6.8% of the state’s area24. The longest glacier in the state is Bara Shigri Glacier, which is approximately 26 km long25. The Indian Summer Monsoon (ISM) and western disturbances influence the region’s snowfall, with maximum snowfall occurring from December to March26,27.

Fig. 1
figure 1

Location map of the study area. The map illustrates the study area situated in Himachal Pradesh, India, marked by the blue inset box.

Glacial lake inventory

Studies indicate that these glaciers have been retreating since the end of the Little Ice Age, leading to the expansion of glacial lakes in previously glaciated areas28,29. Earlier inventories mapped 958 glacial lakes larger than 500 m2 during 2011–201329. A more recent study reported that the number of glacial lakes in the Satluj River basin of the Himachal Himalaya nearly doubled, increasing from 562 in 2019 to 1048 in 202330. We created a glacial lake inventory for Himachal Pradesh through manual digitization using Google Earth Pro, which provides high-resolution images from Maxar and Airbus. We mapped 651 glacial lakes in Himachal Pradesh for the year 2017 and 1130 lakes for the year 2022. This rapid growth highlights the urgent need for automated approaches to map glacial lakes and generate frequently updated inventories. To address this need, we selected 95 lakes from the 2022 inventory for our study area, shown by the blue inset box in Fig. 1b, and used them as the response variable for training and testing our model.

Table 1 Summary of satellite datasets used in the study along with their key characteristics.

Methodology

In this study, we used a combination of spectral, radar, and topographic variables as predictors and a glacial lake inventory as the response variable for lake classification. We extracted the predictor variables from Sentinel 1(10 m), Sentinel 2(10 m), SRTM DEM (30 m) and PlanetScope (3 m) of 2022 and 2023. We used September 2023 satellite image because cloud-free data for 2022 were unavailable for the entire study area. The glacial lake inventory, prepared for 2022, served as the response variable, allowing us to train the model on past lake distributions (2022) and apply it to detect lake presence in the following year (2023). A summary of satellite datasets that are used in this work including their spatial resolution, temporal resolution and swath width are given in Table 1. Although we focused on detection for a single time step, we designed the methodology to remain scalable and adaptable for multi-temporal applications in future studies.

For Level I classification, we calculated spectral indices including the Normalized Difference Snow Index (NDSI), Normalized Difference Water Index (NDWI) using blue and green bands, Normalized Difference Vegetation Index (NDVI), and Normalized Difference Glacier Index (NDGI) from Sentinel-2 data. We used radar backscatter data (VV and VH) from Sentinel-1 and derived topographic variables such as elevation, aspect, and slope from the SRTM DEM. For Level II classification, we used NIR band, NDVI, and NDGI calculated from high-resolution PlanetScope data, along with the same radar and topographic variables. All predictor variable layers were resampled to a common resolution of 10 m using bilinear interpolation to maintain spatial uniformity and ensure accurate pixel alignment within image stack used for training.

We used a glacial lake inventory as the response variable to train and test the Random Forest model as shown in Fig. 2. We evaluated the model’s performance using metrics such as AUC-ROC, recall, precision, accuracy, and F1-score. We applied an automated post-processing workflow and validated the results using high-resolution Planet images.

Fig. 2
figure 2

Workflow chart illustrates the methodology, encompassing data acquisition, preprocessing, classification, analysis, and validation steps.

Level I

For Level I, we used SAR and optical satellite data of 2023 from Sentinel-1, and Sentinel 2 MSI respectively and the SRTM DEM. Optical images assist in identifying spectral contrasts between glacial lakes and their environments and were used to estimate several spectral indices. The NIR band from Sentinel-2 helps in the water-land discrimination because water intensely absorbs NIR radiation31. NDSI discriminates ice and snow from water and land to provide correct lake boundary delineation in glaciated zones. The Sentinel-2 blue, and green bands were utilized to derive two NDWIs namely:

$$\:NDW{I}_{Blue}=\frac{B2-B8}{B2+B8}$$
(1)
$$\:NDW{I}_{Green}=\frac{B3-B8}{B3+B8}$$
(2)

where B2 is blue band, B3 is green band and B8 represent NIR band of Sentinel 2 satellite data. Glacial lakes that reflect higher in B2 band are enhanced more appropriately with the use of NDWI_blue (Eq. 1), whereas NDWI_green is appropriate for glacial lakes reflecting higher in B3 band (Eq. 2)21. Thus, using both the indices, glacial lakes can be identified accurately. NDSI discriminates ice and snow from water and land, enabling accurate delineation of glacial lake boundaries in glaciated zones. Along with NDSI, we used NDVI and NDGI for glacial lake identification. NDVI helps differentiate water bodies, which typically show low or negative values, from vegetated areas that reflect strongly in the near-infrared region. NDGI enhances the detection of snow-covered and glaciated areas by emphasizing the contrast between ice and surrounding terrain32. NDSI, NDGI and NDVI were calculated using Eqs. (3), (4) and (5) respectively using Sentinel 2 data.

$$\:NDVI=\frac{B8-B4}{B8+B4}$$
(3)
$$\:NDSI=\frac{B3-B10}{B3+B10}$$
(4)
$$\:NDGI=\frac{B3-B4}{B3+B4}$$
(5)

where B3 is the green band, B4 is the red band and B8 represent the NIR band and B10 is the SWIR band of Sentinel 2.

As, SAR data can penetrate cloud cover, which tends to interfere with optical images in high-altitude regions33, we used VV and VH backscatter from Sentinel-1 data after applying the pre-processing steps outlined in the methodology flowchart (Fig. 2). The pre-processing framework included an orbit file application, thermal noise removal, radiometric calibration, speckle filtering, terrain correction, and final conversion from linear scale to decibels (dB)22,34,35.

We used spectral indices NDSI, NDWI (green and blue), NDVI, and NDGI; spectral bands such as NIR from both Sentinel-2 and Planet; radar backscatter (VV and VH) from Sentinel-1; and topographic variables including elevation, aspect, and slope as predictor variables. Out of 11, 9 input layers along with the training and testing points are shown in Fig. 3. We resampled all eleven layers to a common spatial resolution of 10 m using bilinear interpolation, normalized them, stacked them as input for the RF model and further processing. This resampling step is important to ensure spatial uniformity across all input layers and proper alignment within the image stack used for model training. This approach allows for easy integration of all datasets within a unified framework suitable for pixel-based classification tasks.

Fig. 3
figure 3

Spatial distribution maps of various predictor variables used in the study. a Aspect Map, b Elevation Map, c Slope Map, d NDGI Map, e NDSI Map, f NDVI Map, g NDWI Blue Map, h NDWI Green Map, and i VH Map. The maps display training and testing points marked as triangles and circles, respectively, over the study area, with corresponding color gradients representing the range of values for each variable.

Level II

For Level II, in addition to Sentinel-1 SAR data, SRTM DEM, we used PlanetScope optical images from 2023. NIR, Red and Green bands from PlanetScope images were used to calculate the NDVI and NDGI using Eqs. (3) and (5). Like Level I, we used Sentinel-1 backscatter VV/VH, after pre-processing it. We created an ensemble dataset consisting of training point values extracted from Planet’s NDVI, NDGI, and NIR, along with values from SRTM’s elevation, aspect, and slope, as well as Sentinel-1’s VV and VH backscatter, and Sentinel-2’s NDSI, NDWI_Green, and NDWI_Blue. We then executed this ensemble dataset on the integrated image stack used in Level I.

Training and testing data preparation

We mapped 95 glacial lakes in the study area for the year 2022 and generated 950 data points within the glacial lake polygons to obtain a robust dataset for training. We also generated 950 non-glacial lake points within the study area to ensure equal representation of both classes. However, we found severe misclassification in the first iteration, where the model incorrectly classified streams as glacial lakes. To address this, we generated additional 500 points particularly within the streams to help the model better differentiate between lakes and streams. Thus, by more accurately representing stream features, this approach improved the model’s accuracy and reduced the false positives in glacial lake detection.

We resampled all predictor variables to a spatial resolution of 10 m and tested for multicollinearity to ensure consistency in the data. We extracted the values of the predictor variables at 950 glacial lake points and 1450 non-glacial lake points using the ‘Extract Multi-Value to Points’ tool in ArcGIS Pro. We then constructed a binary dataset by assigning a value of 1 to glacial lake points and 0 to non-glacial lake points. Finally, we split the dataset into training and testing sets, using 80% for training and reserving the remaining 20% for testing.

Random forest

Random Forest (RF) is an ensemble machine learning algorithm that has been very commonly employed for classification and characterization problems. It works by building multiple decision trees (DTs)36, each of which is based on different subsets of the training set and features by a process referred to as bagging (bootstrap aggregating). By performing an ensemble, overfitting is avoided and generalization is improved,  unlike a single DT, which tends to be biased and prone to overfitting37. In RF, a single tree is constructed based on a bootstrap sample of the training data, and the remaining data are utilized to calculate the out-of-bag (OOB) error, which aids in model performance evaluation. The performance of the algorithm relies on several important hyperparameters, such as the number of trees (ntree), the number of features to consider at each split (mtry), and the node size, which determines the trees’ depth38,39. Trees are usually incremented until the OOB error converges. By combining predictions of many decision trees, RF improves classification accuracy and provides stable predictions for glacial lake identification.

Accuracy assessment

The performance of the glacial lake detection model (RF) was assessed using various statistical metrics, such as accuracy, precision, recall, F1-score and the receiver operating characteristic area under the curve (ROC-AUC). Accuracy is the ratio of correct predictions to all predictions and is beneficial when the class distribution is balanced (Eq. 11). However, ROC-AUC is preferable when dealing with class imbalance, as it calculates the model’s power of class discrimination through the true positive rate versus the false positive rate or specificity (Eq. 7) versus the 1-sensitivity (Eq. 6) plot.

Precision measures the ratio of correctly identified positive instances among all predicted positives (Eq. 8), whereas recall captures the model’s capacity to recognize actual positives (Eq. 9). The F1-score, which is a harmonic mean of recall and precision, provides a balance between the two measures (Eq. 10), such that adequate evaluation is achieved, particularly where false positives or false negatives must be minimized. The performance metrics were calculated as follows:

$$\:Sensitivity=\frac{TP}{TP+FN}\:$$
(6)
$$\:Specificity=\frac{TN}{TN+FP}$$
(7)
$$\:Precision=\frac{TP}{TP+FP}$$
(8)
$$\:Recall=\frac{TP}{TP+FN}$$
(9)
$$\:F-score=\frac{2*Precision*\:Recall}{Precision+Recall}$$
(10)
$$\:Accuracy=\frac{TP+TN}{TP+FN+TN+FP}$$
(11)

where TP is True positive which in our case means pixels which were correctly identified as glacial lakes. TN stands for True Negative, which represents pixels that are not glacial lakes and were not classified as such. FP is False Positive, it denotes pixel incorrectly identified as glacial lakes by the model, while FN is False Negative and it refers to glacial lake pixels that were missed by the model.

Results

Multicollinearity test of predictor variables

Multicollinearity refers to the correlation among predictor variables, which can create redundancy and affect model performance40. To assess this, we conducted a Pearson correlation test to evaluate potential multicollinearity between the predictor variables (Fig. 4). In Fig. 4, we can see that the NDGI and NDVI have a high positive correlation (r = 0.94), which indicates possible redundancy. However, feature contribution analysis with Random Forest (RF) and SHapley Additive exPlanations (SHAP) values revealed that the NDGI contributes much more than the NDVI does in predictive modelling. Similarly, NDWI_Green and NDWI_Blue have an almost perfect correlation (r = 0.98), suggesting that they both carry the same spectral information. Although they are highly correlated, SHAP-based and RF feature importance showed that NDWI_Green makes a valuable contribution to model prediction, justifying its inclusion in the predictor variables. Additionally, the NDVI and NDWI_Blue have very strong negative correlations (r = −0.85). Although high correlation values may signal possible redundancy, our investigation shows that correlation alone is not adequate for selecting variables. Even though some predictor variables are highly correlated, the Random Forest and SHAP analyses confirm that each contributes uniquely to improving model performance. Therefore, in our approach, we held on these variables by their predictive significance and not by dropping them solely on the basis of correlation cut-offs.

Fig. 4
figure 4

Pearson Correlation Matrix illustrating the multicollinearity among various response variables used in the study. The color scale represents the strength and direction of the correlation, ranging from − 1 (strong negative correlation) to 1 (strong positive correlation).

Independence and importance of predictor variables

Machine learning and deep learning techniques successfully replicate the nonlinearity among variables; nonetheless, they are black-box models and do not, by default, provide insight into the contribution of each factor toward the end prediction. To tackle this challenge, the feature importance of predictor variables was assessed through SHAP values (Fig. 5), which not only quantify the contribution of each variable but also show the direction and magnitude of their influence on the classification outcome. For more insight, the overall variable ranking from the RF feature importance analysis is shown in Supplementary Fig. S1, which complements the SHAP results by illustrating the relative contribution of features based on their role.

Fig. 5
figure 5

The plot illustrates the impact of individual features on the model’s prediction using SHAP values. Features are ranked by their importance, with NDSI, NDWI_Green, and Slope having the most significant influence.

The SHAP values derived for the study area show that the NDSI is the most significant feature, as positive SHAP values confirm that high values of the NDSI make classification of a glacial lake more likely. NDWI_Green also shows a considerable contribution, where high values (red) contribute positively to lake classification and low values (blue) decrease the probability of the presence of a lake. This conclusion is consistent with earlier research supporting the effectiveness of spectral water indices in the detection of water bodies41. Terrain characteristics such as slope and elevation are also significant, with gentle slopes and lower elevations promoting the formation of lakes, as also observed in high elevation lake distributions22.

NIR and NDWI_Blue are moderately significant, indicating their sensitivity to surface reflectance and water content. Aspect also emerged as an important factor as it may impact lake formation patterns. Radar-based features such as ascending VV and VH backscatter are of less importance but are useful in providing information on surface roughness and water presence, as indicated by the findings of Shen et al. 202242. The insights from SHAP and RF confirms the relative importance of the variables used in the study. The detailed ranking of variables from the RF model is provided in the Supplementary Fig. S1 and are discussed in detail the supplementary note.

Classification results

Level I

The performance analysis of the Random Forest (RF) model for level I classification yielded promising results (Fig. 6). The model had a total accuracy of 93.69% on the test dataset, which means that it was quite successful in separating glacial lakes from other terrains. Moreover, the statistical measures of class 1 (glacial lakes) also validate the reliability of the model, as the precision is 0.93, the recall is 0.89, and the F 1-score is 0.91. These outcomes emphasize the excellence of the model in accurately predicting glacial lakes with a perfect balance between recall and precision. The validation was conducted visually using the high-resolution Planet Scope satellite data.

Fig. 6
figure 6

Classification result using RF for Level I (Predict_S). The figure shows classified raster results obtained using Sentinel-2 optical data and related indices (NDWI, NDSI, NDVI, NDGI). Areas with a probability of glacial lake presence > 0.75 are shown in blue, while areas with ≤ 0.75 are depicted in white. Red boundaries indicate actual glacial lake boundaries used for validation.

Despite the high accuracy, there was some degree of misclassification in the form of false positive and false negative pixels. These classification mistakes are mainly due to spectral similarities of glacial lakes with adjacent features, including wet ice pixels, shadows, and frozen glacial lakes. This similarity makes it difficult to correctly separate water bodies from adjacent terrain.

Level II

Addressing the above-mentioned misclassification errors, we integrated the NIR band and other remote sensing indices, such as the NDVI and NDGI extracted from Planet data (Level II). Multi-value extraction was performed on training points using Planet’s NIR, NDVI, and NDGI, and these points were then merged with the original training dataset. This ensemble dataset was subsequently executed on an integrated SRTM + Sentinel-2 + Sentinel-1 image stack at a 10-meter resolution. By utilizing this upgraded methodology, we observed an increase in the classification performance (Fig. 7), with a test accuracy being 94.44%, precision of 0.94, and recall of 0.92 for class 1 (glacial lakes). In addition, the number of false negatives decreased from 27 in Level I to 20 in Level II, while false positives decreased from 15 in Level I to 13 in Level II. This slight reduction in misclassification indicates that combining multiple data sources enhanced the model’s ability to distinguish glacial lakes from spectrally similar neighbouring features.

Fig. 7
figure 7

Classification result using RF for Level II (Predict_P).The figure shows classified raster results obtained using indices derived from Planet data for training and applied to Sentinel-2 bands stack for model execution. Areas with a probability of glacial lake presence > 0.75 are shown in blue, while areas with ≤ 0.75 are depicted in white. Red boundaries indicate actual glacial lake boundaries used for validation.

Post-processing

The Random Forest model outputs class probabilities by aggregating predictions from all decision trees in the ensemble. For each pixel, the model calculates the probability of belonging to the “glacial lake” or “non-glacial lake” class based on the proportion of trees that vote for each class. This allows classification to be better represented by the confidence level as well as predicted label, as reflected in the probability score. Such probabilistic output supports both label assignment and uncertainty assessment in classification. Further, for refining the results, post-processing operations were performed to improve the classification results obtained from the RF model. A probability threshold value of 0.75 was used, and pixels with values above 0.75 were kept as glacial lake pixels to minimize misclassification. We vectorized the RF classification outcomes using this threshold of 0.75.

Geoprocessing techniques were employed to enhance the classification. Two classification outputs were considered: one derived from Level I (Predict_S) and the other from the ensembled Level II (Predict_P). A geoprocessing workflow was executed in Predict_S to ensure that the categorized lake pixels in Predict_S were located within 10 m of Predict_P. Subsequently, Predict_S was restricted to encompass Predict_P. The final output was subsequently combined using the union operation before being dissolved and smoothed to produce the final shapefile of the RF model- mapped glacial lakes. This strategy facilitated the refinement of lake boundaries and improved the accuracy of the final map.

Accuracy assessment and validation

The RF model has excellent classification accuracy, with 93.69% for Level I (Predict_S), thus reaffirming its efficiency in classifying glacial lakes. The model’s AUC-ROC score of 0.984 (Fig. 8) also indicates its high discrimination ability between two classes. For class 1 (glacial lakes), the model performed with a precision of 0.93, recall of 0.89, and F1-score of 0.91, thus maintaining a well-balanced performance between precision and recall. The macro average values of precision, recall, and the F1-score are 0.94, 0.93, and 0.93, respectively, whereas the weighted average is consistently at 0.94. These outcomes affirm the strength of the RF model for both classes.

Fig. 8
figure 8

AUC-ROC Curve for Level I classification. The ROC curve displays the performance of the Random Forest model used for glacial lake detection, with an Area Under the Curve (AUC) of 0.984.

Table 2 Model parameter assessment of the models including Precision, recall and F1-score for the two different levels.

Similarly, in the case of the Predict__P dataset (Level II), the AUC-ROC score was 0.983 (Fig. 9), verifying the effectiveness of the model in distinguishing classes. The precision, recall, and F1-score for class 1 are 0.94, 0.92, and 0.93, respectively, highlighting its efficiency in classification. The macro and weighted averages increased to 0.95, 0.94 and 0.95, validating both improvement and consistency of the performance. The test accuracy was 94.44%, indicating the high classification ability of the model. Table 2 summarizes the precision, recall, and F1-score metrics for Predict_S and Predict_P, highlighting the classification performance for both the glacial lake and non-glacial lake classes. Moreover, the counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) were calculated to increase insights into the performance of the model. These are shown in Table 3, where for Level I (Predict_S) the model recorded 213 TP, 411 TN, 15 FP, and 27 FN, while for Level II (Predict_P) TP increased to 220, FN reduced to 20, and FP dropped to 13 with TN rising slightly to 413.

Fig. 9
figure 9

AUC-ROC Curve for Level II classification based on Planet derived indices. The ROC curve displays the performance of the Random Forest model used, with an Area Under the Curve (AUC) of 0.983.

Table 3 Classification performance of the random forest model for two levels, showing true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).

Additionally, to validate the final results visually, we used high resolution PlanetScope images (3 m) as these images offer enhanced visibility of surface details. The results are depicted in Fig. 10 that shows model derived glacial lake boundaries (red outlines) overlaid on PlanetScope images after applying post-processing steps to the model output. Lakes are precisely identified and delineated in panels Fig. 10a, b, d, h, i, j, k and l, demonstrating the method’s ability to capture boundaries with high spatial precision. Few smaller lakes that were not detected are highlighted in yellow box in Fig. 10f and g, while panels Fig. 10c and e show partial omissions, where portions of lakes were missed. Such errors occurred infrequently and were mainly due to factors such as small lake size, shadow interference, and spectral confusion with snow or debris-covered ice. This visual inspection supports the quantitative evaluation by confirming that the proposed RF approach performs reliably in mapping glacial lakes, with only minimal misclassifications.

Fig. 10
figure 10

This figure illustrates the final delineation of glacial lakes, where red boundaries represent the detected lake outlines overlaid on 3 m spatial resolution Planet imagery. The post-processing step significantly improves the accuracy and clarity of lake boundaries by filtering out false positives and enhancing classification precision. Yellow boxes in panels (f) and (g) highlight regions where supraglacial lakes were misclassified due to spectral and textural similarities with nearby wet or saturated glacial surfaces.

Discussion

Proposed method for glacial lake mapping

In the last hundred years, the Himalayas have experienced rising temperatures well above the global average, leading to faster recession of glaciers and growth in glacial lakes11,16. Glacial lakes are valuable sources of fresh water for consumption, hydropower, and agriculture, but when such lakes grow at a fast rate and are structurally unstable, there are considerable risks. Since the formation of glacial lakes are dynamic, frequently updated lake inventories are needed to observe and manage these hazards properly. The accelerated growth of glacial lakes in a warming climate imposes a requirement for accurate and scalable automated mapping solution.

This study provides an operational framework for the detection of glacial lakes in high-altitude regions that addresses critical needs in hazard assessment, monitoring of environmental changes, and resource management, especially in rapidly evolving glacial environments. The method proposed in this study addresses this by outperforming traditional manual and semi-automated techniques. Manual digitization43 is accurate but time-consuming and not possible for large areas and frequent updates. Semi-automatic techniques44 usually rely on thresholding and involve substantial human intervention and post-correction that restricts their consistency in diverse terrain. In contrast, our fully automated approach enables reproducible and accurate detection and delineation of glacial lakes in the remote Himalayan region by integrating multi-source remote sensing data as predictor variables within a Random Forest based machine learning framework.

Although the importance of NDWI-green and NIR have been well established21,22, the present study highlights the role of the NDWI-blue as an auxiliary index to delineate lake boundaries with greater accuracy under complex terrain conditions. SAR data are particularly valuable because they can be acquired under all weather conditions, including cloud cover. Consistent with the findings of previous studies42,45, our results demonstrate that integrating optical and radar data provides synergistic benefits for improved classification performance. The high importance of topographic parameters, particularly elevation and slope, are consistent with the findings of Mustafa et al. 202422. The RF classifier shows a good performance in classifying glacial lakes. The Sentinel-based classification (Predict_S) model shows an accuracy of 93.69%, precision and recall of 0.93 and 0.89 respectively for glacial lakes. Similarly, the Planet-based classification (Predict_P) model has an overall accuracy of 94.44%. For glacial lakes, the precision and recall are 0.94 and 0.92, respectively, confirming the reliability of the model.

Furthermore, this study utilizes a probability threshold of 0.75 to reduce false positives in the classification output. In high-altitude regions, glacial lakes can resemble features which includes streams, shadows, and supraglacial meltwater which can lead to misclassification. Using a high probability threshold, like 0.75, helps keep only the most confident predictions. This reduces false positives and improves boundary accuracy, especially in complex glacial areas. However, this fixed threshold has a limitation where small or partially shadowed lakes with lower confidence scores may be left out, which leads to false negatives and an underestimation of lake pixel identification. While this approach increases precision, it limits sensitivity.

Although deep learning models like GLNet, investigated by Kaushik et al. 202217, perform admirably because of their sophisticated structure, Random Forest has turned out to be an ideal substitute. As a machine learning model, RF yields high accuracy but at a more user-friendly level that may not require familiarity with deep learning46. This renders it a practical and effective option for a wide variety of applications.

Comparison with existing literature

As compared with the current literature, especially with the GLNet model introduced by Kaushik et al. 202220 and the GLaKeMap approach presented by Wanghcuk et al. 202021, our Random Forest–based approach shows comparable or better performance in most of the important metrics. For example, GLNet reported F1-scores between 0.70 and 0.91 for different test sites. While GLNet reported correctly mapping both large (>1 km2) and small (< 0.5 km2) glacial lakes without human interference, their F1-score fell to 0.70 at a few locations. Our algorithm, however, produced a stable F1-score of 0.91 (Sentinel) and 0.93 (PlanetScope ensemble), showing excellent performance in detecting even smaller glacial lakes over rugged Himalayan landscapes. Visual verifications ensure that our approach consistently identified smaller lakes that are frequently missed or mislabeled in other methods. A detailed table comparing the proposed methodology with the existing methods is provided in the supplementary material (Supplementary Table 1). The computational requirement was not explicitly stated in their paper20, it is important to note that GLNet is based on a Convolutional Neural Network (CNN) architecture, which, by general understanding, relies heavily on high-end GPU infrastructure for both training and inference. By contrast, our machine learning based model is computationally light, resource-efficient, and does not require specialized GPU hardware. This makes it more accessible for operational use, especially in scenarios where advanced computational resources are unavailable.

Also, the GLaKeMap technique showed good detection and delineation performance and achieved accuracies of 94.2–96.9%, and mapping accuracies as high as ~ 97.96%. But their accuracy calculation was done based on comparing the area of manually digitized lakes with that of automatically delineated lakes. This calculation, although simple, is different than the general method of accuracy calculation. As our findings (Fig. 7f) reveal that while automatically extracted lake pixel areas may closely match digitized polygons in size, their spatial position could vary substantially. These positional errors are not accounted for by area-based performance measures and can cause performance scores to be overestimated. Whereas our assessment incorporates pixel-level matching and statistical measures like precision, recall, and F1-score, which provide a more detailed measure of classification performance.

Additionally, our incorporation of multisource data sets such as optical indices (NDWI, NDVI, NDGI), radar backscatter (VV, VH), and topographic variables (slope, elevation, aspect) improved the RF classifier’s ability to identify glacial lakes against spectrally similar terrain features like shadows and snow cover. This is especially advantageous in high-altitude, snow-covered, and cloud-ridden environments where single-source data tend to underperform.

Although both GLNet and GLaKeMap contributed meaningfully to glacial lake mapping automation, our approach builds and improves on these with a more flexible, interpretable, and efficient solution. It overcomes limitations of threshold-dependency for image segmentation and area-only accuracy measures and exhibits better performance, specifically in detecting small and spectrally ambiguous lakes. These advancements are critical for operational GLOF risk assessment systems and long-term environmental monitoring in remote Himalayan basins.

Limitation and future works

The results obtained from the classification appear promising after visual validation using high-resolution Planet images. However, we still observed some misclassifications for supra glacial lakes. Despite the use of both optical and SAR data and rigorous post-processing, detecting supraglacial lakes remains challenging due to their spectral similarity and spatial ambiguity with surrounding glacier surfaces. As seen in Fig. 10f and g (yellow boxes), some of the lakes has not been classified by the model.

Additionally, a fixed probability threshold of 0.75 was applied to reduce false positives. However, this fixed threshold creates a trade-off between precision and recall. Some lakes that are small, partially shadowed or ice-covered may be assigned lower probabilities and thus excluded from the final classification. This might lead to false negatives and slight underestimations of lake areas in specific regions. The threshold significantly improves output quality however future studies could explore adaptive or region-specific thresholds to improve detection accuracy. In future such work can explore adaptive or region-specific thresholds and incorporate temporal information to enhance detection accuracy. Additionally, extending the model for time series analysis could help track lake evolution and support GLOF risk assessments and early-warning systems.

Conclusion

This study contributes to cryospheric remote sensing by presenting a fully automated methodology for glacial lake detection in the Himalayas. It combines remotely sensed data from various sources with Random Forest (RF) classification and SHAP analysis. The model uses the SRTM DEM, Sentinel-1 SAR, Sentinel-2 MSI, and high-resolution PlanetScope data. This setup allows for accurate identification of glacial lakes.

This combination of sensors, particularly the use of PlanetScope for training, is critical for capturing fine-scale glacial features. It also achieves high classification performance across all key metrics, including the AUC-ROC, F1-score, accuracy, precision and recall.

This method offers clear advantages over existing approaches. Unlike GLakeMap, which relies on manually defined thresholds for segmentation, our Random Forest model learns complex decision boundaries directly from the data. This improves adaptability across different terrains and reduces omission errors. Compared with deep learning models such as GLNet, which require large, labelled datasets and substantial computational resources, our model provides comparable classification accuracy, achieving 94.44% overall accuracy at Level II (PlanetScope ensemble), while remaining computationally efficient and accessible to researchers without access to GPU infrastructure. The use of SHAP analysis adds model transparency by highlighting the importance of specific input features that addresses a key issue with black-box models. The false positives from features like streams, shadows, and supraglacial meltwater was minimized by a post-processing step using a probability threshold of 0.75. This filtering retained only high-confidence lake pixels and greatly improved boundary accuracy in complex glacial environments.

Validation against high resolution PlanetScope images confirmed strong agreement with model predictions which supports the reliability of this approach. This method holds significant potential for operational use in hazard assessment such as climate-driven glacial lake expansion and the associated risk of Glacial Lake Outburst Floods (GLOFs). The framework presented here bridges the gap between algorithmic performance and practical usability and contributes to both scientific understanding and real-world glaciological monitoring.