Introduction

Water, the lifeblood of our planet, has been essential for sustaining all forms of life since the beginning of time. It is a vital resource that nurtures ecosystems and supports human existence. However, the quality and availability of water have become significant global concerns in recent years, primarily because of climate change and population growth1. Access to clean water is a pressing issue worldwide, particularly in underdeveloped countries. Recognizing this challenge, the Millennium Development Goals were established to improve access to safe drinking water sources for communities around the globe2. By focusing on this fundamental aspect of human well-being, we can address one of the most basic yet essential needs of individuals and empower communities to thrive. However, managing water resources effectively has become increasingly challenging in the face of escalating human activities. Activities such as the discharge of wastewater and sewage into surface and groundwater (GW) resources pose significant threats to water quality and overall ecosystem health3. The consequences of these actions extend far beyond human well-being and impact the delicate balance of our planet’s biodiversity.

Furthermore, rapid urbanization and population growth have placed immense pressure on GW resources, leading to their overuse and subsequent depletion4. Aquifers, the underground reservoirs that provide a significant portion of our freshwater supply, are at risk of irreversible decline. This highlights the urgent need for sustainable water management practices and the development of innovative solutions to ensure the long-term viability of this finite resource5. In recent decades, the combination of climate variability, rapid population growth, and overexploitation has detrimental effects on valuable GW resources worldwide6,7,8,9. Climate variability, including changes in rainfall patterns, surface runoff, evaporation, and rising temperatures, has further exacerbated the issue10. Despite these challenges, GW remains a crucial resource, with subsurface water reserves exceeding those found in streams and freshwater lakes by over 100 times.

Countries facing water scarcity and impacted by climate change-related changes in surface and GW quantity and quality encounter difficulties in understanding the dynamics of their GW resources11. This is particularly true for countries like Pakistan, where rapid GW use, especially in agriculture-dependent regions, has become a pressing concern for decision-makers responsible for managing sustainable GW resources. As a result, comprehending the GW resource regime has become a critical task for countries grappling with water shortages and climate change impacts11. Pakistan is among the most water-stressed countries in the world, and the growing GW quality problems are of grave concern for the increasing population12,13. Water pollution due to industrial, agricultural, and domestic sources has led to widespread contamination, affecting approximately 80% of the population with access to unsafe drinking water14. Urgent measures are needed to address these issues to ensure sustainable water management and achieve water-related UNSDGs by 203015. In recent years, the use of remote sensing (RS) and Geographic Information Systems (GIS) has proven effective in identifying potential GW zones16,17,18. RS provides valuable information and a comprehensive view of spatial and temporal distribution, enabling a more efficient assessment of regional GW flow dynamics19. The integration of RS science, GIS, and ground-based field research is crucial for predicting GW dynamics. Various factors such as geology, drainage patterns, slope, elevation, land use/land cover, lineament density, soil type, rainfall, topography, lithology, porosity, and climatic conditions play a significant role in understanding the occurrence, movement, and delineation of potential GW zones20. Various statistical and machine learning (ML) models have been developed and employed for GW prediction. Models such as the Analytical Heircery Process (AHP)21,22, Weight of Evidence-WOE23, Artificial Neural Networks24,25, Support Vector Machine-SVM26,27, Adaptive Network-based Fuzzy Inference System-ANFIS28, Extreme Gradient Boosting-XGB29, Fuzzy Logic30, and the M5 model have been widely adopted for GW prediction. Choubin and Rahmati31 adopted Random Forest (RF) method for GW potential mapping in fractured bedrock aquifer in Iran. Choubin, et al.32 successfully generated a GW potential map for the Firoozeh watershed in Iran using the Classification and Regression Trees (CARTs) algorithm. Their study incorporated eleven conditioning factors, achieving a notable accuracy level of 88%. Mosavi, et al.33 conducted a study to evaluate four ensemble models (GamBoost, AdaBoost, Bagged CART, and RF) for GW mapping using 339 GW resource locations and spatial conditioning factors. Their findings highlight that Bagging models, particularly RF, demonstrated superior performance (accuracy = 0.86), with topographic and hydrological variables playing crucial roles in the modeling process. Recently Saha, et al.34 employed a ML and geospatial data integration approach to manage urban and peri-urban aquifer sustainability in Vizianagaram, Southern India. Using RF and SVM models with hydrogeological and geo-environmental data, they categorized GW potential with prediction accuracies exceeding 80%, notably 88.40% for RF.

In our study, we chose to employ SVM and WOE techniques for several compelling reasons that align with the specific objectives and characteristics of our research area. SVM was preferred due to its ability to handle complex, nonlinear relationships inherent in GW potential mapping, offering high accuracy and robustness in classification tasks35. This method excels in effectively utilizing high-dimensional data and generalizes well to predict GW potential in unobserved areas without significant risk of overfitting, which is crucial for spatial predictions. SVM has gained popularity in hydrology, as evidenced by their increasing adoption26,29,36. Additionally, WOE was selected for its simplicity and interpretability, allowing for the transparent integration of diverse thematic layers based on their statistical relevance to GW occurrence37,38. This approach facilitates a clear understanding and communication of results to stakeholders, essential for informed decision-making in GW management.

Water-shortage countries, particularly those affected by climate change impacts on surface and GW quantity and quality, face challenges in comprehending the dynamics of their GW resources. The rapid expansion of GW use, especially in agrarian-based developing nations like Pakistan, raises concerns for decision-makers responsible for managing sustainable GW resources. Monitoring GW observations from natural springs and wells is essential to identify stress zones losing GW storage and assess GW quality in the area. A comprehensive study of GW storage is vital for understanding the hydrological cycle and its relationship with climate change11. In the Northwestern Himalayas, where cities are located along river banks, water quality is adversely affected by rapid population growth39. Pakistan has a significant portion of the population lacking access to safe drinking water, below the standards set by the World Health Organization (WHO). In northern Pakistan, the population relies on natural springs and wells for drinking water, which also serve as irrigation sources for nearby agricultural areas. In Pakistan Administered Kashmir (PAK), over 80% of illnesses have been attributed to the consumption of poor-quality water from both surface and GW sources40. Proper management of GW quality is crucial for protecting these resources from contamination11. The Muzaffarabad municipality is facing challenges related to GW management and water quality41. The rapid population growth, coupled with limited access to safe drinking water, highlights the need for effective GW potential mapping and water quality assessment. Therefore, there is a pressing need to address these issues and develop sustainable strategies for GW resource management in the Muzaffarabad municipality. There is a significant research gap in the study area, regarding baseline studies for GW potential mapping and quality assessment. Despite the critical importance of sustainable water management in the region, there is a lack of comprehensive studies addressing these aspects. This study aims to fill this gap by systematically mapping GW potential and assessing water quality in Muzaffarabad, providing essential data for informed decision-making and sustainable water resource management. The research objectives encompass mapping GW potential using SVM and WOE, assessing water quality through geochemical tests, evaluating the effectiveness of the mapping techniques, identifying factors influencing GW potential and water quality, and providing recommendations for sustainable GW resource management. The novelty of this research lies in the integration of SVM and WOE techniques for GW potential mapping and the comprehensive assessment of water quality in the specific context of the Muzaffarabad municipality. We integrated the two methods in a complementary way, not in a real integration. By complementary, we mean that we used the strengths of each method to compensate for the weaknesses of the other. The findings of this study will contribute valuable insights into the field of GW resource management and serve as a basis for sustainable strategies in the municipality.

Study area characteristics

The study area is geographically situated in the northeastern Himalayas of Pakistan and is located 34◦20′00′′ to 34◦25′00′′ N latitude and 73◦26′00′′ to 73◦30′45′′ E longitude covering an area of 52.1 km2 (Fig. 1). Topographically, the study area is situated on the conflux of the Neelum and Jhelum rivers having rugged topography dissected by several other minor and major streams. It is a hilly mountainous area with its elevation varying from 611 m to 1,630 m above mean sea level (MSL). Muzaffarabad, with a population of 149,005, faces challenges in GW availability due to its rugged terrain, particularly away from streams. The region’s aquifers, predominantly unconfined and situated near the Neelum River, rely on recharge from seasonal rainfall and the river itself42. The area experiences an average annual rainfall of 1400 mm, with temperatures ranging from 18 °C to 39 °C in winter and summer respectively. The local ecosystem is under significant pressure due to high population density, highlighting the critical link between GW potential and human sustainability43. Lithologically study area consists of different rock units of sedimentary rocks. These include the Pre-Cambiran Hazara Formation the Cambrian Muzaffarabad Formation, the Paleocene-Eocene sequence, Miocene Murree Formation, and the Quaternary Alluvium deposits (Fig. 2). The study area has severe challenges regarding the GW potential and quality. The rugged topography, combined with the presence of numerous minor and major streams, obstruct the availability and accessibility of GW resources. This challenging terrain and complex hydrogeological conditions make it imperative to establish a dependable method for identifying viable GW sources in the research area. Addressing these issues is of utmost importance, as the scarcity of reliable GW sources adversely impacts the well-being of local inhabitants and hinders sustainable development in the region. Therefore, conducting a comprehensive study to determine the GW potential and availability is crucial to developing effective strategies for sustainable water management and ensuring the prosperity of the local community in the study area.

Figure 1
Figure 1
Full size image

Geographic location of area (a) Geographic location of the area with respect to world map (b) location of the study area in NE part of Pakistan, (c) location map: visualizing the study area’s geographical location. (Source: Figures a and b were generated using ESRI online resources). 

Figure 2
Figure 2
Full size image

Geological map of the study area (digitized from37). 

Methodology

The GW potential mapping and water quality assessment methodology for the Muzaffarabad municipality consists of several key steps (Fig. 3). It begins with the collection of relevant data, including geological information, drainage patterns, slope, elevation, land use/land cover, and lineament density. Additionally, water quality data from spring waters and well bores is obtained from Environmental Protection Agency (EPA), AJK, Pakistan. To ensure accuracy, the collected data undergoes a thorough preprocessing stage, which involves cleaning the data and applying normalization or standardization techniques for uniformity and comparability. The cleaning process primarily focused on removing outliers and correcting any errors, particularly those originating from areas outside the study area. For normalization, we applied the min-max approach to ensure uniformity and comparability in the dataset. This technique scales the data to a specific range (typically 0 to 1), providing a standardized format for the variables. For GW potential mapping using SVM, the dataset is divided into training and testing sets. To address potential biases in the model, we implemented a stratergy of randomization of data. The dataset was randomly shuffled before splitting into 70% training and 30% testing sets. This ensures that the distribution of data points is uniform across both sets, minimizing the risk of any inherent biases affecting the model’s performance. A total of 39 samples were collected for the study. The data was split in a 70:30 ratio, with 70% of the samples (27 samples) allocated for training purposes, and the remaining 30% (12 samples) reserved for testing the model. The SVM algorithm is then applied to the training dataset to construct a predictive model. During this process, an appropriate kernel function is selected, and hyperparameters are fine-tuned to optimize the model’s performance. The accuracy and effectiveness of the SVM model are evaluated and validated using the testing dataset. To evaluate and validate the SVM model, we used some metrics (Accuracy, F1-score and AUC, the area under the ROC curve) that can measure its performance and accuracy. Once validated, the trained SVM model is extended to cover the entire study area, generating a comprehensive GW potential map. Simultaneously, GW potential mapping is also conducted using the WOE technique. WOE values are calculated for each predictor variable based on their relationship with GW potential. These WOE values are then weighted to determine their relative importance in predicting GW potential. The weighted WOE values are combined to calculate an overall WOE score for each location within the study area. As a result, the study area is classified into different GW potential zones, leading to the creation of a GW potential map. Alongside GW potential mapping, a water quality index is also developed using the collected water quality data. Various geochemical parameters, including the physical and chemical properties of water, are analyzed following the standard procedures outlined by the WHO. The water quality index is calculated based on these parameters to assess and evaluate the overall water quality at different locations within the municipality. In summary, this methodology efficiently combines the SVM and WOE techniques for GW potential mapping and integrates geochemical tests for water quality assessment. This combined approach enables the creation of GW potential maps and water quality index maps for the Muzaffarabad municipality, ensuring a consistent and comprehensive analysis.

Figure 3
Figure 3
Full size image

Methodological flow diagram.

Dataset used

In this study, the GW potential model was computed using eight thematic layers based on topographical, hydrological, geological, and ecological parameters. The topographical factors, including elevation, slope gradient, aspect, and curvature, were derived from the Phased Array type L-band Synthetic Aperture Radar (PALSAR) Digital Elevation Model (DEM) with a 12.5 m resolution obtained from the Advanced Land Observing Satellite (ALOS). The Spatial Analyst tool of ArcMap 10.8 was used for this purpose.

An area’s topographic characteristics play a crucial role in determining the water table elevation2. Additionally, geographical factors such as surface drainage and stream networks are significantly influenced by elevation. The maximum elevation in the area was 1630 m, which was divided into five classes (Fig. 4a). The decision to categorize elevation into five classes was based on a combination of geological and hydrogeological considerations, as well as practical considerations for the GW potential mapping study. The slope gradient indicates the degree of inclination or steepness of a surface and is relevant for determining runoff. The slope of the research area was classified into six classes: very mild slope (0–10), mild slope (11–19), gentle slope (20–26), moderate slope (27–33), steep slope (34–46), and extremely steep slopes (above 46) (Fig. 4b). The slope aspect, representing the direction a slope faces, is also essential as it affects snow melting and water infiltration. The slope aspect of the study area was classified into eight classes (Fig. 4c). Curvature, indicating the nature of the surface profile as concave or convex, influences water accumulation in the study area. The curvature of the slope was classified into three classes (Fig. 5a). Hydrological parameters, such as drainage patterns, are significant as they illustrate how quickly rainwater percolates into the soil2. The geological formations, soil absorption capacity, permeability, and slope of an area’s geology all impact its drainage system44. The drainage network of the study area was divided into five classes: 0 to 25 m, 25 to 50 m, 50 to 100 m, 100 to 250 m, and > 250 m, serving as evidence characteristics (Fig. 5b).

Figure 4
Figure 4
Full size image

Groundwater potential influencing parameters: (a) Eevation, (b) Slope gradient, c) Slope aspect.

Figure 5
Figure 5
Full size image

Groundwater potential influencing parameters: (a) Curvature, (b) Distance to streams, (c) Lineaments density.

GW replenishment is notably influenced by surface and subsurface geological parameters, including lithology and geological structure45. The area’s lithology is a crucial factor in determining GW potential zones due to its influence on hydraulic conductivity46. The lithological units were digitized by37 using ArcGIS 10.8 (Fig. 2). Lineaments, recognized as curvilinear features on satellite images, play a significant role in hydrogeology by indicating permeable zones and GW transport pathways47. The study extracted lineaments using Landsat-8 satellite imagery, which were then reclassified into four classes (Fig. 5c). The satellite image processing workflow involves both pre-processing and post-processing. In the preprocessing, the acquired satellite images undergo pre-processing, which includes atmospheric corrections. This correction is achieved using a radiometric calibration tool, ensuring that atmospheric interference is minimized or removed from the images. Post-processing involves applying a smoothness operation to the satellite images. The purpose of this operation is to reduce noise and enhance the visual quality of the images. One of the most important ecological parameters for determining GW occurrence is land use and land cover (LU/LC). The changes in LU/LC have a direct impact on GW flow and also directly influence hydraulic conductivity48. The LU/LC of the area was prepared using Exlis ENVI and satellite imagery from Sentinel-2 with a cell size of 10 m x 10 m and less than 2% cloud cover, acquired from the official source of Copernicus Open Access Hub-EU. The LU/LC was classified into five classes: Water Bodies, Green Land, Forest Land, Barren Land, and Urban Land (Fig. 6). In this work, both input data and output results are on a regional scale. The input data, collected from various regional sources, includes a DEM at 12.5 m resolution, Landsat imagery at 30 m resolution, and a geological map at 1:50,000. The final GW potential map is presented at a resolution of 30 m, ensuring regional relevance and applicability. Overall, this study integrated diverse data layers from topography, hydrology, geology, and ecology to compute the GW potential model, providing valuable insights into GW availability and quality within the Muzaffarabad municipality.

Figure 6
Figure 6
Full size image

Landcover map of the Muzaffarabad municipality area.

Groundwater potential modeling methods

Weight of evidence (WOE)

The WOE is a bivariate statistical approach employed in many scientific investigations to assess environmental phenomena for almost six decades49,50,51. Initially, this approach was created for the diagnostics and predictions of diseases49. Then in the 1980s, this approach was employed for the probability assessment of minerals52. The WOE is based on the Bayesian statistical approach to evaluate the geospatial correlation of phenomena with their effective factors by assigning the weights to each part of the causative factor. This statistical modelling approach also assesses the positive (W+) weight at the time of sample evidence happening to every specific variable that was predicted and the negative weight (W) that reflects the absence of the predicted variable at the time sample evidence not happening, and Contrast (C) which is a quantitative measurement of the relation among the class of effective cause and evidence. The mathematical equation of the positive and negative weights is as follows in Eq. 1 and Eq. 2:

$$\:\varvec{W}+\:=\text{I}\text{n}\left(\frac{\text{H}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{i}\text{n}\:\text{o}\text{b}\text{s}\text{e}\text{r}\text{v}\text{e}\text{d}\:\text{c}\text{l}\text{a}\text{s}\text{s}}{\:\begin{array}{c}\frac{\begin{array}{c}Total\:happening\:area\end{array}}{\frac{\text{N}\text{o}\text{n}-\text{h}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{a}\text{r}\text{e}\text{a}\:\text{i}\text{n}\:\text{o}\text{b}\text{s}\text{e}\text{r}\text{v}\text{e}\text{d}\:\text{c}\text{l}\text{a}\text{s}\text{s}}{\text{T}\text{o}\text{t}\text{a}\text{l}\:\text{n}\text{o}\text{n}-\text{h}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{a}\text{r}\text{e}\text{a}}}\end{array}}\right)$$
(1)
$$\:\varvec{W}-\:=\text{I}\text{n}\left(\frac{\text{H}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{a}\text{r}\text{e}\text{a}\:\text{i}\text{n}\:\text{o}\text{t}\text{h}\text{e}\text{r}\:\text{c}\text{l}\text{a}\text{s}\text{s}}{\:\begin{array}{c}\frac{\begin{array}{c}Total\:happening\:area\end{array}}{\frac{\text{T}\text{o}\text{t}\text{a}\text{l}\:\text{n}\text{o}\text{n}-\text{h}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{a}\text{r}\text{e}\text{a}\:\text{i}\text{n}\:\text{o}\text{t}\text{h}\text{e}\text{r}\:\text{c}\text{l}\text{a}\text{s}\text{s}}{\text{T}\text{o}\text{t}\text{a}\text{l}\:\text{n}\text{o}\text{n}-\text{h}\text{a}\text{p}\text{p}\text{e}\text{n}\text{i}\text{n}\text{g}\:\text{a}\text{r}\text{e}\text{a}}}\end{array}}\right)$$
(2)

and the contrast (C) value can be computed by the following formula (Eq. 3)

$$\boldsymbol C=\mathrm{Wpositive}-\mathrm{Wnegative}$$
(3)

The WOE assessment evaluates the effectiveness of each influencing cause by correlating it with GW occurrence in different locations. Based on this correlation, specific weights are assigned to each influencing cause. These weights signify the relative importance of each cause in predicting GW potential. The WOE technique helps in determining the significance of different factors and their contributions to GW availability and quality in the study area. W+ and W are the two measurements used to determine these weights. To receive a W+, the training points placed inside the class are evaluated. The W+ score is larger than 0 if the class contains more training points than would be predicted by chance. The W is smaller than 0 if the number of training points outside the pattern is fewer than what would be predicted by chance. W+ and W are combined to form contrast (C), a measure of the total relationship between a class and the training points. The weights of each influencing factor are determined by utilizing the indexes of W+ and W. The probability model results from combining the weighted influencing parameter map into a probability map using the following mathematical summation (Eq. 4).

$$\:\text{W}\text{o}\text{E}\:\text{G}\text{W}=\sum\:\text{W}\left(1\right)+\:W(2)\dots+W(n)$$
(4)

In this study, WOE bivariate statistical analysis is performed using the ArcSDM (Spatial Data Modeller) which is an extension of the ArcGIS52. It is used to determine the weight and related statistics of influencing elements and to develop a posterior probability GW model.

Support vector machine (SVM)

Another popular classification approach based on statistical learning theory is the SVM, sometimes known as the maximum margin classifier. It was developed in the 1990s and is considered one of the most widely used approaches because of the performance efficiency in various algorithms. SVM is from the conceptual optimization premise and is fundamentally based on statistical learning theories. It is typically used to develop the highest possible generalization capabilities for machine learning’s empirical relations and confidence intervals. The following Eq. 5 describes it.

$$\:K\left({x}_{a},{x}_{a}\right)=exp\left(-\lambda\:{\sum\:}_{b=1}n\:{\left({x}_{ab}-\:{x}_{a{\prime\:}b}\right)}^{2}\right)$$
(5)

Here in this eq. \(\:{x}_{ab}\) and \(\:{x}_{a{\prime\:}b}\) are the ith pair of observations of the predictor, n is the number of predictors, \(\:\lambda\:\) is a tuning parameter that accounts for the smoothness of the decision boundary, and K stands for the kernel function. Hyper-parameter tuning is a critical step in ML, involving systematic testing and optimization to improve model performance. We employed a Radial Basis Function (RBF) kernel, which is particularly effective for classification and regression tasks involving complex relationships. In this study, the best tuning parameters were determined using the grid search (GS) technique. The process began with training the SVM model with the RBF kernel by finding the optimal kernel parameters (cost (C) and gamma (γ)) using GS. After GS analysis, the optimum parameter values were identified: a gamma value of 0.3 and a cost value of 1 were used in our model. To save time and for quick result modelling an innovative tool pack named LSM (FE tool) developed by53 is used in ArcGIS to generate the SVM-GW model.

Water quality analysis

Water quality analysis begins at the sampling site immediately after collecting the samples. The reliability of the analysis heavily relies on having the necessary facilities, tools, and equipment, as well as following recommended sampling procedures. During the sampling process, utmost care is taken to eliminate any external factors that could alter the composition of the samples and affect the results. To facilitate this, water samples for physical, biological, and chemical analysis were collected using 1-litre capacity polystyrene bottles. The location coordinates of each spring were recorded using the Global Positioning System (GPS).

At the sampling site, physical parameters such as Temperature, Electrical Conductivity (EC), Turbidity, Odour, Colour, and Taste were measured, and the GPS coordinates of each sampling site were recorded. All collected samples were placed in a chill box to maintain their integrity and then transferred to the laboratory for further bacterial and chemical analysis. Advanced multi-parameter equipment was used to analyze parameters like pH, conductivity, temperature, and turbidity. For the chemical parameter analysis, a high-tech Spectrophotometer Lovibond XD-7500 was employed. The chemical parameters included total hardness, Ca, Mg, Chloride, Nitrite, Nitrate, and Sulphate. Bacterial test kits were also incubated for 48 h, after which they were carefully examined. Following the analysis, a table was created to record the results for each parameter based on the collected samples. The spatial distribution map of each parameter in the study area was developed using ArcMap 10.8. Each parameter was then reclassified into different classes, ranging from least to suitable, based on the standards set by the WHO for drinking water. In summary, the water quality analysis process is conducted meticulously, starting from the sampling site to the laboratory analysis and mapping, ensuring accurate and reliable results for each parameter in the study area.

Water quality index (WQI) modeling

The Water Quality Index (WQI) is an effective way to summarize all quality parameters into a single value, expressing the suitability of water resources for human consumption. As described by54, the WQI is a comprehensive ranking that considers the combined impact of multiple factors affecting water quality. The WQI is determined by assessing the cumulative effect of both human-induced and naturally occurring activities, based on specific parameters in the hydro-geometry properties of the water samples.

To calculate the WQI values for each sample location, the average concentration of determinants (TDS, EC, CL-, HCO3, PO43, Ca2+, NO3, Na+, K+, and Mg2+) is used for both dry and wet periods. The weighted Overlay method, commonly used for multi-criteria analysis such as site selection and appropriateness models, is employed to develop the WQI model. Each thematic layer is assigned a weight based on its significance using the Analytic Hierarchy Process (AHP) method. The developed thematic layers in raster format are integrated into ArcMap 10.8, and the model is constructed using the weighted overlay tool. Each reclassified thematic layer is then weighted according to its importance based on the WHO drinking water standards.

Results and discussion

Factor analysis

The slope of an area has a significant impact on both GW potential as well as GW infiltration. Table 1 illustrates the significance of slope as a key topographic factor influencing flat-mild slope areas indicating their more suitability for GW presence in the study area. The GW potentiality is greater in flat and gently sloping terrains55. In the flatter slope terrain area, there is a higher probability that GW can accumulate.

Table 1 Cumulative weights analysis for the causative parameters.

The topographic factors are also essential to determine the elevation of the GW56. The spatial significance of the slope’s aspect as an effective topographic element is shown in Table 1. The southwest direction of the slope in the area has the maximum potential of GW in the study area based on the maximum weight value of WOE analysis. This is because the slope in the southwest direction in the study area receives more rainfall and moisture than in other directions. Table 1 shows the WOE spatial analysis of elevation for the GW. Based on the analysis study reveals those areas with an elevation range of 611 –687 m have more potential for GW in the study area. Low-elevation areas are more susceptible to runoff57. The greater water absorption and recharging of the water are present in the low-elevation terrain plain area58. The slope curvature is also analyzed as an effective topographic element to evaluate the potential of GW by using WOE, shown in Table 1. The analysis revealed that the slope in the study area with concave geometry has maximum GW potential. This category of curvature holds more water over a long period57. Concave areas often act as recharge zones, collecting water runoff, while flat areas may indicate favourable conditions for aquifer development.

The proximity of drainage network analysis is important for delineating the GW in an area. The characteristics of surface and subsurface formation are reflected in the drainage pattern. The WOE assessment spatial relationship of drainage proximity with GW is shown in Table 1. The study concluded that an area of up to 50 m of drainage has the maximum potential for groundwater in the study area. Lithology as an effective geological element and its spatial significance are shown in Table 1. The WOE analysis concluded that the surficial deposit of the study area has the potential of GW as it has the maximum weight value among other lithological units in the study area. This is because most of Muzaffarabad’s municipality area comprises surficial deposits. The predominance of surficial deposits in the study area, coupled with its low elevation, suggests a high potential for GW due to the characteristics of these lithological units. This aligns with findings from59, highlighting how such geological conditions can significantly influence GW availability and potential in Muzaffarabad’s municipality area. In hard rock terrains, lineaments play a significant role in GW recharge; adjacent to the lineaments zone, the GW occurrence potential is high60. The lineament intersections are considered good GW potential zones. The spatial relation of GW and the density of lineaments in the study area are analyzed (Table 1). The analysis concluded that the class range of above 4.5 of evidential parameter lineament density has more potential for GW in the study area as lineaments provide pathways for groundwater movements. The GW potential is high near high-density lineament zones and vice versa61. These results are in line with a recent study conducted by62 in Lahore, Pakistan. Land use/land cover is also an important effective factor for GW63. Doke, et al.59 observed that high-density lineament zones may not universally indicate very high GW potentials. This discrepancy could be attributed to additional factors such as lithology, soil cover, and gradient, which play crucial roles in influencing subsurface water recharge dynamics. Table 1 indicates the spatial significance of the land use/land cover as an ecologically effective element. The water bodies followed by barren areas have the maximum GW potential in the study area.

Ground water potential model

Planning and long-term development of a region depend heavily on having a better grasp of the possibilities of the GW. The management of GW sustainably depends on this kind of knowledge. A detailed GW resource evaluation is necessary since its availability fluctuates throughout time and place. Eight effective thematic maps were developed and analyzed to develop a WOE and SVM composite response for identifying GW potential zones in the municipality of the Muzaffarabad (Fig. 7a, b). Based on a categorization of natural breaks (known as “Natural breaks (Jenks)”), the study area has been divided into five categories: very poor, poor, moderate, good, and excellent potential zones of GW (Table 2). The results showed that 31.1% (16.21 km2) of the area had excellent GW potential based on the WOE model, whereas the SVM model showed that only 20.3% (10.59 km2) of the area fall in the excellent potential zone. According to the WoE model, the good GW potential zone is 6.9% (3.58 km2), while the SVM model is 12.9% (6.74 km2) in the good potential zone. In the WOE model, 6.1% (3.20 km2) had moderate GW potential, 8.9% (4.65 km2) had low GW potential, and 47% (24.47 km2) had very low GW potential, whereas in the SVM model, 25.7% (13.47 km2) moderate, 22.3% (11.61 km2) Low, and 18.7% (9.75 km2) very low GW potential. The differences observed between the SVM and WoE models occur mainly outside the Muzaffarabad municipality, particularly in regions with very low to moderate GW potential. These differences are minimal in the core area of interest and do not significantly impact the study’s findings. The variation can be attributed to the distinct methodologies: SVM is sensitive to kernel selection and parameter tuning, which influences its ability to model complex patterns, while WOE relies on the logistic transformation of input variables, providing a clear interpretation of the relationship between categorical and binary variables. Consequently, some degree of variation between the models is expected, particularly in areas outside the primary focus of our study. In the GW potential model, the influence of topographic conditions, particularly slope and elevation, is evident. Areas with high elevation and steep slopes, often composed of Pre-Cambrian and Cambrian formations, exhibit low GW potential due to the limited infiltration capacity. These formations are typically characterized by low permeability, which hinders the infiltration of rainwater and contributes to rapid surface runoff, reducing GW recharge. Conversely, areas with low elevation and gentle slopes have higher GW potential. These regions are often covered by quaternary deposits, which are more permeable and allow for greater infiltration. The nearly flat terrain in these areas slows down the runoff, providing ample time for rainwater to percolate into the ground, thus enhancing groundwater recharge. Our results also indicate that areas with concave morphology exhibit higher GW potential, as this terrain naturally channels and retains surface water, enhancing infiltration and contributing to increased groundwater recharge. Our results demonstrate that GW potential is highest near drainage networks due to the increased availability of surface water, which promotes infiltration. The proximity to water bodies ensures a continuous supply of water, while the high lineament density enhances the permeability of the subsurface, allowing more water to percolate and recharge the aquifers. This combination of factors creates ideal conditions for sustaining higher groundwater levels in these areas.

Figure 7
Figure 7
Full size image

Groundwater potential potential map produced through a) WOE method, b) SVM model.

Table 2 The total area of GW potential zones using WOE and SVM model.

In terms of area, Chella Bandi, Lower Plate, Domel, Lower Chatter, and Ambore exhibit higher GW potential compared to Tariqabad, Dhanni, and Mujajar Colony. The GW potential maps produced by both SVM and RF show that “high” and “very high” classes are concentrated in the watershed area, particularly around the river systems. This finding aligns with previous studies by 32 and 64, which also observed higher GW potential near rivers. The maps indicate that GW potential decreases as distance from rivers and lineaments increases. Low-elevation areas depict a high potential for GW which is in line with the findings of26. They found that elevation is the most influential factor for groundwater in the Markazi Province, Iran.

Validation of ground water potential model

To measure the accuracy of models, the “Area under the curve” (AUC) was calculated from SRC (training) and PRC (testing) datasets. The model performance is considered to be excellent if the AUC value is close to 1–0.9, < 0.9–0.8 good, < 0.8–0.7 medium, < 0.7–0.6 sufficient, < 0.6–0.5 bad and it would be taken as poor if its AUC value is < 0.5. In this study, 70% of GW inventory points from the study area are used to develop the model, while 30% remaining are used to validate the model. The model’s predictive performance is evaluated using the receiver operating characteristics (ROC) curves. Figure 8a, b shows the AUC of the model, which reveals that the model is in the range of the good class. In terms of AUC-ROC, SVM outperforms the WOE model having higher PRC and SRC values. To improve model performance and prevent overfitting, a feature selection process was adopted65. Pearson’s Correlation Method was utilized to assess correlations among variables. This analysis was conducted using the Semi-Automatic Feature Selection module integrated within the ArcGIS Environment Toolbox. The correlation matrix plot facilitated the visual examination of correlation coefficients, allowing the manual removal of highly correlated features from the dataset. In addition, two different techniques, WOE and SVM, were employed to mitigate bias and overfitting.

Figure 8
Figure 8
Full size image

AUC-ROC curves: (a) SRC curve (b) PRC curve.

Spatial distribution of water quality parameters

Bacterial contamination

A total of 39 water samples, including boreholes/wells and springs, were collected within the Municipality limits of Muzaffarabad City, PAK, for an in-depth analysis of GW quality. The primary focus of this study was to assess the water quality parameters and identify potential bacterial and chemical contamination. The water samples were subjected to thorough analysis to evaluate the presence of bacterial contamination. Surprisingly, 30 out of the 39 spring water samples (76%) were found to be contaminated with bacteria (Table 3). Specifically, the presence of E.coli was observed in 30 samples, indicating a significant risk to public health.

Table 3 Descriptive statistics of various GW parameters.

Alarming results were observed regarding the suitability of the water for drinking purposes. Only 9 out of the 39 samples were deemed fit for consumption without the need for any treatment. This implies that a vast majority of the collected water samples require some form of treatment before they can be considered safe for human consumption. The presence of E. coli in drinking water samples, as highlighted by66 in Taunsa Sharif, is a critical indicator of recent fecal contamination and poses significant health risks due to potential pathogenicity. The detection of E. coli in all analyzed samples indicates widespread contamination, suggesting inadequate sanitation or treatment practices in the water sources. The findings from this GW quality analysis emphasize the importance of regularly monitoring water sources to ensure public health and safety. The high prevalence of bacterial contamination and chemical impurities underscores the need for proper water treatment and management strategies in the study area. Further investigation and implementation of appropriate remediation measures are essential to safeguard the well-being of the local population.

Physical parameters

Electrical conductivity (EC)

The EC of water is influenced by a wide range of geological processes and human activities, such as ion exchange, reverse ion exchange, evaporation, silicate weathering, water-rock interactions, sulphate reduction, oxidation processes, and anthropogenic influences67,68. The research findings, as shown in Table 3, indicate that the examined samples are in line with the standard set by the WHO (2022) for EC, which is 1,500 µs/cm. Our analysis found that all 39 collected samples exhibited an EC range below this limit, with the maximum value equal to or below 987, well within the acceptable range (Fig. 9a). Two samples lie in the range of 200–400; 8 samples lie in the range of 400–600; 18 samples lie in the range of 600–800 and 11 samples lie within the range of 800–1000 (Fig. 10a; Table 3).

Figure 9
Figure 9
Full size image

Spatial distribution maps: (a) Electric conductivity (b) pH (c) Turbidity (d) TDS.

Figure 10
Figure 10
Full size image

Spatial distribution graphs of various chemical tests: (a) EC (b) pH, (c) TDS (d) Calcium.

Hydrogen ion concentration (pH)

The pH measures the acidity or basicity of water, influenced by dissolved chemicals and the carbon dioxide-bicarbonate-carbonate equilibrium system. Water pH plays a crucial role in determining its usability for various purposes. The WHO’s recommended pH range for most water sources is between 6.5 and 8.5, which is considered permissible. The presence of hydrogen ions in water is measured through the pH range, where a neutral pH indicates a balanced hydrogen ion concentration. In our current study, the pH range observed varied from 6.2 (minimum) to 8.4 (maximum), which falls well within the acceptable limit of 6.5 to 8.5, with an average pH of 7.03 (Fig. 9b; Table 3). Out of 39 samples, 20 samples lie within the pH range of 6–7; 12 lie within the range of 7-7.5; 4 lie within the range of 7.5-8 and 3 lie within the range of 8-8.5 (Fig. 10b).

Turbidity

Turbidity in water is caused by suspended materials like clay, silt, organic particles, plankton, and microorganisms. It is a measure of light-scattering and light-absorbing characteristics. Out of 39 collected samples, only one bore well sample was found to be turbid and exceeded the permissible limits. The remaining samples from the springs and the bore well were within the WHO’s guideline value for turbidity, which is 1 NTU (Fig. 9c; Table 3). Fahimah, et al.69 analyzed the statistical relationship between topography and turbidity in Bandung regency, Indonesia and found that the oncentration of turbidity was lower in the high topography area compared to the lower topography area.

Total dissolved solids (TDS)

TDS refers to the combined presence of inorganic salts and low concentrations of organic matter in water, including major ions like carbonate, bicarbonate, chloride, sulphate, nitrate, sodium, potassium, calcium, and magnesium. TDS affects taste, hardness, corrosion properties, and encrustation tendencies. High TDS levels beyond the limits can lead to gastrointestinal irritation. The WHO recommended guideline value for TDS is 1000 mg/l. Our study found that most of the samples fall within this recommended range, with the minimum value being 85 while the maximum value is 582 mg/l (Fig. 9d; Table 3). Out of 39 samples, 9 samples lie within the range of 50–150 TDS value, 4 lies within the range of 150–300; 19 lies within the range of 300–450 and 7 samples lie within the TDS value of 450–600 (Fig. 10c). Rasheed, et al.70 found that out of 82 GW samples, 74 samples of water are unsafe due to excessive TDS in District Jhelum, Upper Indus, Pakistan.

Chemical parameters

Among all 39 collected samples, 15 were chemically contaminated, while other water quality parameters were found within the permissible range. However, Turbidity in 1-Sample above the limit, Calcium in 2-Samples exceeds the permissible limits, magnesium in 7-samples exceeds the limits and sulphate in 10-samples exceeds the limits (Table 3).

Calcium (Ca2+)

Calcium is the major abundant component in GW and contributes to water hardness when present along with magnesium. This mineral is crucial for various bodily processes, including blood clotting, nerve impulse transmission, and heart rhythm stabilization. The occurrence of calcium in GW is often attributed to deposits of limestone, dolomite, calcite, gypsum, and gypsiferous shale. The permissible range for calcium concentration is generally below 200 mg/l.

Among the 39 collected water samples, calcium contamination was observed in 2 samples, exceeding the WHO guideline value of 200 mg/l (Fig. 10d). Figure 11a indicates that the majority of samples fell within this range, except for two GW samples - one from a spring in Domail and another from a bore well in Sund Gali - where calcium levels were found to exceed the limit (Table 3).

Figure 11
Figure 11
Full size image

Spatial distribution maps: (a) Calcium (b) Magnesium (c) Total hardness (d) Chloride.

Magnesium (Mg2+)

Magnesium is another commonly found element in water and is also a significant cause of water hardness. The WHO guideline value for magnesium in water is 150 mg/l (Table 3). Among the 39 samples, only 6 exhibited elevated magnesium levels, while the majority of GW in the study area ranged between 100 and 150 mg/l, as shown in Figs. 11b and 12a.

Figure 12
Figure 12
Full size image

Spatial Distribution graphs of various chemical tests: (a) Magnesium (b) Hardness, (c) Chloride (d) Nitrate.

Total hardness (TH)

Water hardness is caused by the presence of elements such as calcium, magnesium, or ferrous (iron salts), such as chloride, sulphate, or bicarbonate ions. The terms “hard water” and “soft water” are commonly used to describe water with varying hardness levels. An acceptable compromise between corrosion and incrustation issues is typically achieved at a hardness level of approximately 100 mg of CaCO3 per litre. However, WHO recommends a general guideline value of 500 mg/l for hardness due to its distinctive properties (Table 3). In the study area, the water hardness ranges observed during the survey fell within the permissible range, as illustrated in Figs. 11c and 12b (Table 3).

Chloride (cl)

Chloride concentrations in GW tend to be higher than in surface water and can indicate the presence of organic pollutants. The high chloride levels in GW may result from factors such as lithological deposits, pollutant infiltration from sewerage systems, or seawater intrusion. Natural sources of chloride include NaCl, KCl, and Ca Cl2 salts. The WHO guideline value for chloride in water is 250 mg/l (Table 3). In this study, all 39 samples analyzed showed chloride values below the WHO limit, as depicted in Figs. 11d and 12c.

Nitrate (NO3−)

Nitrate contamination in water is often attributed to various sources, such as fertilizer usage, decomposition of plant and animal waste, residential effluent, sewage sludge, industrial discharges, agricultural leachates, and climatic effects. Elevated nitrate levels in water can lead to health issues, including the “blue baby syndrome.” The WHO guideline value for nitrate in water is 50 mg/l (Table 3). The study area’s analyzed samples indicated nitrate levels below the WHO limit, as shown in Figs. 12d and 13a.

Figure 13
Figure 13
Full size image

Spatial distribution maps: (a) Nitrate (b) Sulphate.

Sulphate (SO₄²-)

Sulphate in groundwater can naturally originate from minerals like gypsum (CaSO4. 2H2O), epsomite (MgSO4. 7H2O), and barite (BaSO4). Anthropogenic sources, such as waste discharge from mines, smelters, pulp and paper mills, textile mills, and tanneries, can also contribute to sulphate levels. The WHO guideline value for sulphate in water is 250 mg/l (Table 3). Out of 39 samples, 10 water samples exceed the WHO permissible limit (Figs. 13b and 14).

Figure 14
Figure 14
Full size image

Spatial distribution graphs of sulphate.

Groundwater quality index model

The groundwater index (GWI) model for the study area was developed by integrating the spatial distribution of physical and chemical properties of the groundwater samples. The GWI model, as depicted in Fig. 15, reveals that the majority of the locations exhibit moderate to good water quality, while some locations have excellent and some locations have poor water quality. The presence of bacterial contamination was detected in 30 samples, primarily attributed to poor solid waste disposal practices, open dumping, and inadequate sewage infrastructure leading to GW contamination. Alsalme, et al.71 investigated the GW quality for chemical and microbial contamination in Bhimber, (PAK) and found that almost all of the samples were grossly contaminated with E. coli. They also found that chlorite ion concentration is below the limits of WHO. Another study conducted by Khalid, et al.72 in Poonch, (PAK) underscores significant concerns regarding water quality. While chemical parameters generally met WHO standards, elevated lead levels pose a specific concern. However, the presence of biological contaminants, as indicated by positive results in coliform, total microbial load, and fungal tests, suggests widespread biological contamination. The improper disposal of liquid waste through open sewage and poorly designed septic tanks has contributed to the contamination issue70. Additionally, the prevalence of open water sources, such as springs, increases the risk of contamination during the rainy season due to water runof. To safeguard the GW quality and protect public health, addressing these contamination sources is imperative. Implementing proper solid waste management practices, improving sewage infrastructure, and promoting the use of closed water sources can significantly reduce the risk of bacterial contamination. Such measures will contribute to maintaining and improving the overall GW quality in the study area, ensuring a safe and sustainable water supply for the local population.

Figure 15
Figure 15
Full size image

Groundwater quality index map of the Muzaffarabad municipality.

Conclusion

This study aimed to address the pressing water management issues faced by Muzaffarabad Municipality, Pakistan, through the integration of SVM and WOE techniques for GW potential mapping and water quality assessment. The combination of these methods allowed for a comprehensive and accurate understanding of GW availability and quality in the region. GW potential mapping using SVM and WOE provided valuable predictive models, allowing the identification of potential GW zones. The weights assigned to each influencing factor in the WOE analysis helped in understanding the relative importance of different factors in predicting GW potential. Water quality analysis revealed the presence of various chemical parameters, some of which exceeded the WHO drinking water standards. The Water Quality Index was employed to summarize the water quality parameters, providing a comprehensive assessment of the suitability of water resources for human consumption. The study demonstrates that flat and gently sloping terrains, combined with low-elevation areas, exhibit higher GW potential due to enhanced water accumulation and absorption. The presence of concave slope geometries and proximity to drainage and high-density lineament zones further contribute to increased GW potential. However, some areas are classified as having very poor and poor GW potential, highlighting the need for targeted GW management strategies. Moreover, the assessment of GW quality indicates that bacterial contamination is a major concern, with substantial spring water and overall samples being contaminated with E.coli. This underscores the importance of implementing effective solid waste management practices and improving sewage infrastructure to prevent further contamination and safeguard public health. The study’s findings can be used to guide sustainable GW management and conservation strategies in the Muzaffarabad municipality. Adequate measures to protect GW resources and improve water quality are vital to ensuring a safe and reliable water supply for the local population. Further research and continuous monitoring are recommended to track changes in GW potential and quality over time and assess the effectiveness of implemented management measures. By prioritizing efficient GW resource management, we can safeguard this precious resource for future generations and ensure the well-being and prosperity of communities worldwide.