Introduction

Hydrogen is touted as one of the foremost environmentally friendly fuels, emerging as a potent clean energy carrier1,2,3. With the advent and enhancement of renewable energy sources (RESs), the costs associated with electricity production are on a downward trajectory, ensuring medium-term benefits4,5,6. Notably, using surplus, low-cost renewable electricity for hydrogen conversion offers the dual advantage of storage for later use and significant economic efficiency augmentation of renewable energy systems7.

RESs, however, suffer from intermittency, highlighting an urgent need for robust hydrogen storage solutions8,9. Geological structures, encompassing rock salt deposits, depleted hydrocarbon deposits, and aquifers, emerge as viable candidates for large-scale hydrogen storage10,11. While preliminary assessments of such structures are available, they often overlook vital surface and underground factors, potentially limiting their suitability for storage12.

Recent years have witnessed a surge in the application of artificial intelligence algorithms, particularly machine learning, as formidable computational tools to simulate intricate phenomena across academic spectra13,14,15. The strength of these tools, especially artificial neural networks (ANN), lies in their innate learning capabilities, obviating the need for statistical source data assumptions and their proficiency in handling non-linear scenarios16,17,18. In recent years, machine learning methods have been increasingly used in research related to underground hydrogen storage in geological structures. The effective use of algorithms in predicting the values of critical parameters such as wettability affecting the storage capacity of porous rocks has been confirmed by numerous studies19,20,21. ML algorithms were also successfully used to predict interfacial tension in brine-hydrogen systems22,23,24,25,26. Research conducted by27,28,29,30 also focuses on developing data-driven ML models for predicting hydrogen solubility in water and brines under various pressure and temperature conditions. Research on the use of ML also concerns the optimization of hydrogen storage parameters and the design of energy systems supported by underground energy storage31,32,33, as well as the characterization of hydrogen storage reservoirs consisting of the prediction of thermodynamic parameters34,35. The convergence of machine learning and Geographic Information Systems (GIS) stands out as a game-changer, offering unprecedented insights into optimal locations for underground hydrogen storage (UHS).

Although UHS is a frequently discussed topic with numerous reviews available36,37,38,39,40,41,42, the meticulous assessment of storage potentials and associated technicalities remains paramount. Salt caverns, especially, emerged as the leading contenders for UHS, given their adoption in industrial contexts, such as the petrochemical industry43,44,45,46. Their storage efficacy is evidenced by operational examples like Teesside in the UK and locations in the USA like Clemens, Moss Bluff, and Spindletop47. Additionally, research endeavours worldwide emphasize rock salt deposits' immense hydrogen storage potential48,49,50,51,52,53,54,55,56. These caverns' sheer size and adaptable shape as well as rock salt's tightness and inert properties with respect to hydrogen, make them suitable for storing colossal hydrogen volumes57,58,59,60,61,62,63,64.

The technological intricacies involved in UHS in salt caverns are manifold, including evaluating cavern dimensions, rock salt properties, and the associated impact on storage capacity58,59,60,61,62,63,64. Beyond these technological considerations, UHS site selection is critically determined by rock salt deposit characteristics, like thickness and depth. A holistic approach for optimal site determination incorporates environmental, technical, economic, and social criteria65,66. The innovative integration of GIS with ML streamlines site selection and impact assessment, with successful applications observed in diverse fields17,66,67,68,69,70,71,72,73.

Our research introduces an innovative artificial intelligence framework, combining eight distinct machine-learning algorithms to generate suitability maps for rock salt deposit-based hydrogen storage. This study is a pioneering effort in the domain by harnessing the power of machine learning and complementing it with spatial data analysis. This methodology offers enhanced accuracy in determining hydrogen storage potential and equips stakeholders with an indispensable tool, potentially revolutionizing the decision-making process for hydrogen storage locations.

Materials and methods

This study developed a comprehensive methodology to identify optimal locations for Underground Hydrogen Storage (UHS) within rock salt formations, focusing on the Na1 rock salt deposit in the Fore-Sudetic Monocline, southwest Poland. The Na1 unit, a part of the Upper Permian rock salt bearing formation extending across the Polish Lowland, was chosen for its favourable characteristics for hydrogen storage53,74,75. A part of the Na1 rock salt deposit with a thickness of over 130 m, occurring up to 1,800 m below ground level, was selected for analysis (Fig. 1).

Figure 1
figure 1

The Na1 rock salt deposit selected for analysis76 using ArcGIS Pro 2.8 software. The base map was developed by Esri using HERE data, DeLorme base map layers, OpenStreetMap contributors, Esri base map data, and select data from the GIS user community. For more information about Esri® software, please visit http://www.esri.com.

Methodological framework

  • Overview of Integrated Approach

The methodology integrated Artificial Intelligence (AI) algorithms, Multi-Criteria Decision Analysis (MCDA), and Geographic Information System (GIS) spatial analysis. The Analytic Hierarchy Process (AHP) was employed to break down this complex issue into manageable components, establishing evaluation criteria, their weights, and a conclusive ranking of site alternatives.

The process entailed:

Defining Evaluation Criteria: Establishing the parameters for UHS site selection.

AI Algorithm Integration: Implementing eight machine-learning algorithms (KNN, SVM, LightGBM, XGBoost, MLP, CatBoost, GBR, and MLR) on a unified platform.

Data Segmentation: Dividing criteria-based data into a training set (70%) and a validation-testing set (30%).

Performance Assessment: Evaluating algorithmic performance using standard error metrics and the Correlation Coefficient (R2).

Optimal Algorithm Selection: Choosing the most effective algorithm based on performance metrics.

GIS Visualization: Mapping the spatial distribution of potential UHS sites.

Suitability Mapping: Creating a UHS suitability map from the selected algorithm's outputs.

Final Algorithm Formulation: Establishing a protocol for future research applications.

  • Exclusion and Evaluation Criteria

The study incorporated both exclusion and evaluation criteria. Exclusion criteria, guided by Polish environmental regulations, eliminated areas within protected zones, residential and industrial areas, transportation networks, bodies of water, and active mining sites. Evaluation criteria focused on the rock salt layer's storage capacity, land development, access to water resources, road infrastructure, proximity to gas pipelines, energy demand, and the level of geological exploration.

Data resources and preparation

  • Data Collection

The study employed twelve standardized raster maps, with each map corresponding to a specific evaluative criterion. These criteria encompass hydrogen storage capacity, hydrological features, transportation infrastructure, gas pipeline network, land use development above the deposit, energy consumption across administrative units, and locations of geological research boreholes. The storage capacity map was developed by Lankof and Tarkowski53, while the remaining maps were acquired from various spatial information portals and the National Transmission System. All maps were harmonized in terms of extent and pixel specifications, ensuring consistency in pixel size and dimensions across the dataset.

  • Map Transformation and Criteria Weighting

The hydrological features, transportation infrastructure, gas pipeline, and boreholes maps were transformed into proximity maps and then, together with other maps, normalized to a 1–10 scale, with higher values indicating greater suitability for UHS. The AHP method facilitated pairwise criteria comparison to establish weights, incorporating expert opinions from various fields.

Machine learning algorithms overview

Artificial Intelligence (AI) is a branch of computer science dedicated to creating intelligent systems capable of learning and improving from experience. Machine learning (ML) is a critical domain within AI, exploring how systems can autonomously improve their performance. ML includes various techniques like representation learning and deep learning. These methods enable machines to automatically discover patterns in raw data and learn representations necessary for tasks such as detection or classification. The advancement of big data and AI technologies, especially in GPU computing power, has significantly impacted geological sciences. AI applications in geology include geological surveys, mineral recognition, and geochemical anomaly detection. This study focuses on using AI to evaluate potential sites for Underground Hydrogen Storage (UHS) in geological formations. Acknowledging the indispensable role of preprocessing in enhancing model reliability, we refined our dataset through systematic cleaning, normalization, and feature engineering processes. This ensured that our ML models were trained on data that accurately represented the underlying geological phenomena, laying a solid foundation for trustworthy estimations.

  • K-Nearest Neighbours (KNN) The KNN is an intuitive and straightforward machine-learning algorithm for regression and classification77. It is an idle algorithm in machine learning as a new data mining method. In fact, it learns nothing from the training dataset rather than considering the features of the k-closest neighbors on the training dataset and calculating their distance. In other words, the KNN considers a point and all its nearby points in the training dataset. Thus, the distance from the point of forecasting in the testing dataset to the nearby points is calculated to define the closest neighbors. Ultimately, the same features and attributes are assigned to the forecasting point.

  • Support Vector Machine (SVM) A robust algorithm for regression and classification tasks,78,79 SVM includes a unique parameter, ε, determining the width of the margin around the decision boundary, optimizing forecast accuracy80,81,82. The main goal of the Support Vector Machine (SVM) is to find the best possible dividing line, or 'hyperplane,' which creates the widest possible gap between distinct categories of data points. It is capable of interpreting both straightforward and complex patterns by employing specialized functions known as kernels. Widely utilized across various fields, particularly in the study of Earth sciences, this algorithm is renowned for its superior accuracy and reliability83.

  • Light Gradient Boosting Machine (LightGBM) As another gradient-boosting framework, LightGBM concentrates on speed and efficiency. Therefore, a new tree-building algorithm is introduced known as gradient-based one-sided sampling (GOSS) for reducing the number of data during training. A histogram-based method is also used in LightGBM to bucket continuous properties within discrete bins. Thus, memory efficiency and training speed are enhanced while supporting distributed and parallel computing for large-scale datasets. The LightGBM is applied successfully in different domains, such as recommender systems, online advertising, and fraud detection.

  • Extreme Gradient Boosting (XGBoost) XGBoost is a strong gradient-boosting framework with considerable performance and speed. Hence, a powerful predictive model can be made through the integration of weak learners, characteristically decision trees. To construct the XGBoost model, a stage-wise method is used, in which each following tree tries to correct the errors created by previous trees. XGBoost used gradient descent optimization methods during training for minimizing a precise loss function. The accuracy, scalability, and interpretability of XGBoost have critical roles in its extensive adoption across different domains, such as anomaly detection, click prediction, and web analytics.

  • Multilayer Perceptron (MLP) Recently, a huge deal of attention has been attracted by neural networks84. ANNs are stimulated by biological neural networks to make non-linear models between dependent and independent variables, rivaling the learning of the biological neuron system85. An MLP is a kind of feed-forward neural network that includes multiple layers of interconnected artificial neurons. A non-linear activation function is applied by a neuron to the weighted sum of its inputs. Any arbitrary function can be approximated by MLPs considering adequate hidden units and proper activation functions86.

  • Categorical Boosting (CatBoost) Various general implementation problems are addressed by the CatBoost technique for gradient boosting and solving the issue by offering ordering principles. Dorogush et al.87 developed CatBoost as an enhanced GBDT toolkit the same as XGBoost. The problems of gradient bias and prediction shift are solved by CatBoost. It has numerous advantages such as embedding an innovative algorithm to treat categorical features automatically as numerical characteristics. Moreover, it utilizes a combination of category properties, taking advantage of the connections between features and, importantly, enriching feature dimensions. Also, a perfectly symmetrical tree model is adopted to decrease overfitting and enhance the generalizability and accuracy of the algorithm.

  • Gradient Boosting Regressor (GBR) GBR with better stability and higher performance is an integrated model. It was proposed by Friedman to extend the boosting algorithm and solve the regression problems. The negative gradients of the loss function are used to solve the minimum value in this algorithm. By Gradient Boosting, random differentiable loss functions are optimized thus constructing an additive model in a forward stage wise procedure. A regression tree in each stage fits the non-positive gradient of the presented loss function.

  • Multiple Linear Regression (MLR) MLR models the linear relationship between multiple independent variables and a dependent variable, optimizing the fit through the minimization of a loss function.

AI approach in UHS site evaluation

Our approach utilized the Fuzzy Analytic Hierarchy Process (FAHP) model to generate a target database for training the ML algorithms. In this paper, the input parameters of the ML algorithms included Conservation Area, Geological exploration, Water reservoir, Accessibility, Ecological Site, Energy Consumption, Land Use, Natural Gas Pipelines, Natural forest, Protected Area, Special Protection Area, Storage Capacity, and the AHP output derived from Lankof and Tarkowski88 as the ML algorithms output parameter.

A representative sample from the study area, comprising 1000 evenly spaced points, was selected for this purpose. The data was divided into a training set (70%) and a validation-testing set (30%).

The methodology employed in the present study is captured in Fig. 2, which outlines a multi-stage process integrating both Fuzzy Analytic Hierarchy Process (FAHP) and various Machine Learning (ML) algorithms to create a suitability map in rock salt deposits.

  • Fuzzy Analytic Hierarchy Process (FAHP)

Figure 2
figure 2

The workflow of the methodology.

Our approach begins with the FAHP, which combines fuzzy logic with the traditional Analytic Hierarchy Process (AHP) to handle the inherent uncertainties in evaluating multiple criteria. This process starts by assigning scores to various natural and anthropogenic factors such as natural reserves, water reservoirs, and land use. These factors are categorized under broader evaluation criteria like storage capacity, accessibility, and energy consumption.

The FAHP enhances the AHP by incorporating fuzzy member functions, which allow for the expression of vagueness and imprecision in human judgment89. We then normalize the assigned weights using FAHP, ensuring consistency across all criteria.

Subsequently, AHP's pairwise comparison matrix calculates the relative weights of these criteria, addressing both the importance and the interdependency among them. The integration of these weighted criteria through Geographic Information System (GIS) analysis, along with fuzzy functions, creates an overlay of layers that form the basis for further analysis by ML algorithms.

To refine the suitability map, we employed various ML algorithms: CatBoost, Gradient Boosting Tree (GBT), k-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LGBM), Multilayer Perceptron (MLP), Logistic Regression (LR), Support Vector Regression (SVR), and XGBoost. These algorithms were trained using 70% of the collected data, with the remaining 30% split equally for validation and testing purposes.

  • Performance Metrics

The selected ML algorithms were trained and tested on this dataset, and their performance was evaluated using Mean Squared Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), and Correlation Coefficient (R2). These metrics (Eqs. 15)17,90,91 were crucial in assessing the algorithms' effectiveness and ensuring the model's accuracy before applying it to the entire dataset. The ultimate goal was to identify optimal locations for UHS .

$$\text{MSE}=\frac{1}{\text{N}}\sum_{\text{i}=1}^{\text{N}} {\left({\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}\right)}^{2}$$
(1)
$$\text{MAE}=\frac{1}{\text{N}}\sum_{\text{i}=1}^{\text{N}} \left|{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}\right|$$
(2)
$$\text{MAPE}=\frac{1}{\text{N}}\sum_{\text{i}=1}^{\text{N}} \left|\frac{{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}}{{\text{y}}_{\text{i}}}\right|$$
(3)
$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{{\text{N}}} \left( {{\text{y}}_{{\text{i}}} - {\hat{\text{y}}}_{{\text{i}}} } \right)^{2} }$$
(4)
$${\text{R}}^{2}=1-\frac{\sum_{\text{i}} {\left({\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}\right)}^{2}}{\sum_{\text{i}} {\left({\text{y}}_{\text{i}}-\overline{\text{y} }\right)}^{2}}$$
(5)

where yi is the ith observed value, \({\widehat{\text{y}}}_{\text{i}}\) is the corresponding predicted value for yi, and n is the number of observations.

Data preprocessing and analysis

The data are required to be preprocessed before uploading to any ML model. The preprocessing includes multiple data transformation steps such as data resampling, standardization or normalization, noise elimination, outlier removal, etc77. These steps aid to enhance the forecasting accuracy of data-driven algorithms.

After clearing data, the first step is normalization, and Eq. 6 is used for this purpose.

$${X}_{N}= \frac{\left({X}_{R}-{X}_{\text{minimum }}\right)}{\left({X}_{\text{maximum }}-{X}_{\text{minimum}}\right)}$$
(6)

Here, XN represents the normalized value, XR is the value to be normalized, Xminimum is the minimum value in all the values for related variables, and Xmaximum is the maximum value in all the values for related variables92.

In evaluating machine-learning algorithms for predicting the suitability of sites for underground hydrogen storage in Poland, our findings are depicted in Figs. 3, 4, 5, 6, 7, 8, 9 and 10. Each figure provides a comprehensive overview of the algorithm's performance, showcasing the strong predictive accuracy and reliability of the models used. The CatBoost Regressor, as shown in Fig. 3, demonstrated exceptional performance with a high correlation between observed and predicted values (R2 = 0.888). This strong correlation was supported by a consistent R2 value for both training and test data and a learning curve indicating the model's ability to generalize without overfitting.

Figure 3
figure 3

CatBoost performance.

Figure 4
figure 4

GBT performance.

Figure 5
figure 5

KNN performance.

Figure 6
figure 6

LGBM performance.

Figure 7
figure 7

MLP performance.

Figure 8
figure 8

LR performance.

Figure 9
figure 9

SVR performance.

Figure 10
figure 10

XGBoost performance.

Figure 4 illustrates the Gradient Boosting Regressor's robust performance with a solid R2 of 0.867. The model's residuals and learning curve further suggest stable performance and good generalization capabilities, even as the number of training instances increases.

The K-Nearest Neighbours (KNN) algorithm, analysed in Fig. 5, also showed significant prediction accuracy (R2 = 0.861). The consistency of its training score and the improvement in the cross-validation score with additional data indicate its effectiveness in learning from the increasing dataset.

The Light Gradient Boosting Machine (LGBM) Regressor, discussed in Fig. 6, displayed a strong correlation between observed and predicted values (R2 = 0.883) and a learning curve demonstrating the model's growing accuracy with more data points.

Figure 7 highlighted the performance of the Multilayer Perceptron (MLP) Regressor, revealing a reliable predictive accuracy (R2 = 0.862) and a stable performance across varying sizes of training data, suggesting the model's proficiency in learning effectively.

Linear Regression (LR) model efficacy, represented in Fig. 8, confirmed a strong linear relationship and predictive capability (R2 = 0.842). The model's learning curve indicates consistent and generalizable performance throughout training. Support Vector Regression (SVR), shown in Fig. 9, achieved a strong predictive accuracy (R2 = 0.877) with a learning curve reflecting a positive performance trajectory as more data is introduced, underscoring the model's generalization strength. Lastly, the XGBoost Regressor, detailed in Fig. 10, exhibited high predictive accuracy (R2 = 0.877) and a learning curve suggestive of robust learning ability, with an upward trend in cross-validation scores as the number of training instances expanded.

These figures substantiate the high accuracy and generalization capabilities of the ML algorithms employed, with the CatBoost model being particularly noteworthy for its superior performance. This empirical observation from the CatBoost model's analysis has yielded actionable insights into site suitability for UHS. Such contributions are significant to the research field, laying a strong groundwork for enhancing the methodologies used in future site selection and policy planning.

Table 1 summarizes the performance of various machine learning algorithms evaluated based on four key metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE). The algorithms assessed include CatBoost, LightGBM (Lgbm), XGBoost, Gradient Boosting Regressor (Gbr), K-Nearest Neighbors (Knn), Linear Regression (Lr), Support Vector Regression (Svr), and Multilayer Perceptron (MLP). The results indicate that CatBoost outperforms other algorithms across all metrics, suggesting its superior predictive capability within the evaluated dataset.

Table 1 Comparative performance metrics of machine learning algorithms for predictive modelling.

In assessing the suitability of the Na1 rock salt deposit in the Fore-Sudetic Monocline for underground hydrogen storage, our research meticulously compared machine-learning algorithms, concluding that CatBoost outperforms its counterparts. It achieved the most favourable error metrics, with an MAE of 0.1994, MSE of 0.0816, RMSE of 0.2833, and a notably low MAPE of 0.0163, underscoring its precision in predictive modelling. Other evaluated algorithms, namely SVR, MLP, KNN, and XGBoost, yielded higher errors, with MAE values ranging from 0.2065 to 0.2461, MSE values from 0.0905 to 0.1159, RMSE values from 0.2982 to 0.3372, and MAPE values from 0.0169 to 0.0202. Given these results, CatBoost is identified as the most reliable algorithm for forecasting the suitability of salt caverns for hydrogen storage in the geological context of Poland.

Another point worth mentioning is the good trend of SHAP values illustrated in Fig. 11. The clear trend of SHAP (Shapley Additive Explanations) values highlights the advantage of SHAP as a model-agnostic tool for feature importance analysis. Grounded in game theory, SHAP values estimate each feature's contribution to the model's prediction. This model-agnostic approach allows for examining underlying patterns using ML/AI models without the constraint of assuming perfect model representation. Consequently, it mitigates interpretative bias, offering a more robust understanding of feature impact. The SHAP Value Summary Plot depicted in Fig. 11 demonstrates the influence of each feature on the model's output: features are colour-coded (with red indicating high and blue indicating low values) to show their impact on predictions. Features like Storage Capacity and Energy Consumption have a significant impact, marking them as key determinants for assessing the viability of salt caverns for hydrogen storage. In contrast, distances to Conservation Areas and Special Protection Areas display variable impacts, indicating their different levels of influence on the model's predictions across the dataset.

Figure 11
figure 11

Feature importance of CatBoost model.

Results and discussion

The seminal work by Lankof and Tarkowski76 provides a robust foundation for site selection methodology using multi-criteria decision analysis and GIS. Their approach represents a significant step in identifying suitable locations for hydrogen storage within bedded salt deposits. Our present study builds upon this foundation and introduces an innovative artificial intelligence (AI) framework that enhances the site-selection process.

While Lankof and Tarkowski76 focus on the application of a multi-criteria decision analysis in a GIS setting, the present study expands this by incorporating a suite of eight AI algorithms. This inclusion goes beyond the traditional GIS analyses by enabling a data-driven, machine-learning approach that offers increased accuracy and computational efficiency. Notably, our work emphasizes the superior performance of the CatBoost algorithm in evaluating the suitability of salt caverns for hydrogen storage, which complements and quantitatively surpasses the earlier methodologies.

Furthermore, our research provides a comprehensive comparison between traditional methods, such as the Analytic Hierarchy Process (AHP), and advanced machine learning techniques, showcasing the latter's enhanced capabilities in creating detailed suitability maps. This methodological advancement is crucial for stakeholders involved in the strategic development of underground hydrogen storage facilities. By employing AI algorithms, the present study presents a cutting-edge methodology that can inform decision-making for governmental bodies, geological services, and the renewable energy industry.

Moreover, our results contribute to the ongoing scientific discourse on underground hydrogen storage by offering empirical evidence of the effectiveness of AI in the site selection process. The adaptability of our AI framework underscores its potential application on an international scale, supporting the strategic infrastructure development for renewable energy storage. So, our research not only aligns with the objectives of Lankof and Tarkowski's work76 but also extends it by leveraging the latest advancements in AI, thereby providing a novel and empirically validated approach to the selection of underground hydrogen storage sites.

This research advances the application of an artificial intelligence (AI) approach to strategically selecting prime locations for Underground Hydrogen Storage (UHS) within bedded rock salt formations. Historically, multi-criteria decision analysis has been harnessed in site-selection studies, particularly for evaluating distinct salt structures for hydrogen storage. However, the dedicated application of AI algorithms for identifying optimal UHS sites in bedded rock salt deposits is a novel exploration presented within this paper.

Machine Learning (ML) methods, which utilize algorithms to learn from and make inferences from data, are employed herein without explicit programming. Concurrently, the Analytic Hierarchy Process (AHP) is utilized to assign relative importance to various criteria, a technique especially useful in morphometric analysis of watersheds where quantification of factors like rainfall or soil characteristics may be imprecise.

Despite their respective advantages, both AHP and ML methodologies are subject to practical limitations. AHP's reliance on expert judgment for rule definition can result in difficult models to interpret and validate. Conversely, ML's efficacy is tethered to the quality and volume of the data, as well as the algorithm and parameter selections, which, if not judiciously chosen, can lead to overfitting or underfitting, thereby diminishing the model's predictive capability on novel data sets.

The present study divided the data into two subsets: a training set constituting 70% of the total data, and a validation-testing set forming the remaining 30%. The performance of various ML algorithms was rigorously evaluated on both the training and testing datasets, as illustrated in Figs. 3, 4, 5, 6, 7, 8, 9 and 10. These figures juxtapose the target values derived from AHP against the predictions made by the algorithms, thereby calculating the error of the models. The numerical outcomes were closely aligned with those procured from the KNN, SVM, LightGBM, XGBoost, MLP, CatBoost, GBR, and MLR methods, as presented in Table 1. The CatBoost model, in particular, exhibited enhanced performance compared to its counterparts.

Twelve input data layers were processed through the AI algorithm to identify suitable UHS locations within Poland, Conservation Area, Geological exploration, Water reservoir, Accessibility, Ecological Site, Energy Consumption, Land Use, Natural Gas Pipelines, Natural forest, Protected Area, Special Protection Area, Storage Capacity. After selecting the optimal method from the suite of evaluated ML algorithms, the chosen model was applied to the entire study area (Fig. 12). The resulting performance was assessed against the outcomes derived from the AHP technique, with the ML model demonstrating greater accuracy and computational efficiency than the AHP model, thereby solidifying the potential of AI in streamlining UHS site selection.

Figure 12
figure 12

Map displaying the outcomes obtained through the implementation of the optimal machine-learning algorithm, delineating potential sites for hydrogen storage within the region, generated using ArcGIS Pro 2.8 software. The base map was developed by Esri using HERE data, DeLorme base map layers, OpenStreetMap contributors, Esri base map data, and select data from the GIS user community. For more information about Esri® software, please visit http://www.esri.com.

Our research demonstrates that the selected methodology markedly impacts the generated suitability maps, proving an efficient instrument for swiftly pinpointing optimal locations for Underground Hydrogen Storage (UHS). The comparison of the spatial output derived from AI algorithms with the findings of Lankof and Tarkowski76 validates the commendable accuracy of the algorithms utilized88.

The present study's results, as depicted in Fig. 12, illustrate the sites deemed suitable for Underground Hydrogen Storage (UHS) within Poland, which were identified through the application of advanced machine learning (ML) algorithms. A comparison of Lankof & Tarkowski's research76 shows some discrepancies in the potential sites across different regions in Poland.

We utilized a robust dataset divided into a training set, constituting 70% of the total data, and a validation-testing set for the remaining 30%. The ML algorithms were not only trained on this dataset but also rigorously tested and validated to ensure the generalizability of the predictions. The performance metrics, thoroughly detailed in Table 1, reflect the algorithms' accuracy and predictive quality. Specifically, the CatBoost model exhibited superior performance, underlined by its high precision in mapping the complex interrelations of the criteria defining the suitability for UHS.

The variations in suitable sites between the studies can be attributed to the diverse analytical mechanisms intrinsic to different ML algorithms compared to the GIS-based MCDA employed by Lankof and Tarkowski76. The ML approach takes into account a broader range of factors and their interactions, allowing for the identification of patterns that may not be apparent through traditional methods.

To further expound on the results obtained from our AI algorithms, we have delved into a feature importance analysis. This analysis, using techniques such as SHAP (Shapley Additive explanations), clarifies the contribution of each criterion to the predictive models. This step is crucial for understanding how specific factors such as conservation areas, geological exploration, and energy consumption significantly influence the algorithms' output, thereby demystifying the ML process. By conducting this comprehensive analysis and comparison, we demonstrate the efficacy and accuracy of ML algorithms in identifying suitable UHS locations. This demonstrates that our selected methodology can successfully supplement and potentially improve upon traditional approaches, providing an efficient means for swiftly identifying prime locations for UHS. The data were rendered into a raster map, culminating in a final visualization that illustrates the potential of various locations for UHS. This suitability map clearly delineates areas within the rock salt strata that hold promise for hydrogen storage, allowing for straightforward identification of prospective sites. The most favourable sites—characterized by high storage capacity and favourable ratings across all assessed criteria—are predominantly located in the central-western segment of the study area. Furthermore, the map differentiates areas of high suitability based on a composite of criteria. The most advantageous areas in the monocline's western regions are those with substantial storage volumes and extensive geological investigation. Conversely, in the eastern sectors of the surveyed region, high suitability correlates with factors such as elevated energy demand, the extent of geological exploration, and proximity to existing gas pipeline infrastructure.

Conclusion

This study systematically applied eight artificial intelligence algorithms—namely KNN, SVM, LightGBM, XGBoost, MLP, CatBoost, GBR, and MLR—to scout for viable underground hydrogen storage (UHS) locations within Poland. The research established a robust AI-informed framework by leveraging a multifaceted dataset comprising storage capacity, proximity to water sources, transportation networks, pipelines, boreholes, energy consumption, and land use. Our comparative analysis pinpointed the CatBoost algorithm as the most precise tool for delineating favourable UHS sites within the rock salt strata, offering an accurate numerical assessment of their potential. The efficacy of the machine learning approach was benchmarked against the Analytic Hierarchy Process (AHP), with CatBoost demonstrating enhanced accuracy and computational efficiency. These advancements present actionable intelligence and novel strategic avenues for stakeholders, including policy planners, geological services, renewable energy producers, and entities within the chemical and petrochemical sectors, who are invested in the strategic development of UHS facilities. The implications of our work extend to governmental and European Union institutions, which are key players in the infrastructure development for renewable energy storage. Additionally, the outcomes of this research are poised to contribute significantly to the ongoing discourse within the scientific community regarding hydrogen storage solutions, offering empirical data to inform policy decisions. The adaptability of the proposed AI methodology underscores its potential for broader international application in selecting sites for underground energy storage, subject to region-specific modifications and criteria. Future research directions include conducting comparative analyses of these contemporary AI methodologies against traditional site selection practices. Such studies would be instrumental in identifying new, sustainable UHS sites, further streamlining the site selection process, enhancing operational efficiency, and ensuring the conservation of time and resources in future UHS ventures.