Introduction

The phenomenon of rockburst (RB) refers to a dynamic failure (seismic event) that occurs due to the violent and sudden release of elastic energy accumulated within coal or rock formations. This phenomenon can result in significant consequences, including the failure of underground working spaces, potential casualties, deformation of supporting systems, damage to machinery, and delays in construction activities1,2,3,4. Based on these destructive effects, attention needs to be given to predicting of this phenomenon in underground excavation projects. There are two types of RB prediction: long-term and short-term5,6,7,8. The long-term prediction of RB is typically conducted during the early stages of excavation and project design, and it serves as a guide for the subsequent excavation phases. Such predictions typically rely on intrinsic rock mechanics parameters (including stiffness, strength, energy storage capacity, and brittleness) to assess the occurrence of this phenomenon at a specific site. On the other hand, short-term prediction is primarily used during the life of the project to quickly detect the occurrence of RB events. This enhances the coordination of industrial activities and reduces the risk of severe accidents. Generally, short-term RB prediction involves assessing the risk of RB occurrences in the near future based on in situ techniques. Among these techniques, microseismic (MS) monitoring is one of the most widely used methods for RB event monitoring due to real-time monitoring, wide detection range, big data scale, and no harm to production8,9. In this technique, using sensors laid out spatially with different azimuths, MS waves released during rock fracture can be captured. By analyzing the MS waves, some precursory features of RB events are discovered that could be used to predict the risk of this phenomenon10,11,12.

The mechanism of RB occurrence is complex and influenced by a combination of factors. Because of this complexity, RB prediction without the aid of computer models is challenging13. Recently, machine learning (ML) methods have been employed to predict RB owing to their capacity to tackle complex and nonlinear issues7. These methods mainly focus on the long-term prediction of this phenomenon using unsupervised learning, supervised learning, and comparative decision strategies14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43, whereas only a few studies have been conducted on short-term prediction using ML methods6,44,45,46,47,48. Based on literature survey, application of ML methods has proven to be effective in real time prediction of RB intensity based on MS monitoring data. Despite their reliable and precise outputs, most algorithms are not readily applicable in practice owing to their non-interpretable “black box” nature. Also, development of ML models requires fine-tuning hyperparameters, which can drastically affect the performance of a model49. Fine-tuning hyperparameters is a computational challenge due to the large size of search space. There are a variety of methods for finding the optimal Hyperparameters. In most previously developed models, adjusting of hyperparameters was done manually, which requires a lot of time and trial and error50. Recently, to solve this problem, the optimization methods are used as a practical way from the perspective of time and performance balance.

To address the mentioned limitations of traditional ML models, this study proposes two novel hybrid models for short-term prediction of RB intensity. Two models are developed in the Python environment using the random forest (RF) method, with whale optimization algorithm (WOA) and coati optimization algorithm (COA) applied to adjust the hyperparameters. The RF technique was chosen due to its outstanding capability in addressing complex and nonlinear problems such as RB prediction. Additionally, the Shapley additive explanations (SHAP) method is applied to explain the prediction process and assess the influence of input features.

Database description

The study utilized a database collected from MS monitoring events of the Jinping-II hydropower project in China to develop the proposed models51. The Jinping-II hydropower project is located at the Jinping bend on the Yalong River and the main characteristics of this case study are presented in Table 1. During construction process of this project, RB events occurred frequently and caused significant problems in terms of safety. According to site observation and survey, all three types of RB (including strain burst, pillar burst and slip-fault burst) occurred, while strain bursts were the most common52. In strain bursts, the location of the seismic event is the same as where damage occurs, providing the opportunity to predict the RB using MS monitoring data6. MS monitoring system was built based on the Integrated Seismic System (ISS), including a server, smart sensors, geophysical seismometer (GS), intelligent uninterruptible power system (UPS), optional communication element (I-Splitter, moxa, fiber, DSL, and TP-Link), junction box, cables, etc. A 54-channel MS monitoring system was used and the GS had a wide sampling rate (3–48,000 Hz)10,52. The MS sensors had a natural frequency of 14 Hz and an approximate usable frequency range that varied from 7 to 2000 Hz. Two groups of MS sensors were set up just behind the working face. These were moved with the working face every 30–40 m (i.e. manually removed, moved, and set up again). A sectional velocity model was used for MS events location. Details of the MS data acquisition system has been discussed in Feng et al.10.

Table 1 Characteristics of Jinping-II hydropower project45,53,54.

The database includes 93 RB case histories. Based on this database, the input parameters for developing hybrid intelligent models are cumulative seismic energy (CE), cumulative microseismic events (ME), cumulative apparent volume (CV), event rate (ER), apparent volume rate (AVR), seismic energy rate (LER) and the output is RB severity related to each case51. Based on literature survey, these six input parameters are most commonly used MS features to predict the severity of RB in real-time. ME denotes the measure of microfracture density, while CV and CE denote the extent of damage and the fracture strength to the rock mass, respectively. These three parameters are fundamental indicators representing the attributes of microfractures during the development of RB events10. To integrate temporal aspects into the process, three parameters related to time (ER, AVR, and LER) are considered. ER denotes the rate of MS activity occurrence, the process of rock mass failure, and the average response pattern over time. LER indicates the energy emitted by the rock mass through MS radiation during a specific time period, while AVR refers to the volume of rock within the deformation region experiencing inelastic behavior over the same time frame6. To optimize execution, the values of LER, AVR, CV, and CE are chosen on a logarithmic scale. The primary objective of the logarithmic function is to address the skewness towards larger numbers in the RB database. Figure 1 shows the histograms, cumulative distributions, and basic statistical description of input parameters. The RB severity (output parameter) is classified into four classes (including none, light, moderate, and severe) based on the classification by Chen et al.52, the distribution of which is shown in Fig. 2.

Fig. 1
figure 1

The histogram, cumulative distribution, and basic statistical description of input parameters.

Fig. 2
figure 2

Distribution of the RB severity classes.

Development of models

RF is an integrated learning method based on decision trees, which enhances the generalizability and accuracy of the model by constructing multiple decision trees and averaging their predictions, or selecting the final prediction through majority vote55. The RF algorithm is an extension of the bagging integration method. It utilizes bagging and feature randomness to generate a set of uncorrelated decision trees, consequently diminishing correlations between them and enhancing model diversity56. This approach mitigates the risk of overfitting, enhances model robustness, facilitates feature importance assessment, and enables parallelized training, thereby significantly reducing training time. In the development of real-time models, the optimal selection of hyperparameters affects the performance and accuracy of the models. In this study, the WOA meta-heuristic algorithm has been utilized to optimally adjust hyperparameters. Mirjalili et al.57 introduced WOA, a novel meta-heuristic optimization algorithm, in 2016. This algorithm enhances performance by mimicking the hunting behavior of humpback whales. The specific implementation method entails replicating the bubble-net feeding technique of humpback whales through a spiral pattern58. The bubble-net feeding technique comprises three sequential steps: encircling the prey, creating a spiral bubble-net to trap the prey, and then locating the next target59. Since the method and mathematical aspects of WOA have already been extensively documented in numerous literature sources59,60,61,62, this study does not provide an explanation of them. The RF algorithm was utilized in the Python environment in this section, and its performance was improved by optimizing it with the WOA. RF serves as the framework for developing models, utilizing the WOA to determine the best hyperparameters. To develop the models, the train-test (hold out cross validation) method was used. The database has been randomly partitioned into two segments: training and testing, in an 8:2 ratio. Initially, the model is trained on 80% of the data and subsequently validated using the remaining 20%. It is important to note that the test database is independent of the training database and plays no role in the model’s development. Figure 3 shows the algorithm of RF hybrid models. Table 2 displays the optimal arrangement of RF-WOA hyperparameters.

Fig. 3
figure 3

The RF-WOA/COA hybrid architecture.

Table 2 The optimal arrangement of RF-WOA hyperparameters.

Another optimization algorithm utilized in this study to adjust the hyperparameters of the RF algorithm is the COA. The COA is a novel metaheuristic algorithm introduced in 2023 by Dehghani et al.63, inspired by the natural actions of coatis. It imitates two specific behaviors of coatis: hunting iguanas and evading predators. The implementation process of this algorithm includes three steps: 1) initialization process, 2) exploration phase, and 3) exploitation phase, the details of which have been elucidated by various researchers63,64. Similar to the previous one, this algorithm is first developed in the Python environment using the training database and then evaluated on the test database. Table 3 displays the optimal arrangement of RF-COA hyperparameters.

Table 3 The optimal arrangement of RF-COA hyperparameters.

Performance evaluation and comparison of models

The RF-WOA and RF-COA hybrid models are evaluated using the testing datasets. The performance of the models is thoroughly assessed in this section using precision, accuracy, F1-score, and recall measures65. These values can be determined through the confusion matrices of each model (Fig. 4). Figures 5 and 6 show the confusion matrices of models RF-WOA and RF-COA (for both train and test datasets), respectively. The evaluation results of the created models, using assessment metrics, are displayed in Table 4. Based on the obtained results, the WOA optimizer shows superior performance in the RF model for evaluating the occurrence of RB compared to the COA optimizer.

Fig. 4
figure 4

Performance evaluation measures for classification problems4,6.

Fig. 5
figure 5

Confusion matrix of RF-WOA model: (a) Training, (b) Testing.

Fig. 6
figure 6

Confusion matrix of RF-COA model: (a) Training, (b) Testing..

Table 4 The performance indices for the developed models.

To thoroughly assess the established models, they were compared with the recently created models in the literature for prediction of short-term RB potential (Table 5). This comparison, as part of the performance evaluation of the models, showed that the proposed RF-WOA model can accurately predict the RB severity with outstanding accuracy. This model overcomes the limitations of previous ML models applying WOA optimization algorithm for hyperparameters tuning of RF model and SHAP method to assess the influence of input features on RB severity (which is described in the next section). On the other hand, this model not only exhibits high prediction performance but also promotes transparency in the prediction process. Therefore, project managers can use this model to predict this phenomenon and implement the necessary control measures to increase safety and productivity.

Table 5 The comparison between the models generated in this study and the models developed in previous studies.

Finally, to prove the applicability and practicability, RF-WOA model (as the best developed model) was utilized to predict 19 new RB cases, which were not included in the original 93 cases. These new validation cases were collected from the MS monitoring technique from the Jinping-II hydropower project in China, the Ashele Copper Mine in China, the Neelum–Jhelum Hydroelectric Tunnel in Pakistan and the Qinling Water Conveyance Tunnel in China72,73. Table 6 presents the MS monitoring parameters with real and predicted RB severity for each case. As shown in Table 6, the prediction results of RF-WOA model are coincided with real RB severity expect for cases 3 and 4, which yields the accuracy of 86.667%. This proves the good generalization and effectiveness of the proposed RF-WOA model.

Table 6 Validation results of RF-WOA model for 19 new RB cases.

Model interpretability and parameters importance

Despite the high performance of the developed model, it cannot interpret the effects of local samples on the model output or quantify the correlations between different features, resulting in insufficient explanations for the prediction process. Furthermore, due to its complexity, the model lacks interpretability74. Therefore, an integrated model interpretation framework called SHAP is introduced to enhance the understanding of the RB risk prediction mechanism. The framework integrates the Shapley values derived from cooperative game theory with the locally explainable model explanation method. The SHAP method is a recently discovered technique that uses Shapley values to provide explanations for predictions generated by machine learning algorithms74. Using the SHAP method, it is possible to determine the specific contributions of the input parameters to the predictive outputs in the model, as well as whether their effects on the results are positive or negative. Although SHAP is generally used for providing local explanations, it is also feasible to get an overall view by analyzing the Shapley values, as shown in Fig. 7. The horizontal axis of Fig. 7 represents the SHAP values, whereas the vertical axis denotes the properties that are pivotal for the prediction made by the model. The color bar represents the feature’s precise value. The attributes representing the features along the y-axis are presented in a descending order, where the attribute with the highest mean absolute SHAP value is positioned initially.

Fig. 7
figure 7

The SHAP plot of RF-WOA model.

Based on the importance of input parameters, ME, CE, and CV have been identified as the most effective parameters in the occurrence of the RB phenomenon. These parameters are fundamental indicators representing the attributes of microfractures during the development of RB events, as stated in Feng et al.10. According to the SHAP values of these three parameters (Fig. 7), their dependency plots can be presented to interpret the behavior of this phenomenon. Figure 8 presents the SHAP force plot and waterfall plot. This figure visually demonstrates how the RF-WOA model is gradually adjusted from the base value to the final predicted value through various parameters. In Fig. 8b, the value of E[f(x)] represents the model’s baseline (the average predicted output without considering specific parameters), while f(x) indicates the final predicted value for this sample75,76,77. It can be observed that the CE parameter has a value of + 0.09, making it the largest positive contributor to the prediction. As shown in Figs. 8a and 8b, the predicted value matches the actual value, highlighting the high accuracy of the developed model. By quantifying the contribution of features using SHAP values and providing local explanations for individual samples, one can thoroughly investigate the feature-specific effects on a given predictor value, thereby significantly enhancing the model’s interpretability. Figure 9 shows the three-dimensional dependency plot of these three parameters. Based on this figure, the behavior of RB (relative to the most effective parameters in the occurrence of this phenomenon) follows a parabolic pattern. According to this figure, when the values of these parameters are low, the risk of RB is also low. Clearly, as the microfracture density, extent of damage, and fracture strength of the rock mass decrease, the burst power of the rock mass decreases as well. Gradually, with the increase in the values of these parameters, the risk of this phenomenon occurring increases. However, when these values surpass a certain limit, the strength of the rock mass increases to a point where the risk of this phenomenon occurring decreases. By identifying this limit, appropriate control measures can be implemented to reduce the potential risk of this phenomenon. This may entail reinforcing the rock mass, implementing ground support systems, or adjusting mining techniques to minimize the likelihood of RB53,78. Understanding the relationship between these parameters and the risk of RB is crucial for ensuring the safety of workers in underground mining operations.

Fig. 8
figure 8

The SHAP force and waterfall plot.

Fig. 9
figure 9

3D dependency plot obtained from SHAP analysis for the most effective RB parameters.

Summary and conclusions

Microseismic activity, mining disturbances, and geological factors influence the occurrence and consequences of RB. Hence, it is crucial to take these influential factors into account when issuing early risk warnings. Furthermore, it is essential to create precise and easily understandable prediction models to improve the dependability and practicality of ML in predicting RB threats. In this study, real-time models for predicting short-term RB have been developed using the powerful RF algorithm and two meta-heuristic algorithms (WOA and COA) to optimally set hyperparameters. To develop the models and implement the algorithm, a database comprising 93 RB case histories was used, taking into account the MS parameters affecting the occurrence of this phenomenon. Finally, to interpret the model, the SHAP method was employed. The performance results of the developed models showed that the RF-WOA model outperformed the RF-COA model (Accuracy = 0.944, Precision = 0.950, Recall = 0.944, and F1-score = 0.943). Additionally, comparing the performance of the RF-WOA model with other previously developed models demonstrated that this model evaluates this phenomenon with high accuracy and low uncertainty. The results of examining the importance of input parameters on the occurrence of this phenomenon showed that three parameters including ME, CE, and CV have the greatest impact on the occurrence of this phenomenon. Finally, by analyzing the SHAP values of these three parameters, it was found that the behavior of this phenomenon follows a parabolic-like pattern. Specifically, at low values of these three parameters, the risk of RB is low. As the values of these parameters increase, the risk of occurrence also increases and finally, after surpassing a certain limit, the risk of occurrence begins to decrease. The obtained results can be a valuable resource for accurately predicting the occurrence of short-term RB in operational conditions. Project managers can implement the necessary control and management measures based on these results to reduce the risk of this phenomenon. However, this study has limitations, such as the number of cases in the database and the lack of examination of geological and mining parameters. The authors are currently studying these parameters, and their inclusion will ensure the comprehensiveness of the models, as investigating this phenomenon thoroughly is essential.