Introduction

Coal is an important basic energy and raw material, and it plays an important role in economic development1. China is currently the largest coal producer in the world2, and it is also one of the countries which are most seriously affected by water inrush in coal mining3. Water inrush is a phenomenon of a large amount of groundwater flooding into the roadway when mining exposes water-bearing media4. Water inrush usually occurs violently and floods the underground working area in a short time5, which seriously threatens mining production and causes casualties6. Therefore, monitoring and early warning of water inrush are of great significance to safety production of coal mines.

When groundwater comes from coal seam floor, it is called water inrush of coal seam floor. Under the action of mining stress, the damage depth of floor develops continuously and the waterproof layer decreases continuously. At the same time, the mining stress can also cause fault activation in the floor, and then confined water fills the fractures, making the water level rise continuously. Once the mine failure zone is connected with the fracture, confined water flows into roadway from aquifer along the water inrush channel, and water inrush of coal seam floor occurs. Therefore, to realize the early warning of water inrush of coal seam floor, electrical resistivity tomography (ERT), which is sensitive to low-resistivity anomalous bodies such as water, has unique advantages7. ERT is an exploration method used to solve geological problems by observing and studying the underground distribution law and characteristics of an artificial electric field based on the difference in the electrical conductivity between different underground media8, and it is one of the most widely used geophysical methods9]– [10.

However, ERT can only realize static positioning of water-bearing fractures. To realize dynamic tracking of the upward extension process of water-bearing fractures, time-lapse ERT (TL-ERT) should be used. TL-ERT is a method derived from ERT, whose essence is to analyze the change of resistivity (or apparent resistivity) of an area with time through repeated measurements11. At present, it has been widely used in hydrogeological monitoring12, geothermal monitoring13, volcanology studies14, contaminant monitoring15, saline intrusion monitoring16, infrastructure stability17, landslide monitoring18, permafrost monitoring19, vegetation studies20, and gas dynamics monitoring21. When time-lapse measurements are carried out, the measured data are often evaluated by independent inversion, ratio inversion, difference inversion, and cross model constraint inversion22,23,24,25. However, although the above methods can determine the change in resistivity images with time, they cannot directly determine which points in the measured profile are the risk points of water inrush.

Therefore, we hope to propose a classification method to judge whether the points in the measured profile are the risk points of water inrush or not. In the commonly used classification algorithms, the random forest classification algorithm and the kNN classification algorithm can be used to classify measured points26]– [27. However, the limit of such classification algorithms is that they cannot determine the risk magnitude of the measured point. The naive Bayes classification algorithm is based on the Bayes theorem and the independent assumption of feature conditions28, and its classification results can be expressed in the form of probability29. It has the advantages of high efficiency30, easy implementation31, and wide application fields32. Therefore, it is suitable for determining the risk magnitude of measured points.

In summary, the goal of this study is to use the naive Bayes classification algorithm to obtain the probability contour map of the measured coal seam floor, and the measured point whose probability value exceeds the threshold value of water inrush risk will be warned. This method solves the problem that the conventional methods cannot determine which points in the measured profile are the risk points of water inrush, and displaying the water inrush risk in the form of probability can greatly improve the intuitionistic readability of the contour map of the coal seam floor. In addition, since the changes among several groups of measured data are expressed in only one contour map, the comparison process in conventional methods is omitted, which improves the efficiency of early warning.

Materials and methods

Before the mining starts, the coal seam floor is in a relatively stable state, and the water-resisting zone between the coal seam and the aquifer can prevent the confined water from rising. Under normal circumstances, there are natural fractures in the water-resisting zone, and confined water fills these natural fractures and rises to a certain height, which is called as the natural intrusion height. The corresponding area is called as the natural intrusion zone, which will not destroy the stability of coal seam floor in general. After the beginning of mining, under the combined action of mining stress and pressure of confined water, fractures in the natural intrusion zone extend further upward, and confined water fills the extended fractures and rises to a new height, which is called as the progressive intrusion height. The corresponding area is called as the progressive intrusion zone. At the same time, the mining stress leads to the increase of maximum depth of coal seam floor failure33, and the corresponding area is called as the mine failure zone. Once the mine failure zone is connected with the fracture, confined water flows into roadway from aquifer along the water-conducting passage, and water inrush of coal seam floor occurs34, as shown in Fig. 1.

Fig. 1
Fig. 1
Full size image

The mechanism of water inrush of coal seam floor. Under the combined action of mining stress and pressure of confined water, fractures in the natural intrusion zone extend further upward, which forms the progressive intrusion zone. Once the mine failure zone is connected with the progressive intrusion zone, the water-conducting passage is formed, and confined water flows into roadway from aquifer along the water-conducting passage, and water inrush of coal seam floor occurs.

As can be seen from Fig. 1, the key to realize water inrush early warning is to monitor the upward extension process of water-bearing fracture in the coal seam floor. Researchers usually use independent inversion, ratio inversion, difference inversion, and cross model constraint inversion to process long-term monitoring data. However, although the above methods can determine the change in resistivity images with time, they cannot determine which points in the measured profile are the risk points of water inrush. Therefore, we propose to use the naive Bayes classification algorithm to generate probability contour maps to solve this problem.

The probability of event A occurring is assumed as \(P\left( A \right)\), and the probability of event B occurring is assumed as \(P\left( B \right)\). On the premise that event A occurs, the probability of event B occurring is

$$P\left( {B\left| A \right.} \right)=\frac{{P\left( {AB} \right)}}{{P\left( A \right)}}$$
(1)

where \(P\left( {AB} \right)\) is the probability that events A and B occur simultaneously.

Similarly, on the premise that event B occurs, the probability of event A occurring is

$$P\left( {A\left| B \right.} \right)=\frac{{P\left( {AB} \right)}}{{P\left( B \right)}}$$
(2)

The Bayes formula can be derived from Eqs. (1) and (2) as follows:

$$P\left( {A\left| B \right.} \right)=\frac{{P\left( {B\left| A \right.} \right)P\left( A \right)}}{{P\left( B \right)}}$$
(3)

It is assumed that the decision variables in the training set are set as \({x_1},{x_2}, \cdots ,{x_n}\), and the target variable is set as Y, then it can be obtained from Eq. (3) as follows:

$$P\left( {Y\left| {{x_1}{x_2} \cdots {x_n}} \right.} \right)=\frac{{P\left( {{x_1}{x_2} \cdots {x_n}\left| Y \right.} \right)P\left( Y \right)}}{{P\left( {{x_1}{x_2} \cdots {x_n}} \right)}}$$
(4)

According to the Markov hypothesis, we can get

$$P\left( {{x_1}{x_2} \cdots {x_n}\left| Y \right.} \right)=\prod\limits_{{i=1}}^{n} {P\left( {{x_i}\left| Y \right.} \right)}$$
(5)

By substituting Eq. (5) into Eq. (4), we can get

$$P\left( {Y\left| {{x_1}{x_2} \cdots {x_n}} \right.} \right)=\frac{{P\left( Y \right)\prod\limits_{{i=1}}^{n} {P\left( {{x_i}\left| Y \right.} \right)} }}{{P\left( {{x_1}{x_2} \cdots {x_n}} \right)}}$$
(6)

As can be seen from Eq. (6), after the training set is determined, the probability of the target variable occurring at a measured point can be obtained through the set of numerical values of the decision variables. When the target variable is water inrush occurring, the probability calculated by Eq. (6) can be used to determine the water inrush risk of the measured point.

In this study, there are five decision variables in the training set, which are derived from the data of four consecutive measurements. Their physical meanings, judgment conditions, and assignments are shown in Table 1, where \({\rho _i}\) is the apparent resistivity value of each measured point of the ith measurement.

Table 1 Physical meanings, judgment conditions, and assignments of decision variables in the training set.

Using actual measured data as the training set will lead to a large number of value sets appearing of all 0s and all 1s, while the probability of other value sets appearing is basically 0. It results in the probability values of measured points in the generated probability contour map being mostly 0 or 1. To improve the generalization ability of the proposed method, the used training set is a pseudo-random matrix generated by MATLAB. The purpose is to traverse various value sets of \(\left\{ {{x_1},{x_2},{x_3},{x_4},{x_5}} \right\}\) to avoid the possibility that some value sets do not appear when using actual measured data as a training set. To better conform to the statistical law and avoid the occurrence probability of a certain value set being too large or too small, the training set is a \(1000 \times 5\) pseudo-random matrix generated by 0 and 1. When at least three of the decision variables are assigned to 1, Y is set to 1, otherwise Y is set to 0. Next, the proposed method is verified by experiments.

Results

Physical simulations

The physical simulations are conducted in a plastic water tank with a water level of 0.2 m, and the array type is set as the Wenner α array. Nineteen brass electrodes are fixed vertically downward on the square wood beam with an electrode spacing of 0.1 m. The water entry depth of the tips is the same (approximately 0.01 m), which can be regarded as point contact35. Iron blocks with the length of 0.2 m and the height of 0.05 m are gradually put into the water. Due to iron blocks are low-resistivity bodies relative to water, they can be used to simulate the water-bearing fracture in coal seam floor. Four different upward extension processes of water-bearing fractures are simulated, and the steps are as follows:

Physical simulation (1): Step 1, no iron block is put into the water; Step 2, the first iron block is placed directly below electrodes 9#, 10#, and 11#; Step 3, the second iron block is placed above the first one and shifted 0.05 m to the left; Step 4, the third iron block is placed above the second one and shifted 0.05 m to the left, which is directly below electrodes 8#, 9#, and 10#. The experiment simulates the continuous upward extension of the water-bearing fracture, and the experimental layout is shown in Fig. 2.

Fig. 2
Fig. 2
Full size image

Iron blocks are gradually put into the water to simulate the process of water-bearing fracture extending upward to the left in coal seam floor.

Physical simulation (2): Step 1, no iron block is put into the water; Step 2, the first iron block is placed directly below electrodes 9#, 10#, and 11#; Step 3, the second iron block is placed directly below electrodes 13#, 14#, and 15#; Step 4, the third iron block is placed above the second one and shifted 0.05 m to the right. The experiment simulates the discontinuous upward extension of the water-bearing fracture, and the iron blocks are placed in sequence of left-right-right. The experimental layout is shown in Fig. 3.

Fig. 3
Fig. 3
Full size image

Iron blocks are gradually put into the water to simulate the process of water-bearing fracture extending upward discontinuously. The iron blocks are placed in sequence of left-right-right.

Physical simulation (3): Step 1, no iron block is put into the water; Step 2, the first iron block is placed directly below electrodes 9#, 10#, and 11#; Step 3, the second iron block is placed above the first one and shifted 0.05 m to the left; Step 4, the third iron block is placed directly below electrodes 13#, 14#, and 15#. The experiment simulates the discontinuous upward extension of the water-bearing fracture, and the iron blocks are placed in sequence of left-left-right. The experimental layout is shown in Fig. 4.

Fig. 4
Fig. 4
Full size image

Iron blocks are gradually put into the water to simulate the process of water-bearing fracture extending upward discontinuously. The iron blocks are placed in sequence of left-left-right.

Physical simulation (4): Step 1, no iron block is put into the water; Step 2, the first iron block is placed directly below electrodes 9#, 10#, and 11#; Step 3, the second iron block is placed directly below electrodes 13#, 14#, and 15#; Step 4, the third iron block is placed above the first one and shifted 0.05 m to the left. The experiment simulates the discontinuous upward extension of the water-bearing fracture, and the iron blocks are placed in sequence of left-right-left. The experimental layout is shown in Fig. 5.

Fig. 5
Fig. 5
Full size image

Iron blocks are gradually put into the water to simulate the process of water-bearing fracture extending upward discontinuously. The iron blocks are placed in sequence of left-right-left.

The measured data of the physical simulations are processed with the naive Bayes classification algorithm, and the obtained probability contour maps are shown in Fig. 6.

Fig. 6
Fig. 6
Full size image

Probability contour maps obtained from the measured data of physical simulations processed by the naive Bayes classification algorithm. (a) Physical simulation (1); (b) Physical simulation (2); (c) Physical simulation (3); (d) Physical simulation (4).

Actual monitoring of coal seam floor

The actual monitored coal mining face is Ji 17-33200 coal mining face of No.10 Mine of Pingdingshan Tian’an Coal Industry Co., LTD., which is located in Weidong District, Pingdingshan City, Henan Province, China. The water level of Cambrian limestone water in No. 26 observation hole near Ji 17-33200 coal mining face is -675.9 m, and the pressure of Cambrian limestone water under coal mining face is 2.86 MPa. According to the actual exposure of the coal mining face and the analysis of the exposure data of the limestone water borehole in the coal seam floor roadway, the thickness from the bottom of coal seam to the roof of Cambrian limestone is approximately 86 m.

Water inrush coefficient T is commonly used to characterize the risk degree of water inrush, and its expression is as follows36:

$$T=\frac{P}{M}$$
(7)

where P is the hydrostatic pressure of aquifer, and M is the thickness of water-resisting zone.

The corresponding values are substituted into Eq. (7), and the water inrush coefficient can be obtained as \(T=0.033MPa/m\), which is less than the threshold value \(\left[ T \right]=0.06MPa/m\). Therefore, the normal section of the water-resisting zone can withstand a greater water pressure value than the actual water pressure value, but the water pressure resistance capacity of the weak section is reduced, where water inrush maybe occurs. Therefore, it is necessary to monitor the coal seam floor.

A total of 50 brass electrodes with the length of 0.25 m and the diameter of 0.01 m are laid in the roadway at one side of the coal mining face, in which 30 electrodes are located in the effective monitoring zone (electrodes 21# to 50#). They are laid at the junction of the coal seam and its floor along the strike, with an electrode spacing of 10 m. To ensure good coupling between the electrodes and the coal seam floor, the electrodes are completely inserted into the coal seam floor by hammering, and the gaps are filled with yellow mud.

After the installation of the instrument, daily measurements are taken with a measurement interval of approximately 24 h as the coalface advances; that is, measurements are taken at approximately the same time each day. The probability contour map obtained from the data measured by the Wenner α array on September 7th, September 8th, September 9th, and September 10th, 2022 processed by the naive Bayes classification algorithm is shown in Fig. 7 (a), and the probability contour map obtained from the data measured by the Wenner α array on September 8th, September 9th, September 10th, and September 11th, 2022 processed by the naive Bayes classification algorithm is shown in Fig. 7 (b).

Fig. 7
Fig. 7
Full size image

Probability contour maps of the coal seam floor of Ji 17-33200 coal mining face. (a) The probability contour map obtained from the data measured by the Wenner α array on September 7th, September 8th, September 9th, and September 10th, 2022; (b) The probability contour map obtained from the data measured by the Wenner α array on September 8th, September 9th, September 10th, and September 11th, 2022.

Discussions

Figures 2, 3 and 4, and 5 simulate four upward extension processes of water-bearing fractures. Figure 2 simulates the continuous upward extension of water-bearing fracture, so the water inrush risk is the highest. Figure 3 simulates that the water-bearing fracture that originally appears on the left side in the coal seam floor transfers to the right side and extends upward on the right side, so the water inrush risk is high. Figure 4 simulates that the water-bearing fracture that originally extends upward on the left side in the coal seam floor transfers to the right side, since the upward extension trend is interrupted, the water inrush risk is low. Figure 5 simulates that the locations of water-bearing fractures appear randomly, and there is no trend of continuous upward extension, so the water inrush risk is the lowest. Under the combined action of mining stress and pressure of confined water, the upward extension process of water-bearing fracture is more complicated, but it can be regarded as the superposition of these four basic processes, so these four extension processes are representative.

For the convenience of observation, the upper limit of color scale of all probability contour maps in Fig. 6 is uniformly set as 0.5, namely 0.5 is the threshold value of water inrush risk. There are the most measured points with probability values greater than 0.5 in Fig. 6 (a), indicating the highest water inrush risk. There are also high-risk measured points in Fig. 6 (b), but the number is smaller than that in Fig. 6 (a), indicating a water inrush risk lower than that of Fig. 6 (a). There is no high-risk measured point in both Figs. 6 (c) and 6 (d), but the maximum probability value in Fig. 6 (d) is smaller than that in Fig. 6 (c), indicating that Fig. 6 (d) has the lowest water inrush risk. All these are consistent with the above analysis, which proves that the probability contour maps generated by the proposed method can effectively display high-risk areas.

The threshold value of water inrush risk \(\left[ R \right]\) can be optimized according to the hydrogeological conditions of the monitored coal seam floor. If the water inrush coefficient of the coal seam floor is large, the threshold value of water inrush risk can be appropriately reduced. On the contrary, if the water inrush coefficient of the coal seam floor is small, the threshold value of water inrush risk can be appropriately raised. The calculation method of optimization is as follows:

$$\left[ R \right]=1 - \frac{T}{{\left[ T \right]}}$$
(8)

By substituting the calculation results in Sect. 3.2 into Eq. (8), we can get \(\left[ R \right]=0.45\), so the threshold value of water inrush risk in Fig. 7 is set as 0.45. Since there is no high-risk measured point in both Figs. 7 (a) and 7 (b), there is no need to issue water inrush early warning.

According to the calculation of Eq. (7), the normal section of the water-resisting zone can withstand a greater water pressure value than the actual water pressure value, where the probability of water inrush is small, so the measurement interval of 24 h can meet the monitoring requirements. For other coal mining face with a higher risk of water inrush, the measurement interval will be shortened to 12 h or even smaller.

Next, the proposed method is compared with the conventional method, and the used conventional method is the difference rate calculation based on cascaded reference value, which is similar to the ratio inversion. The calculation method is as follows37:

$$\Delta {\rho _a}\left( {x,y} \right)\% =\frac{{{\rho _{a,i}}\left( {x,y} \right) - {\rho _{a,i - 1}}\left( {x,y} \right)}}{{{\rho _{a,i - 1}}\left( {x,y} \right)}}$$
(9)

where \(\left( {x,y} \right)\) is the coordinate of each measured point in the apparent resistivity profile, \(\Delta {\rho _a}\left( {x,y} \right)\%\) is the apparent resistivity difference rate of each measured point, \({\rho _{a,i}}\left( {x,y} \right)\) is the apparent resistivity value of each measured point of the ith measurement, and \({\rho _{a,i - 1}}\left( {x,y} \right)\) is the apparent resistivity value of each measured point of the previous measurement.

The measured data of Physical simulation (1) are substituted into Eq. (9) in turn, and the calculated results are sent to MATLAB for imaging. The obtained apparent resistivity difference rate contour maps are shown in Fig. 8. The measured data of Physical simulation (1) are processed with the naive Bayes classification algorithm, and the obtained probability contour map is shown in Fig. 9.

Fig. 8
Fig. 8
Full size image

Apparent resistivity difference rate contour maps of Physical simulation (1) obtained by difference rate calculation based on cascaded reference value. The red arrows indicate the sequence of measurement steps. The areas marked by white boxes are the locations of the new-added iron blocks in each step in Fig. 2.

Fig. 9
Fig. 9
Full size image

The probability contour map obtained from the measured data of Physical simulation (1) processed by the naive Bayes classification algorithm.

As can be seen from Fig. 8, the apparent resistivity contour map obtained by using difference rate calculation based on cascaded reference value can only determine the change in apparent resistivity images with time, but it cannot determine which points in the measured profile are the risk points of water inrush. While as can be seen from Fig. 9, the probability contour map can display the water inrush risk of the measured point in the form of probability, which not only greatly improves the intuitionistic readability of the contour map of the coal seam floor, but also expresses the changes among several groups of measured data in only one contour map, so the comparison process among different contour maps in Fig. 8 is omitted, thus improving the efficiency of early warning.

Since the processes of processing measured data by ratio inversion and difference inversion are similar to difference rate calculation based on cascaded reference value, the above comparison proves the superiority of the proposed method in determining water inrush risk of coal seam floor compared with conventional methods.

Conclusions

In the process of long-term monitoring of coal seam floor, the naive Bayes classification algorithm based on pseudo-random matrix is used to process the measured data to solve the problem that the conventional methods cannot determine which points in the measured profile are the risk points of water inrush. The physical simulations are carried out in a plastic water tank. The results of the experiments show that the probability contour maps generated by the proposed method can effectively display high-risk areas. The actual monitoring is carried out on Ji 17-33200 coal mining face of No.10 Mine in Pingdingshan Tian’an Coal Industry Co., LTD. Since there is no measured point whose probability value exceeds the threshold value of water inrush risk in the probability contour maps obtained from the data measured by the Wenner α array for 4 consecutive days, there is no need to issue water inrush early warning. Since the changes among several groups of measured data are expressed in only one contour map, the comparison process in conventional methods is omitted, which improves the efficiency of early warning.