Introduction

Formation permeability is a critical parameter in petroleum engineering, influencing fluid flow within reservoirs and impacting hydrocarbon recovery efficiency. Traditional methods for estimating permeability, such as core sampling and well testing, can be costly and time-consuming, often yielding limited spatial coverage of reservoir properties. Moreover, the inherent heterogeneity of geological formations can lead to significant discrepancies in permeability values, complicating reservoir characterization. As a result, there is an increasing need for innovative approaches that leverage advanced data analytics and machine learning techniques to enhance the accuracy and efficiency of permeability estimation from available well log data and other indirect measurements1.

Understanding and evaluating formation permeability is essential for several reasons:

  1. 1.

    Hydrocarbon flow: Formation permeability directly affects the movement of oil and gas within the reservoir. High permeability allows for easier flow of hydrocarbons to the wellbore, which is essential for efficient extraction. Conversely, low permeability can hinder production rates and increase extraction costs1.

  2. 2.

    Reservoir management: Understanding permeability is vital for effective reservoir management and development strategies. Engineers use permeability data to predict reservoir behavior under various production scenarios, optimize well placement, and enhance recovery techniques2.

  3. 3.

    Enhanced oil recovery (EOR): In enhanced oil recovery methods, knowledge of formation permeability is crucial for selecting appropriate techniques, such as water flooding or gas injection. The effectiveness of these methods often depends on the permeability characteristics of the reservoir rock3.

  4. 4.

    Modeling and simulation: Accurate permeability measurements are essential for creating reliable reservoir models and simulations. These models help engineers forecast production performance, evaluate the economic viability of projects, and make informed decisions regarding drilling and completion strategies4.

Therefore, the precise measurement of this parameter is of utmost importance. The prediction of reservoir permeability has been of key interest in the industry as investment decisions based on the volume of hydrocarbon resources are dependent on their accuracy5. As a result, considerable time and money have been allocated to take advantage of technological advances in data collection from cores and well testing, among other activities, to help reduce the permeability data uncertainty and Improve the reservoir performance prediction6. On the other hand, such conventional methods often face problems such as a lack of data in some regions that makes it impossible to determine formation permeability.

Machine learning (ML) is a subset of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to perform tasks without explicit programming. Instead of being programmed with specific instructions, ML systems learn from data, identifying patterns and making decisions based on that information. In the past decade, the advancements in computer science especially in the field of artificial intelligence and machine learning (ML) enabled us to effectively extract basic data from real-world data collected in oil industry applications and employ them for better reservoir characterization7,8,9,10. ML is broadly acknowledged to improve our understanding of wells11, production, and reservoir areas12. In specific, ML is most broadly utilized in reservoir management and has accomplished important outcomes such as permeability, porosity, and tortuosity prediction13, modeling CO2-oil systems minimum miscibility pressure14, shale gas production forecast15, reservoir characterization16, predicting formation damage of oil fields17, digital 3D core reconstruction18,19,20, well test interpretation21,22, shale gas production optimization23,24, well log processing25, modeling wax deposition of crude oils26 and history matching27,28. This has motivated numerous researchers to gradually abandon the use of multiple linear regression models and empirical correlations in favor of incorporating ML when forecasting significant reservoir petro-physical properties29.

ML approach for permeability prediction

Artificial neural network (ANN) is a famous approach that uses the obtained results to predict permeability30. In recent years, some authors have studied reservoir characterization problems from different aspects, including soft computing methods 31,32,33. The results of such studies revealed that soft computing models outperform regression models. The advantage of computational methods over regression lies in the fact that elemental uncertainties or heterogeneities are not explicitly included in computational regression methods32.

Permeability can be evaluated by interpreting in situ measurements taken by formation testers using well-testing equipment and well-logging. During verification, given the average permeability thickness, transient well testing provides a wealth of information about the flow capacity of the reservoir. Another worthwhile method for measuring the absolute permeability of reservoirs is to conduct flow experiments using representative core samples34. Geoscientists can therefore manage the production process effectively with the help of a reliable and accurate permeability estimation.

Several studies have been proposed to estimate permeability. Mohagueg et al.35 presented their three main approaches to permeability estimation, including analytical, statistical, and computational tools using well-log data. Chehrazi and Rezaee36 introduced a classification plan for permeability prediction models, including analytical models, soft computing models, and porous phase models using well-log data. Rezaee et al.37 presented the results of a research project that investigated permeability prediction for the Precipice Sandstone of the Surat Basin which machine learning techniques were used for permeability estimation based on multiple wireline logs. Tembely et al.38 emphasize the important role of feature engineering in predicting physical properties using machine and deep learning. The proposed framework, which integrates various learning, rock imaging, and modeling algorithms, is capable of rapidly and accurately estimating petrophysical properties to facilitate reservoir simulation and characterization. Okon et al.39 presented an ANN model to forecast the physical properties of reservoirs namely, porosity, permeability, and water saturation, developed based on logs from fifteen fields. A joint reversal technique based on a multilayer linear calculator and particle swarm optimization algorithm was applied by Yasin et al.40 to estimate the spatial variation of important petrophysical parameters e.g. porosity, permeability, and saturation, and essential geo-mechanical specifications (Poisson’s ratio, and Young’s modulus) for downhole zones using seismic data. Anifowose et al.41 conducted stringent parametric research to examine the comparative accuracy of ML techniques in estimating the permeability of the carbonate reservoir in the Middle East using integrating seismic attributes and wireline data. Akande et al.42 studied the predictability and impact of feature engineering on the precision of support vector machines in estimating carbonate reservoir permeability using well-log data. Bruce et al.43 accomplished ANN to process permeability estimation by the usage of wireline logs. El Ouahed et al.44 proposed combining the ANN with fuzzy logic to fully account for the fractured reservoir using well-log data. Al Khalifah et al.45 used ANN and genetic algorithms to estimate cores permeability measured by lab experiments.

The contents mentioned in the previous paragraph are analyzed in Table 1 which provides a comparison between the research of different researchers on permeability estimation, where the method and type of data are given. It can be seen that the researchers used statistical tools and artificial intelligence to estimate the permeability of the formation, and in their research, the data used included well log data, rock imaging data, seismic data, and core data.

Table 1 Tools and data types used by researchers to estimate permeability.

Lost circulation vs. formation permeability

Lost circulation is a prevalent drilling problem, especially in formations with high permeability, and natural or induced fractures 46,47. Lost circulation can occur in a variety of formations ranging from h shallow, unconsolidated geological layers to well-consolidated geological layers which are disrupted by drilling fluids hydrostatic pressure 48,49. Two conditions are necessary for a loss of circulation in the borehole to occur. First the pressure at the bottom of the well exceeds the pore pressure and next there should be a fluid flow path for lost circulation50. Underground routes that cause to occur lost circulation can be defined as following classes:

  • Cavernous formations: In the direction of drilling in some formations, there are cavernous and empty spaces in which, as a result of drilling the formations, a large amount of mud loss occurs(Fig. 1a).

  • Natural fractures: The existence of a natural fracture network, which is created by tectonics in the formation, can act as a conduit to cause leakage in the formation, and the amount of leakage depends on several factors which are mentioned below(Fig. 1b).

  • Induced fractures (e.g. quick tripping or blowouts): In this mechanism, as a result of drilling operations such as tripping and blowouts, the bottom hole pressure increases, and cracks are created by induction. In fact, due to the low strength of some formations against stress, due to the application of additional stresses on the formation, fracture occurs in the formation(Fig. 1c).

  • Highly permeable formations: The presence of permeable formations causes a large amount of drilling fluid to leak into the formation due to the pressure difference between the bottom hole and the formation pressure(Fig. 1d).

Fig. 1
figure 1

Schematic classification of lost circulation51.

Fractures are an important cause of drilling fluids loss to formations whose lost circulation severity depends on fracture opening width, fracture density, fracture orientation, fracture distribution, fracture network, etc.52,53,54.

The loss ratio indicates paths of lost circulations and can show what remedial technique should be employed to counteract the loss. The lost circulation severity can be divided into four classes as follows 55,56,57:

  • Seepage losses: less than 1 m3/h.

  • Partial losses: 1–10 m3/h.

  • Severe losses: more than 15 m3/h.

  • Complete losses: no fluid comes out of the annulus.

In this study, for the first time, a novel method is proposed to have another important use of mud loss data. To this end, synthetic data driven from a commercial reservoir simulator is used as input to train and build our AI model. Mud loss data which is generated by a simulator is used to estimate formation permeability using Deep Jointly Informed Neural Networks(DJINN) and Convolutional Neural Networks(CNN) by formation type, formation thickness, mud density, mud viscosity, drilling depth, and mud loss rate data, which is presented as an accurate prediction of formation permeability.

Methodology

The model development diagram is shown in Fig. 2 and the method preparation is discussed in Sects. 2.1 to 2.5. The model development flowchart begins with data generation where all of the parameters related to mud loss are generated. The generated data is then subjected to statistical analysis. Next, the data undergoes preprocessing to make it suitable for modeling. The modeling phase begins with initializing the hyper-parameters for the deep learning model. When hyper-parameters are initialized, the models are trained with an adaptive moment estimation (Adam) optimizer. The hyper-parameters are adjusted and iterated using the trial-and-error method until the model shows good performance metrics with a minimum error.

Fig. 2
figure 2

Deep learning model development flowchart.

Data generation

As shown in Fig. 3, the drilling fluid loss process is similar to fluid injection in the porous medium. According to Darcy’s law, the loss rate depends on the parameters of bottom-hole pressure, formation pressure, viscosity of the drilling fluid, and the formation permeability. In this study reservoir simulator software (Eclipse E100) was used to simulate the drilling fluid loss process and generate mud loss data, which can be used to.

Fig. 3
figure 3

Lost circulation process (h: formation thickness, k: formation permeability, Pw: bottom-hole pressure).

The data available in the drilling process include mud weight, mud viscosity, drilling depth, mud loss rate, formation type, and the thickness of the drilled formation up to that depth. Therefore, mud loss data were generated according to the following assumptions in the 810 data series:

  1. 1-

    Loss circulation limits: 1–250 bbl /hr.

  2. 2-

    Fluid type: water-based mud.

  3. 3-

    Increasing mud viscosity with increasing mud weight.

  4. 4-

    Increasing mud weight with increasing depth in general.

  5. 5

    –10 layers with different pore pressure.

Statistical analysis of generated data

Data analysis of the mud loss dataset focused on definitive and inferential statistics which focused on univariate analysis. It was summarized the data by visualizing the distribution of each parameter in Table 2 and Fig. 4 shows the histogram of each variable.

Table 2 Data summary generated mud loss data.
Fig. 4
figure 4

Histogram of mud loss data.

Mainly, the correlation coefficient (CC) is used to test the linear association between parameters. This can be expressed as follows:

$$CC = \frac{{\sum\limits_{i = 1}^{n} {(x_{m} - \overline{x}_{m} )(x_{p} - \overline{x}_{p} )} }}{{\sqrt {\sum\limits_{i = 1}^{n} {(x_{m} - \overline{x}_{m} )^{2} \sum\limits_{i = 1}^{n} {(x_{p} - \overline{x}_{p} )^{2} } } } }}$$
(1)

where n represents the number of experimental data, \({x}_{m}\), and \({x}_{p}\) define the measured and predicted parameters, respectively, and \(\overline{x }\) m, and \(\overline{x }\) p signify their average values58.

Figure 5 indicates the CC matrix for analyzed variables. According to this data mud loss rate is the main parameter that influences the permeability parameter while other parameters have insignificant effects on the permeability. By analyzing the CC matrix, it can be seen that generated data is comparable to operational data, e.g. there is a high correlation between fluid viscosity and fluid density that is similar to the relation between these parameters in the real condition.

Fig. 5
figure 5

Correlation coefficient matrix for mud loss data.

Figure 6 shows non-linear relationships between variables. It is clear that the linearity relation between some variables, e.g. depth vs formation type, drilling fluid viscosity vs density, etc. Also, the non-linearity between some parameters is obvious, it is because of random conditions assumed for drilling conditions and no linear relationship between them at the real condition.

Fig. 6
figure 6

Non-linear relationships between variables.

Data arrangement

Mud loss data were passed through three steps before fitting into deep learning models. The process initiated with organizing categorical data, pre-processing the data using a normalization scaler, and splitting the data into training and testing sets.

The categorical variable organized in the data collection stage is formation types, which were nominated values 1 to 10, respectively. Then, normalization was performed to convert the variables between 0 and 1. Normalized data had an average of zero and a standard deviation of one. Such normalized data is used to train the model Neuropathy since it enhances learning processes and reduces high computational costs 59. The formula which used for normalization is as follows:

$$\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x}_{i} = \frac{{x_{i} - x_{\min } }}{{x_{\max } - x_{\min } }}$$
(2)

In Eq. (2), xi is the value of the variable for the ith observation, xmin is the minimum value of a variable, and xmax is the maximum value of the variable. Finally, the reshaped data were split into two sets using 80:20 ratios for the training and testing set.

Convolutional neural networks (CNN)

In the past decade, Convolutional Neural Networks have been responsible for breakthroughs in computer vision and image processing 60,61,62, obtaining state-of-the-art results on a range of benchmark and real-world tasks. More recently, one-dimensional CNNs have shown great promise in processing structured linguistic data in tasks such as machine translation 63,64 and document classification 65,66. Bai et al. 67 in 2018 indicated that, for many sequence modeling tasks, 1D-CNNs using current best practices such as dilated convolution often perform better than other recurrent neural network architectures.

A convolutional neural network is a type of feedforward neural network that consists of multiple convolution stages that perform the task of feature extraction and a single output stage that combines the extracted high-level features to predict the desired output 68. Figure 7 indicates a sample of the 1D-CNN architecture for the forecasting model.

Fig. 7
figure 7

Example of One-dimensional convolutional neural network (1D-CNN) architecture.

In this study, Table 3 shows the elements of neural networks that included one 1D-CNN layer, one flattened layer, two dropout layers with a value of 0.2, and two fully connected layers. Exponential Linear Unit (elu) is applied in the convolution and fully connected layers as an activation function.

Table 3 Details of the used 1D-CNN model.

Deep jointly informed neural networks (DJINN)

The DJINN algorithm determines the appropriate deep neural network architecture and initializes the weights using the dependency structure of the decision tree trained on the data. The algorithm can be divided into three steps: building a set of decision trees, mapping the tree to a neural network, and fine-tuning the neural network through backpropagation 69.

Decision tree construction

The first step of the DJINN algorithm is to build a model based on a decision tree. This can be a single decision tree generating a neural network or an ensemble of trees, such as random forests70, which will create a set of neural networks. The depth of the tree is often limited to avoid creating neural networks that are too large; Maximum tree depth is a hyper-parameter that must be tuned for each data set69.

Mapping decision trees to deep neural networks

The DJINN algorithm selects a deep neural network architecture and a set of initial weights based on the structure of the decision tree. The mapping is not intended to reproduce a decision tree, but instead uses the decision path as a guide for network architecture and weight initialization. Neural networks are initialized layer by layer, whereas decision trees are typically saved for each decision pass. The path starts at the top branch of the tree and follows each decision to the left and then to the right until it reaches a leaf (prediction). Due to the way trees are stored, it is difficult to navigate the tree by depth, but it is easy to traverse the tree recursively. Mapping from a tree to a neural network is easiest if the structure of the tree is known before initializing the neural network weights. Therefore, the decision pass is executed twice; first, it determines the structure and then initializes the weights 69.

Optimizing the neural networks

As soon as the tree is mapped to the initialized neural network, the weights are adjusted using backpropagation. In this example, a deep neural network is trained with Google’s deep learning software Tensor Flow. The activation function used in each hidden layer is a modified linear unit, which generally works well in deep neural networks71,72 and can retain the values of neurons in previously hidden layers. The Adam optimizer73 is used to minimize the cost function (mean squared error (MSE) for regression, cross-entropy with logit for classification)74.

Model performance evaluation

It is necessary to recognize the criteria associated with evaluating model performance. In this work, root mean squared error, mean absolute error, mean absolute percentage error, R-squared, and relative error were used as statistical indicators to evaluate the performance of the models.

Root mean squared error (RMSE)

The root mean squared error is used to see how well the network output matches the desired output. Better performance is guaranteed with smaller RMSE values. It is defined as follows 75:

$$RMSE = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {(x_{m} - x_{p} )^{2} } }$$
(3)

Mean absolute error (MAE)

The mean absolute error is the average value of the absolute difference between the predicted value and the actual value. Errors showing a uniform distribution shall be presented. Furthermore, MAE is the most natural and accurate measure of the average level of error 58.

$$MAE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {x_{p} - x_{m} } \right|}$$
(4)

Mean absolute percentage error (MAPE)

The mean absolute percentage error is calculated by dividing the absolute error of each period by the observed values evident in that period. Then average these fixed percentages. This approach is useful when the size or dimensions of the predictor variable are important in assessing the accuracy of the prediction 76,77. MAPE indicates the degree of forecast error compared to the actual value.

$$MAPE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\frac{{\left| {x_{p} - x_{m} } \right|}}{{x_{m} }}} \times 100\%$$
(5)

R-squared (R2)

An important index to check the correctness of the regression algorithm is \({R}^{2}\), which ranges from 0 to 1. \({R}^{2}\) is defined as follows58:

$$R^{2} = 1 - \frac{{\sum\limits_{i = 1}^{n} {(x_{p} - \overline{x}_{m} )^{2} } }}{{\sum\limits_{i = 1}^{n} {(x_{m} - \overline{x}_{m} )^{2} } }}$$
(6)

where n represents the number of observations, \({x}_{m}\), and \({x}_{p}\) define the measured and predicted parameters, respectively, and \(\overline{x }\) m signifies the average of measured parameters.

Relative error (RE)

The relative error is defined as the ratio of the difference of the predicted to the measured value. If \({x}_{m}\) is the measured value of a quantity, \({x}_{p}\) is the predicted value of the quantity, then the relative error can be measured using the below formula78.

$$RE = \frac{{x_{p} - x_{m} }}{{x_{m} }}$$
(7)

Results and discussion

Based on the previously mentioned methods fthe structural parameters of the CNN and DJINN for predicting formation permeability were determined, and the models were trained and tested. Real value versus predicted values of permeability (md) for training and testing data are displayed as cross plots in Fig. 8 and Fig. 9. R2 represents an alternative measure of forecast accuracy. As a precision indicator, it represents the proportion of the variance displayed by the dependent variable that can be predicted through the choice of the independent variable. If R2 = 1, this shows that the permeability of the formation can be predicted without error by the selected independent variables.

Fig. 8
figure 8

Cross-plots of the real value versus predicted values of the normalized permeability by 1D-CNN for training data.

Fig. 9
figure 9

Cross-plots of the real value vs predicted values of the normalized permeability by DJINN for testing data.

As can be seen from Table 4, the 1D-CNN prediction model has sufficiently high accuracy on training and test data (for training data: R2 = 0.968, RMSE = 50.78, MAE = 37.50, MAPE = 16.39; for test data: R2 = 0.962, RMSE = 58.17, MAE = 42.95, MAPE = 11.29). As shown in Table 5, the DJINN prediction model has also sufficiently high accuracy on training and test data (for training data: R2 = 0.973, RMSE = 46.15, MAE = 34.34, MAPE = 9.57; for test data: R2 = 0.970, RMSE = 51.39, MAE = 39.56, MAPE = 13.53).

Table 4 1D-CNN model accuracies.
Table 5 DJINN model accuracies.

Figure 10 and Fig. 11 indicate relative error for 1D-CNN and DJINN models. According to them the accuracy for data with low values is lower than the accuracy for data with high values. Therefore, these models are suitable for prediction data with high values.

Fig. 10
figure 10

Training and testing relative error for 1D-CNN.

Fig. 11
figure 11

Training and testing relative error for DJINN.

Figure 12 compares the computational error on training and test data for used algorithms. It indicates that RMSE, R2, MAE, and MAPE for the DJINN model are more accurate than the 1D-CNN model. Therefore, DJINN is a better algorithm for predicting formation permeability.

Fig. 12
figure 12

Comparing RMSE, MAE, MAPE, and R2 accuracies for training and testing data.

Table 6 compares the results of this study with those of recent permeability estimation studies. While most studies have used well log data, seismic data, rock imaging data, core data to estimate formation permeability by Analytical, statistical, and computational tools, and artificial intelligence tool, this study benefits from deep learning algorithms to estimate the formation permeability through mud loss data.

Table 6 Compare the results obtained with those of recent permeability estimation studies.

Conclusions

Permeability is the key parameter to reservoir characterization. There are various methods to evaluate the formation and estimate the formation permeability, but in some cases, the evaluation may not be done or it may not be done correctly.

This study estimated formation permeability using drilling fluid data and two deep-learning algorithms. Drilling data including depth, formation type, fluid density, fluid viscosity, formation thickness, and mud loss rate were generated by reservoir simulator software similar to real-world conditions and Deep learning algorithms including 1D-CNN and DJINN.

The results show that DJINN (R2 equals 0.973 on training data and 0.970 on test data) is a more accurate model than 1D-CNN (R2 equals 0.968 on training data and 0.962 on test data) in modeling this problem. Therefore, this study could present a novel method that uses mud loss data to estimate formation permeability accurately by deep learning algorithms (Supplementary table S1).