Introduction

In the past few decades, the effective mechanical properties of porous materials have been widely explored. There are several analytical methods in the literature to obtain predictions of effective mechanical properties, but these models are approximations, and their accuracy is usually low. For this reason, most authors obtain the macroscopic properties of composite materials through multi-scale homogenization processes, and in most cases, finite element method (FEM) is used. This allows for the establishment of relationships between microstructural parameters (such as pore size and volume fraction) and material response, as well as the study of the roles of defects, interfaces, and nonlinearity in material behavior.

In the method of estimating the overall performance of porous materials, homogenization theory can be used to replace the inclusion material with porous material, which can be obtained through analytical and finite element methods. Eshelby’s equivalent inclusion theory is one of the analytical methods1, which is not applicable when there are many pores and slightly complex structures. Sakata et al. used an approximate stochastic homogenization method based on polynomial approximation techniques to analyze the influence of changes in microscopic results on the homogeneous elastic properties of particle-reinforced composite materials2.

Pian et al.3 proposed the Hybrid Stress Element Method (HSEM), which can achieve higher accuracy in sparse grids by assuming high-order stress fields in the elements, overcoming the problem of complex displacement interpolation when the number of edges in traditional displacement finite element elements is too large4. For the stability problem of rock slopes, Yang et al.5 studied the contact algorithm of polygonal hybrid elements, utilizing the advantages of polygonal stress hybrid elements in constructing stress functions and integration regions, enabling them to adapt to complex model boundaries and material boundaries. Due to the adjustable number of node points in the hybrid stress element, it can be used in the quadtree algorithm and is widely used in the calculation of particle-reinforced composite materials. Ghosh et al.6,7 proposed a Voronoi element finite element based on hybrid stress elements, where each element contains an inclusion and introduces characteristic strain to consider microstructural features. Guo et al.8,9,10,11 proposed some improved Voronoi elements based on hybrid stress element theory, considering the effects of thermal strain, plasticity, and creep, making hybrid stress elements more versatile. Wang12 proposed the quadtree algorithm, which can quickly calculate hybrid stress elements. The influence of random inclusions’ volume ratio, quantity, and aspect ratio on the homogeneous equivalent modulus of materials was studied using homogenization and Monte Carlo methods. Zhao13 studied the mechanical properties of particle-reinforced composite materials from three aspects: equivalent modulus, equivalent Poisson’s ratio, and principal stress, using the proportional boundary finite element method (SBFEM) and Monte Carlo method.

Data science and machine learning technologies have developed rapidly in the past few years and are increasingly being applied in materials engineering. Mai14 proposed a two-dimensional temperature field inversion method for turbine blades using a physical information neural network (PINN) and finite discrete temperature measurement points. The neural network can be trained on a small amount of data to predict the temperature field of the turbine blades. The PINN trained on only 62 temperature data points on the test set has an average relative error of less than 2% and an R2 score higher than 0.95. Nattavadee15 used multi-layer neural network (MNN) and convolutional neural network (CNN) algorithms to predict permeability in computational rock physics and trained gradient descent algorithms through Bayesian regularization feedforward and backpropagation. Most of these studies establish the one-to-one relationship between microstructure and material properties. Yang16 designed and implemented a deep learning method for simulating the elastic homogenization structure performance relationship in high-contrast composite material systems. Li et al.17,18,19 proposed an application of data analysis and supervised machine learning that can accurately predict the macroscopic stiffness and yield strength of unidirectional composite materials subjected to transverse plane loading. Wu proposed a framework that uses image recognition neural networks to quickly predict permeability directly from images, which is a novel pore-scale modeling method with great potential. Bom20,21,22,23 proposed a new method using deep learning and Bayesian methods for simultaneously estimating uncertainty quantification of permeability and porosity from image logs. This method can be used to derive a probability density function (PDF) from each prediction.

Rock is a naturally porous material that contains a large number of irregular, cross-scale pores inside. These pores directly affect the rock’s macroscopic physical, mechanical, and chemical properties, such as strength, elastic modulus, permeability, electrical conductivity, wave velocity, particle adsorption force, rock reservoir productivity, etc24. Exploring the intrinsic relationship between pore structure and rocks’ macroscopic physical and mechanical properties is of great significance for solving practical problems in petroleum, geology, mining, metallurgy, civil engineering, and hydraulic engineering25.

Rock is essentially a heterogeneous material, and the microscopic pore structure influences its mechanical properties. Although current research has, to some extent, revealed the relationship between the pore structure of rocks and their macroscopic mechanical properties, due to the complexity and disorder of pore structure, as well as the lack of theoretical and experimental methods, people are still unable to accurately describe the pore characteristics inside rocks26. There are many factors that affect the mechanical properties of rocks. Considering the complexity of the problem, this article focuses on the influence of two-dimensional pore distribution on the mechanical properties of rocks, that is, the analysis of the effective mechanical properties of porous materials.

This article proposes a framework for predicting the mechanical properties of porous materials. In this study, we developed a data-driven supervised machine learning model and applied it to porous materials. This framework utilizes methods such as quadtree algorithm and convolutional neural network to establish an implicit mapping between the generated RVE model and mechanical parameters. The framework proposed by us can quickly predict the effective elastic modulus and Poisson’s ratio information of porous material models. The \(\:{R}^{2}\) predicted by this framework for the equivalent elastic modulus is 0.98. For randomly generated non-circular irregular pore microstructures, the maximum error predicted by the model is 3.6%. The entire framework specifically includes (1) generation of porous media samples, (2) calculation of mechanical parameters through quadtree algorithm, (3) training of convolutional neural network (CNN) with simulated data, and (4) validation of the simulation.

Methodology

Basic principle of hybrid stress element

Fig. 1
figure 1

Arbitrary polygonal hybrid stress element.

A typical hybrid stress element is shown in Fig. 1, where all outer boundaries of the element are composed of a given displacement boundary \(\:\partial\:{\varOmega\:}_{u}\), a given force boundary\(\:\partial\:{\varOmega\:}_{t}\), a common boundary \(\:\partial\:{\varOmega\:}_{e}^{{\prime\:}}\) between elements, and a free boundary \(\:\partial\:{\varOmega\:}_{f}\), i.e. \(\:\partial\:{\varOmega\:}_{e}=\partial\:{\varOmega\:}_{e}^{{\prime\:}}\cup\:\partial\:{\varOmega\:}_{u}\cup\:\partial\:{\varOmega\:}_{t}\cup\:\partial\:{\varOmega\:}_{f}\) Based on the principle of minimum surplus energy in the virtual work principle, the element is subjected to stress. The finite element model of function \(\:\varvec{\sigma\:}\) as a field variable and the residual function of the element can be written as:

$${\varPi}_{c}={\int}_{{\varOmega}_{e}}^{}\frac{1}{2}\varvec{\sigma}:\varvec{S}:\varvec{\sigma}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{u}}^{}\mathbf{T}\cdot\overline{\varvec{u}}\text{d}\partial\varOmega$$
(1)

Among them, \(S\) is the elastic flexibility tensor, which is a third-order matrix, \(E\) is Young’s modulus, and \(\mu\) is Poisson’s ratio:

$$\begin{array}{c}S=\left[\begin{array}{ccc}\frac{1}{E}&-\frac{\mu}{E}&0\\-\frac{\mu}{E}&\frac{1}{E}&0\\0&0&\frac{2\left(1+\mu\right)}{E}\end{array}\right]\end{array}$$
(2)

\(\:\varvec{\upsigma\:}\) is the equilibrium stress field within the domain \(\:{{\Omega\:}}_{e}\), \(\:\stackrel{-}{u}\) is the known displacement on the boundary, and \(\:\text{T}\) is the boundary surface force of the element. At the same time, force boundary conditions need to be satisfied on the given surface force boundary:

$$\:\begin{array}{c}n\cdot\:\sigma\:=\overline{\varvec{T}}\:\end{array}$$
(3)

The continuity condition of force needs to be satisfied at the boundary between units:

$$\:\begin{array}{c}{\varvec{n}}^{+}\cdot\:\sigma\:={\varvec{n}}^{-}\cdot\:\sigma\:\:\end{array}$$
(4)

Defining \(\:\overline{\varvec{u}}\) as a Lagrange multiplier and implementing the two constraints in the above equation, the modified residual function can be obtained as follows:

$${\varPi}_{mc}=\sum_{e}\left({\int}_{{\varOmega}_{e}}^{}\frac{1}{2}\varvec{\sigma}:\varvec{S}:\varvec{\sigma}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{u}}^{}\stackrel{-}{\varvec{T}}\cdot\varvec{u}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{e}^{{\prime}}}^{}\varvec{n}\cdot\varvec{\sigma}\cdot\varvec{u}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{t}}^{}\left(\varvec{n}\cdot\varvec{\sigma}-\stackrel{-}{\varvec{T}}\right)\cdot\varvec{u}\text{d}\varOmega\right)$$
(5)

According to the definition of surface force, the surface force on the boundary is expressed as:

$$\:\begin{array}{c}T=n\cdot\:\sigma\:\:\end{array}$$
(6)

The displacement \(\:\overline{\varvec{u}}\) given on the displacement boundary \(\:\partial\:{\varOmega\:}_{u}\) is equal to the interpolation \(\:\varvec{u}\) of the displacement of the unit node, that is:

$$\:\begin{array}{c}\overline{\varvec{u}}=u\:\end{array}$$
(7)

The modified residual function can be further simplified as:

$${\varPi}_{mc}=\sum_{e}\left({\int}_{{\varOmega}_{e}}^{}\frac{1}{2}\varvec{\sigma}:\varvec{S}:\varvec{\sigma}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{e}^{{\prime}}}^{}\varvec{n}\cdot\varvec{\sigma}\cdot\varvec{u}\text{d}\varOmega-{\int}_{\partial{\varOmega}_{t}}^{}\mathbf{T}\cdot\overline{\varvec{u}}\text{d}\varOmega\right)$$
(8)

In the hybrid stress element model, the equilibrium stress field within the element is represented by the stress coefficient \(\:\varvec{\upbeta\:}\), which is written as:

$$\:\begin{array}{c}\sigma\:=P\beta\:\:\end{array}$$
(9)

Among them, \(\:\varvec{\upbeta\:}\) is a column vector containing \(\:m\) stress parameters, and for two-dimensional problems, \(\:\mathbf{P}\) is a \(\:3\times\:m\) matrix. So, the boundary surface force can be expressed as the product of the generalized force \(\:\varvec{R}\) and the stress coefficient \(\:\varvec{\beta\:}\):

$$\:\begin{array}{c}T=nP\beta\:=R\beta\:\:\end{array}$$
(10)

Where \(\:\varvec{n}\) is the unit vector in the normal direction of the interface.

The boundary displacement \(\:\varvec{u}\) is obtained by linear interpolation of nodes:

$$\:\begin{array}{c}u=Ld\:\end{array}$$
(11)

Among them, \(\:\varvec{d}\) is the generalized displacement of the node, and L is the displacement interpolation function.

The given surface force satisfies the condition:

$$\:\begin{array}{c}\stackrel{-}{\varvec{T}}-T=0\:\end{array}$$
(12)

By substituting the above three equations, we can obtain a modified expression for the residual energy functional after finite element discretization:

$$\prod mc=\sum_{e}\left(\frac{1}{2}{\varvec{\upbeta}}^{\mathbf{T}}\mathbf{H}\varvec{\upbeta}-{\varvec{\upbeta}}^{\mathbf{T}}\mathbf{G}\mathbf{d}+\stackrel{̄}{\mathbf{f}}\mathbf{d}\right)$$
(13)

Among them, \(\:\varvec{H}\),\(\:\varvec{G}\), and are respectively:

$$H={\int\:}_{{\varOmega\:}^{e}}^{\:}{\mathbf{P}}^{\text{T}}\mathbf{S}\mathbf{P}d{\Omega\:}$$
(14)
$$G={\int\:}_{\partial\:{\varOmega\:}_{t}}^{\:}{\mathbf{P}}^{T}\mathbf{L}d\partial\:{\Omega\:}$$
(15)

The load vector array \(\:\stackrel{̄}{\mathbf{f}}\) is:

$${\overline{\mathbf{f}}}^{T}=\sum_{e}\int_{\partial{\varOmega}_{t}}^{\:}{\stackrel{-}{\mathbf{T}}}^{T}\mathbf{L}d\varOmega$$
(16)

According to the residual functional stationary condition, calculate the partial derivative of the above equation to obtain the stationary value:

$$\:\begin{array}{c}\frac{\partial\:{\varPi\:}_{mc}}{\partial\:{\varvec{\beta\:}}_{i}}=0\:\left(i=\text{1,2},\dots\:,m\right)\end{array}$$
(17)

The expression of stress parameters within each unit can be obtained:

$$\:\begin{array}{c}\beta\:={\mathbf{H}}^{-1}Gd\:\end{array}$$
(18)

The further revised residual functional formula is:

$${{\Pi\:}}_{mc}=\sum_{e}\left(\frac{1}{2}{\left({\mathbf{H}}^{-1}\mathbf{G}\mathbf{d}\right)}^{T}\mathbf{H}\left({\mathbf{H}}^{-1}\mathbf{G}\mathbf{d}\right)-{\left({\mathbf{H}}^{-1}\mathbf{G}\mathbf{d}\right)}^{T}\mathbf{G}\mathbf{d}+{\stackrel{-}{\mathbf{f}}}^{T}\mathbf{d}\right)$$
(19)

The further revised residual functional formula is:

$$\:\begin{array}{c}{{\Pi\:}}_{mc}=\sum\:_{e}\:\left(-\frac{1}{2}{\mathbf{d}}^{T}{\mathbf{G}}^{T}{\mathbf{H}}^{-1}\mathbf{G}\mathbf{d}+{\stackrel{-}{\mathbf{f}}}^{T}\mathbf{d}\right)\:\end{array}$$
(20)

According to the modified residual energy stationary principle, the system of equations for solving generalized displacement is obtained from \(\:\delta\:{\prod\:}_{mc}\:=0\)

$$\:\begin{array}{c}KD=\stackrel{-}{\mathbf{f}}\:\end{array}$$
(21)

Among them, \(\:\text{D}\) is the overall displacement, and the stiffness matrix is:

$$K=\sum_{e}\:{\mathbf{K}}_{e}=\sum_{e}{\mathbf{G}}^{T}{\mathbf{H}}^{-1}G$$
(22)

Quadtree method

In order to estimate the effective performance of porous materials, the corresponding effective performance of composite materials can be considered by replacing the inclusions in the composite material with pore structures. The equivalent inclusion method is a popular method. The direct homogenization method obtains the corresponding macroscopic effective coefficient by taking the average of representative units’ micro stress and microstrain in area or volume. Finally, it calculates the equivalent elastic modulus from the stress-strain relationship.

From the calculation process of the stress hybrid element, it can be seen that calculating the stiffness matrix of the element is the central time-consuming part. In order to improve computational efficiency, Wang et al. proposed the quadtree mesh technique for modeling. The quadtree technique has the advantages of simplicity and efficiency, but it has not been widely used in ordinary displacement finite elements. Sukumar and Tabarraei27 proposed a universal polygon element technique with any number of edges to solve the displacement coordination condition. After using this method, stress hybrid elements can quickly and accurately calculate the results.

Fig. 2
figure 2

Schematic diagram of quadtree grid partitioning process.

As shown in Fig. 2, the specific division process of a quadtree is illustrated. In (a), two different material regions are blue and white. Firstly, whether only one material exists in the square region is determined. The region is divided into four equal parts if multiple materials exist, as shown in (b). If there is only one type of material in the area, stop dividing. For example, stop dividing again if there is only one type of material in the blue area in the bottom right corner of (c). Traverse the process of repeated partitioning until the entire area stops partitioning and the quadtree grid is fully partitioned.

When calculating the effective elastic properties of porous material models, we consider the problem of plane stress and randomly distribute the positions of pores. Assuming an isotropic and statistically uniform porous model, we calculate the effective elastic properties based on the RVE parallel to the load direction for each calculation. The RAE size we have chosen is small enough at the macro level and large enough at the micro level. For porous material models, our proposed calculation method is applicable to most size calculations. In estimating the effective elastic modulus, Farkash et al. proposed that the maximum relative deviation between the results of 2D and 3D solutions is about 9%29. Currently, we are considering the 2D case. One specific calculation condition is shown in Fig. 3. The validation model is in a \(2 mm \times 2 mm\) square matrix, which contains 9 randomly distributed pores with a diameter of 0.15 mm, accounting for 15.9% of the volume. The displacement of the upper, lower, and left ends of the constrained substrate is 0 mm, and the displacement load of the right end is 0.001 m. The elastic modulus of the matrix material is 160Gpa, and the Poisson’s ratio is 0.22.

Fig. 3
figure 3

Quadtree calculation of porous material model: (a) Microstructure diagram; (b) model working condition diagram; (c) A quadtree diagram of microstructure partitioning; (d) Microstructure calculation result chart.

Miled et al. derived the equivalent elastic modulus based on the mean field Eshelby homogenization scheme30, while Chung et al. used graph neural networks to predict the effective elastic modulus of rocks from digital CT scan images31. Wang et al. proposed the quadtree grid partitioning technique, which can quickly calculate the mechanical properties of porous material models. The Mori Tanaka method was compared with the stress hybrid element calculation method, and its efficiency and accuracy have been verified12. By using the stress hybrid element method, compared to ordinary finite elements, high-order stress fields are used to construct elements. Less units can be used to capture stress concentration phenomena between two relatively close holes. As shown in Fig. 3, (a) is the porous material RVE model, (b) is the model working condition diagram, (c) is the model divided by the quadtree algorithm, (d) is the horizontal tensile calculation result in MPa, and the x-direction stress map calculated using the quadtree algorithm. We used this method to quickly obtain the equivalent elastic modulus and Poisson’s ratio of a large number of porous material models, providing data support for the next step of neural network training.

Training and testing of convolutional neural networks

In this section, we briefly explain the basic theory of deep neural networks, construct a suitable convolutional neural network, and train and test this network.

Convolutional Neural Network (CNN)

This section uses convolutional neural networks, which belong to a specific type of artificial neural network20. Convolutional neural networks typically include convolutional, pooling, and fully connected layers. Convolutional neural networks need to consider fewer parameters than other artificial neural network structures. Due to the introduction of convolutional and pooling layers, they have more advantages in image processing. Like other neural networks, convolutional neural networks can also be trained using backpropagation algorithms.

Fig. 4
figure 4

Convolution operation.

When the input data is an image, the image usually contains three-dimensional information of length, width, and height. Traditional fully connected layers need to flatten the three-dimensional data into one-dimensional data and process all input data as neurons of the same dimension, so shape information is often overlooked. The characteristic of convolutional neural networks is the addition of convolutional layers and kernels. Due to the existence of convolutional kernels, convolutional layers can receive image information in three-dimensional form and output it to the next convolutional layer in the same three-dimensional data format. The purpose of convolutional layers is to extract different features of the input, such as low-level features such as edges and lines. With the increase of convolutional layers, more complex features can be iteratively extracted from low-level features. As shown in Fig. 4, the specific operation of the convolutional layer is to multiply the elements at each position with the filter, then sum them up, and finally add a bias to obtain the final output data.

Fig. 5
figure 5

Maximum value pooling operation.

The pooling layer imitates the human visual system to reduce the dimensionality of data and represent images with higher-level features. The pooling layer is usually used after the convolutional layer, mainly for dimensionality reduction of feature maps, increasing the receptive field of subsequent feature maps, improving the scale invariance and rotation invariance of the model, and preventing overfitting of the model. The typical operations of the pooling layer include the following: max pooling, mean pooling, random pooling, median pooling, and combination pooling. This article uses max pooling, as shown in Fig. 5, which takes the maximum value within a particular area. Its advantage is that it can learn the edges and texture structure of the image.

The convolutional, activation, and pooling layers can be regarded as the learning feature extraction layers of CNN. A one-dimensional flat layer transforms the data and applies the learned features to model classification or regression tasks through a fully connected layer.

Inspired by the above mechanism, we hope to establish a multi-layer convolutional neural network to predict the mechanical properties of porous materials, as shown in the above figure. Lower-level layers can extract features of small blocks in sliced images. The senior management can grasp the features of the entire sliced image. Add two fully connected layers to the last layer to obtain its equivalent mechanical properties. In this way, the network establishes an implicit mapping between the microstructure of slices and their equivalent mechanical properties. In the next section, we will discuss the establishment and training process of this convolutional neural network.

The basic training and learning process of CNN

The flowchart of the image-based mechanical performance prediction modeling we constructed is shown in Fig. 6, and the specific steps are as follows.

Step 1: Generate 1000 randomly distributed microstructures.

Step 2: Conduct the mechanical analysis of the microstructure as described earlier.

Step 3: Preprocess the images in the dataset. The most convenient method is binarization.

Step 4: Use the image parameters as the given input x and the mechanical properties as the target label y.

Step 5: Use machine learning and forward propagation to optimize parameters.

Step 6: Test the accuracy of the model.

Fig. 6
figure 6

Flow chart of image-based mechanical performance prediction modeling.

Elastic modulus is the most essential mechanical parameter of rock materials, which characterizes the difficulty of deformation of rocks under compression. Rock is essentially a heterogeneous material, and its macroscopic mechanical properties are closely related to its microscopic structure. By using the stress-strain curve, the elastic modulus of rock can be obtained. The elastic properties of rocks calculated by the finite element method are consistent with experimental calculation results28. The mechanical properties of porous materials are determined by multiple factors, among which the matrix material, porosity, and pore distribution position are the main factors affecting the mechanical properties of porous materials. We select the reconstructed two-dimensional slices of porous materials as input and consider the influence of pore size and distribution position on the final mechanical properties under the same matrix material.

Training neural networks require many data sets, and experimental methods require much time and effort. We randomly generate pore structures in the matrix material to quickly generate a two-dimensional microstructure model of porous materials. We can quickly obtain the mechanical properties of porous materials through the quadtree above algorithm. We randomly generated 1000 data points and selected rocks as the matrix material, considering the porosity issue in rocks with Young’s modulus of 16Gpa. The Poisson’s ratio is 0.2224. Input it into a specific software for quadtree calculation to obtain mechanical parameters such as equivalent elastic modulus and Poisson’s ratio, and use them as labels to input into the convolutional neural network for training. Figure 7 shows a randomly generated porous material image, where yellow represents the matrix material and purple represents the empty pores. The labels below represent the equivalent elastic modulus information obtained using the quadtree algorithm.

Fig. 7
figure 7

Equivalent elastic modulus information of randomly generated porous images (the images have been binarized).

Fig. 8
figure 8

Convolutional neural network structure diagram.

We used a CNN model to predict the mechanical properties of porous materials, utilizing convolution and pooling and finally adding a fully connected network to construct the model. The TensorFlow framework was employed for framework usage. The receptive field is an essential consideration in designing CNN models, defined as the size of the input region in the feature area. A reasonable receptive field can ensure that the model extracts critical information from the input completely. For example, in the process of recognizing cat and dog images, a small receptive field may only recognize local features (such as mouth shape), while a sizeable receptive field can recognize global features (such as body shape). Reasonable setting of the receptive field can make the model converge faster. The common method to increase the receptive field is to add more convolutional and pooling layers. However, excessive convolution and pooling operations can lead to increased computational overhead, and storing convolutional features also requires a large amount of memory. As shown in Fig. 8, the model uses four convolutional layers and pooling layers to extract fine-scale features from the original microstructure. The convolution kernel size is 5 × 5, the step size is 1, and the activation function is the ReLU function. Due to the low computational cost of this function, the model can be trained or predicted in less time without the problem of gradient vanishing. The pooling layer adopts maximum value pooling, taking the maximum value in a 2 × 2 window. After pooling, the fine-scale of each hole is blurred, retaining only the coarse scale features.

The activation function of the hidden layer is the ReLU function, which enters the backpropagation process when there is a significant difference between the output layer and the actual labeled data. The backpropagation process adopts the stochastic gradient descent method. In order to prevent the overfitting of neural networks, a certain proportion of neurons are discarded in the flattening layer. The training time is longer than that of other algorithms, but the prediction accuracy is higher. The entire dataset is randomly divided into two parts, with 70% of the data used for model training and 30% for model testing. When the given training target accuracy is reached, the entire training process will terminate.

In the loss function, we use Minimum Squared Error (MSE) as the loss function, where N represents the number of samples, \(\:{f}_{i}\) represents the predicted value of the i-th sample, and \(\:{y}_{i}\) represents the actual value of the i-th sample. By using forward propagation to reduce MSE, mechanical properties can be predicted.

$$MSE=\frac{1}{N}\sum_{i=1}^{N}{\left({f}_{i}-{y}_{i}\right)}^{2}$$
(23)

Results and discussion

In this section, we generated 1000 random porous material samples to train the convolutional neural network. Binary processing of slice images of porous materials can reduce the complexity of the network. The input slice images of porous materials are reconstructed into a 255 × 255 grayscale image and input into the network for training. The training was conducted on a desktop computer equipped with an i5-12400 CPU, 32GB RAM, and Nvidia GTX3060. The training iterations lasted 1000 cycles and took approximately 10 min. As mentioned earlier, the training process is to determine all weights and biases of the convolutional neural network.

Fig. 9
figure 9

Loss of different optimization algorithms and learning rate models: (a) Loss graphs of different optimization algorithms; (b) Different learning rate loss graphs.

As shown in Fig. 9 (a), we compared the effects of different optimization algorithms on the model test set loss, namely stochastic gradient descent (SGD), Adam, and RMSprop algorithms. The SDG optimization algorithm steadily decreased loss before the 150th generation. Then, it oscillated repeatedly, unable to rapidly decrease along the direction of a slight gradient, resulting in unstable convergence and slow convergence speed. The RMSprop optimizer can dynamically adjust the learning rate by utilizing the number of iterations and cumulative gradient. For example, if the initial learning rate is high and converges quickly, the parameters can be attenuated and fine-tuned to find a stable optimal point for the model. However, in this model, after 200 iterations, the loss is still relatively significant. Adam is a combination of SGD and RMSprop, which was proposed in 2015 and basically solves the problem of gradient descent. It can be seen that the loss has been reduced to below 0.5 around the 25th generation, and the use of Adam optimizer in this network has a better effect.

Figure 9 (b) shows that we selected the Adam optimization algorithm. We compared the changes in training errors during the training process with different learning rates for the same parameter model. The model converged well before the 50th generation but experienced varying degrees of oscillation after the 50th generation. When the learning rate was 0.001, due to the significant learning rate, the model oscillated severely after 100 generations of training. When the learning rate was 0.0001, the model convergence speed was slow, and after 200 generations of training, the loss value was still large. In the end, we adopted a learning rate of 0.0003, which resulted in good model stability and minimal loss after convergence.

Fig. 10
figure 10

Simulated and predicted values of the CNN model: (a) Equivalent elastic modulus test set; (b) Poisson’s ratio and equivalent elastic modulus test set.

By testing the predictive ability of the model on 800 training sets and 200 testing sets, as shown in Fig. 10 (a), blue represents the training set data, and red represents the testing set data. For the equivalent elastic modulus, most of the data is distributed between 9.5Gpa and 16Gpa, and the prediction results of the training set are more robust than those of the testing set. Figure 10 (b) shows that the horizontal axis represents the equivalent elastic modulus, and the vertical axis represents Poisson’s ratio. There is a total of 200 test set data. The blue color represents the mechanical performance values calculated by the quadtree algorithm, and the red color represents the mechanical performance values predicted by the CNN model. Most of the data from the two sets are relatively close. It can be seen that CNN networks have good predictive ability for the mechanical properties of porous materials. The predicted mechanical properties through this framework are similar to the actual ones, with high accuracy and computational efficiency. Finally, \(\:{R}^{2}\) was used to evaluate the prediction results of the model for a more quantitative assessment. The \(\:{R}^{2}\) of the equivalent modulus reached 0.98, indicating that the model has good predictive ability.

The proposed convolutional neural network architecture method can save computational costs, depending on the computational workload required for finite element simulation. In the case discussed in this article, the computational complexity of the finite element model is relatively light. Using the commercial finite element software MARC to calculate the mechanical properties of porous materials takes about 20 min. Through machine learning, the exact prediction can be performed in about 0.5 s, which is 2400 times faster. For models involving millions of porous materials, the time saved in calculations can be several orders of magnitude higher. In this case, the running time of finite element analysis is much longer, while our proposed network architecture can still quickly predict equivalent mechanical properties after appropriate training.

Fig. 11
figure 11

Prediction of mechanical properties using randomly generated microstructures.

We further evaluated the model’s generalization ability and randomly generated non-circular irregular pore microstructures, which better simulate the pore situation in rocks. As shown in Fig. 11, Yellow represents the matrix material, purple represents empty pores, and the first row label below represents the equivalent elastic modulus information predicted by the model. The second row label represents the equivalent elastic modulus information calculated by the quadtree, and the left row label represents the percentage error. It can be seen that the model has a specific extrapolation ability. The maximum error predicted by the model is 3.6% for randomly generated non-circular irregular pore microstructures.

Conclusions

In practical engineering, the mechanical properties of porous materials are crucial. Deep-learning neural networks can quickly predict the equivalent mechanical properties of porous materials. In this work, we propose a framework for predicting the mechanical properties of porous materials, and the main conclusions are as follows:

  1. 1.

    Based on hybrid stress elements, many mechanical property datasets corresponding to porous material models were quickly calculated using the quadtree algorithm. A porous material mechanical property model that can consider multiple factors such as porosity, pore size, and position distribution was obtained using neural network machine learning methods. This model uses two-dimensional slices of porous material models as input variables. The corresponding mechanical properties are obtained as output variables.

  2. 2.

    To improve the model’s prediction accuracy, training models were established for the elastic modulus and Poisson’s ratio of two material mechanical properties, respectively, for the final prediction analysis. The effectiveness of the models was verified through data feedback from the test set. This model directly takes two-dimensional porous material slices as input, overcoming the problems of traditional empirical formulas not considering pore positions and long experimental periods. This method can quickly and accurately predict the mechanical properties of porous materials in a short period.

  3. 3.

    After discussing and analyzing different optimization algorithms and learning rates of the model, the Adam optimizer was ultimately selected with a learning rate 0.0003. The accuracy and extrapolation ability of the model were verified, and the effectiveness of the model was validated through \(\:{R}^{2}\). The \(\:{R}^{2}\)of the model for the equivalent elastic modulus was 0.98, and for randomly generated non-circular irregular pore microstructures, the maximum error predicted by the model was 3.6%.

In this study, we focused our attention on the equivalent elastic modulus and Poisson’s ratio, but this method can be extended to predict the overall response of elastic-plastic materials. The rapid prediction of this method can be used to provide information for porous materials. It can be combined with experimental data, empirical formulas, and other prediction methods to increase the accuracy of the mechanical property prediction model for porous materials. By modifying the input data, a similar method can be designed that is driven by experimental data, using microscopic photographs from microstructures and actual forces during mechanical testing as input data to train the network, rather than computer-generated microstructures and corresponding finite element analysis. When there are slender holes only in one direction, the mechanical behavior may not actually be isotropic, and we will further investigate such phenomena. Similar methods can be designed for porous materials to explore data-driven predictions of response under performance degradation dominated by pores, such as predicting mechanical parameters under creep, matrix cracking, chemical aging, and other conditions. We will make these topics for future research.