Introduction

Concrete is one of the most widely used construction materials and plays an important role in infrastructure development1,2. Randomly and evenly dispersed steel fibers may effectively hinder the development of micro fractures and prevent the creation of macro cracks, consequently increasing the material’s stability and toughness3. Concrete structures are not only subjected to a variety of normal loads throughout their lifetime, but are also exposed to accidental fires caused by a variety of causes, such as fuel and vapour explosions. Building fire accidents occur from time to time and fire has always been one of the serious disaster threats to building structures4,5. Concrete that has experienced fire or high temperatures undergoes a series of physical and chemical changes internally, leading to a deterioration of its mechanical properties, which can threaten the life of the building and the safety of people. Therefore, it is of great importance to establish a constitutive identification model for high temperature damage of steel fiber concrete in place to accurately predict the thermodynamic response of steel fiber concrete.

More and more scholars have studied the damage constitutive models of concrete, the conventional constitutive model may be loosely split into two categories, one of which is the constuctive model based on the theory of composite reinforcement and microscopic damage mechanics. For example, Hameed et al.6 suggested a damage constitutive model based on the damage behaviour law of plain and fiber reinforced concrete for the prediction of SFRC, considering the fiber-matrix bond damage of sliding fibers. Wang et al.7 used the 4D CT in-situ experimental technique to study the fine structural variations and mechanical behavior of steel fiber concrete during cyclic uniaxial compression loading, and they proposed a revised stiffness degradation damage constitutive model that takes into account steel fiber distribution orientation and ITZ characteristics. Yu et al.8 suggested a model for the fracture behaviour of SFRC by using a discrete-continuous coupled finite element method to accurately follow the microscopic cracks during the bending process, taking into account key factors such as the spatial positioning of coarse aggregate and steel fibers, fiber content, length and diameter, and bonding properties at the interfacial transition zone (ITZ) of the fiber mortar. The second kind is an image-only constitutive model which is based on the macromechanical response of SFRC materials and the experimental data fitting and correlation analysis. For example, Bi et al.9 studied the influence of the volume percentage of steel fibers as well as the matrix strength on the mechanical characteristics of concrete materials, improved the HJC model and proposed a constitutive model suitable for SFRC materials. Chen et al.10 conducted compression and splitting tensile tests on SFRC with various admixtures, discussed the strain hardening effectiveness and the relationship between the energy dissipation of the specimens, and proposed a constitutive model for SFRC taking into account the damage evolution. In order to provide a form of damage evolution equation that is applicable to concrete materials under high temperature settings, Wen et al.11 conducted compression experiments of concrete materials at different temperatures (20℃, 200℃ and 400℃).

The results of the above research on the traditional constitutive model provide a valuable reference for the designing and application for steel fiber concrete constructions, and the established constitutive model is a mathematical equation expressed as an explicit function. However, there are several impacts on the characteristics of steel fiber concrete materials, which are often coupled (e.g. temperature, strain and damage), making it difficult to determine the material parameters accurately, and even impossible to determine some of the parameters (e.g. damage) directly. In addition, when analysing the data, different mathematical forms can be obtained by using different formulas for the regression fit (exponential fit, polynomial fit, etc.). If too many parameters are included, the solution of the mathematical equations is too complex and lacks practicality. If fewer parameters are included, the accuracy of the model is greatly reduced, it cannot fully reproduce the experimental data and repeatability is poor. Therefore, there is currently no easy application in engineering that can be used to accurately and comprehensively characterise the factors influencing the concrete material performance and the coupled effect of the factors in the constitutive model.

As computers have developed, machine learning methods have been gradually introduced into the field of civil engineering materials research by many scientists because of their high information processing capabilities, but most of the related research has focused on the prediction of concrete strength12,13,14,15 and image damage identification16,17,18. Ta, Q. A parametric study of image-based crack identification for orthotropic anisotropic steel bridge decks using captured images with complex backgrounds. The results show that the trained ACDN model can identify fatigue cracks and the accuracy of the crack detection results is improved by optimising the training parameters16. Ghaboussi et al.19 proposed the first use by neural network theory to construct constitutive models of materials. Afterwards this type of model has been continuously applied by many scholars to the constitutive model of soil, concrete and other materials20,21,22. For example, Wang et al.23 identified the intrinsic response and damage evolution laws of polymers through various input-output modes based on SHPB test data. Xu et al.24 investigated polymer constitutive models with and without damage evolution using a new approach that combines experimental techniques with a back-propagation (BP) neural network procedure. Ning et al.25,26 proposed an Artificial Neural Network (ANN) model based on the Back Propagation (BP) method for predicting concrete damage behavior. By comparing prediction accuracy, results show that the ANN model performs better than the commonly utilized experience equations. Zaidi et al.27 applied the Levenberg-Marquardt (LM) algorithm for constructing an ANN to predict the residual stress-strain curves of plain and fiber concrete under axial compression after fire. The findings show good agreement of the predicted stress-strain curves with the actual experimental stress-strain curves. Neural networks have advantages over traditional analysis methods in meeting the accuracy, complexity, and stability of material constitutive relationships. Based on neural network methodology to determine the material constitutive model, not dependent on the failure mechanism, which has not yet been well defined, no need for complex parameter analysis, through the given sample data to learn, directly extract the rules from them to get the required data, to ensure the accuracy while saving a lot of time. Currently, the concrete constitutive identification model is applied more BP neural network, but BP as the earliest known and most widely used a kind of neural network, its own also exists certain limitations and shortcomings, low identification accuracy, poor generalisation ability and so on. Recurrent Neural Networks (RNN) are able to maintain the previous state of information when processing sequential data, and have the function of ‘memory’, which has demonstrated great performance in forecasting the performance of construction materials28. LSTM, as a variant of RNN, is time sensitive and by introducing a gating mechanism, it overcomes the limitation of RNN in solving long term dependency problems and can increase the accuracy of the model’s prediction. Recently, neural network research has been carried out in many fields, such as earthquake risk prediction, image processing, medicine, finance and so on, and has achieved certain results. As such, Xu et al.29 suggested a LSTM neural network for predicting the seismic response of nonlinear structures of arbitrary length and sampling rate. The findings show that the suggested LSTM model appropriately reproduces the global and specific characteristics of the time histories of four kinds of structural response datasets, with good accuracy and generalisability. But LSTM has numerous parameters like neuron count, learning rate, thresholds and so on. These hyperparameters have a big influence on the results of its prediction and it will take a lot of time to take the values only by experience. Zhang et al.30 used the WOA algorithm to optimise the number of hidden neurons, time step and batch size of the LSTM to construct the WOA-LSTM model that predicted the amount of gas inflow. The results indicate that when the WOA-LSTM model is contrasted with the LSTM, RNN and BP neural network models, the WOA-LSTM is the best among the above models.

In summary, there are few studies that incorporate machine learning on damage constitutive models for steel fiber concrete materials under high temperature conditions, mostly focusing on single models such as BP neural networks. From the point of view of concrete damage accumulation, concrete is a kind of memory material, with the ‘memory’ that the previous load has an effect on the subsequent loads, and the stress-strain relationship of concrete presents nonlinear characteristics, while the traditional neural network is less efficient in dealing with time series data. LSTM can effectively predict data with nonlinear and temporal characteristics31, 32. However, LSTM has limitations in parameter selection, and the choice of parameter values has a substantial effect on prediction performance, and choosing an intelligent algorithm to optimise the parameters can increase the accuracy of the prediction33. When looking for the best neural network parameter solutions, traditional meta-heuristic optimisation techniques are prone to falling into local optimisation mode.

To deal with these challenges, we try to make some improvements to the current study. As a result, this research offers an Improved Whale Algorithm (IWOA)-optimized Long Short-Term Memory (LSTM) neural network as a constitutive identification model for steel fiber concrete at high temperatures. According to the principle of the Whale algorithm, the Laplace cross operator strategy, the optimal neighbourhood perturbation strategy, the adaptive weighting strategy and the variable helix position updating strategy are used to overcome the original shortcomings. To verify the optimality finding ability of IWOA, WOA, CPO, BOA and GWO algorithms, fifteen benchmark testing functions in CEC2005 dataset are selected and compared with them, IWOA algorithm has high accuracy and fast convergence speed. LSTM model, WOA-LSTM model and IWOA-LSTM model were established, and it can be seen from the error analysis that the prediction outcome of the IWOA-LSTM model is more accurate, and the prediction error can be guaranteed to be less than 0.4 in both cases of considering damage and not considering damage. The damage evolution curves of SFRC at various temperatures were obtained using the IWOA-LSTM model for the steel fiber concrete constitutive identification model, decoupling of damage and plastic strain, which confirmed the validity and excellence of the IWOA-LSTM high-temperature constitutive identification model for steel fiber concrete established in this paper. In addition, this paper only considered the high-temperature constitutive identification of concrete under quasi-static conditions, and the high-temperature constitutive identification at impact loading is not within the scope of this study.

Materials and test

Raw material preparation for the test: The specimen is a cylinder with the size of φ50mm×100 mm. Cement selection of compressive strength of 42.5 MPa slag cement; coarse aggregate selected for 5–10 mm crushed stone; fine aggregate selected for the fineness modulus of 2.3 sand; steel fiber is used in the length of 6 mm, the diameter of 0.175 mm copper-plated surface of the ultra-fine, ultra-short steel fiber. The static load uniaxial compression tests in this paper were performed on an electro-hydraulic servo compression tester, model MTS810, with a high temperature heater installed to suit the specimens heating requirements in this paper. The design flow of the experiment is shown in Fig. 1. A complete description of how to prepare and maintain the specimens, how to heat the specimens, as well as the experimental set-up and experimental procedure can be seen in Li et al.34.

Fig. 1
figure 1

Flow chart of experimental design.

Sample heating process, with the increase in temperature, the sample color will change, the temperature reaches 200 ℃ before the sample appearance of color change is not large, to reach 400 ℃ when the sample shows light yellow, 520 ℃ sample yellow deterioration, the edge of the gray-white color. The damaged state for steel fiber concrete (Examples with steel fiber content Vf = 0.5 per cent and Vf = 1.5 per cent) under high temperature conditions is shown in Fig. 2.

Fig. 2
figure 2

Damage morphology of SFRC at different temperatures.

Principle of IWOA algorithm and simulation experiments

Whale optimisation algorithm WOA

Whale Optimisation Algorithm WOA is a meta-heuristic optimisation algorithm developed by Mirjalilii et al.35. The algorithm imitates the 3 main feeding behaviours of whales: encircling prey, hunting and foraging. Whales will first search and gradually gain relevant information about their preys, then encircle their preys and update their information in a spiral until they find their preys, in other words, find an optimal way to solve the problem. The flowchart of the WOA algorithm is shown in Fig. 3.

Encircling prey

Whales can communicate with each other to learn about and surround their prey while they’re feeding. In the search space, the global optimal position is unknown, so the position of the individual whale closest to the prey is assumed to be the approximate optimal solution, with the other whales attempting to update their positions by approaching the enclosure to the optimal whale’s position. This encircling strategy allows the algorithm to search in a smaller spatial area, improving the algorithm’s search efficiency. The mathematical model for the encircling search is:

$$\:\begin{array}{c}D=\left|C{\cdot\:X}^{*}\left(t\right)-X\left(t\right)\right|\end{array}$$
(1)
$$\:\begin{array}{c}X\left(t+1\right)={X}^{*}\left(t\right)-A\cdot\:D\end{array}$$
(2)

where t represents the iteration number, \(\:{X}^{*}\left(t\right)\) represents the position vector for the best individual whale to date, and \(\:X\left(t\right)\) represents the position vector for the remaining whales. A and C are coefficient vectors and the position of \(\:X\left(t\right)\) around the optimal solution is found by adjusting the values of A and C. When |A|≥1, the algorithm widens the scope and performs a global search. When |A|<1, the algorithm restricts the scope of the search and performs a local search. Formulas for coefficient vectors A and C are as follows:

$$\:\begin{array}{c}A=2a\cdot\:r-a\end{array}$$
(3)
$$\:\begin{array}{c}C=2\cdot\:r\end{array}$$
(4)

where: \(\:r\)[0,1], and a is a controlling parameter with a linearly decreasing in value from 2 to 0 as the whale population iterates through predation.

Bubble network attack

During feeding, whales hover and rise with their prey at the centre, spitting out bubbles to form an encirclement moving towards the prey. In WOA, the shrink mechanism and spiral update position mechanism were designed to describe bubble net hunting behaviour. This pattern of behaviour helps to prevent the algorithm from getting stuck in local optima and allows it to search for optimal solutions more quickly. Reducing the wraparound mechanism is accomplished through decreasing the value of a in Eq. (3), and the mathematical model for spiral updating of individual positions is as follows:

$$\:\begin{array}{c}X\left(t+1\right)=\left\{\begin{array}{c}{\:X}^{*}\left(t\right)-A\cdot\:D\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:p<0.5\\\:\:{X}^{*}\left(t\right)+D\cdot\:{e}^{bl}\cdot\:{cos}\left(2\pi\:l\right)\:\:\:p\ge\:0.5\end{array}\right.\end{array}$$
(5)

where: \(\:l\) represents a random number from [−1,1], with p used to distinguish which method to use for the position updating, p[0,1]. b represents a constant factor used to determine the shape of the logarithmic spiral.

Search for prey

The algorithm implemented the global search strategy to simulate the behaviour of a whale in search of prey, jumping out of local optimal solutions and exploring a wider solution space. Taking the whale at position \(\:{X}^{{\prime\:}}\left(t\right)\) as an example, the equation of motion for searching for prey is:

$$\:\begin{array}{c}X\left(t+1\right)={X}^{{\prime\:}}\left(t\right)-A\cdot\:D\end{array}$$
(6)
$$\:\begin{array}{c}D=\left|C\cdot\:{X}^{{\prime\:}}\left(t\right)-X\left(t\right)\right|\end{array}$$
(7)

where \(\:{X}^{{\prime\:}}\left(t\right)\) denotes a random location vector of whales.

Fig. 3
figure 3

Flowchart of the WOA algorithm.

Improvement of Whale search algorithm

The tendency of the WOA algorithm to slip into partial optima and the simplicity of the parameter tuning methods within the model make it difficult to balance the ability to explore globally with the ability to exploit locally. Therefore, this paper presents an improved WOA algorithm, with the following specific improvement steps: (1) The Laplace crossover operator is introduced to increase the population diversity, solve the slow convergence and improve the algorithm’s optimisation accuracy. (2) Introducing the optimal neighbourhood perturbation strategy to improve the algorithm’s capacity to escape from the local optimum and avoid the phenomenon of premature maturity. (3) Use the adaptive weighting strategy to gradually strengthen the influence on the optimal location and enhance the convergence speed algorithm to some extent. (4) Introduce the notion of variable spiral search to expand the whale’s capacity to explore unexplored regions, hence improving the algorithm’s global search capabilities. Effective improvements are proposed from four aspects, namely, Laplace cross operator strategy, optimal neighbourhood perturbation strategy, adaptive weight strategy, and variable spiral position update strategy, to improve performance and efficiency for the WOA algorithm. The flowchart of the IWOA algorithm is shown in Fig. 4.

Laplace cross operator strategy

The Laplace crossover operator, proposed by Deep et al.36, improves the ability of global search and removes the local extremum constraints by performing dynamic crossover operations on the optimal whale positions obtained in each iteration, and generating children farther away from the parent in the pre-iteration period. Late iterations produce children closer to the parent, refining the search range and improving the solution accuracy. The Laplace density function and crossover calculation formulae are as follows:

$$\:\begin{array}{c}f\left(x\right)=\frac{1}{2b}{e}^{-\frac{\left|x-a\right|}{b}},-\infty\:<x<+\infty\:\end{array}$$
(8)
$$\:\begin{array}{c}{U}_{1i}={x}_{1i}+\beta\:\left|{x}_{1i}-{x}_{2i}\right|\end{array}$$
(9)
$$\:\begin{array}{c}{U}_{2i}={x}_{2i}+\beta\:\left|{x}_{1i}-{x}_{2i}\right|\end{array}$$
(10)

Where: \(\:{U}_{1i}\) and \(\:{U}_{2i}\) are the individual positions of the progeny generated by the crossover of the Laplace operator; aR represents the position parameters; b represents the scale parameters; \(\:{x}_{1i}\) and \(\:{x}_{2i}\) are the positions of the two individuals with the highest fitness in the solution space, respectively; and \(\:\beta\:\) is the distribution random number. The improved whale optimisation algorithm position update formula is:

$$\:\begin{array}{c}X\left(t+1\right)=\left\{\begin{array}{c}{X}^{*}\left(t\right)+{(X}^{*}\left(t\right)-X\left(t\right))\cdot\:exp(-abs\left({r}_{1}\right))+(1-{r}_{2})\cdot\:(\stackrel{-}{X}\left(t\right)-X\left(t\right)),\:t<\frac{{t}_{max}}{2}\\\:{X}^{*}\left(t\right)+{(X}^{*}\left(t\right)-X\left(t\right))\cdot\:\frac{{exp}\left(-abs\left({r}_{1}\right)\right)}{2}+(1-{r}_{2})\cdot\:(\stackrel{-}{X}\left(t\right)-X\left(t\right)),\:t\ge\:\frac{{t}_{max}}{2}\end{array}\right.\end{array}$$
(11)

where: \(\:{t}_{max}\) represents the maximal number of iterations, \(\:{r}_{1}\), \(\:{r}_{2}\)[0,1] are random numbers uniformly distributed, and \(\:\stackrel{-}{X}\left(t\right)\) is the average of all positions.

Optimal neighbourhood perturbation strategy

Since in the iterative process of WOA optimization algorithm, the optimal position is updated only when the optimal limit is exceeded, however, the reduction in the number of updates will result in a decrease in prediction accuracy of the algorithm, the optimal neighborhood perturbation strategy is implemented to prioritize the random searching near the optimal whale position, and then search for the optimal global value, which helps to jump out of the local optimum.

The neighbourhood perturbation formula is shown below, where \(\:\stackrel{\sim}{X}\left(t\right)\) represents the updated position, which is updated to the global optimum if the produced position is superior to the original position, and vice versa if it remains unchanged.

$$\:\begin{array}{c}\stackrel{\sim}{X}\left(t\right)=\left\{\begin{array}{c}{X}^{*}\left(t\right)+0.5\cdot\:randn\cdot\:{X}^{*}\left(t\right),{p}_{1}<0.5\\\:{\:X}^{*}\left(t\right),{p}_{1}\ge\:0.5\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right.\end{array}$$
(12)

where: \(\:randn\) is the generated random number obeying normal distribution, \(\:{p}_{1}\) is a uniformly random number between [0,1]; \(\:\stackrel{\sim}{X}\left(t\right)\) is the generated new position.

A greedy strategy is applied to the generated neighbourhood locations to decide whether to retain them or not, using the following formula:

$$\:\begin{array}{c}{X}^{*}\left(t\right)=\left\{\begin{array}{c}\stackrel{\sim}{X}\left(t\right),f\left(\stackrel{\sim}{X}\left(t\right)\right)\le\:f\left({X}^{*}\left(t\right)\right)\\\:{X}^{*}\left(t\right),f\left({X}^{*}\left(t\right)\right)\le\:f\left(\stackrel{\sim}{X}\left(t\right)\right)\end{array}\right.\end{array}$$
(13)

Where: \(\:f\left(x\right)\) is the positional adaptation value of \(\:x\).

Adaptive weighting strategies

To ensure the capability of the whale optimisation algorithm to seek both global and local optimisations, adaptive weights w, which vary with the increasing iteration number, are added to the location update in order to gradually increase the influence of the optimal location in the algorithm. If the weights are larger, the algorithm has a stronger global optimisation capability, which is conducive to global search; if the weights are smaller, it has a stronger local optimisation capability, which can converge quickly and guarantee the accuracy of the results. Based on the change of numbers of updates in the whale optimisation algorithm, the weights of the adaptive inertia, consisting of the numbers of iterations t, are selected as following:

$$\:\begin{array}{c}w=\left\{\begin{array}{c}0.2{cos}\left(\frac{\pi\:}{2}\cdot\:\left(1-\frac{t}{{t}_{max}}\right)\right)\:\:{p}_{2}>0.5\\\:0.2{sin}\left(\frac{\pi\:}{2}\cdot\:\left(1-\frac{t}{{t}_{max}}\right)\right)\:\:{p}_{2}\le\:0.5\end{array}\right.\end{array}$$
(14)

Dynamic adjustment of w size allows individual whales to be closely associated with the optimal whale position, even at various moments, while ensuring that other populations of whales can converge to the optimal position as quickly as possible. The improved whale optimisation algorithm position update formula is:

$$\:\begin{array}{c}X\left(t+1\right)=\left\{\:\begin{array}{c}w\cdot\:{X}^{*}\left(t\right)-A\cdot\:\left|C\cdot\:{X}^{*}\left(t\right)-X\left(t\right)\right|,\:p<0.5\\\:w\cdot\:{X}^{*}\left(t\right)+D\cdot\:{e}^{bl}{cos}\left(2\pi\:l\right),\:p\ge\:0.5\:\:\:\:\:\:\:\:\:\end{array}\right.\end{array}$$
(15)
$$\:\begin{array}{c}X\left(t+1\right)=w\cdot\:{X}^{{\prime\:}}\left(t\right)-A\cdot\:\left|C\cdot\:{X}^{{\prime\:}}\left(t\right)-X\left(t\right)\right|\end{array}$$
(16)

Update strategy for variable helix positions

When seeking for prey, whales adjust the moving distance for each position update dependent on the form of the spiral connecting the target’s location to their own. In Eq. (5), b is generally set as a constant, which leads to the whale’s spiral movement in searching for prey being too homogeneous, and very easily falls into the misunderstanding of local optimal solution. Therefore, this paper sets the parameter b as a parameter that varies with the numbers of iterations according to the idea of variable spiral search. After combining the adaptive weights, the new formula for updating the spiral position is as follows:

$$\:\begin{array}{c}\left\{\begin{array}{c}\:X\left(t+1\right)=w\left(t\right){X}^{*}\left(t\right)+bD\cdot\:{e}^{l}{cos}\left(2\pi\:l\right)\\\:\:b={e}^{5\cdot\:{cos}\left(\pi\:\cdot\:\left(1-\frac{t}{{t}_{max}}\right)\right)}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right.\end{array}$$
(17)
Fig. 4
figure 4

Flowchart of IWOA.

Simulation experiments

In this paper, to verify the optimisation search performance of the IWOA algorithm, we selected 15 benchmark test functions from the CEC2005. For the single-peak problem, f1-f5, these functions have a global minimum and no local minima, and are frequently employed to verify the convergence of an algorithm, as illustrated in Table 1. For the fundamental multi-peak problem, f6f7f9-f11, which are functions with multiple locally optimum solutions, are employed to predict the balanced capability of this algorithm to explore the decision space globally and exploit it locally, as seen in Table 2. For the expanded multi-peak and composite hybrid problems, f13f15f21-f23, which are more complex functions with multiple global and local optimal solutions, are employed to check the function performance when dealing with hybrid optimising problems, as shown in Tables 3 and 4.

Table 1 Functions for the single-peak problem.
Table 2 Basic multi-peak problem functions.
Table 3 Extended multi-peak and hybrid composite problem functions.
Table 4 Parameter combinations.

Analysis of test results

Algorithm iteration process

Crown Porcupine Optimization Algorithm (CPO) with unique perturbation strategy is competitive in complex multi-peaked problems. Butterfly Optimization Algorithm (BOA) is efficient in dealing with high dimensional nonlinear problems. Grey Wolf Optimization Algorithm (GWO) has fast convergence, simple parameters and stable performance in multi-domain optimization problems. The Whale Optimization Algorithm (WOA) is homologous to IWOA and widely used in engineering optimization problems. Therefore, in order to better analyse the performance of IWOA algorithm in terms of convergence speed, stability and optimization accuracy, WOA, CPO, BOA, GWO and IWOA are compared for test function optimization. Meanwhile, to ensure fairness in comparing each algorithm, the algorithms set the identical experimental parameters, the numbers of populations N = 30, the maximum numbers of iterations T = 500, as well as run 30 independent experiments for each algorithm. The results of the five algorithms are illustrated in Fig. 5.

Fig. 5
figure 5

Convergence curve of the test function.

From the convergence curves Fig. 5(a-e), it is obvious that in terms of convergence speed, the improved whale optimisation algorithm in this paper is faster than the WOA algorithm, CPO algorithm, BOA algorithm and GWO algorithm, and the IWOA algorithm completes the convergence between 250 and 450 iterations. In terms of convergence accuracy, the IWOA algorithm was able to achieve the optimum value of the test functions, while none of the other algorithms reached the optimum after 500 iterations. The IWOA algorithm has a better capability to find the optimum when dealing with single peak functions, and the iteration speed is fast and stable. According to Fig. 5(f-j), it is observed that the convergence rate of IWOA has an absolute advantage, and it can jump rapidly away from the local extreme value solution in the beginning of the iteration to reach the global optimum. For Fig. 5(h), IWOA reaches the global optimum after only 10 iterations. IWOA has faster convergence speed and better convergence accuracy when dealing with multi-peak functions. According to Fig. 5(k-o), it is obvious that WOA easily falls into the local optimum when coping with complex problems, while IWOA has a stronger global search ability, which also indicates that the improvement in this paper helps WOA leap out of the local optimum. For Fig. 5(l), IWOA achieves the algorithm’s global optimum in 25 iterations and CPO achieves the algorithm’s optimum in 370 iterations. This shows that IWOA has good coping ability in dealing with complex problem functions.

Algorithm evaluation indicators

The assessment metrics for the intelligent optimisation algorithm’s data processing are the optimum, the standard deviation, the mean value and the worst value, and in order to observe the stability of the IWOA more graphically, a box-and-line diagram is plotted as can be seen in Fig. 6. The results of the assessment metrics of the 5 algorithms tested by the benchmark function are presented in Table 5. For the one-peak problem, the IWOA algorithm finds the theoretical optimum of the function on the functions f1-f4, whereas all of the other algorithms do not find the optimisation. The IWOA algorithm does not find the theoretical optimum for function f5, but the IWOA algorithm searches for the closest optimum compared to the other compared algorithms. For the multi-peak problem, the IWOA algorithm, the WOA algorithm, the CPO algorithm, and the BOA algorithm all obtain the theoretically optimal value of the function f9, whereas the standard, average and worst values of the IWOA algorithm and the CPO algorithm are all 0. The IWOA, WOA, CPO and GWO algorithms all find the theoretical optimum on function f11, whereas the standard, mean and worst values of the IWOA and CPO algorithms are 0. The IWOA algorithm searches for the optimum on functions f6, f7, and f10 more accurately than any of the other compared algorithms. In the expanded multi-peak problem and the hybrid compound problem, there is some error between the optimum found by the algorithm and the theoretical optimum, for example, for the function f15, the optimum found by the IWOA algorithm is 0.000317, whereas the theoretical optimum is 0.0003. On functions f13, f21 and f22, the IWOA algorithm searches for the optimal value with the lowest error between optimum and theoretical optimum comparing to the others comparative algorithms, thus having a high accuracy in finding the optimum value. The boxplots of the tests also show that the performance of the algorithm IWOA is relatively stable, with basically no outliers, and the medians are all close to the target optimum with very small deviations. In summary, the comparison of experimental results shows that IWOA has good convergence efficiency and robustness, and the algorithm performance is stable.

Fig. 6
figure 6

Box line diagram.

Table 5 Algorithm evaluation metrics.

Model principles and evaluation metrics

LSTM neural network model

The LSTM is an optimisation of the recurrent neural network RNN, which creatively adds an input gate, a forget gate and an output gate to control the global memory metacells, primarily to solve the problems of gradient vanishing and gradient exploding when training long sequences. The structure of the LSTM model is illustrated in Fig. 7.

Fig. 7
figure 7

Structure of the LSTM model.

The forget gate filters the data information and uses the \(\:\sigma\:\) function to discard information and retain information with a certain probability, and by adjusting the weights and biases of the forget gate, the training data fitting can be achieved. The input \(\:{x}_{t}\) at the moment t and the output \(\:{h}_{t-1}\) of the hidden layer at the moment t-1 are output in the range of [0,1] through the \(\:\sigma\:\) function, where 0 means all historical data information is completely discarded, and 1 means all historical data information is completely retained. The calculation formula is as follows:

$$\:\begin{array}{c}{f}_{t}=\sigma\:\left({W}_{f}\cdot\left[{h}_{t-1},{x}_{t}\right]+{b}_{f}\right)\end{array}$$
(18)

where Wf represents the forgetting gate weight matrix; bf represents the forgetting gate bias term.

The input gate stores the transmitted information, processes the inputs of the current input sequence, and decides which information to keep to update the server’s state. The \(\:{i}_{t}\) indicates the degree of trade-off for the newly added information, and the candidate unit \(\:{c}_{t}\) is generated by the \(\:{tan}h\) function, which is calculated as follows:

$$\:\begin{array}{c}{i}_{t}=\sigma\:\left({W}_{i}\cdot\left[{h}_{t-1},{x}_{t}\right]+{b}_{i}\right)\end{array}$$
(19)
$$\:\begin{array}{c}\stackrel{\sim}{{c}_{t}}={tan}h\left({W}_{c}\cdot\:\left[{h}_{t-1},{x}_{t}\right]+{b}_{c}\right)\end{array}$$
(20)
$$\:\begin{array}{c}{c}_{t}={f}_{t}{\cdot\:c}_{t-1}+{i}_{t}\cdot\:\stackrel{\sim}{{c}_{t}}\end{array}$$
(21)

where \(\:{W}_{i}\) represents the weight matrix of the input gate and \(\:{b}_{i}\) represents the corresponding bias term; \(\:{W}_{c}\) represents the weight matrix of the neuron state and \(\:{b}_{c}\) represents the corresponding bias term; \(\:\stackrel{\sim}{{c}_{t}}\) is the variable for acquiring the new information, \(\:{c}_{t}\) is the updated memory cell state variable, \(\:{c}_{t-1}\) is the previous memory cell state variable, and \(\:\sigma\:\) is the sigmoid activation function.

The output gate determines how much data information is output based on the original state of the new unit. ot represents the degree of trade-off between the currently fused historical information and the input information, and ht represents the data predicted by the output. This is calculated using the following formula:

$$\:\begin{array}{c}{o}_{t}=\sigma\:\left({W}_{o}\cdot\:\left[{h}_{t-1},{x}_{t}\right]+{b}_{o}\right)\end{array}$$
(22)
$$\:\begin{array}{c}{h}_{t}={o}_{t}\cdot\:{tan}h\left({c}_{t}\right)\end{array}$$
(23)

where \(\:{W}_{o}\) represents the oblivious gate weight matrix; \(\:{b}_{o}\) represents the oblivious gate bias term.

Modelling IWOA-LSTM

An IWOA-LSTM model is developed to identify the constitutive model of fiber reinforced concrete at high temperatures. We use the IWOA algorithm to optimise the learning rate parameter, the number of hidden layer units and the regularisation coefficients in the LSTM units. The flowchart of the IWOA-LSTM model is shown in Fig. 8.

Fig. 8
figure 8

Flowchart of the IWOA-LSTM model.

Performance indicators

Three measures have been proposed by machine learning to assess a model’s prediction accuracy: coefficient of determination (R2), mean square error (MSE), and root mean square error (RMSE). These metrics are calculated by comparing the output values to the anticipated values. The following metrics are usually selected for evaluation.

(1)Formula for MSE:

$$\:\begin{array}{c}MSE=\frac{1}{n}\sum\:_{i=1}^{n}{\left({y}_{i}-{\stackrel{\sim}{y}}_{i}\right)}^{2}\end{array}$$
(24)

(2)Formula for RMSE:

$$\:\begin{array}{c}RMSE=\sqrt{\frac{\sum\:_{i=1}^{n}{\left({y}_{i}-{\stackrel{\sim}{y}}_{i}\right)}^{2}}{n}}\end{array}$$
(25)

(3)Formula for R2:

$$\:\begin{array}{c}{R}^{2}=1-\frac{\sum\:_{i=1}^{n}{\left({y}_{i}-{\stackrel{\sim}{y}}_{i}\right)}^{2}}{\sum\:_{i=1}^{n}{\left({y}_{i}-{\stackrel{-}{y}}_{i}\right)}^{2}}\end{array}$$
(26)

where \(\:n\) represents the sample number, \(\:{\stackrel{\sim}{y}}_{i}\) represents the fitted value, \(\:{y}_{i}\) represents the true value, and \(\:{\stackrel{-}{y}}_{i}\) represents the mean of the true value.

Analysis of results

The destruction of concrete materials is the result of the expansion of microcracks within them, and the steady expansion and penetration of these ranks ultimately leads to the macroscopic destruction of concrete. The static damage of concrete is a time course and not an instantaneous event. Define the continuous damage \(\:D\) as follows:

$$\:\begin{array}{c}D=\frac{{\sigma}_{0}-\sigma\:}{{\sigma}_{0}}=1-\frac{\sigma}{{\sigma}_{0}}\:\:(0\le\:D\le\:1)\end{array}$$
(27)

where \(\:{\sigma\:}_{0}\) represents the stress of the perfect material without damage and \(\:\sigma\:\) represents the stress of the real material containing damage.

In principle, the material damage develops with the deformation procedure, such that the damage D is a dependence of the strain \(\:\epsilon\:\). Due to the loading environment, the evolution of material damage during loading is dependent on both the strain and the temperature, i.e. D = D(\(\:\epsilon\:\), T). The value of damage D can not be measured in the experiment directly, however, it should be considered that it can be measured by looking at the time t as an inverse function of D. Therefore, consider the inverse function of t with regard to D as damage. Constitutive relations, or the system recognition issue, are defined from the standpoint of systems science as determining the relationship between a system’s cause (input) and consequence (output). Therefore, the following formulation may be used to represent the one-dimensional constitutive relationships of steel fiber concrete at different temperatures:

$$\:\begin{array}{c}\sigma\:=f\left[\epsilon\:,T\right]\:\:\:\:\:\:\:\epsilon\:\le\:{\epsilon\:}_{th}\end{array}$$
(28)
$$\:\begin{array}{c}\sigma\:=f\left[\epsilon\:,T,D\right]=f\left[\epsilon\:,T,{t}^{-1}\left(D\right)\right]\:\:\:\:\:\:\epsilon\:>{\epsilon\:}_{th}\end{array}$$
(29)

where \(\:{\epsilon\:}_{th}\) is the threshold strain (according to the relevant literature37,38 and experimental data, in this paper \(\:{\epsilon\:}_{th}\) is taken as 0.75 times the peak strain), which represents the ultimate elastic strain.

Model identification results

The settings of the relevant parameters of the IWOA-LSTM model are shown in Table 6.

Table 6 Optimal hyperparameters of the model.

Model identification results

Since test datasets constitute the foundation of all neural network models, the caliber of the gathered datasets greatly affects the model’s performance. In order to predict the stress-strain relationship of steel fiber concrete by the model and to avoid chance in the tests, 3–5 repetitions of the tests were carried out under each of the same conditions to ensure that at least three valid test results were obtained, with a standard deviation of the stress-strain curve of the specimens at each temperature of less than 5%, and that an average stress-strain curve was derived from the valid test results. Finally, the experimental curve closest to the average curve is selected for subsequent analysis to ensure a more reliable and consistent representation of the material behaviour. The data obtained from the uniaxial compression tests were put to use using neural networks in two cases. First case: strain \(\:\epsilon\:\), temperature T as input and stress \(\:\sigma\:\) as output (without considering damage evolution). Second case: strain \(\:\epsilon\:\), temperature T, time t as inputs and stress \(\:\sigma\:\) as output (considering damage evolution). The experimental data of steel fiber concrete (Vf =0.5%) is divided into training and test sets, 70% of the pre-processed data is used as training set and the remaining 30% is used as test set. A concrete specimen made of steel fibers at 400 °C as an illustration, test data were fed into the LSTM, WOA-LSTM, and IWOA-LSTM models, respectively. The models were trained to predict the steel fiber concrete stress-strain curves under high temperature conditions, and the steel fiber concrete stress-strain curves following high temperature were compared with the deep network model’s prediction. Figure 9 displays the results of the identification.

Fig. 9
figure 9

Comparison of model identification results.

The LSTM model is the lowest constitutive identification model among these three, as seen in Fig. 9. The LSTM model’s projected curves do not closely match the actual curves when damage is taken into account; without damage, the predictions are lower than the current experimental curves. Comparing Fig. 9(a), it can be seen that the optimised LSTM model with the addition of WOA or IWOA has improved in verifying its accuracy in defining the damage, and both of them can define the macro-continuous damage, and the prediction curves match the experimental curves better after the addition of the damage data, from which the accuracy in defining the damage can be verified. In the range of \(\:{\epsilon\:}_{th}\), the WOA-LSTM and IWOA-LSTM prediction curves match well with the test curves, whereas the concrete specimen deformation deviates from the test and prediction curves after exceeding the bounds of the model learning, and we believe that it is the appearance of the damage that leads to the deviation of the curves, as shown by the blue dotted lines in Fig. 9(b-c). It is evident that there is a stress difference between the outcomes that are anticipated with and without consideration of damage evolution. This difference represents the weakening impact that damage evolution causes as strain increases. The threshold strain \(\:{\epsilon\:}_{th}\) marks the start of damage evolution and is an important material parameter for designing the strength of steel fiber concrete in terms of damage evolution and fracture damage. After accounting for the damage evolution, i.e., entering the time t as an inverse function of the damage D for input, the predicted curves of the WOA-LSTM and IWOA-LSTM models agree well with the experimental curves over the whole strain range, as shown by the red underlined lines in Fig. 9(b-c).

Model comparison

Comparison between WOA-LSTM and IWOA-LSTM

Figure 9 shows that both algorithms, WOA-LSTM and IWOA-LSTM, can study the damage evolution law and predict the stress-strain curve of concrete materials. To compare the precision of the two algorithms throughout the prediction process, the prediction errors of the concrete materials were selected for comparison between the intrinsic identification process considering the damage evolution and without considering the damage evolution. Figure 10 presents the findings.

$$\:\begin{array}{c}E=\left|\stackrel{\sim}{\sigma\:}-\sigma\:\right|\end{array}$$
(30)

where \(\:E\) represents the prediction error, \(\:\stackrel{\sim}{\sigma\:}\) represents the predicted stress and \(\:\sigma\:\) represents the experimental stress.

Fig. 10
figure 10

Results of error analysis.

As shown in Fig. 10, the prediction error of IWOA-LSTM basically stays within 0.4, while the prediction error of WOA-LSTM is within 1.2 and is very unstable. In the SFRC constitutive identification with and without considering damage, the prediction errors of both the WOA-LSTM and IWOA-LSTM models are larger at both the beginning and end of loading, mainly due to the fact that only one side of the data exists at both the beginning and end of loading. In terms of prediction findings, both with considering and without damage evolution, the IWOA-LSTM model suggested in this study is very accurate in forecasting the stress-strain behaviour of concrete after high temperature with minimal error. Therefore, it can be verified that the damage defined by IWOA-LSTM is more reliable.

Comparative analysis of model indicators

The prediction outcomes are compared to evaluate the effectiveness of the proposed LSTM, WOA-LSTM, and IWOA-LSTM models. Table 7 displays the average comparison results of ten runs of each model, whereas Fig. 11 displays the visualisation results.

Table 7 Comparison of indicators for Models.
Fig. 11
figure 11

Comparison and visualisation of model indicators.

The metrics for errors MSE and RMSE are metrics used to evaluate a model’s prediction accuracy. The IWOA-LSTM model has the lowest error and the best prediction accuracy, whereas the LSTM model has the most error, as shown in Table 6; Fig. 11. For the most commonly used MSE assessment metrics, the IWOA-LSTM model is 47.66% more accurate than the WOA-LSTM model and 65.60% more accurate than the LSTM model. The degree of correlation between anticipated and true values, or R2, is a measure of the model’s goodness-of-fit. The closer the number is to 1, the better the model predicts the data. The IWOA-LSTM model has the R2 is the highest, reaching more than 0.99. In summary, the IWOA-LSTM model has better performance.

Validation of the IWOA-LSTM

Instead of relying on conventional mathematical formulas and fundamental laws of mechanics, the IWOA-LSTM based constitutive behaviour of concrete relies on the inputs and outputs of different modes for constitutive identification. Steel fiber concrete specimens with Vf =0.5% (T = 200℃ and T = 520℃) and Vf =1.5% (T = 200℃, T = 400℃, and T = 520℃) were chosen in order to further confirm the IWOA-LSTM model’s reliability. Sample test data were then entered into the model, with 70% serving as the training set and 30% as the test set. In Fig. 12, the prediction results are displayed.

Fig. 12
figure 12

Recognition results of SFRC.

The prediction findings in Fig. 12 show that the IWOA-LSTM model’s prediction results are quite close to the experimental curves when damage is taken into consideration. The temperature affects the damage evolution process in addition to strain, and both temperature and strain determine continuous damage. The appearance of the evolution of the damage D with the strain at different temperatures is produced as shown in Fig. 13 by comparing and analyzing the experimental curves and the projected curves without taking the damage into consideration. As the loading process goes on, the damage variable varies. At low stress levels, there is no obvious damage; as external loads increase, the damage value climbs progressively; and as temperature rises, the threshold strain, or \(\:{\epsilon\:}_{th}\), gets smaller and smaller.

Fig. 13
figure 13

Damage evolution curve.

Conclusion

In this study, we present an Improved Whale Algorithm Optimised Long Short-Term Memory neural network-based concrete constitutive identification model (IWOA-LSTM identification model). The constitutive identification model of steel fiber concrete at elevated temperatures is used to determine the damage evolution law and constitutive response of steel fiber concrete by using the IWOA-LSTM model with different input-output modes based solely on the experimental data, without making any prior assumptions on their constitutive relationships. The following are the primary conclusions:

  1. (1)

    Targeting the fundamental whale optimization method, which is sluggish to converge and prone to local optimisation. To address these initial shortcomings, broaden the population’s diversity, and improve the balance between local exploitation and global search, four new strategies are introduced: the Laplace cross operator strategy, the optimal neighbourhood perturbation strategy, the adaptive weighting strategy, and the variable helix position update strategy. It extends the whale optimization algorithm’s global search capabilities, which improves the algorithm’s stability and quality of solution, in addition to increasing the algorithm’s convergence speed and accuracy of solutions.

  2. (2)

    The five algorithms are tested on 15 benchmark test functions. The improved WOA algorithm in this paper has stronger efficiency and accuracy, as evidenced by the convergence curve, which shows that the WOA algorithm is easy to fall into the local optimum while the IWOA algorithm converges quickly and can overcome the restriction of the local optimum solution to obtain higher solving accuracy. It can also be seen from the boxplots of the test that compared to the WOA algorithm, CPO algorithm, BOA algorithm and GWO algorithm, the IWOA algorithm has a more stable performance, basically no outliers, and the medians are close to the target optimal value, with very small deviations, which provides a strong stability.

  3. (3)

    In this paper, three constitutive identification models, LSTM, WOA-LSTM and IWOA-LSTM, are developed to decouple the rheology from the damage by identifying the constitutive model of steel-fiber concrete at high temperatures without making any assumptions. Based on the simulation results of the models on the test and training sets, it is clear that the IWOA-LSTM model outperforms the WOA-LSTM and LSTM models in terms of predicting the stress-strain relationship of concrete after exposure to high temperatures. The correlation coefficient, mean square error, and root mean square error of the IWOA-LSTM model are 0.9981, 0.4868, and 0.6977, respectively. The mean square error of the IWOA-LSTM model is 47.66% and 65.60% more accurate than that of the WOA-LSTM model and the LSTM model, respectively, and it has a higher computational accuracy.

  4. (4)

    The IWOA-LSTM neural network model was applied to identify steel fiber concrete at various temperatures (200℃, 400℃, and 520℃), resulting in damage evolution curves for SFRC. The findings indicate that no appreciable damage occurs at low stress levels, that damage values progressively rise with increasing external loads, and that \(\:{\epsilon\:}_{th}\) falls with rising temperatures.

  5. (5)

    Based on machine learning to determine the ontological model of the material, does not rely on the failure mechanism that is not yet clear, does not need complex parameter analysis, through the given sample data to learn, directly extract the rules from it, to obtain the required data, in order to ensure the accuracy while saving a lot of time. The IWOA-LSTM model designed and developed in this paper has high accuracy and generalisation ability, which provides a new idea and option for future research. For future research, this paper only considers the high temperature principal structure identification of concrete under quasi-static conditions, and an attempt can be made to improve the model and carry out research on the high temperature principal structure identification of concrete under impact loading.