Introduction

Concrete is one of the most widely utilized construction materials globally, largely due to its availability and the ease of sourcing local materials1,2. With coarse and fine aggregates like stones and sand, along with cement, water, and chemical admixtures, concrete offers a cost-effective and durable solution that requires limited skilled labor for its application. Nevertheless, these properties contribute to its inherent heterogeneity, resulting in a nonlinear behavior under compressive stress. Existing destructive experimental methods that offer a reliable approach to determining this property are time-consuming and labor-intensive3,4. Accordingly, developing models for accurate predictions of compressive strength enables efficient resource allocation and time-effective construction practices, while imprecise estimations can lead to structural failures or overdesign, with significant economic implications5,6.

Recently artificial neural networks (ANNs) have emerged as promising tools for predicting material properties, including the compressive strength of concrete, owing to their ability to model complex nonlinear relationships within datasets7,8,9,10,11,12. Conventional ANN consists of interconnected layers based on a mathematical structure13. The architecture starts with an input layer vector, which connects to one or more hidden layer vectors, ultimately leading to an output layer vector. Each layer contains units called neurons (nodes), which gives rise to the term “neuron” in the context of artificial neural networks. These neurons are linked through weights and numerical values that define the strength of connections between them14. During training, ANNs iteratively adjust these weights using feedforward and backward propagation to improve performance15. Activation functions such as step, linear, sigmoid, and rectified linear functions enable the network to handle nonlinear problems16, while bias values better capture real-world data patterns17,18.

Despite their potential, conventional ANNs face limitations when applied to large datasets. These challenges include extended training times, susceptibility to noise, and potential information loss in deep networks, all of which compromise the accuracy and reliability of predictions19,20. Recent research has sought to use alternative machine learning techniques, such as bagging and boosting methods, to predict the strength and performance of concrete materials and structures21,22,23,24,25. Accordingly, there is still a gap in developing architectures that improve the accuracy and robustness for predicting concrete compressive strength with a large dataset.

In order to address these challenges, this study introduces a novel multi-lobar artificial neural network (MLANN) architecture inspired by the brain’s lobar processing of sensory information. This approach integrates multiple “lobes,” each with a distinct arrangement of neurons, designed to enhance data processing efficiency, reduce noise during training, and improve generalization capabilities. Unlike conventional ANNs, the MLANN framework provides a robust and adaptable solution for handling nonlinear datasets while maintaining a computationally efficient structure. The proposed MLANN is evaluated through comprehensive experiments to predict the compressive strength of concrete, utilizing a dataset with diverse mixture compositions. The methodology incorporates data normalization, shuffling, and unique weight initialization strategies to optimize training. The study employs the SoftPlus activation function to aggregate outputs from all lobes. Performance metrics such as root mean square error (RMSE), mean absolute error (MAE), and the A20 index are utilized to benchmark the MLANN against traditional multi-layer ANNs and ensemble learning neural networks (ELNN). By addressing the shortcomings of existing models and presenting a robust alternative, this study underscores the potential of MLANNs in advancing predictive modeling in civil engineering. The findings contribute to the broader application of brain-inspired computational frameworks in material science, paving the way for more reliable and cost-effective construction practices.

Research significance

Existing destructive testing methods, while reliable, are time-intensive procedures, creating a need for efficient, accurate, and cost-effective alternatives. Artificial neural networks (ANNs) have shown promise in this domain, but their application is often hindered by challenges such as extended training times, noise susceptibility, and accuracy limitations when dealing with large nonlinear datasets. In order to address these shortcomings, this research introduces a novel multi-lobar artificial neural network (MLANN) architecture inspired by the brain’s lobar processing of sensory information. The MLANN framework enhances data processing, reduces training noise, and improves generalization capabilities, resulting in a model that significantly outperforms conventional ANNs and ensemble learning neural networks. This advancement holds substantial implications for civil engineering, as accurate and efficient predictions of concrete compressive strength enable better design, analysis, and construction practices. The findings contribute to material science and also to the broader application of bio-inspired computational frameworks.

Proposed neural network architecture

The human brain processes sensory information through a complex network of neurons, where different regions, known as lobes, manage specific aspects of incoming nerve impulses26. This lobar processing allows the brain to handle complex tasks efficiently. Traditional ANNs, Fig. 1, typically rely on straightforward input-output pathways, which can be inadequate for handling large datasets and complex problems without significant computational overhead. These conventional ANNs often require deep networks to achieve satisfactory results, leading to lengthy training times and increased susceptibility to noise, which can compromise accuracy.

Fig. 1
figure 1

Typical ANN architecture.

For a traditional ANN architecture, the process within a single layer \(\:l\) can be mathematically represented as:

$$\:{a}^{l}=f\left({W}^{l}{a}^{l-1}+{b}^{l}\right)$$
(1)

where \(\:{a}^{l-1}\) is the input (or activations from the previous layer); \(\:{W}^{l}\) is the weight matrix; \(\:{b}^{l}\) is the bias vector; \(\:f\) is the activation function; \(\:{a}^{l}\) is the output of layer \(\:l\). For an ANN with \(\:L\) layers, the final output \(\:y\) is:

$$\:y={f}^{L}\left({W}^{L}{f}^{L-1}\left(\cdots\:f\left({W}^{l}x+{b}^{l}\right)\cdots\:\right)+{b}^{L}\right)$$
(2)

In order to address these limitations, ensemble learning neural networks (ELNNs) have been previously developed, as shown in Fig. 2. ELNNs aggregate multiple models through an ensemble layer to improve predictive performance. The ELNN’s operation involves multiple sub-models \(\:i=\text{1,2},\dots\:,M\), each producing an output \(\:{y}_{i}\):

$$\:{y}_{i}={f}_{i}\left({W}_{i}x+{b}_{i}\right)$$
(3)

The final output of the ELNN is obtained by combining the outputs of all sub-models using an aggregation function \(\:g\), such as a weighted sum:

$$\:y=\sum\:_{i=1}^{M}{\alpha\:}_{i}{y}_{i}$$
(4)

However, they can introduce interdependencies between models, potentially propagating noise and reducing training efficiency. Accordingly, this study develops an MLANN framework to replicate the brain’s biological mechanisms and functionalities for processing data structures in machine learning tasks.

The proposed MLANN architecture, depicted in Fig. 3, is inspired by the brain’s lobar structure. In this regard, each lobe in the MLANN functions as an independent processing unit capable of addressing specific nonlinearities in diverse datasets. In the MLANN, the operation of each lobe \(\:k=\text{1,2},\dots\:,N\) can be described as:

$$\:{z}_{k}={f}_{k}\left({W}_{k}x+{b}_{k}\right)$$
(5)

where \(\:{z}_{k}\) is the output of the \(\:k\)-th lobe; \(\:{W}_{k}\) and \(\:{b}_{k}\) are the weights and biases specific to that lobe; \(\:{f}_{k}\) is the activation function used within the lobe.

The outputs of all lobes are then aggregated using a function \(\:h\), such as the SoftPlus activation function:

$$\:y=log\left(1+{e}^{\sum\:_{k=1}^{N}{z}_{k}}\right)$$
(6)

This design inherently promotes adaptability and scalability, enabling the incorporation of various types and functions within a single model without relying on deep layering. In the MLANN, each lobe consists of its own set of layers and neurons, allowing it to process input data independently before combining the outputs. This modular approach enhances flexibility and improves noise control during training compared to conventional methods. The independent lobes reduce the risk of noise propagation, as each lobe can specialize in learning different patterns or features within the data.

The MLANN algorithm, as described in Algorithm 1, outlines the procedure for training a model for regression tasks. While it follows many conventional ANN practices, it includes specific steps for this framework. Initially, the data is shuffled randomly to eliminate any inherent order that could bias the training process. The shuffled data is then divided into training and testing sets.

Fig. 2
figure 2

An example of the ELNNs architecture.

Both sets are normalized to ensure consistent scaling, which can improve the MLANN’s convergence speed and overall performance. The weights and biases of the MLANN are initialized randomly, with each lobe starting from a different seed. Each neuron in the MLANN employs an activation function to introduce nonlinearity into the model. During forward propagation, the input data passes through the MLANN, progressing through each lobe and layer, to produce a regression output. This process involves multiplying the inputs by the weights, adding biases, and applying activation functions at each neuron.

Fig. 3
figure 3

Architecture of the proposed MLANN.

In the output layer, the SoftPlus activation function is utilized to aggregate the effects of all lobes, combining their outputs into a single regression result. Optimization techniques are then applied to minimize the loss function with respect to the network’s weights and biases27,28. Finally, the optimized weights and biases are saved, and the model’s performance is evaluated using the testing dataset to assess its generalization capability.

Algorithm 1
figure a

Multi-lobar artificial neural networks frameworkalgorithm.

Materials and methods

Developed database

In order to assess the proposed approach’s capabilities in estimating the compressive strength of concrete, a dataset with a large sample size was utilized. This dataset was adopted extensively in similar research in the past29,30,31,32,33. Table 1 depicts the descriptive statistics of the adopted dataset. Generally, it consists of 1005 mixtures with their compressive strengths at different ages. The input features for each of the investigated neural network models included the materials used in each concrete specimen, while the compressive strength of concrete was chosen as the prediction model’s output. Figure 4 shows a correlation analysis of the features in the dataset.

Table 1 Descriptive statistics of the adopted dataset in this study.
Fig. 4
figure 4

Correlation analysis of the dataset (A color scale with an arrow showing how the scale varies is provided. It indicates that the light green color refers to high positive correlation coefficients while the magenta color refers to high negative correlation coefficients).

Investigated models

This study compares three different architectures, ANN, ELNN, and MLANN, to highlight the latter’s performance in enhancing the estimation accuracy in the case of compressive strength of concrete. During the training phase, the primary focus in these models was to minimize the loss function, which was taken as the normalized root mean squared error. In general, all models were independently constructed using the TensorFlow library in Python. The Pandas library was used for dataset management, while the NumPy library was utilized to handle post-training mathematical calculations. To ensure a robust comparison, the following conditions were maintained:

  • The ANN, ELNN, and MLANN cases adopted the same activation functions where the leaky rectified linear units (Leaky ReLU) were used in the hidden layers, and the SoftPlus was used in the output layer.

  • The parameters (weights and biases) for each model were initialized using the GlorotNormal function to prevent confounding factors. In order to address the variability in parameter initialization, each model underwent 1000 training cycles, each using a different randomization seed on normalized datasets. The best performance across these training cycles was considered for comparison.

  • The investigated models utilized backward propagation with the Nadam optimizer while maintaining uniform learning rates and batch sizes.

  • The ANN architecture was optimized with 10 hidden layers, each with 50 neurons. The MLANN architecture was composed of four lobes to produce a comparable case, where the first lobe had one hidden layer, the second lobe had two hidden layers, the third lobe had three hidden layers, and the fourth lobe had four hidden layers with each hidden layer in the MLANN containing 50 neurons. As a result, both the ANN and MLANN had an identical total number of neurons. Finally, the ELNN had the same structure as the MLANN with the same number of hidden neurons and hidden layers but with the exception of having an ensemble layer before the final output to aggregate the sub-models directly.

  • The database was randomly split into 70% training dataset and 30% testing dataset. In order to avoid any overfitting due to data splitting, the training was repeated 1000 times, each with a different random seed, meaning that the training and testing datasets were randomly changed 1000 times to ensure that the model accuracy is not affected by the data splitting and the selection of the random seed.

As a further investigation, linear regression, support vector machine, k-nearest neighbor, decision tree, random forest, adaptive boosting, and gradient boosting models were developed on the same dataset to benchmark the proposed approach performance against alternative machine learning techniques.

Performance assessment metrics

In order to evaluate the performance of the developed models herein, four different metrics were utilized, including the coefficient of determination (R2), Eq. (7), the root-mean-square error (RMSE), Eq. (8), the mean absolute percentage error (MAE), Eq. (10), and the A20 index which measures the percentage of predictions that fall within 80% accuracy of the actual values, Eq. (10). The A20 index typically ranges from 0 to 100%, with the latter being the best scenario where 100% of the predictions are within 80% accuracy while the earlier being the worst case.

$$\:{R}^{2}=1-\frac{\sum\:({x}_{i}-{y}_{i}{)}^{2}}{\sum\:({x}_{i}-{\stackrel{-}{x}}_{i}{)}^{2}}$$
(7)
$$\:\text{R}\text{M}\text{S}\text{E}=\sqrt{\frac{\sum\:_{i=1}^{n}{\left({x}_{i}-{y}_{i}\right)}^{2}}{n}}$$
(8)
$$\:\text{M}\text{A}\text{E}=\frac{1}{n}\sum\:_{i=1}^{n}\left|{x}_{i}-{y}_{i}\right|$$
(9)
$$\:\text{A}20=\left(\frac{1}{n}\sum\:_{i=1}^{n}1\left(\frac{{x}_{i}-{y}_{i}}{{x}_{i}}\le\:0.2\right)\right)\times\:100$$
(10)

where \(\:{x}_{i}\) is the true value; \(\:{\stackrel{-}{x}}_{i}\) denotes the average of the true values; \(\:{y}_{i}\) refers to the predicted value; \(\:n\) is the total count of data points; \(\:1\left(\bullet\:\right)\) is an indicator function that equals 1 if the condition inside is true, and 0 otherwise.

Results and discussions

Figures 5, 6 and 7 illustrate the density charts depicting the training progress of the developed models for predicting compressive strength. These figures were produced using the Datashader and Matplotlib libraries in Python and then formatted in PowerPoint. Specifically, Fig. 5 presents the results for the ANN case, Fig. 6 depicts the results for the ELNN case, and Fig. 7 shows the results for the proposed MLANN case. As mentioned before, to ensure a rigorous comparison, each model underwent 1000 training cycles. This approach was chosen to produce mean training curves that mitigate potential biases arising from the initial parameter settings. Examination of these figures reveals that, after completing the iterative optimization of weights and biases, all models achieved a comparable reduction in the loss function. Despite this similarity, the MLANN demonstrated superior performance in terms of accuracy. It reached comparable levels of precision with fewer training iterations compared to both the ANN and ELNN. One notable advantage of MLANN was its enhanced robustness in the effects of parameter initialization. The MLANN consistently showed greater resilience, whereas the ANN and ELNN exhibited sensitivity to fluctuations during training, with the latter having a slighter effect. This sensitivity can be attributed to the ANN’s larger number of weight vectors and the ELNN’s ensemble layer, which introduce additional complexity and noise during the initial data processing stages. As a result, the performance of the ANN and ELNN was more prone to variability due to the iterative optimization of a greater number of parameters. On the other hand, the MLANN, with its shallower architecture involving multiple lobar, managed to handle this noise more effectively, leading to a more stable training process and better accuracy. This difference underscores the potential advantages of the MLANN in scenarios where resilience to stabilized training and accuracy is critical.

Fig. 5
figure 5

Density plot of loss function training cycles in a conventional neural network using 1000 seeds over 1000 iterations.

Fig. 6
figure 6

Density plot of loss function training cycles in an ELNN using 1000 seeds over 1000 iterations.

Fig. 7
figure 7

Density plot of loss function training cycles in an MLANN using 1000 seeds over 1000 iterations.

Although both regression models achieved similar magnitudes of loss function during the training process, their performance diverged significantly when evaluated on the testing data. Figures 8, 9 and 10 illustrate the calculated versus measured compressive strength values predicted by the investigated models.

Fig. 8
figure 8

Predicted versus actual compressive strength via ANN model.

Fig. 9
figure 9

Predicted versus actual compressive strength via ELNN model.

The results show that the ANN, ELNN, and MLANN models fit the training data exceptionally well, with R2 values around 99%. However, a different picture emerges with the testing data, where traditional cases underperformed compared to the MLANN. Specifically, the ANN achieved an R2 of approximately 90% on the testing data, while the ELNN reached an R2 of about 88%, and the MLANN produced an R2 of about 94%. This discrepancy can be attributed to overfitting in the ANN, likely caused by the noise introduced by its 10 hidden layers and the existence of an ensemble layer. In contrast, the MLANN mitigates these issues, leading to better generalization and performance on the testing data.

Fig. 10
figure 10

Predicted versus actual compressive strength via MLANN model.

The performance assessment of the models, as shown in Fig. 11, highlights the variation between training and testing phases across all metrics and models. It can be seen that the MLANN demonstrates significantly better performance in the testing phase compared to ANN and ELNN, with clear improvements in error reduction and R2 and A20 indices. In this regard, the MLANN reduces the RMSE in the testing case by 24.7% compared to the ANN (from 5.38 to 4.05 MPa) and by 32.9% compared to the ELNN (from 6.04 to 4.05 MPa). Similarly, the MLANN lowers the MAE in the testing case by 21.4% compared to the ANN (from 3.78 to 2.97 MPa) and by 25.9% compared to the ELNN (from 4.01 to 2.97 MPa). The R2 increases by 4.4% for the MLANN in the testing case over the ANN (from 0.90 to 0.94) and by 6.8% over the ELNN (from 0.88 to 0.94). Furthermore, the MLANN improves the A20 index by 17.9% over the ANN in the testing case (from 76.08 to 89.70%) and 14.4% over the ELNN (from 78.41 to 89.70%), indicating better accuracy and controlled overfitting and highlighting the superiority of the proposed technique over traditional cases.

Fig. 11
figure 11

Performance assessment of the investigated models.

Further comparisons with other alternative machine learning techniques for predicting the compressive strength of concrete are performed, and the results are reported in Table 2. It can be observed that the decision trees exhibit near-perfect training performance but suffer from overfitting, reflected in a drop in testing R2 and A20 index. Random forest and gradient boosting outperform the simpler models, demonstrating high testing R2 and superior generalization, with random forest giving an A20 index on testing data of 79.47% while gradient boosting gives 82.45%. When compared to ANN-based models, MLANN stands out as superior, achieving higher testing R2 (0.94), lower RMSE (4.05), and a significantly higher A20 index (89.70) than all other models in this analysis, including random forest and gradient boosting. ANN and ELNN, while performing well, show lower performance, reflecting marginally less reliability in error tolerance. In summary, while random forest and gradient boosting demonstrate strong performance among traditional methods, MLANN’s enhanced architecture makes it more effective for both training and testing phases, particularly in scenarios requiring precise and reliable predictions.

Table 2 Performance of the proposed approach against other alternative machine learning techniques.

Conclusion

This study aims to develop a brain-inspired MLANN architecture for accurately estimating concrete compressive strength while addressing the limitations of traditional neural networks. Current approaches, such as ANNs and ELNNs, often struggle with overfitting and are susceptible to noise, which limits their ability to generalize to new data. This research introduces the MLANN model to enhance prediction accuracy and control overfitting, providing a more reliable and efficient solution. The study compares the performance of MLANN against ANN and ELNN models, demonstrating the superior generalization capability of the MLANN. Based on the aforementioned statements, the following conclusions are drawn:

  • MLANN achieves a faster reduction in the loss function with respect to the epoch number than ANN and ELNN.

  • MLANN improves R² by 4.4% over ANN and by 6.8% over ELNN in the testing phase, whereas it increases the A20 index by 17.9% compared to the ANN and by 14.4% compared to the ELNN, demonstrating improved prediction accuracy.

  • MLANN reduces the RMSE in the testing phase by 24.7% compared to the ANN and by 32.9% compared to ELNN, while it lowers the MAE by 21.4% compared to the ANN and by 25.9% compared to ELNN in the testing phase.

  • MLANN effectively controls overfitting and generalizes unseen data better than traditional ANN and ELNN models.

  • Further comparison with alternative machine learning techniques, including bagging and boosting, shows that the MLANN approach yields the highest accuracy, represented by the highest A20 index and lowest RMSE and MAE values in both training and testing cases.

Finally, the study has certain limitations. The dataset used primarily focuses on compressive strength prediction, and the model’s performance needs to be validated across other datasets and different material properties to confirm its versatility. Future research should also explore the integration of different activation functions and hybrid approaches with other machine learning techniques to further enhance the MLANN’s adaptability and robustness.