Abstract
This paper investigates the flexural bearing behavior of reinforced concrete beams through experimental analysis and advanced machine learning predictive models. The primary problem centers around understanding how varying compositions of construction materials, particularly the inclusion of recycled aggregates and carbon fiber-reinforced polymer (CFRP), affect the structural performance of concrete beams. Eight beams, including those with natural aggregates, recycled aggregates, fly ash, and CFRP, were tested. The study employs state-of-the-art machine learning frameworks, including Random Forest Regressor (RFR), XGBoost (XGB), and LightGBM (LGBM). The formation of these models involved data acquisition from experiments, preprocessing of key input features (such as rebars area, cement portion, recycled and natural aggregate masses, silica fume, fly ash, compressive strength, and CFRP presence), model selection, and hyperparameter tuning using Pareto optimization. The models were then evaluated using performance metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and coefficient of determination (R2). Outputs focus on load-induced deflection and mid-span displacement. With a dataset of 4851 samples, the optimized models demonstrated excellent performance. The experimental results revealed substantial enhancements in both compressive strength and load-bearing capacity, notably observed in beams incorporating 70% recycled aggregate and 10% silica fume. These beams exhibited a remarkable increase in compressive strength of up to 53.03% and a 7% boost in load-bearing capacity compared to those without recycled aggregate. By integrating experimental analysis with advanced computational techniques, this study advances the understanding of eco-friendly construction materials and their performance, shedding light on the intricate interactions between sustainable construction materials and the flexural bearing behavior of beams.
Similar content being viewed by others
Introduction
The global construction industry confronts pressing environmental challenges, leading to a surge in the use of recycled aggregates (RA) derived from construction and demolition waste (CDW). Wang et al.1 examined historical evolution, inherent defects, and improvement methods of recycled aggregate concrete (RAC). Building on this foundation, the flexural behaviour of concrete beams incorporating treated wastewater, recycled concrete aggregates, and fly ash was investigated by Abushanab and Alnahhal2. The study meticulously evaluated the influence of RCA and fly ash on the flexural behaviour and ultimate capacity of reinforced concrete (RC) beams. In contrast, the present study integrates experimental analysis with machine learning models—Random Forest Regressor (RFR), XGBoost (XGB), and LightGBM (LGBM)—to predict beam deflection based on parameters like recycled aggregates, natural aggregates, silica fume, CFRP presence, and compressive strength. Pareto optimization is also employed to fine-tune the models, offering a predictive tool for sustainable construction and enhancing the understanding of eco-friendly materials in steel-reinforced concrete beams.
Several studies have significantly contributed to understanding the feasibility and structural performance of recycled aggregate concrete (RAC). Sato et al.3 conducted flexural loading tests on reinforced recycled concrete members, confirming the technical feasibility of incorporating RAC in RC beams. Momeni et al.4 focused on flexural strength, introducing intelligent models for predicting the ultimate flexural strength of recycled reinforced concrete (RRC) beams. González-Taboada et al.5 employed regression and genetic programming to predict key mechanical properties of structural recycled concrete, contributing valuable tools for sustainable construction. Ignjatović et al.6 investigated the flexural behavior of RAC beams, providing insights into the material properties of recycled aggregate concrete.
Velay-Lizancos et al.7 explored the impact of using recycled aggregate in precast beams, conducting real-scale tests with varying proportions. Arezoumandi et al.8 experimentally investigated the flexural strength of full-scale reinforced concrete beams with 100% RCA, revealing comparable ultimate flexural strength to conventional concrete (CC) beams but with approximately 13% higher deflection. Choi et al.9 emphasized the importance of utilizing recycled aggregates in RAC beams to address concrete waste challenges. Tošić et al.10 compiled a comprehensive database on the flexural and shear strength of reinforced RAC beams, comparing results to Eurocode 2 predictions. Collectively, these studies have provided valuable insights into the mechanical properties and structural behaviour of recycled aggregate concrete, offering a foundation for its sustainable application in construction.
The integration of basalt macro-fibres with RCA in Ghoniem’s research employed advanced computational methods to assess the shear capacity of fibrous RAC beams11. The study highlighted the evolving role of numerical modelling and deep learning techniques in understanding and predicting the behaviour of sustainable structural elements. Meanwhile, Sojobi and Liew12 study the flexural behaviour and efficiency of reinforced recycled concrete beams using CFRP laminate, showcasing the exploration of innovative materials to enhance structural performance.
Kar and Biswal13 addressed broader applications and conducted comprehensive review on the use of fiber-reinforced polymer (FRP) composites for shear rehabilitation of RC flexural members. Their work drew from an extensive range of experimental studies, critically evaluating the impact of various parameters on the shear behaviour of FRP-strengthened RC beams. Recognizing the versatility of FRP, especially Carbon FRP (CFRP), in enhancing the structural performance of RC elements, the study underscored its significance beyond traditional applications, exploring its innovative role in strengthening the shear tensile zone of structures. Seminal works by Abdelkarim et al.14, Karayannis and Golias15, and Ruan et al.16 contributed to the broader understanding of FRP’s influence on structural behaviour, covering aspects such as FRP content, behaviour of slender RC beams, and flexural characteristics of concrete beams reinforced with various FRP materials. Ghalehnovi et al.17 further extended this exploration by examining the impact of Steel Fiber-Reinforced Concrete (SFRC) jacketing on the flexural performance of Coarse Recycled Aggregate Reinforced Concrete (CRARC) beams. Their study provided valuable insights for optimizing the behaviour of SFRC-jacketed recycled aggregate concrete elements.
The conventional approach to assessing shear and flexural capacities in concrete is labor-intensive and time-consuming. To address these challenges, researchers are increasingly adopting advanced computational methods such as Machine Learning (ML), Artificial Neural Networks (ANNs), and Deep Learning (DL)18. For instance, Murad et al.19 used Gene Expression Programming (GEP) to develop a simplified model for predicting the flexural behavior of fiber-reinforced polymer (FRP) reinforced concrete beams. Their findings revealed that the GEP model, built using six main parameters, showed high accuracy with R-squared values close to those of the ACI-440-17 and CSA S806-12 guidelines, indicating its reliability in predicting the beams’ flexural strength. Similarly, Momeni et al.4 developed ANN models, enhanced by particle swarm optimization (PSO) and imperialist competitive algorithm (ICA), to predict the ultimate flexural strength of RRC beams. Their experimental results indicated a 10% reduction in flexural strength for RRC beams compared to conventional beams made with natural aggregates. The PSO-based ANN model exhibited high prediction accuracy, with correlation coefficients of 0.997 and 0.994 for the testing data, outperforming the ICA-based and conventional ANN models.
In another significant study, Yuan et al.20 used ensemble ML methods to predict the compressive and flexural strengths of RAC, with random forest outperforming gradient boosting. Tran et al.21 improved predictions of recycled concrete compressive strength using hybrid models combining ML and PSO, such as Gradient Boosting (GB)_PSO and Extreme Gradient Boosting (XGB)_PSO, with GB_PSO showing the highest accuracy. Rezaiee-Pajand et al.22 employed the ICA to predict the mechanical strengths of RCA concrete, achieving mean absolute errors of 0.54, 0.36, and 0.48 for compressive, flexural, and tensile strength predictions, respectively.
Recent studies further expand on the use of ML in sustainable construction. Al Martini et al.23 investigated concrete mixes with RCA and supplementary cementitious materials, developing ML-based models for predicting compressive and flexural strengths. Wakjira et al.24 proposed ML models for predicting the shear capacity of RC beams strengthened with inorganic composites, with the XGB model demonstrating superior prediction capability. In another study, Wakjira et al.25 developed a super-learner ML model for predicting the flexural capacity of FRP-RC beams, outperforming existing code and guideline equations. Additionally, Wakjira et al.26 used various ML models to predict the shear capacity of FRP-RC beams, with XGB providing the best performance. These studies highlight the potential of advanced ML techniques to enhance prediction accuracy for concrete properties, promoting sustainable construction practices by enabling the broader use of recycled materials and innovative reinforcement methods. In the quest for improved predictive capabilities, the changing landscape of machine learning drives the exploration of more advanced models. Notably, ensemble models like Random Forest Regression (RFR)27,28 and Light Gradient Boosting Machine (LGBM)29,30 emerge as promising options. Their ability to blend multiple predictors makes them particularly adept at capturing complex data patterns.
The performance of machine learning (ML) models has been enhanced by coupling them with robust optimization methods like Student Performance Based Optimization (SPBO)31, Three Modified Symbiotic Organisms Search algorithm (3mSOS)32, and Adaptive Elitist Differential Evolution (AEDE)33. To further advance predictive accuracy and reliability, the development of more robust hybrid ML models using various metaheuristic optimization techniques is necessary. Pareto Optimization stands out as a powerful multi-objective optimization method that identifies a set of optimal solutions, known as the Pareto front. Each solution on the Pareto front represents a trade-off where no objective can be improved without compromising another. This technique is widely used to address complex problems with conflicting objectives, offering a diverse range of solutions that balance different criteria34,35. This study selected Pareto Optimization for its ability to handle multiple objectives and provide comprehensive optimal solutions. By applying Pareto Optimization, the study aims to improve prediction accuracy and robustness in modeling the flexural behavior of beams with sustainable construction materials, ensuring well-balanced and effective models across various scenarios.
There exists a notable gap in the current studies regarding the significant impact of CFRP on the flexural capacity of beams incorporating RAC and fly ash. This study aims to address this gap by presenting a pioneering investigation to provide clarity and depth to a previously overlooked area. The research commences with a meticulously designed experimental program involving eight beams and a diverse range of materials. Subsequently, advanced ensemble models are developed based on the extensive dataset obtained from the experimental phase. By employing state-of-the-art ensemble learning rooted in comprehensive experimental investigations, this study not only seeks to fill the aforementioned research void but also aims to establish a new standard for research in the sustainable construction domain.
Experimental investigations
Materials and experimental scheme
This experiment investigates the behaviour of steel-reinforced concrete beams. The beams were designed with cross-sectional dimensions of 200 × 300 mm and a length of 1800 mm, as shown in Fig. 1. Each beam employed two Φ14 bars in the tension zone and two Φ12 bars in the compression zone, with Φ6 steel used as the stirrups at a spacing of 150 mm. To enhance the load-carrying capacity of the beams, they were strengthened with CFRP sheets. The CFRP sheets had a thickness of 0.167 mm, a width of 220 mm, a length of 1800 mm, a yield strength of 825 MPa, and an elastic modulus of 245,000 MPa.
Dimensions of the reinforced concrete beam specimen.
The CFRP sheets were bonded to the underside of the beams in accordance with ACI 440.2R-0836, following these steps:
-
Surface Preparation: Cleaned and smoothed the concrete surface for proper adhesion.
-
Primer Application: Applied primer to enhance bonding with the CFRP sheet.
-
Epoxy Application: Coated the surface with a two-part epoxy adhesive.
-
CFRP Sheet Placement: Carefully aligned and placed the CFRP sheet onto the epoxy.
-
Pressing and Curing: Applied pressure and allowed curing for full-strength bonding.
This method ensured effective reinforcement, as shown in Fig. 2.
Reinforcement with CFRP sheet for the bottom face of the beam.
As shown in Table 1, the concrete mixture used in this research involved recycled aggregate concrete (RAC) as a replacement for natural aggregate concrete (NAC), with the addition of silica fume for beams M2, M3, and M4, and the addition of fly ash for beams D2, D3, and D4. The weight percentages of RAC used to create the concrete mixtures under study were as follows: 30%, 50%, and 70%. The percentage of silica fume used in the concrete mixtures was 10%, and the percentage of fly ash was 30%. To assess the compressive strength of the concrete mixtures as listed in Table 1, cubic concrete samples with dimensions of 150 mm were casted and cured for 28 days before undergoing compressive strength testing. The properties of sand and natural coarse aggregate are analogous to those in the previous study by Thien et al.37.
As shown in Table 2, silica fume has a specific gravity of 2.15 g/cm3, and its specific weight varies between 0.5 and 0.7 g/cm3. The chemical characteristics of fly ash are presented in Table 3, while the mechanical properties of the CFRP sheet are shown in Table 4.
Testing procedures
Figure 3 shows two pictures of the three-point bending test (D3, see Table 1) for beams with a hydraulic loading system with a loading capacity of 500 kN. It should be noted that the four-point bending test creates a pure bending region, which accurately facilitates the assessment of a beam member’s bending behavior. Despite this distinction, the load-deflection characteristics observed in four-point bending tests are quite similar to those in three-point bending tests38.
The flexural test conducted on specimen D3: (a) Initial model setup featuring three LVDTs for precise measurements; (b) Post-testing view, illustrating changes in specimen state; (c) Schematization of the initial model configuration with the planned positions of LVDTs.
The system is connected to a mobile data acquisition unit with 30 channels used for data recording. The span between the two supports in this experiment is 1500 mm. Three displacement measurement devices (LVDTs), including LVDT1, LVDT2, and LVDT3, are used to measure the deflection at the bottom of the beams under the bending load. A displacement control at a rate of 1.0 mm/minute is used in all tests. During the testing process, load data is recorded by the data acquisition machine, while deflection data from LVDTs is simultaneously recorded. Subsequently, this data is used to plot the load-deflection response curves of the tested beams.
Experimental results and discussion
Compressive strength
The compressive strength test was conducted in accordance with the Vietnam standard TCVN 3118:202239. The test began with the reference sample M1 and the compressive strength was measured at 16.70 MPa. It is to be noted that the introduction of RAC in combination with silica fume yielded notable improvements in compressive strength for samples M2, M3, and M4, which contained 30%, 50%, and 70% RAC, respectively, along with 10% silica fume. The compressive strengths of these test samples reached 25.5 MPa, 17.6 MPa, and 19.1 MPa, respectively. This reflected substantial increases of 53.03%, 5.63%, and 14.62% compared to the compressive strength of the reference sample M1. The enhancement in strength is primarily attributed to the impact of silica fume on the concrete’s compressive strength. The addition of silica fume to the concrete mixture fills voids between cement particles, leading to a denser and more tightly bonded concrete structure, ultimately contributing to improved compressive strength.
On another note, a different trend was observed for concrete mixtures utilizing 30%, 50%, and 70% RAC combined with 30% fly ash under similar curing conditions. As represented by samples D2, D3, and D4, the compressive strengths were found to be 16.5 MPa, 12.9 MPa, and 12.8 MPa, respectively. Surprisingly, these values indicated reductions of 1.44%, 23.01%, and 23.37%, compared to the compressive strength of the reference sample M1. This outcome suggests that the incorporation of fly ash in combination with higher proportions of recycled aggregate led to a decrease in compressive strength. The intricate interplay between the materials in the concrete mixture highlights the importance of carefully selecting and proportioning supplementary materials to achieve optimal strength characteristics. Further analyses and investigations into the underlying mechanisms are warranted to refine the understanding of these observed trends and guide the refinement of concrete mix designs for enhanced performance.
In Fig. 4, the destructive load tests on beams M1, M2, M3, and M4 reveal compelling insights into the structural performance of various concrete mixtures and reinforcement strategies. Notably, the applied loads on these beams highlight a remarkable load-bearing capacity for the beam M4, which incorporates 70% RAC and 10% silica fume in the concrete mix (Table 1), as well as further reinforced with CFRP (Table 4). This configuration yields a load-bearing capacity of 115.59 kN, showcasing a substantial improvement over the reference beam M1 (108 kN). The significant increase in load-bearing capacity is a testament to the synergistic effects of utilizing recycled materials with silica fume, leading to enhanced densification and bonding within the concrete matrix.
Load-displacement of beams.
Flexural strength
The flexural strength test was conducted in accordance with ASTM C29340. The corresponding displacements (i.e., maximum) at the mid-span further elucidate the performance of each beam under the applied loads. In Fig. 4, Beam M3, despite its lower load-bearing capacity (81.87 kN), exhibits a notably smaller maximum displacement of 10.19 mm when it is compared to the other beams, suggesting a more rigid response. This behavior can be attributed to the specific composition of its concrete mixture, which may enhance stiffness but limit its load capacity. In contrast, the higher load-bearing capacity of beam M4 is associated with a slightly larger displacement of 18.65 mm, indicating that its concrete mixture and reinforcement provide a balance between strength and ductility. These observations highlight that an optimal mix of recycled materials and additives can result in beams that combine both rigidity and flexibility.
Turning attention to beams D2, D3, and D4, which incorporate 70% RAC with 30% fly ash, the load tests yield intriguing results. Despite the reduction in compressive strength observed in the earlier discussion, these beams exhibit competitive load-bearing capacities. Beam D4, with a load-bearing capacity of 105.94 kN, closely approximates the reference beam M1, demonstrating the potential of fly ash as an effective supplementary material. The corresponding maximum displacements at the mid-span further reflect the varying structural responses, with these beams showcasing a balance between load-bearing capacity and flexibility under applied loads.
The experimental results in Fig. 4 highlight the complex interplay between material composition, reinforcement strategies, and structural performance. The substantial increase in load-bearing capacity observed in Beam M4, which utilizes a combination of recycled materials, silica fume, and CFRP sheets, underscores the potential for designing environmentally sustainable structures without sacrificing structural integrity. Additionally, the competitive performance of beams incorporating fly ash (D2, D3, and D4) indicates the versatility of supplementary materials in achieving desired structural characteristics, despite initial reductions in compressive strength.
In summary, the analysis of load-bearing capacity and displacement characteristics presented in Fig. 4 provides valuable insights into how different material combinations influence structural behavior. It is evident that stiffness, ductility, and toughness must be considered alongside load-bearing capacity when evaluating the effectiveness of concrete mixtures and reinforcement techniques. Engineers and researchers can leverage these insights to optimize concrete mix designs for specific applications, balancing performance and sustainability objectives. Further research and application of these principles can contribute to the ongoing evolution of resilient and environmentally conscious construction practices.
Crack and failure mechanisms of test beams
The development and expansion of cracks in beams significantly affect the load-bearing capacity of reinforced concrete beams, especially those using recycled concrete aggregates and additives reinforced with CFRP sheets. In the test beams, cracks begin to form in the regions with high moments due to bending. These cracks propagate towards the top of the beam and widen as the load increases. Figure 5 shows an increase in the number of cracks, which gradually appear away from the mid-span. Notably, some cracks do not originate from the bottom of the beam but develop from 1/3 to 1/4 of the beam’s height (see Fig. 5a). When the beam reaches its failure load, the cracks widen, and the failure mode is characterized by inclined shear-bending failure (ref. Figure 5b). However, no delamination occurs between the concrete surface and the CFRP sheets. This indicates that the behavior of reinforced concrete beams using recycled concrete aggregates with additives reinforced by CFRP sheets corresponds well to beams using conventional natural aggregates. It is important to note that the failure mode emerges naturally based on the beam’s response to loading conditions, rather than being predetermined by the design. This natural development of failure mode highlights the adaptability and effectiveness of the recycled aggregate concrete when used with CFRP reinforcement, providing performance comparable to conventional materials.
Crack and failure behavior of test beams: (a) Crack expansion, (b) Failure pattern.
Soft computing modelling
A dataset comprising 4851 samples of load and displacement for various beams with distinct compositions has been generated based on the experimental scheme. In addition to the seven beams previously discussed, all subjected to CFRP reinforcement on the bottom face, there is one control beam designated as D0. The load-displacement data for this controlled beam is provided in the supplementary material. Notably, this beam was conducted without the use of CFRP, RAC, fly ash, or silicafume, and instead employed natural aggregate. The statistical properties of this dataset are detailed in Table 5.
The statistical properties presented in the table show the intricate relationships governing the load-bearing performance of steel-reinforced concrete beams. Notably, the cross-sectional area of reinforcement bars (X1) showcases an average of 3.0788 cm2, suggesting a moderate size, yet the skewness of 2.5712 indicates a rightward tail in the distribution, implying that some beams exhibit substantially larger reinforcement areas. In contrast, the mass of the cement portion (X2) demonstrates a relatively consistent usage pattern, with an average of 28.39 kg and a standard deviation of 4.1210 kg. This consistency in cement mass is crucial for maintaining uniformity in the mixtures, contributing to the reliability of the study.
The mass of recycled aggregate (X3), however, introduces a layer of complexity with a mean of 38.88 kg and a substantial standard deviation of 33.37 kg. The near-zero skewness suggests a symmetric distribution, but the wide dispersion indicates diverse compositions of recycled aggregate across different mixtures. This variability poses an interesting avenue for further investigation into the influence of recycled aggregate on load-bearing capacities. The compressive strength (X7) has an average of 16.7 kPa, a moderate standard deviation of 3.5684 kPa, and a positive skewness of 1.0568. This skewness hints at a concentration of mixtures with higher compressive strengths, possibly attributable to specific combinations of input variables.
The binary variable CFRP (X8) adds an intriguing layer to the analysis. On average, CFRP reinforcement is present in 30.7% of the mixtures and the intermittent use of CFRP introduces a dichotomy in the dataset, allowing for a detailed exploration of how its presence influences load-bearing performance. The applied load (X9) demonstrates substantial variability, with a mean of 48.61 kN and a wide standard deviation of 34.61 kN, reflecting diverse loading conditions during testing. The positive skewness of 0.8864 implies that certain beams experienced significantly higher loads, warranting an investigation into the role of factors such as reinforcement and aggregate composition.
Finally, beam displacement (Y) exhibits a mean of 4.14 mm, yet the elevated standard deviation of 4.9085 mm and positive skewness of 1.6060 indicate a distribution skewed towards larger displacements under load. This asymmetry underscores the significance of understanding the factors contributing to beam deformation, with potential implications for structural design and resilience. In essence, a deeper exploration of these statistical properties unveils a complex interplay of variables, paving the way for a more nuanced comprehension of how input parameters influence the load-bearing behaviour of steel-reinforced concrete beams. Further analyses, such as correlation studies and multivariate regression, could unravel intricate patterns and provide invaluable insights for optimizing the design and performance of such structural elements.
The provided correlation matrix heatmap (Ref. Figure 6) reveals complicated relationships among the variables (X1 to X9), shedding light on the interplay of factors influencing the load-bearing performance of steel-reinforced concrete beams.
Correlation matrix.
Rebars area (X1)
Correlation with X4 (natural aggregate mass) is 0.47, indicating a moderate positive relationship. As the cross-sectional area of reinforcement bars increases, the mass of natural aggregate tends to increase. Negative correlation with X3 (recycled aggregate mass) of – 0.47 suggests that as the rebars area increases, the mass of recycled aggregate tends to decrease. Positive correlation of 0.4 with X9 (load) suggests that beams with larger rebars areas tend to experience higher loads.
Mass of cement portion (X2)
Positive correlation with X7 (compressive strength) is 0.62, indicating that an increase in the mass of the cement portion is associated with higher compressive strength. Negative correlation with X8 (CFRP) of -0.26 suggests that the presence of CFRP tends to be associated with a lower mass of the cement portion.
Mass of recycled aggregate (X3)
Negative correlation with X1 (rebars area) of – 0.47 implies that as the mass of recycled aggregate increases, the cross-sectional area of reinforcement bars tends to decrease. Positive correlation with X4 (natural aggregate mass) of 0.47 indicates that an increase in recycled aggregate mass is associated with higher natural aggregate mass.
Mass of natural aggregate (X4)
Positive correlation with X1 (Rebars area) of 0.47 suggests that as the mass of natural aggregate increases, the cross-sectional area of reinforcement bars tends to increase. Negative correlation with X3 (Recycled aggregate mass) of – 0.47 indicates an inverse relationship between natural and recycled aggregate masses.
Mass of Silicafume (X5)
Positive correlation with X7 (Compressive strength) of 0.71 indicates a strong positive relationship between the mass of Silicafume and compressive strength.
Mass of fly ash (X6)
Positive correlation with X8 (CFRP) of – 1 suggests a strong negative relationship; when flyash is present, CFRP is less likely to be present, and vice versa.
Compressive strength (X7)
Positive correlation with X2 (mass of cement portion) and X5 (mass of silicafume) suggests that higher masses of these components are associated with increased compressive strength. Negative correlation with X6 (Mass of fly ash) and X3 (recycled aggregate mass) indicates that higher masses of flyash and recycled aggregate are associated with lower compressive strength.
CFRP (X8)
Positive correlation with X3 (recycled aggregate mass) of 0.47 implies that the presence of CFRP is associated with higher masses of recycled aggregate. Negative correlation with X2 (mass of cement portion) and X6 (mass of fly ash) suggests that when CFRP is present, the masses of cement and fly ash tend to be lower.
Load (X9)
Positive correlation with X1 (rebars area), X4 (natural aggregate mass), X7 (compressive strength), and X3 (recycled aggregate mass) suggests that higher loads are associated with larger rebars areas, higher natural and recycled aggregate masses, and greater compressive strength. Negative correlation with X6 (mass of fly ash) and X8 (CFRP) indicates that higher loads are associated with the absence of fly ash and CFRP.
In-depth analysis of these correlations provides valuable insights for optimizing concrete mixtures and structural designs to enhance load-bearing performance. For instance, understanding the relationships between mass components and compressive strength can guide the selection of materials for achieving desired structural characteristics. The correlations also highlight potential trade-offs, such as the inverse relationship between recycled and natural aggregate masses, informing decisions on sustainable material usage. Additionally, the presence of CFRP and its correlations with other components suggest a nuanced role in the structural behaviour of the concrete beams.
Machine learning models
Random Forest regressor
The Random Forest (RF) algorithm is known for its capacity to generate numerous Regression Trees (RTs) that are uncorrelated during the training process. Each tree is grown within a randomly split subset derived from the training set Sn. The amalgamation of these RTs is achieved through a bagging method, a crucial technique for improving prediction accuracy by reducing the variance associated with predictions41,42.
In this context, the algorithm initiates by randomly selecting n samples from Sn, each with a selection probability of 1/n. These samples collectively constitute a bootstrap sample \(S_{n}^{\Theta }\), where \(\Theta\) is an independently distributed vector. Using the bagging algorithm, q bootstrap samples\(\left( {S_{n}^{{{\Theta _1}}},S_{n}^{{{\Theta _2}}},\ldots,S_{n}^{{{\Theta _q}}}} \right)\) are selected, and subsequently, q regression trees are trained on these subsets: \(\hat {h}\left( {X,S_{n}^{{{\Theta _1}}}} \right),\hat {h}\left( {X,S_{n}^{{{\Theta _2}}}} \right),\ldots,\hat {h}\left( {X,S_{n}^{{{\Theta _q}}}} \right)\). The resulting q outputs are obtained by fitting q regression trees: \({\hat {Y}_1}=\hat {h}\left( {X,S_{n}^{{{\Theta _1}}}} \right),{\hat {Y}_2}=\hat {h}\left( {X,S_{n}^{{{\Theta _2}}}} \right),\ldots,{\hat {Y}_q}=\hat {h}\left( {X,S_{n}^{{{\Theta _q}}}} \right)\). The ultimate output is determined by averaging the values of these q outputs. This ensemble approach enhances the robustness and predictive power of the Random Forest algorithm by capitalizing on the diversity and independence inherent in individual Regression Trees.
XGB
XGB is a machine learning technique that utilizes gradient boosting decision trees for predictive modelling43. The computation of the estimated output (yi) of the XGB model involves summing the prediction scores (fk(xi)) from all trees in the model. This mathematical expression can be represented as the sum of prediction scores from all trees in the model31.
Here ϕ denotes the set of regression tree parameters, representing the tree structure parameters of s; and k is the count running up to the total number of trees K. For each sample i, xi represents the corresponding features. The leaf node of a tree possesses a prediction score, fk(xi), indicating the regression value of all samples at that node. Each leaf node j with j ∈ {1,2,…T} has a computed leaf weight (wj), where T is the total leaves within the tree.
The XGB algorithm employs an objective function to regulate the learning process, terminating when the reduction in the objective function becomes limited. The regularization term in the objective function includes a hyper-parameter (λ) penalizing the complexity cost of introducing additional leaf nodes. The L2 norm of leaf node weights (wj) is given by the regularization term as follows:
where n represents the number of data samples, and \(\sum\limits_{{i=1}}^{n} {l\left( {{y_i},{{\hat {y}}_i}} \right)}\) the training loss function describing the model’s fit to the training data; \(\gamma T+\frac{1}{2}\lambda \left\| \omega \right\|_{{}}^{2}\) is a regularization term for penalizing the complexity cost by introducing an additional leaf; λ is a regularization hyper-parameter, and \(\varvec{\omega}\) is the L2 norm of leaf node weights wj.
The construction of trees occurs through a sequential and cumulative learning process. The objective function Φ(k) is redefined during each iteration step k, employing a second-order Taylor’s expansion to estimate it. This results in an approximation of the objective function given by Eq. (6).
where \({g_i}={\partial _{\hat {y}_{j}^{{(k - 1)}}}}l\left( {{y_i},{{\hat {y}}^{_{{(k - 1)}}}}} \right)\) and \({h_i}=\partial _{{\hat {y}_{j}^{{(k - 1)}}}}^{2}l\left( {{y_i},{{\hat {y}}^{_{{(k - 1)}}}}} \right)\)are the first and the second-order gradient statistic of the loss function, respectively.
To achieve an optimal solution, a tree is created using a leaf score vector and a mapping function associating an instance with a leaf j. Consequently, Eq. (6) can be expressed as:
After determining the tree structure, the best scores for each leaf node are obtained using quadratic programming (Eq. (8) and Eq. (9)), where Eq. (8) represents a function scoring the structure and determining the suitability of the leaf score vector.
A greedy algorithm is employed to discover the optimal tree structure by iteratively traversing all leaf nodes and selecting the one that maximizes the objective function’s increase after splitting. The gain calculation function for the splitting point is defined as:
where \(\frac{{G_{L}^{2}}}{{{H_L}+\lambda }}\)and \(\frac{{G_{R}^{2}}}{{{H_R}+\lambda }}\)are the scores of the left and the right subtrees, respectively, and \(\frac{{{{\left( {{G_L}+{G_R}} \right)}^2}}}{{{H_L}+{H_R}+\lambda }}\) is the score when there is no split.
LGBM
LGBM is a state-of-the-art machine learning framework developed by Microsoft Research. It is built on the foundation of the boosting regression algorithm and is renowned for its efficiency and precision in model training. Unlike traditional methods, LGBM incorporates innovative techniques that contribute to faster training time with reduced memory usage. The key features are briefly discussed in the followings44,45.
Gradient boosting algorithm: LGBM leverages the power of the gradient boosting algorithm, an ensemble learning technique that sequentially builds a series of weak learners to create a robust predictive model.
Histogram-based techniques: To efficiently handle continuous features, LGBM employs advanced histogram-based techniques. This not only accelerates the training process but also minimizes memory requirements, making it suitable for large-scale datasets.
Leaf-wise Tree Growth: LGBM adopts a leaf-wise tree growth strategy, deviating from the traditional level-wise approach. This innovative technique expands the tree by selecting the leaf that provides the maximum reduction in loss, resulting in lower loss and effective mitigation of overfitting.
Efficiency and Precision: Known for its exceptional efficiency and precision, LGBM is a versatile tool applicable to a diverse array of machine learning tasks. Its optimization strategies, including histogram-based techniques and leaf-wise tree growth, contribute to high-performance models.
Pareto optimization
Pareto optimization, also known as Pareto efficiency or Pareto optimality, is a concept derived from the work of the Italian economist Vilfredo Pareto. It refers to a state of allocation of resources in which it is impossible to make any one individual or preference criterion better off without making at least one individual or preference criterion worse off. In simpler terms, Pareto optimization is about achieving the best possible outcome for one or more criteria without negatively impacting any other criteria. This concept is often applied in various fields such as economics, engineering, and multi-objective optimization problems46,47,48,49. A brief theoretical concept is stated below.
Objective functions: In Pareto optimization, you typically have multiple objective functions, denoted as \({f_1}\left( x \right),{f_2}\left( x \right),{f_3}\left( x \right),\ldots,{f_k}\left( x \right)\)where x is the vector of decision variables.
Decision Variables: x represents the decision variable vector that you want to optimize. These variables might be subject to certain constraints.
Feasible region: Constraints define the feasible region in the decision variable space, denoted as \(x \in X\), where X is the set of feasible solutions.
Pareto Dominance: A solution x1 is said to dominate another solution x2 if \({f_i}\left( {{x_1}} \right) \geqslant {f_i}\left( {{x_2}} \right)\)for all i and \({f_j}\left( {{x_1}} \right) \geqslant {f_j}\left( {{x_2}} \right)\) for at least one j. This dominance relation is denoted as x1≤ x2.
Pareto Front: The Pareto front is the set of non-dominated solutions, i.e., solutions that are not dominated by any other solution. Mathematically, it is the set \(\left\{ {x \in \left. X \right|\nexists x^{\prime} \in X:x^{\prime} \leqslant x} \right\}\).
Pareto Set: The Pareto set is the set of corresponding decision variables for the solutions on the Pareto front.
Mathematical Formulation: The goal of Pareto opt imization is to find solutions x that belong to the Pareto front by solving a multi-objective optimization problem. This problem can be formulated as finding x that minimizes or maximizes a vector objective function. \(F\left( x \right)=\left[ {{f_1}\left( x \right),{f_2}\left( x \right),{f_3}\left( x \right),or maximizes a vector objective function,{f_k}\left( x \right)} \right]\)subject to constraints.
Mathematically, one may write the multi-objective optimization problem as: Minimize (or Maximize) \(F\left( x \right)=\left[ {{f_1}\left( x \right),{f_2}\left( x \right),{f_3}\left( x \right),\ldots,{f_k}\left( x \right)} \right]\) subject to \(x \in X\)
Optimality conditions: A solution x is Pareto optimal if there is no other feasible solution that Pareto dominates it.
Flowchart of the proposed soft computing approach
The proposed soft computing approach for this study is presented as a flow chart in Fig. 7. The procedural steps are outlined as follows:
Data Acquisition: The initial step involves the execution of eight bending tests, generating a comprehensive dataset capturing the load-displacement characteristics of RC beams with diverse compositions.
Model selection and Pareto Optimization: The study focuses on identifying the most suitable predictive models for the acquired dataset. A rigorous evaluation is conducted, employing state-of-the-art machine learning frameworks, namely RFR, XGB, and LGBM. Subsequently, these selected models are further enhanced through Pareto optimization, to improve their performance.
Performance evaluation: The optimal models are systematically evaluated to assess their performance. This involves a detailed analysis of various metrics to determine the most effective model for accurately predicting the load-displacement behavior of RC beams under varying conditions.
Insight generation: The final step aims at gaining profound insights into the physical significance of the features influencing the flexural capacity of RC beams. Factors such as recycled aggregate, fly ash, CFRP, and silicafume are specifically examined, unraveling the nuanced relationships between these variables and the structural performance of RC beams.
Flow chart for the proposed soft computing approach.
This approach is comprehensive as it integrates experimental data, advanced machine learning techniques, and Pareto optimization and provides a robust framework for understanding and predicting the behavior of RC beams under diverse compositional influences.
Comparative study of various ML models
The experimental dataset undergoes learning from seven distinct machine learning methods, each bringing its unique approach to the task. These methods include RFR42, LGBM45, XGB31, ANN50, Decision Tree (DT), Multivariate Adaptive Regression Splines (MARS)51, and Support Vector Machine (SVM)24. Each of these methodologies represents a diverse set of algorithms, offering various strengths and characteristics in their predictive capabilities. The adoption of this ensemble of methods facilitates a thorough investigation of the dataset, accommodating various patterns and structures inherent in the data through multiple model iterations and random train/test splits, ensuring the stability of both the data and the model. The details of each model’s performance are given in Table 6.
The in-depth examination of machine learning models, as delineated in Table 7, not only establishes a discernible hierarchy of performance but also unveils the distinctive strengths and potential shortcomings of each model across various metrics. RFR emerges as the indisputable leader, achieving the top position by demonstrating unparalleled precision with the lowest Mean Squared Error (MSE) of 0.101 and Mean Absolute Error (MAE) of 0.165 on the testing set. Its nearly flawless R-squared (R2) score of 0.98 unequivocally underscores RFR’s exceptional capacity to precisely capture the intricate variability in the target variable.
LGBM establishes itself with commendable metrics, securing the second position. LGBM’s MSE of 0.441, MAE of 0.206, and consistent R2 scores of 0.98 on both training and testing sets attest to its robust predictive capabilities. Meanwhile, XGB, positioned third, sets itself apart with the lowest MSE of 0.041 and MAE of 0.074 on the testing set, accompanied by impeccable R2 scores. This compelling evidence underscores XGB’s exceptional accuracy, positioning it as a competitive choice for complex applications.
On another note, despite DT’s commendable achievement of the lowest MSE (0.029) on the training set and a high R2 of 0.694, indications of overfitting issue are not unsurprised. ANN and MARS exhibit moderate performance with elevated MSE and MAE values, signalling opportunities for refinement. Notably, SVM has the lowest rank, grappling with high MSE and MAE on the testing set, highlighting inherent challenges in predictive accuracy. This evaluation underscores the complexity of model selection, emphasizing the imperative consideration of trade-offs between accuracy and generalization within specific application contexts.
The graphical presentation shown in Fig. 8 vividly emphasizes the superior performance of three specific methods, i.e., RFR, LGBM and XGB. The visual representation not only accentuates the efficiency and effectiveness of these chosen methods but also underscores their advanced predictive capabilities. This graphical illustration provides a compelling narrative, making it intuitively evident that RFR, LGBM, and XGB stand out in terms of performance. Consequently, they have been designated for further enhancement through the integration of Pareto’s optimization. This strategic coupling aims to refine and optimize the performance of RFR, LGBM, and XGB, leveraging the principles of Pareto optimization to achieve superior results in handling the experimental dataset.
Performance Evaluation of Machine Learning Methods. Comparative analysis across key metrics, including a Mean Squared Error (MSE), b Mean Absolute Error (MAE), and c R-squared (R2).
The outcome of Pareto optimization for LGBM, RFR, and XGB machine learning models is presented in Fig. 9. In the case of LGBM, the outcome is visually depicted through a scatter plot graph, where the x-axis represents Mean Squared Error (MSE) in training (ranging from 0.3000 to 0.4500), and the y-axis signifies MSE in testing (ranging from 0.550 to 0.700). In Fig. 9a, lower values on both axes indicate superior performance. Feasible points, denoted by green dots, are scattered across the plot, with a notable concentration along and near the red Pareto front line (i.e., all Pareto efficient solutions). This Pareto front, represented by a line shown with stars, delineates optimal solutions, in which improving one objective necessitates a trade-off with another. The presence of green dots clustered along the Pareto front highlights solutions that strike a balance between training and testing MSE, showcasing the model’s generalization capabilities without overfitting.
Outcome of Pareto optimization for (a) LGBM; (b) RFR; (c) XGB.
In a similar vein, the Pareto optimization analysis for the Random Forest method is illustrated through a scatter plot graph with feasible points marked by green circles in Fig. 9b. The x-axis represents MSE train (ranging from 0.3000 to 0.4500), and the y-axis denotes MSE test (ranging from 0.550 to 0.700). A discernible difference of green circular dots is observed along and close to the red Pareto front line. The Pareto front delineates optimal solutions where enhancing one objective comes at the expense of degrading another. The graph reveals that as MSE train increases, MSE test conforms, reaching a pivotal point where further increments in training error led to a subsequent decrease in testing error. The blue star strategically placed on the Pareto front represents the best solution, indicating an optimal balance between training and testing errors and affirming the model’s capacity to generalize effectively without overfitting.
In the Pareto optimization analysis for the XGB method, as shown in Fig. 9c, a scatter plot graph showcases feasible points (green circles) with the x-axis representing MSE train (ranging from 0 to 0.3) and the y-axis depicting MSE test (ranging from 0.450 to 0.625). The pronounced concentration of green circular dots along and near the red Pareto front line highlights optimal solutions where improvement in one objective involves a trade-off with another. The graph demonstrates that as MSE train increases, MSE test initially rises but then decreases after reaching a threshold, emphasizing an optimal balance. The strategically placed blue star on the Pareto front signifies the best solution, exemplifying an optimal equilibrium between training and testing errors. This reinforces the model’s effectiveness in generalization without succumbing to overfitting during training.
The results summary of Pareto optimization is given in Table 7. The outcomes of Pareto optimization on the three selected machine learning methods, namely RFR, LGBM, and XGB, reveal a meticulous fine-tuning of hyperparameters to bolster their predictive capabilities. In the case of RFR, an optimal configuration of 563 trees, a maximum depth of 10, and minimal samples for splitting and leaf nodes reflects a balanced complexity. LGBM’s optimal settings, including 636 trees, a learning rate of 0.048, and carefully chosen values for parameters like colsample_bytree and num of leaves, underscore its refined adaptability to the dataset. Meanwhile, XGB’s fine-tuned parameters, such as 373 trees, a learning rate of 0.023, and precise regularization terms, showcase the model’s heightened sensitivity to nuanced relationships within the data. This strategic optimization not only contributes to enhanced performance but also emphasizes the significance of hyperparameter tuning in extracting the full potential of these machine learning models for robust predictions.
Model performance with the optimal hyperparameters
The LGBM model demonstrates exceptional performance, as indicated by various key metrics in Fig. 10. In terms of the R2, the model exhibits high values of 0.98 for the training dataset and 0.989 for the testing dataset. These scores suggest that the LGBM model effectively explains the variance in the target variable, showcasing robust predictive capabilities in both training and testing scenarios. The RMSE values of 0.706 for training and 0.454 for testing further emphasize the model’s accuracy, with lower values indicating closer proximity of predicted values to the actual ones.
Performance evaluation of LGBM model on a Training data and b Testing data.
Additionally, the MAE values of 0.192 for training and 0.145 for testing underscore the LGBM model’s ability to provide predictions with low absolute errors on average. The A20 metric, measuring the percentage of predicted values within 20% of the true values, exhibits remarkable consistency with values of 72.84 for training and 72.727 for testing. This consistency indicates the model’s reliability in maintaining accuracy across different datasets.
In summary, the LGBM model’s performance is characterized by high R2 values, low RMSE and MAE values, and a consistently high A20 metric for both training and testing datasets. These metrics collectively highlight the model’s robustness, accuracy, and generalization ability, making it a strong candidate for predictive modelling in the given context.
Shown in Fig. 11, the XGB model showcases an impressive performance across various metrics, yet there are notable discrepancies between its performance on the training and testing datasets. The R2 values are exceptionally high, reaching 0.998 for the training set and 0.99 for the testing set. These scores indicate an almost perfect explanation of the variance in the target variable during training, but a slightly lower, though still substantial, level of explanatory power during testing.
Performance evaluation of XGB model on a Training data and b Testing data.
However, the RMSE and MAE metrics reveal a complex but interesting picture. The RMSE values are notably low at 0.213 for training and 0.433 for testing, suggesting that the model’s predictions are generally close to the true values. On the other hand, the MAE values of 0.079 for training and 0.134 for testing indicate a low average absolute error in predictions, further affirming the model’s accuracy.
The most notable difference lies in the A20 metric, representing the percentage of predicted values within 20% of the true values. While the training set boasts an impressive A20 of 95.062, indicating a high proportion of accurate predictions, the testing set exhibits a lower A20 of 71.429. This discrepancy suggests that the XGB model, while excelling in accuracy during training, may face challenges in generalizing its predictive capabilities to unseen data.
In summary, the XGB model’s outstanding R2, low RMSE, and MAE underscore its proficiency in capturing patterns in the training data. However, the decrease in A20 during testing hints at potential overfitting or challenges in generalization, emphasizing the importance of further fine-tuning and evaluation to enhance the model’s robustness across diverse datasets.
As presented in Fig. 12, the RFR model exhibits a strong and balanced performance across various evaluation metrics, demonstrating its effectiveness in predictive modelling. The R2 values are notably high, standing at 0.995 for the training set and 0.993 for the testing set. These scores indicate the model’s exceptional ability to explain the variance in the target variable, showcasing a high level of predictive accuracy for both datasets.
Performance evaluation of RFR model on a Training data and b Testing data.
In terms of predictive accuracy, the RMSE values are low, with 0.371 for training and 0.375 for testing. This suggests that the model’s predictions is closely aligned with the true values, reflecting its precision in capturing the underlying patterns in the data. Similarly, the MAE values of 0.111 for training and 0.123 for testing reinforce the model’s ability to make accurate predictions with low absolute errors on average.
The A20 metric, which measures the percentage of predicted values within 20% of the true values, further emphasizes the model’s robustness. RFR achieves a high A20 of 90 for the training set, indicating that a substantial portion of its predictions fall within the 20% margin of the true values. Despite of the slightly lower A20 value of 87.5 in the testing set, it still signifies a strong generalization ability, suggesting that RFR maintains accurate predictions on unseen data.
In summary, the RFR model showcases exceptional accuracy and generalization capabilities, as evidenced by high R2 values, low RMSE and MAE values, and a consistently high A20 metric for both training and testing datasets. This performance underscores RFR’s reliability and efficacy in predictive modelling across diverse datasets.
The comparison among the three methods, namely RFR, XGB, and LGBM, reveals distinct performance characteristics. RFR emerges as the undisputed champion, as evidenced by its exceptional performance across multiple metrics. The model achieves an impressively low Mean Squared Error (MSE) of 0.15 for both the training and testing sets (Fig. 13a), highlighting its unparalleled proficiency in capturing intricate patterns within the dataset and delivering predictions with remarkable accuracy. This numerical prowess is complemented by the model’s consistency, as reflected in the maintenance of Mean Absolute Error (MAE) at around 0.1 (Fig. 13b), emphasizing RFR’s ability to provide predictions closely aligned with the true values.
Performance evaluation of three models with optimized hyperparameters. Comparative analysis across key metrics: a MSE, b MAE, c A20, and d R2.
Further strengthening RFR’s dominance, the A20 metric remains steadfast at 90 for both training and testing (Fig. 13c), underscoring the model’s unwavering and reliable performance. This consistent metric instills confidence in RFR’s predictive capabilities, showcasing its stability and effectiveness across diverse datasets. The R2 values of 0.995 for training and 0.993 for testing (Fig. 13d) elevate RFR to a level of excellence, emphasizing its unmatched capacity to explain the variance in the target variable.
The optimization of the XGB model led to significant improvements, particularly in reducing the MSE, highlighting the model’s extensive hyperparameter tuning capabilities that effectively mitigate overfitting. However, the RFR model outperformed all others, achieving the lowest MSE and demonstrating superior predictive accuracy (ref. Table 8). Additionally, all models exhibited general improvements in overfitting, as indicated by increased training MSE and decreased testing MSE, suggesting better generalization to unseen data. These findings underscore the effectiveness of advanced ensemble models and optimization techniques in enhancing prediction accuracy and model robustness.
The residual analysis of the RFR model in Fig. 14 reveals a distribution that closely resembles a normal distribution on both the training and testing sets, with most residuals clustering around zero (±0.5). This pattern signifies a well-calibrated model, demonstrating minimal bias in its predictions. The normal distribution indicates that, on average, the model’s predictions align with the true values, showcasing balanced and unbiased predictive behaviour.
Performance evaluation of RFR model: (a) Residual distribution on training set, (b) Residuals on training set, (c) Residual distribution on testing set, and (d) Residuals on testing set.
The fact that most residuals are concentrated around zero emphasizes the model’s precision, indicating that it tends to make predictions close to the actual values with minimal systematic overestimation or underestimation. This observation supports the reliability of the RFR model, suggesting that it provides accurate and unbiased estimates across diverse datasets. While the absence of a noticeable skewness in residuals points towards a lack of significant bias, it’s crucial to consider these findings alongside other performance metrics for a comprehensive evaluation of the model’s predictive capabilities.
Sensitivity analysis
The feature importance plots in Fig. 15 provide a comprehensive insight into the given dataset of the RFR model. The percentage contributions assigned to each variable offer a nuanced understanding of their respective impacts on the model’s predictions.
Feature importance plots showing percentage (%).
Rebars area (X1) emerges as the most influential variable, contributing 48% to the model’s predictive power. This underscores the critical role that the area of rebars plays in determining the outcome, highlighting its significance in the structural analysis represented by the dataset. Applied Load (X9) closely follows with a 46% contribution, suggesting that the applied load is a pivotal factor affecting the predicted outcome. These two variables, i.e., Rebars area and Load, collectively account for a substantial portion of the model’s explanatory power.
Compressive strength (X7) contributes 2%, signifying a lesser but still relevant impact on predictions. The variables Mass of Silicafume (X5), Mass of recycled aggregate (X3), and Mass of cement portion (X2) each contribute 1%, indicating their presence in influencing the model’s output, albeit to a lesser extent.
Notably, the absence of specific percentage contributions for certain variables, such as Mass of natural aggregate, Mass of fly ash, and CFRP, implies that these inputs have negligible effects on the model’s predictions. While they may have some influence, their impact is overshadowed by the dominant contributions of Rebars area and Load. This study enables stakeholders to prioritize their resources on optimizing the most influential variables for enhanced predictive performance. The feature importance analysis thus serves as a valuable guide for decision-makers.
The Partial Dependence Plots (PDP) in Fig. 16 reveals complicated relationships between input variables and the predicted output, specifically for the beam’s displacement. The followings will discuss the variables respectively.
Partial Dependence Plots (PDPs) illustrating the nuanced relationships between key input variables and the predicted output, specifically the beam’s displacement in the structural analysis model.
X1 (rebars area): A substantial increase from 1.54 to 2.01 induces a significant decrease in beam displacement, plummeting from above 6 to below 3. This emphasizes a robust negative relationship between X1 and beam displacement, highlighting the critical role of rebars area in influencing structural stability.
X2 (Mass of cement portion): A marginal increase from 20 to 28 results in a small rise in displacement, showcasing a subtle impact compared to other variables. The influence of X2 on beam displacement appears relatively modest.
X3 (Mass of recycled aggregate): The variation in X3 presents a fascinating pattern. An increase from 0 to 40 leads to a notable reduction in displacement, while a subsequent increase from 40 to 62 induces a rise in displacement. Beyond 62, displacement experiences a slight reduction, underscoring the complexity of the relationship.
X4 (Mass of natural aggregate): Surprisingly, X4 exhibits no discernible effect on beam displacement. The variations in X4 do not translate into changes in the model’s output, suggesting its limited influence on structural behavior.
X5 (Mass of Silicafume): A moderate increase from 0 to 3 induces a noticeable decrease in displacement, underlining the impact of Silicafume on enhancing structural integrity.
X6 (Mass of flyash): Intriguingly, X6 shows no apparent effect on the output, suggesting that changes in flyash mass do not significantly alter beam displacement in the modeled scenario.
X7 (compressive strength): The increase in compressive strength from 12.8 to 18 results in a decrease in displacement, showcasing the importance of compressive strength in mitigating structural deformations. However, beyond 18, the effect becomes negligible, indicating a potential saturation point.
X8 (CFRP): X8 demonstrates minimal influence on beam displacement, suggesting that changes in CFRP mass have a limited impact on the structural response modeled by the system.
X9 (load): The relationship between load and displacement is notable. Generally, an increase in load leads to an elevation in beam displacement. Interestingly, this relationship becomes more stable when the load is less than 75 kN. However, beyond this threshold, the displacement fluctuates, indicating potential structural challenges under higher loads.
To sum up, these intricate insights enable a deeper understanding of the structural dynamics, providing valuable guidance for structural optimization and informed decision-making in scenarios involving varying input conditions.
Conclusion
This study addressed the critical issue of predicting the flexural strength of concrete beams incorporating recycled aggregates, fly ash, silica fume, and CFRP. To tackle this, a comprehensive experimental program involving eight bending beam tests produced an extensive dataset of 4851 samples. This dataset formed the basis for applying advanced machine learning models, particularly ensemble methods fine-tuned via hyperparameter optimization using Pareto optimization.
Among the models evaluated, RFR emerged as the most effective, achieving the lowest MSE and demonstrating superior predictive accuracy. The optimization of the XGB model also resulted in significant improvements, highlighting the importance of thorough hyperparameter tuning in mitigating overfitting. All models showed improved generalization capabilities, as evidenced by increased training MSE and decreased testing MSE, indicating reduced overfitting and enhanced prediction accuracy on unseen data.
The sensitivity analysis conducted revealed the contributions of various input variables to the flexural strength, offering insights into the influence of cement types, aggregate proportions, supplementary materials, compressive strength, and the presence of CFRP.
In summary, this study successfully combined meticulous experimental work with advanced machine learning techniques to address the problem of predicting flexural strength in concrete beams. The key findings underscore the effectiveness of ensemble models and optimization techniques in improving prediction accuracy and model robustness. These contributions not only advance our understanding of the factors influencing concrete beam performance but also pave the way for future advancements in sustainable construction practices and optimized structural design.
Limitations and future work
Despite the comprehensive experimental approach, the number of testing beams in this study was limited due to the significant logistical and resource constraints. This limitation impacts the statistical significance and generalizability of the findings. Although robust hybrid explainable machine learning models were developed to accurately capture the load-displacement behavior of Steel-Reinforced Concrete Beams with Recycled Aggregates and CFRP, the precise effect of CFRP on the beams’ flexural strengths remains insufficiently clarified.
For future work, the dataset will be substantially expanded to include a greater number of test beams, enhancing the robustness and reliability of the results. Additionally, future studies will consider the effects of various parameters such as loading rate, beam size, and aspect ratio52,53. This will provide a more comprehensive understanding of the factors influencing the flexural strength of beams and further refine the predictive capabilities of the machine learning models. Expanding the scope of the study to include these variables will significantly contribute to the development of more accurate and generalized models for practical applications in sustainable construction and structural design optimization.
Data availability
The data that support the findings of this study are available on request from the corresponding author.
References
Wang, B., Yan, L., Fu, Q. & Kasal, B. A comprehensive review on recycled aggregate and recycled aggregate concrete. Resour. Conserv. Recycl. 171, 105565. https://doi.org/10.1016/j.resconrec.2021.105565 (2021).
Abushanab, A. & Alnahhal, W. Flexural behavior of reinforced concrete beams prepared with treated wastewater, recycled concrete aggregates, and fly ash. Structures. 45, 2067–2079. https://doi.org/10.1016/j.istruc.2022.10.029 (2022).
Sato, R., Maruyama, I., Sogabe, T. & Sogo, M. Flexural behavior of reinforced recycled concrete beams. J. Adv. Concr. Technol. 5, 43–61. https://doi.org/10.3151/jact.5.43 (2007).
Momeni, E., Omidinasab, F., Dalvand, A., Goodarzimehr, V. & Eskandari, A. Flexural strength of concrete beams made of recycled aggregates: an experimental and soft computing-based study. Sustainability. 14, 11769. https://doi.org/10.3390/su141811769 (2022).
González-Taboada, I., González-Fonteboa, B., Martínez-Abella, F. & Pérez-Ordóñez, J. L. Prediction of the mechanical properties of structural recycled concrete using multivariable regression and genetic programming. Constr. Build. Mater. 106, 480–499. https://doi.org/10.1016/j.conbuildmat.2015.12.136 (2016).
Ignjatović, I. S., Marinković, S. B., Mišković, Z. M. & Savić, A. R. Flexural behavior of reinforced recycled aggregate concrete beams under short-term loading. Mater. Struct. 46, 1045–1059. https://doi.org/10.1617/s11527-012-9952-9 (2012).
Velay-Lizancos, M., Vazquez-Burgo, P., Restrepo, D. & Martinez-Lage, I. Effect of fine and coarse recycled concrete aggregate on the mechanical behavior of precast reinforced beams: comparison of FE simulations, theoretical, and experimental results on real scale beams. Constr. Build. Mater. 191, 1109–1119. https://doi.org/10.1016/j.conbuildmat.2018.10.075 (2018).
Arezoumandi, M., Smith, A., Volz, J. S. & Khayat, K. H. An experimental study on flexural strength of reinforced concrete beams with 100% recycled concrete aggregate. Eng. Struct. 88, 154–162. https://doi.org/10.1016/j.engstruct.2015.01.043 (2015).
Choi, W. C., Yun, H. D. & Kim, S. W. Flexural performance of reinforced recycled aggregate concrete beams. Magazine Concrete Res. 64, 837–848. https://doi.org/10.1680/macr.11.00018 (2012).
Tošić, N., Marinković, S. & Ignjatović, I. A database on flexural and shear strength of reinforced recycled aggregate concrete beams and comparison to Eurocode 2 predictions. Constr. Build. Mater. 127, 932–944. https://doi.org/10.1016/j.conbuildmat.2016.10.058 (2016).
Ghoniem, A. Deep learning shear capacity prediction of fibrous recycled aggregate concrete beams strengthened by side carbon fiber-reinforced polymer sheets. Compos. Struct. 300, 116137. https://doi.org/10.1016/j.compstruct.2022.116137 (2022).
Sojobi, A. O. & Liew, K. M. Multi-objective optimization of high performance bio-inspired prefabricated composites for sustainable and resilient construction. Compos. Struct. 279, 114732. https://doi.org/10.1016/j.compstruct.2021.114732 (2022).
Kar, S. & Biswal, K. C. Rehabilitation of RC flexural members in shear with externally bonded fiber-reinforced polymer composites: present status and future need. Archives Civil Mech. Eng. 21 https://doi.org/10.1007/s43452-021-00274-8 (2021).
Abdel-Karim, A. H., Khalil, G. I., Ewis, A. E. & Makhlouf, M. H. Impact of developed hybrid polypropylene fiber inclusion on the flexural performance of concrete beams reinforced with innovative hybrid bars. Constr. Build. Mater. 409, 134113. https://doi.org/10.1016/j.conbuildmat.2023.134113 (2023).
Karayannis, C. G. & Golias, E. Full-scale experimental testing of RC Beam-column joints strengthened using CFRP ropes as external reinforcement. Eng. Struct. 250, 113305. https://doi.org/10.1016/j.engstruct.2021.113305 (2022).
Ruan, X., Lu, C., Xu, K., Xuan, G. & Ni, M. Flexural behavior and serviceability of concrete beams hybrid-reinforced with GFRP bars and steel bars. Compos. Struct. 235, 111772. https://doi.org/10.1016/j.compstruct.2019.111772 (2020).
Ghalehnovi, M., Karimipour, A., Anvari, A. & de Brito, J. Flexural strength enhancement of recycled aggregate concrete beams with steel fibre-reinforced concrete jacket. Eng. Struct. 240, 112325. https://doi.org/10.1016/j.engstruct.2021.112325 (2021).
Nguyen, T., Truong, T. T., Nguyen-Thoi, T., Van Hong Bui, L. & Nguyen, T.-H. Evaluation of residual flexural strength of corroded reinforced concrete beams using convolutional long short-term memory neural networks. Structures 46, 899–912. https://doi.org/10.1016/j.istruc.2022.10.103 (2022).
Murad, Y. et al. Flexural strength prediction for concrete beams reinforced with FRP bars using gene expression programming. Structures. 33, 3163–3172. https://doi.org/10.1016/j.istruc.2021.06.045 (2021).
Yuan, X. et al. Machine learning prediction models to evaluate the strength of recycled aggregate concrete. Mater. (Basel). 15 https://doi.org/10.3390/ma15082823 (2022).
Quan Tran, V., Quoc Dang, V. & Si Ho, L. Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach. Constr. Build. Mater. 323, 126578. https://doi.org/10.1016/j.conbuildmat.2022.126578 (2022).
Rezaiee-Pajand, M., Mohebi Najm Abad, J., Karimipour, A. & Rezaiee-Pajand, A. Propose new implement models to determine the compressive, tensile and flexural strengths of recycled coarse aggregate concrete via imperialist competitive algorithm. J. Building Eng. 40, 102337. https://doi.org/10.1016/j.jobe.2021.102337 (2021).
Al Martini, S., Sabouni, R., Khartabil, A., Wakjira, T. G. & Shahria Alam, M. Development and strength prediction of sustainable concrete having binary and ternary cementitious blends and incorporating recycled aggregates from demolished UAE buildings: experimental and machine learning-based studies. Constr. Build. Mater. 380, 131278. https://doi.org/10.1016/j.conbuildmat.2023.131278 (2023).
Wakjira, T. G., Ebead, U. & Alam, M. S. Machine learning-based shear capacity prediction and reliability analysis of shear-critical RC beams strengthened with inorganic composites. Case Stud. Constr. Mater. 16, e01008. https://doi.org/10.1016/j.cscm.2022.e01008 (2022).
Wakjira, T. G., Abushanab, A., Ebead, U. & Alnahhal, W. F. A. I. Fast, accurate, and intelligent approach and prediction tool for flexural capacity of FRP-RC beams based on super-learner machine learning model. Mater. Today Commun. 33, 104461. https://doi.org/10.1016/j.mtcomm.2022.104461 (2022).
Wakjira, T. G., Al-Hamrani, A., Ebead, U. & Alnahhal, W. Shear capacity prediction of FRP-RC beams using single and ensenble ExPlainable machine learning models. Compos. Struct. 287, 115381. https://doi.org/10.1016/j.compstruct.2022.115381 (2022).
Gong, H., Sun, Y., Shu, X. & Huang, B. Use of random forests regression for predicting IRI of asphalt pavements. Constr. Build. Mater. 189, 890–897. https://doi.org/10.1016/j.conbuildmat.2018.09.017 (2018).
Borup, D., Christensen, B. J., Mühlbach, N. S. & Nielsen, M. S. Targeting predictors in random forest regression. Int. J. Forecast. 39, 841–868. https://doi.org/10.1016/j.ijforecast.2022.02.010 (2023).
Deng, S., Su, J., Zhu, Y., Yu, Y. & Xiao, C. Forecasting carbon price trends based on an interpretable light gradient boosting machine and bayesian optimization. Expert Syst. Appl. 242, 122502. https://doi.org/10.1016/j.eswa.2023.122502 (2024).
Guo, J. et al. Prediction of heating and cooling loads based on light gradient boosting machine algorithms. Build. Environ. 236, 110252. https://doi.org/10.1016/j.buildenv.2023.110252 (2023).
Nguyen, T., Ly, D. K., Huynh, T. Q. & Nguyen, T. T. Soft computing for determining base resistance of super-long piles in soft soil: a coupled SPBO-XGBoost approach. Comput. Geotech. 162, 105707. https://doi.org/10.1016/j.compgeo.2023.105707 (2023).
Nguyen, T., Ly, D.-K., Shiau, J. & Nguyen-Dinh, P. Optimizing load-displacement prediction for bored piles with the 3mSOS algorithm and neural networks. Ocean Engineering 304, 117758, https://doi.org/10.1016/j.oceaneng.2024.1177588 (2024).
Van Tran, M., Ly, D.-K., Nguyen, T. & Tran, N. Robust prediction of workability properties for 3D printing with steel slag aggregate using bayesian regularization and evolution algorithm. Construction and Building Materials 431, 136470, https://doi.org/10.1016/j.conbuildmat.2024.136470 (2024).
Jin, Y. & Sendhoff, B. Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man. Cybernetics Part. C (Applications Reviews). 38, 397–415 (2008).
Asilian Bidgoli, A. et al. Machine learning-based framework to cover optimal pareto-front in many-objective optimization. Complex. Intell. Syst. 8, 5287–5308 (2022).
ACI 440.2R-08. Guide for the design and construction of externally bonded FRP systems for strengthening concrete structures. American Concrete Institute Farmington Hills, MI, USA (2008).
Thien, V. D. D., Thanh, H. N. & Dinh, H. N. Experimental study on the effect of concrete strength and corrosion level on bond between steel bar and concrete. Tạp chí Khoa học Giao thông vận tải. 72, 498–509 (2021).
Leutbecher, T. & Rebling, J. Predicting the postcracking strength of ultra-high performance fiber reinforced concrete by means of three‐point bending tests according to EN 14651. Struct. Concrete. 20, 2081–2095. https://doi.org/10.1002/suco.201900070 (2019).
TCVN 3118:2022. Hardened concrete - Test method for compressive strength. Vietnam Institute for Building Science and Technology, Ministry of Construction, Vietnam (2022).
ASTM C293. Standard Test Method for Flexural Strength of Concrete (Using Simple Beam With Center-Point Loading). ASTM International & Conshohocken, W. PA (2008).
Schapire, R. E. The boosting approach to machine learning: an overview. 171, 149–171, doi: (2003). https://doi.org/10.1007/978-0-387-21579-2_9
Zhang, J., Huang, Y., Wang, Y. & Ma, G. Multi-objective optimization of concrete mixture proportions using machine learning and metaheuristic algorithms. Constr. Build. Mater. 253, 119208. https://doi.org/10.1016/j.conbuildmat.2020.119208 (2020).
Chen, T. & Guestrin, C. in 22nd acm sigkdd international conference on knowledge discovery and data mining 785–794.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.Y. in Advances in Neural Information Processing Systems 30 (NIPS 2017) (eds I. Guyon et al.) 7,102 (110 Vols) (Neural Information Processing Systems Foundation, Inc. (NeurIPS)).
Das, P. & Kashem, A. Hybrid machine learning approach to prediction of the compressive and flexural strengths of UHPC and parametric analysis with shapley additive explanations. Case Stud. Constr. Mater. 20, e02723. https://doi.org/10.1016/j.cscm.2023.e02723 (2024).
Younis, A. & Dong, Z. Adaptive surrogate assisted multi-objective optimization approach for highly nonlinear and complex engineering design problems. Appl. Soft Comput. 150, 111065. https://doi.org/10.1016/j.asoc.2023.111065 (2024).
Ly, K. D., Nguyen-Thoi, T., Truong, T. T. & Nguyen, S. N. Multi-objective optimization of the active constrained layer damping for smart damping treatment in magneto-electro-elastic plate structures. Int. J. Mech. Mater. Des. 18, 633–663. https://doi.org/10.1007/s10999-022-09596-8 (2022).
Ly, D. K., Truong, T. T. & Nguyen-Thoi, T. Multi-objective optimization of laminated functionally graded carbon nanotube-reinforced composite plates using deep feedforward neural networks-NSGAII algorithm. Int. J. Comput. Methods. 19 https://doi.org/10.1142/s0219876221500651 (2021).
Bui-Ngoc, T., Nguyen, T., Nguyen-Quang, M.-T. & Shiau, J. Predicting load–displacement of driven PHC pipe piles using stacking ensemble with Pareto optimization. Engineering Structures, 316, 118574. https://doi.org/10.1016/j.engstruct.2024.118574 (2024).
Nguyen, T. H., Nguyen, T., Truong, T. T., Doan, D. T. V. & Tran, D. H. Corrosion effect on bond behavior between rebar and concrete using bayesian regularized feed-forward neural network. Structures. 51, 1525–1538. https://doi.org/10.1016/j.istruc.2023.03.128 (2023).
Nguyen-Minh, T., Bui-Ngoc, T., Shiau, J., Nguyen, T. & Nguyen-Thoi, T. Undrained sinkhole stability of circular cavity: a comprehensive approach based on isogeometric analysis coupled with machine learning. Acta Geotech. https://doi.org/10.1007/s11440-024-02266-3 (2024).
Hao, X. K., Feng, Q. & Zheng, J. J. Closed-form solution of the ductility of short FRP-Wrapped rectangular concrete prisms under eccentric loads. J. Compos. Constr. 25 https://doi.org/10.1061/(asce)cc.1943-5614.0001153 (2021).
Hao, X. K., Feng, Q. & Zheng, J. J. A passive stress-strain model for concrete prisms reinforced by a combination of confinement reinforcement. Eng. Struct. 246, 112981. https://doi.org/10.1016/j.engstruct.2021.112981 (2021).
Acknowledgements
The authors sincerely appreciate the support from Ho Chi Minh City University of Technology and Education in conducting the experimental phase. We also extend our heartfelt thanks to the reviewers for their valuable feedback and constructive comments, which have significantly enhanced our manuscript. The authors gratefully acknowledge the handling editor(s) for their effort in overseeing the rigorous peer-review process..
Author information
Authors and Affiliations
Contributions
Tan Nguyen: Conceptualization, Methodology, Investigation, Validation, Visualization, Writing- Reviewing and Editing, Writing- Original draft preparation; Thanh-Hung Nguyen: Resources, Methodology, Investigation, Validation, Writing- Reviewing and Editing; Hoang-Thach Vuong: Methodology, Data curation, Writing- Original draft preparation; Jim Shiau: Writing- Reviewing and Editing; Trung Nguyen Thoi: Methodology, Writing- Reviewing and Editing; Dinh-Hung Nguyen: Data curation.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Nguyen, TH., Vuong, HT., Shiau, J. et al. Optimizing flexural strength of RC beams with recycled aggregates and CFRP using machine learning models. Sci Rep 14, 28621 (2024). https://doi.org/10.1038/s41598-024-79287-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-79287-1
Keywords
This article is cited by
-
Intelligent low carbon reinforced concrete beam design optimization via deep reinforcement learning
Scientific Reports (2025)
-
Stacked Ensemble Intelligence for Predicting Compressive Strength of CDW-Incorporated Sustainable Concrete
Iranian Journal of Science and Technology, Transactions of Civil Engineering (2025)


















