Abstract
Carbonation treatment of recycled concrete aggregates (RCA) enhances aggregate performance while enabling CO₂ sequestration, thereby supporting sustainable construction practices. Despite the growing application of machine learning (ML) in concrete research, predictive modeling of concrete incorporating carbonated RCA remains insufficiently explored. This study investigates regression based ML approaches for predicting the compressive strength of concrete containing carbonated recycled coarse aggregates. A total of 108 experimentally obtained data points were compiled and statistically validated prior to model development. The input variables included water to cement ratio, cement to coarse aggregate ratio, cement to fine aggregate ratio, water absorption and crushing value of natural and recycled coarse aggregates, parent concrete strength, degree of carbonation, and replacement ratio of natural coarse aggregates with recycled aggregates. Composite indices were introduced to simplify predictive formulations and to evaluate the influence of multicollinearity. Six ML algorithms namely, multilinear regression (MLR), ridge regression, polynomial MLR, decision tree, random forest, and LightGBM were developed and optimized using 5-fold cross-validation. Model performance was evaluated using mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R²), supported by residual and prediction analyses. The decision tree and LightGBM models achieved the highest predictive accuracy on the test dataset, yielding RMSE of 1.59 MPa, MAE of 1.17 MPa, and R² of 0.991. The random forest model closely followed, with RMSE of 1.64 MPa, MAE of 1.25 MPa, and R² of 0.991. Polynomial MLR demonstrated moderate performance, with RMSE of 2.23 MPa, MAE of 1.89 MPa, and R² of 0.983. In contrast, conventional MLR and ridge regression exhibited lower accuracy, with RMSE values of 3.19 and 3.08 MPa, MAE of 2.65 and 2.56 MPa, and R² values of 0.965 and 0.967, respectively. The tree based ensemble models provided the most accurate and stable predictions, maintaining a maximum error range of approximately ± 13%. Although the decision tree and LightGBM models achieved marginally lower prediction errors, the random forest model was identified as the most reliable overall due to its ability to capture physically meaningful relationships, as evidenced by SHAP analysis, together with its high predictive accuracy and robustness. Composite indexing had a minimal effect on model performance. Sensitivity and SHAP analyses identified the mix proportioning index as the most influential parameter, followed by degree of carbonation and aggregate performance. Overall, the results demonstrate that ML based models, particularly tree based ensembles, effectively capture the experimental behavior of carbonated RCA concrete and enable efficient compressive strength prediction.
Introduction
Concrete and other materials recycling is increasingly recognized as a vital strategy in sustainable construction. This is aimed at reducing the environmental burden of construction industry while conserving natural materials1,2. RCAs, obtained from crushed and processed waste concrete offers significant environmental and economic benefits. These benefits include reduced landfill disposal, lower extraction of natural resources, and cost savings1,3,4,5. However, RCA typically exhibits inferior qualities due to the residual cement paste adhered to the aggregate surface. These characteristics often result in reduced mechanical performance and durability compared to concrete made with natural aggregates, limiting the broader application of RCA in structural projects6,7,8,9.
Carbonation treatment of RCA has emerged as an effective technique to address the above mentioned limitations. In the carbonation treatment process, carbon dioxide (CO2) is introduced under controlled conditions to react with calcium hydroxide (Ca(OH)2 and calcium silicate hydrate phases (C-S-H), forming calcium carbonate (CaCO3) that fills pores. Then this carbonate polymorph densifies interfacial transition zones, and heals micro cracks. This chemical transformation improves aggregate quality. It can also enhance density, reduce water absorption, and increasing strength. Moreover, carbonation treatment provides the dual benefit of improving concrete performance while contributing to carbon sequestration and storage10,11,12,13.
While the benefits of carbonated recycled aggregate concrete (ARAC) are well established, traditional laboratory testing of its mechanical properties is labor intensive, costly, and time consuming14. These constraints hinder rapid mix optimization and large scale adoption. Predictive modeling offers a practical alternative by enabling accurate property estimation based on mix proportions, material properties, and other parameters, thereby reducing the need for exhaustive experimental testing15,16,17,18,19. Classical regression techniques, such as MLR, Ridge regression and polynomial MLR, have been widely applied in predicting concrete strengths due to their interpretability and ability to quantify linear and simple nonlinear relationships20,21,22. However, these methods performs less when dealing with complex, nonlinear interactions.
Recent advances in tree and ensemble based ML models offer powerful tools for overcoming these limitations. Models such as random forest, decision tree, and LightGBM can capture nonlinearities, handle high dimensional data, and automatically identify important input features23,24,25. Optimization frameworks such as k-fold cross validation enable systematic hyper parameter tuning and Shapley Additive explanations (SHAP) for interpretability26.
Despite the growing body of research on predictive modeling for conventional concrete and RAC, there is a notable gap. To the best of the authors’ knowledge, no study to date has developed predictive models considering parameters like carbonation treatment and parent concrete strength effects on the characteristics of RCAs and ARACs. Additionally, effects of multicollinearity and correlation between data sets on the predictive capacity of regression based models are not explored. Furthermore, systematic comparisons between classical regression based ML models and new ensemble based ML approaches for carbonated RCA remain unexplored.
The present study aims to fill these gaps by developing and evaluating predictive models for the compressive strength of carbonated RCA concrete. Composite indices for correlated data will be developed and a comparison of the ML data outputs using the individual dataset and aggregated datasets will be explored. Both classical regression based ML methods (MLR, ridge and Polynomial MLR) and advanced ML techniques (random forest, decision tree, and LightGBM) are employed. Model performance is compared in terms of predictive accuracy, robustness, interpretability, and computational efficiency. The research further seeks to identify the most influential parameters governing carbonated RAC performance.
Methodology
Source of data
The data used for strength predictions in this study were obtained from an experimental program reported in a previous work by the present authors27. In that study, 108 concrete samples were tested for compressive strength. The datasets were categorized into five groups: (i) mix design proportioning of the target concrete, (ii) physical and mechanical properties of the coarse aggregates, (iii) parent concrete strength effects on RCA carbonation degree, (iv) quantified CO₂ stored by carbonation (expressed as percent by mass of mortar attached to RCAs), and (v) percentage replacement of natural aggregates with RCAs. The eight individual data sets from the 5 categories were, water to cement ratio (W/C), cement to Coarse aggregate ratio (C/Ca), cement to fine aggregate ratio (C/Fa), water absorption (Wa) of natural and recycled coarse aggregates in percent, crushing value (Cv) of natural and recycled coarse aggregates in percent, parent concrete strength (Pcs) of recycled aggregates, degree of carbonation (Dc) of recycled aggregates, and Replacement ratio (Rr) of natural coarse aggregates by recycled coarse aggregates. The descriptive statistics of these data sets and the output variable which is compressive strength (CS) are shown in Table 1.
Data validity and assumption checks prior to regression based machine learning modeling
Prior to modeling, a series of statistical validity checks were conducted to ensure that the predictor variables were accurately represented in the models and that the fundamental assumptions of multiple regression and data processing were reasonably met28,29. The initial step involved was examining the pairwise correlations among the predictor variables as shown in Fig. 1. It was observed that C/Ca, W/C, and C/Fa exhibited high intercorrelation to each other. In addition, a strong correlation was found between Wa, Cv, Rr, and Dc suggesting a high degree of correlation in the information provided by these predictors.
The presence of such high correlations raised a substantial concern regarding multicollinearity, which was further confirmed through Variance Inflation Factor (VIF) analysis as shown in Table 230. Extremely large VIF values (Table 2) were seen in the data sets, indicating multicollinearity and may lead to unstable coefficient estimates, inflate standard errors, and difficulties in determining the relative importance of individual predictors.
To address this issue, a combined index approach was implemented as a countermeasure30,31. This involved aggregating highly correlated variables into composite indices, thereby retaining the essential information they contribute while reducing the interdependence between predictors. In this study, two composite predictors were developed to address multicollinearity among highly correlated variables, while retaining the essential engineering significance of each constituent parameter. The composites were formulated based on essential physical and mechanical characteristics of aggregates (Performance Index) and the strength governing proportions of the mix (proportioning Index) in a compact and statistically sound form using multiplicative, divisive, and exponential relationships informed by domain knowledge and statistical optimization31. The composite indices were created for the proportioning variables (W/C, C/Ca and C/Fa), and aggregate performance indicators (Wa, Cv and Pcs). The new data set after aggregation were then became, replacement ratio (x1), degree of carbonation (x2), proportioning index (x3) and performance index (x4).
To perform the aggregation, domain knowledge was used to derive the proportioning index and performance indices (see Eq. 1 to 4). A statistical optimization was then applied to ensure that the correlation between predictors was reduced to below 0.8 and the VIF to below 5 (see Fig. 2; Table 3). This step was necessary to minimize instability and overfitting caused by multicollinearity. Based on these criteria, each variable was assigned an exponent of a1 to a9. Having these exponents, the new indices were fed into a machine learning model, which was used to learn and determine the optimized exponents:
Where, x1 is the replacement ratio =, x2 is the degree of carbonation =, x3 is the proportioning index and x4 is the performance index of the aggregates.
After optimization using ML the values of the exponents were obtained as follows:
a1 = 1.911, a2 = 0.103, a3 = 1.165, a4 = 0.801, a5 = 1.481, a6 = 1.053, a7 = 1.752, a8 = 0.986, a9 = 0.659. All correlation values and VIFs fell below their respective limits, as shown in Table 3; Fig. 2.
The proportioning index (x3) reflects the balance between binder content, coarse and fine aggregate mass, and water dosage in the recycled aggregate concrete mix. The product of the cement to coarse aggregate and cement to fine aggregate ratios represents the overall binder richness of the mix relative to the aggregate mass. Dividing by the water to cement ratio adjusts the index to account for water content, a critical determinant of workability and strength development. Physically, a higher proportion index value corresponds to mixes with greater binder content, generally associated with higher strength and reduced porosity. The index therefore encapsulates the combined influence of mix composition parameters in a single value, reducing collinearity between the original ratios while retaining their collective predictive power.
The performance index (x4) integrated three critical aggregate characteristics into a single measure of aggregate quality. Parent concrete strength, in the numerator, reflects the intrinsic load bearing capacity of the original aggregate source. The denominator combines water absorption, representing pore structure, and aggregate crushing value, representing resistance to mechanical breakdown. The physical interpretation of this composite is straightforward where higher parent strength increases the index, while higher porosity or lower mechanical durability reduces it. Thus, the index expresses a net measure of aggregate performance. Statistically, combining these collinear parameters into one variable mitigates redundancy in regression modelling, ensuring VIF remain within acceptable limits while preserving the behavioral influence of each individual property. The exponents in each predictor are determined based on the statistical optimization that the data fulfils all the statistical checks (see Table 3; Fig. 2). To prevent the results from the composite indexes from becoming zero or infinite, small insignificant values were assumed instead of using zero for the natural aggregate concrete. For the replacement ratio, a value of 1% was used instead of 0%, while for the degree of carbonation and parent concrete effects, a value of 0.1% was assigned. These substitutions were necessary because several of the results in the composite indexes or individual results involved terms in which these variables appear in multiplication or division; using zero would lead to multiplication or division by zero errors and consequently produce zero or infinite index values. The selected small values are sufficiently low to preserve the physical meaning of “no replacement” or “no measurable effect,” while simultaneously providing numerical stability in the calculations. They act only as mathematical placeholders and do not influence the actual material behavior or alter the interpretation of natural aggregate concrete.
Following the construction of composite indices for highly correlated predictors, two separate modeling pathways were pursued. The aim of this stage was to determine whether the presence of intercorrelation among predictors influences the predictive performance of regression based ML models. In the first pathway, models were trained using the original dataset containing the individual predictors. In the second pathway, models were trained using the composite dataset, in which correlated predictors had been aggregated into indices. Both modeling pathways were implemented under identical conditions where the same data preprocessing steps, including training and testing partitioning, and algorithm settings were applied to ensure fair comparison. Model performance was assessed using standard regression evaluation metrics. Comparing the predictive outcomes of the two models, we can see if intercorrelation among predictors, has a significant effect on the models’ accuracy and stability.
Figure 3 shows the frequency distribution of the aggregated predictors and the target compressive strength values. The compressive strength exhibits a bimodal distribution, corresponding to normal strength concrete and high strength concrete.
Principles of the classical statistical models and ensemble based algorisms
In numerical modeling, selecting the right regression approach depends not only on the nature of the data but also on the underlying assumptions about how features influence the target32. This section explores a spectrum of regression methods from classical statistical models to advanced machine learning architectures. Each technique is briefly outlined, focusing on its internal logic, typical use cases, and how it adapts to different data structures. Three classical regression based algorism and three tree and ensemble learning based algorisms were used in this study and these were, MLR, ridge regression, ploy MLR, random forest, decision tree, and LightGBM. Abrams-law baseline was used to show the value added by ML models beyond classical relations. Abrams law states that the strength of concrete is inversely related to the water to cement ratio and depends only on this ratio. As the water to cement ratio increases, the compressive strength decreases33.
Classical MLR models
A regression model provides a mathematical framework for describing how one variable of interest changes in relation to one or more influencing factors. If the model incorporates only a single predictor, it is termed a simple regression. But, when it includes two or more predictors, it becomes a multiple regression model. Such models are widely used to quantify the degree and nature of association between explanatory variables and the response, with the aim of revealing patterns and dependencies that might otherwise remain hidden32.
In a simple regression scenario, the model seeks a single straight line that best represents the relationship between the dependent and independent variable across all observations. Multiple regression extends this principle to higher dimensions that, instead of fitting a line, it estimates a hyperplane that best represents the combined effect of several predictors. While the inclusion of multiple variables increases analytical complexity, it also allows for richer interpretations, including the possibility that the influence of one predictor may vary depending on the level of another. These dependencies may manifest as linear trends, curvature, or conditional effects34.
The general structure of MLR model can be expressed as:
b0 is the intercept, and bj are the partial regression coefficients reflecting the marginal contribution of each predictor. The coefficients are typically estimated using the least squares method, which minimizes the sum of squared differences between observed and predicted values34. Ridge regression is a type of regularized linear regression that helps prevent overfitting by adding a penalty on the size of the model coefficients.
Polynomial regression builds on the MLR framework by introducing higher order and interaction terms, enabling the model to capture curved relationships without discarding the linear estimation approach. This enhancement increases the model’s flexibility in fitting data that exhibit nonlinear patterns. However, the inclusion of many polynomial terms also increases the likelihood of overfitting, particularly when the number of predictors or polynomial degree is high. Nevertheless, for datasets with relatively smooth curvature, polynomial MLR can yield highly accurate and interpretable fits. The polynomial MLR equation of degree two is expressed as:
Where bjj, denotes coefficients for the squared terms of each predictor, and bij, corresponds to coefficients for interaction terms between predictors.
Tree and ensemble based learning algorisms
-
(a)
Decision tree
The algorithm works by recursively splitting the dataset based on feature values that provides the best discriminatory power, resulting in a tree like structure35. Each internal node represents a decision based on the input feature, and the leaves represent the specific classification outcome or value of the tested attribute. The tree is constructed using a greedy approach, selecting the best feature to split the data at each step36. Decision trees partition the feature space into regions using recursive binary splits. For input x, prediction is the average of target values in the corresponding region:
Where, Rn are regions (leaf nodes), Cn is the mean response in Rn, and \(\:1(x\in\:{R}_{n})\) is the indicator function. Splits are chosen to minimize impurity measures.
-
(b)
Random forest
Random forest, an ensemble learning technique, builds multiple decision trees and aggregates their results to improve predictive performance and reduce overfitting35,37. Each tree in the forest is trained on a random subset of the data, and predictions of each are combined through majority voting to produce the final classification. The Random forest model was used to compare its performance to the decision tree, with expectations of improved accuracy and reduced variance. This method is often preferred in situations where robustness and generalization are important35. Random forest’s average predictions are generated from many decision trees built on bootstrapped data samples and random feature subsets38,39:
Where, yt (x) is the prediction of the tth tree and T is the total number of trees. This reduces variance and overfitting compared to a single decision tree39,40.
-
(c)
LightGBM
LightGBM is a gradient boosting framework that builds decision trees sequentially, each one correcting errors of the previous. It uses advanced optimizations such as leaf wise tree growth, gradient based one side sampling, and exclusive feature bundling, making it highly efficient for large scale tabular data41. LightGBM achieves better performance on many tabular tasks, handling complex nonlinear relationships with minimal manual feature engineering. Interpretability is moderate, with tools like SHAP often needed to explain predictions. Overfitting is possible, but well controlled with parameters like tree depth, learning rate, and regularization. It is highly scalable and computationally efficient, though essentially a black box model. The model results are found by adding the prediction of all the ensemble trees (Eq. 9), where fn (x) indicates the prediction from the nth individual tree40,41.
Cross validation strategy and hyperparameter tuning
Model evaluation employed fivefold cross validation to ensure robust performance assessment and minimize bias due to data partitioning. Model hyperparameters were tuned using grid search with cross validation. To ensure full reproducibility, all procedures used fixed random seeds (split seed = 42, kfold seed = 42, search seed = 42) and search spaces as shown in Table 4.
Quantitative performance metrics
To provide a comprehensive evaluation of predictive accuracy and robustness, multiple statistical metrics were computed for both training and testing sets:
R2 was applied to assess explanatory power. RMSE used to measure prediction error in absolute terms. While MAE used to determine average magnitude of error. The expected values for the indice R2 should ideally be 1, indicating good prediction level, while RMSE, and MAE should ideally be 0, indicating less errors in the predictions. In general, a predictive model is ideal if its performance indices are close to or strictly at these values42. The equations for calculating these indices are presented in Eqs. (10–12) as follows42,43.
Where, yi are observed value, ŷ are predicted value and ȳ are mean of observed value.
Visual performance metrics
When evaluating models, visual checks are extremely useful because they make complex statistics easier to understand at a glance42. To complement the numerical evaluation of model performance, visual diagnostic techniques were applied. These techniques were, scatter plots, residual plots, histograms of residuals, and quantile-quantile (Q-Q) plots. Scatter plots, in which predicted values were plotted against the observed data, provided an overview of how closely the model output followed the actual trend. Ideally, points cluster around the line, and noticeable departures from this line highlight areas of reduced reliability. Residual plots offered further insight by presenting the distribution of prediction errors. Systematic patterns, such as consistent underestimation or overestimation in certain ranges or increasing error spread with larger values, were examined to identify potential weaknesses in the models.
Histograms of the residuals were used to evaluate the overall error distribution. A distribution centered near zero with a roughly symmetric shape indicates unbiased predictions, while skewed or shifted distributions suggest bias. In addition, Q-Q plots were generated to compare the residuals with a theoretical normal distribution for the test dataset. Alignment of points along the diagonal indicated that the normality assumption was reasonable, whereas systematic deviations reflected no normal error structures.
Feature importance analysis
To investigate the sensitivity of machine learning models and to interpret their behavior at both broad and granular levels, the SHAP framework, grounded in cooperative game theory principles, was applied44. This method was used to evaluate the relative contribution of each input variable to the model’s predictions. As a sophisticated tool within explainable artificial intelligence, SHAP provides a transparent view of how input features interact and influence the output. By quantifying feature impacts, it highlights which variables play the most significant roles in shaping predictions and clarifies the direction and magnitude of their effects on the model’s results42,45.
Pipeline and scientific libraries
The experimental pipeline was implemented in Python 3.10 with a notebook of Spider version 6, using Scikit-learn, LightGBM, and supporting scientific libraries.
Result and discussion
Data correlation effects on model performance
The outputs from the different machine learning modeling approaches showed that the predictions using the individual datasets and composite indices on the MLR based models have shown some differences. But in the tree based models there have not shown tangible differences. Figure 4 shows the effects of data type on the residual compressive strength outputs of different ML models. The residual are scattered following the same line on the tree based models for both datasets but some deviations are seen on MLR, Ridge regression and Poly MLR. It is shown that the individual data performed well on these models.
The quantitative matrices comparisons between models trained on individual predictors and those using composite indices are shown on Tables 5 and 6. As a baseline model, Abrams’ law yielded R², MAE, and RMSE values of 0.668, 5.774, and 9.849, respectively. These results are considerably lower in accuracy compared to the ML models. With the composite predictors, MLR, Ridge regression and Poly MLR achieved fair accuracy with test R² of 0.965, 0.967 and 0.983 respectively. Similarly with the individual predictors they achieved a test R2 values of 0.975, 0.975 and 0.991 respectively. Decision tree, random forest and LightGBM models all achieved very high predictive performance (test R² >0.991) under both datasets, with minimal train-test differences. Overall, the results indicate that predictor intercorrelation is a minimal concern for classical linear models, where it may contribute to overfitting and unstable parameter estimates. For more advanced models, particularly ensemble tree methods, multicollinearity had no effect, as they inherently manage redundancy among predictors. This suggests that while composite indexing can improve the stability of simpler models such as MLR, modern ML algorithms can reliably predict concrete compressive strength even in the presence of highly correlated predictors. Nonetheless, the composite indices continue to demonstrate substantial value by greatly simplifying model variables. Equations (15) and (16) reveal that prediction equations from classical models, especially polynomial MLR models are often long. Using composite indices, as shown in Eqs. (13) and (14), has reduce this complexity. The indices condensed multiple variables into single, coherent metrics, making the models far more manageable and actionable.
Where, DT is decision tree, PMLR is polynomial MLR, LGBM is lightGBM, RF is random forest.
Where MLR CSAgg., is the compressive strength prediction equation derived from multilinear regression based machine learning and aggregated composite predictors.
Where, poly MLR CSAgg., is the compressive strength prediction equation derived from polynomial multilinear regression based machine learning and aggregated composite predictors.
Where MLR CSIndv., is the compressive strength prediction equation derived from multilinear regression based machine learning with individual predictor data sets.
Where, poly MLR CSIndv, is the compressive strength prediction equation derived from polynomial multilinear regression based machine learning with individual predictor data sets.
Model performance comparisons
The series of plots comparing actual versus predicted compressive strength values for different models are shown on Fig. 5 revealing distinct performance patterns across the methods. MLR, Ridge and Polynomial MLR (Fig. 5, a, b and f) has shown reasonably good agreement between predicted and actual values. The points are largely clustered around the ideal 45° line, indicating that these models capture the general trend of the data. The polynomial MLR improves alignment for some values, suggesting that introducing a nonlinear term helps capture some curvature in the relationship. However, deviations at some values indicate some residual bias that these classical regression models may not fully capture46,47,48,49.
Decision tree, random forest and LightGBM (Fig. 5, c, d and e) has demonstrated improved prediction accuracy over classical regression models. Decision tree models already provide a better fit compared to MLR for nonlinear patterns, while Random Forest further enhance predictive performance by averaging multiple trees, reducing overfitting, and capturing complex interactions. The clustering of points near the ideal line in (Fig. 5, c, d and e) confirms their robustness across test datasets. These models effectively manage the nonlinearity and heteroscedasticity in the data48,49,50.
The figural representations of the quantitative performance metrics (R2 and MAE) and visual performance metrics (Q-Q plots and residual histograms) of the regression based ML models are shown on Figs. 6, 7 and 8, as well as all quantitative performance metrics scores of these models are shown on Table 5 for aggregated datasets and Table 6 for individual data sets. These results provides valuable insights into the predictive performance of the various ML models on the given dataset. The evaluation of the qualitative performance metrics R2, MAE and RMSE for training and testing datasets, as well as an analysis of the visual performance metrics like the distribution of the models’ residuals, revealed a clear hierarchy of model effectiveness, with nonlinear, tree based models outperforming traditional linear and polynomial regression methods. As shown in the performance metrics, the decision tree, random forest and LightGBM models demonstrated superior predictive accuracy and stability. These models achieved the best qualitative metrics values, indicating that they were most effective at capturing the underlying patterns in the data and minimizing prediction error. This performance is consistent with the nature of these models, which are well suited to handle complex, nonlinear relationships and intricate feature interactions that linear models cannot. Many studies have compared different models for better performances based on their data types18,51,52.
Further evidence of the models’ performance can be found by examining the distribution of their residuals using Q-Q plots and residual histograms. The histograms visually represent the frequency of different residual values, while the Q-Q plots compare the quantiles of the residuals to the quantiles of a theoretical normal distribution. For a well fitting model, both plots should demonstrate a bell shaped curve and a close alignment with the diagonal line, respectively53.
The residual histograms for the decision tree, random forest and LightGBM models show a fair distribution that is close to a bell shaped curve, with residuals centered around zero. This is a strong indication that these models are fairly capturing the systematic relationships in the data. The corresponding Q-Q plots for these models show that their residuals are close to a normal distribution.
Conversely, the MLR andridge regression models yielded the weakest performance, with the lowest test R2 (0.965) and the highest MAE (2.65). This indicates that the relationship between the predictors and the target variable is likely not linear, and that a simple linear model is unable to model the data as accurate as tree based models. The residual histogram for the Linear MLR models is noticeably less centered and shows a wider spread, indicating less prediction capability compared to other models. An improvement in performance observed in the Polynomial MLR model which achieved a better qualitative performance than the standard MLR further supports the conclusion that nonlinear relationships are a key characteristic of this dataset. However, its residual histogram, while better than the standard MLR, is still not as well formed as the tree based models’ histograms, suggesting that even with polynomial terms, a linear based model struggles to fully capture the data’s structure.
Analysis of average residuals for each model also showed that the poly MLR, Decision Tree, Random Forest and LightGBM models all demonstrated strong performance on the testing data, outperforming the standard linear MLR models.
The simple multilinear models exhibited an average locked test residual values of not less than 2.56 Mpa on the individual data and 2.65 Mpa on composite data. The scatter plots further confirm this performance, as the predictions from the MLR models show a wide deviation of up to 6.3 MPa, falling within a ± 30% maximum error range from the actual experimental data. In contrast, the poly MLR model, which added complexity, achieved a lower testing average residual of 1.21 Mpa on the individual data and 1.88 Mpa on the composite data, showing its superior ability to generalize to new, unseen data. This model’s better ability to generalize is further supported by its scatter plot analysis, which shows a significantly more accurate prediction with only a ± 15% maximum error range.
Similarly, the decision tree, random forest and LightGBM models also performed well on the test set. The decision tree, random forest and LightGBM models testing average residuals were 1.17 Mpa, 1.25 MPa and 1.17 MPa, respectively, on both the individual and composited datasets. The scatter plots for Decision Tree, Random Forest and LightGBM models reinforce their accuracy, showing a tight error range of ± 13% from the experimental data.
Sensitivity analysis
This section examines the SHAP summary plots of the individual predictors (C/Ca, C/Fa, W/C, Wa, Cv, Dc, Pcs and Rr) as can be seen on Fig. 9 and the composite induced predictors (x1, x2, x3 and x4) as can be seen on Fig. 10, to evaluate the influence of predictor variables on the ML model for concrete compressive strength54. SHAP values on the horizontal axis indicate each feature’s contribution to model predictions were positive values enhance the output, and negative values reduce it based on the predictor behavior. The color scale represents feature magnitude, allowing identification of whether high or low feature values drive positive or negative contributions14.
The analysis shows that all predictors impact the model, though to varying degrees. The most influential variables are the cement and water related ratios (W/C, C/Ca and C/Fa), forming a composite index x3. High C/Ca and C/Fa ratios combined with low W/C strongly increase predicted compressive strength, consistent with standard mix design principles.
The degree of carbonation ranks second in importance. Its effect looks complex and nonlinear, with both high and low values producing substantial positive and negative contributions. This was due to the incomplete carbonation of the high parent concrete aggregates. The percentage of the carbonation byproducts in low strength parent concrete aggregates were higher compared to aggregates sourced from high strength parent concrete. But conversely, even though the amount of carbonation by products were less in the high strength parent concrete aggregates, the resultant compressive strength of the concrete produced from these aggregates was higher compared to the aggregates produced from low strength parent concrete. This behavior introduced the nonlinear behavior in the SHAP.
Composite index x4, encompassing (Wa and Cv along with Pcs), shows moderate influence. Individually these factors were impactful. However, collectively they exert less effect than the mix proportioning index variables and carbonation. The replacement ratio of natural with recycled aggregates (Rr, composite index x1) has the least impact, with SHAP values clustered near zero. In summary, proportioning index elements were the primary drivers of model predictions, followed by carbonation, while recycled aggregate characteristics play a secondary role. This insight can guide optimized concrete mix design for enhanced performance. A study by Singh et al.55 on predicting the flexural strength of recycled aggregate based concrete using an integrated approach combining machine learning models and global sensitivity analysis demonstrated that ML techniques can effectively estimate key mechanical properties. Based on the findings of their work, ML models can also be used to predict the compressive strength of RAC, and the sensitivity of each predictor can be evaluated in a manner similar to the analysis presented in this study .
Additionally, SHAP dependence plots were used to examine the marginal effects of the key input variables on the random forest predictions of compressive strength. Figure 11 shows the dependence plots between variables x1, x2 and x3. The variable x1 showed SHAP values clustered near zero across its entire range, indicating that it contributed minimally to the model’s predictions. In contrast, x2 exhibited both positive and negative SHAP values (approximately − 10 to + 9 MPa), demonstrating a meaningful and nonlinear influence on compressive strength. Higher values of x2 generally corresponded to positive SHAP contributions, while lower values were associated with reduced predicted strength. Among the evaluated predictors, x3 displayed the strongest and most consistent effect, with SHAP values ranging from approximately − 20 MPa at low x3 levels to + 20 MPa at higher levels. This indicates that increasing x3 substantially increases the predicted compressive strength.
Table 7 shows an example of converting the individual predictors to composite indices then to predictions. It is clearly shown that the optimized indices predicted the outputs well.
The ML models developed in this study were trained and validated using datasets that fall within the parameter ranges summarized in Table 8. These ranges define the domain in which the models can make predictions. Inputs that fall outside these intervals may lead to extrapolation, which may reduce prediction accuracy.
Table 9 show the global SHAP values of each model. SHAP values quantify the contribution of each feature to the model’s prediction and are expressed in the same units as the model output. In this study, the model predicts compressive strength (MPa), so a SHAP value represents how many Mpa a feature increases or decreases the predicted strength. For example, a mean absolute SHAP value of approximately 17 for feature x₃ (see Table 9) indicates that this variable changes the prediction by about 17 MPa on average, whereas a value around 3 for x₂ reflects a smaller influence of roughly 3 MPa. Importantly, Polynomial regression models can exhibit very large SHAP magnitudes due to the presence of squared and interaction terms, which increase the internal numerical scale of the model. These inflated SHAP values do not imply disproportionately higher importance; instead, they reflect the model’s expanded feature space and sensitivity to unscaled polynomial terms. Therefore, comparisons of absolute SHAP magnitudes should be made within the same model, while cross model comparisons should rely primarily on relative rankings rather than numerical size.
Conclusion and recommendation
This study demonstrated that carbonation treatment of RCAs can be effectively integrated with ML to support sustainable construction practices. The investigation showed that, unlike conventional testing methods that are expensive and time consuming, regression based ML models provide an efficient and accurate alternative. For the first time, predictive modeling was applied specifically to concretes containing carbonated RCAs, filling a gap in existing research. Using 108 experimental tests of individual datasets that are later composited to indices representing mix proportioning, aggregate performance, carbonation degree, and replacement percentage, six ML algorithms were evaluated for their predictive ability. Based on the outputs of the study the authors concluded and recommended the following:
-
ML models predicted the compressive strength of carbonated RAC with good accuracy.
-
Intercorrelation and multicollinearity within the datasets have no noticeable effect on the performance of the ML models, particularly tree based and ensemble based models.
-
Comparing classical regression based and tree based models on the same dataset, it was found that tree based models performed better.
-
Numerical prediction equations developed using polynomial MLR performed well after the ensemble based models and may be applied in practical compressive strength predictions.
-
Although the decision tree and LightGBM models achieved marginally lower prediction errors, the random forest model was identified as the most reliable overall due to its ability to capture physically meaningful relationships, as evidenced by SHAP analysis, together with its high predictive accuracy and robustness.
-
Variations in fine aggregate type and properties, variations in cement type, addition of super plasticizers and coarse aggregate size effect were not considered in this study. Therefore, future studies can work on compressive strength prediction of carbonated RCA concrete considering these properties together with deep learning models and sufficient datasets.
-
Modeling of Tensile strength, modulus of elasticity and other durability properties of carbonated RCA concrete can be another area for future research.
-
The datasets used in this study are limited and cover extreme strength ranges, low to normal strength (17 to 29 MPa) and normal to high strength (40 to 72 MPa). This limitation may have contributed to the relatively high R2 values produced by the models, which should therefore be interpreted with caution. Although the dataset includes 18 mix combinations, the limited variation in water to cement ratios, with only two values across all mixtures, likely reduced the overall variability, simplifying the prediction task and allowing the models to achieve high goodness of fit scores. Even when additional parameters such as carbonation degree, replacement ratio, and parent concrete strength are considered, the restricted range of water to cement ratios may have constrained the dataset’s diversity, reflecting the narrow experimental conditions. Including additional strength values both within and beyond the current ranges, as well as a wider range of water to cement ratios, could improve model accuracy, reduce overfitting, and enhance robustness by capturing a broader spectrum of concrete mix behavior.
Data availability
The data used and analyzed in this study are available from the corresponding author upon reasonable request.
References
Li, S., Wang, B., Gao, Y. & Yu, Y. A compressive strength prediction model for precast recycled aggregate concrete based on machine learning. AIP Adv. 15, 075133. https://doi.org/10.1063/5.0280234 (2025).
Sinkhonde, D., Bezabih, T., Mirindi, D., Mashava, D. & Mirindi, F. Ensemble machine learning algorithms for efficient prediction of compressive strength of concrete containing tyre rubber and brick powder. Clean. Waste Syst. 10 (2025), 100236. https://doi.org/10.1016/j.clwas.2025.100236 (2025).
Lin, L., Xu, J., Ying, W., Yu, Y. & Zhou, L. Post fire compressive mechanical behaviors of concrete incorporating coarse and fine recycled aggregates. Constr. Build. Mater. 461, 139948. https://doi.org/10.1016/j.conbuildmat.2025.139948 (2025).
Yu, Y., Xu, J., Su, J., Xu, L. & Luo, Y. Investigating specimen size and shape effects on compressive mechanical behaviors of recycled aggregate concrete using discrete element mesoscale Modeling, Constr. Build. Mater. 438, 137196. https://doi.org/10.1016/j.conbuildmat.2024.137196 (2024).
Wang, D. et al. Mechanical performance of recycled aggregate concrete in green civil engineering: review. Case Stud. Constr. Mater. 19, e02384. https://doi.org/10.1016/j.cscm.2023.e02384 (2023).
Arora, S., Singh, B. & Bhardwaj, B. Strength performance of recycled aggregate concretes containing mineral admixtures and their performance prediction through various modeling techniques. J. Build. Eng. 24, 100741. https://doi.org/10.1016/j.jobe.2019.100741 (2019).
Deng, F. et al. Compressive strength prediction of recycled concrete based on deep learning. Constr. Build. Mater. 175, 562–569. https://doi.org/10.1016/j.conbuildmat.2018.04.169 (2018).
Duan, J., Asteris, P. G., Nguyen, H., Bui, X. N. & Moayedi, H. A novel artificial intelligence technique to predict compressive strength of recycled aggregate concrete using ICA-XGBoost model. Eng. Comput. 37, 3329–3346. https://doi.org/10.1007/s00366-020-01003-0 (2021).
Zhang, J., Huang, Y., Aslani, F., Ma, G. & Nener, B. A hybrid intelligent system for designing optimal proportions of recycled aggregate concrete. J. Clean. Prod. 273, 122922. https://doi.org/10.1016/j.jclepro.2020.122922 (2020).
Peiris, D. et al. Impact of treatment methods on recycled concrete aggregate performance: a comprehensive review. Environ. Sci. Pollut Res. 32, 14405–14438. https://doi.org/10.1007/s11356-025-36497-y (2025).
Fortunato, L. R., Parsekian, G. A. & Junior, A. N. Developing an easy to build laboratory chamber for CO2 experiments. Matéria (Rio J) https://doi.org/10.1590/1517-7076-RMAT-2023-0078 (2023).
Bui, H., Delattre, F. & Levacher, D. Experimental methods to evaluate the carbonation degree in Concrete-State of the Art review. Appl. Sci. 13 (4), 2533. https://doi.org/10.3390/app13042533 (2023).
Liu, K. et al. Carbonation of recycled aggregate and its effect on properties of recycled aggregate concrete: A review. Mater. Express. 11 (9), 1439–145214. https://doi.org/10.1166/mex.2021.2045 (2021).
Kashem, A. et al. Hybrid data-driven approaches to predicting the compressive strength of ultra-high-performance concrete using SHAP and PDP analyses, case stud. Constr. Mater. 20, e02991. https://doi.org/10.1016/j.cscm.2024.e02991 (2024).
Alyami, M. et al. Estimating compressive strength of concrete containing rice husk Ash using interpretable machine learning-based models. Case Stud. Constr. Mater. 20, e02901. https://doi.org/10.1016/j.cscm.2024.e02901 (2024).
Philip, S. & Nidhi, M. Performance comparison of artificial neural network and random forest models for predicting the compressive strength of fbre – reinforced GGBS – based geopolymer concrete composites. Mater. Circ. Econ. 6 (34), 1–18. https://doi.org/10.1007/s42824-024-00128-7 (2024).
Tipu, R. K., Arora, R. & Kumar, K. Machine learning-based prediction of concrete strength properties with coconut shell as partial aggregate replacement: A sustainable approach in construction engineering. Asian J. Civ. Eng. 25 (3), 2979–2992. https://doi.org/10.1007/s42107-023-00957-y (2024).
Tipu, R. K., Batra, V., Suman, Panchal, V. R., Pandya, K. S. & Patel, G. A. Optimizing compressive strength in sustainable concrete: a machine learning approach with iron waste integration. Asian J. Civ. Eng. 25, 4487–4512. https://doi.org/10.1007/s42107-024-01061-5 (2024b).
Lamba, P. et al. Repurposing plastic waste: experimental study and predictive analysis using machine learning in bricks. J. Mol. Struct. 1317, 139158. https://doi.org/10.1016/j.molstruc.2024.139158 (2024).
Rusu, A., Bărbuță, M. & Sabina, S. Multiple linear regression model to predict compressive strength of concrete with silica fume and metallic fibers. In: Moldovan, L., Gligor, A. (eds) The 17th International Conference Interdisciplinarity in Engineering. Inter-ENG 2023. Lecture Notes in Networks and Systems, 926. Springer, Cham. (2024). https://doi.org/10.1007/978-3-031-54664-8_23
Chore, H. & Shelke, N. L. Prediction of compressive strength of concrete using multiple regression model. Struct. Eng. Mech. https://doi.org/10.12989/sem.2013.45.6.837 (2013).
Aamir, H., Aamir, K. & Javed, M. F. Linear and Non-Linear regression analysis on the prediction of compressive strength of sodium hydroxide Pre-Treated crumb rubber concrete. Eng. Proc. 44 (1), 5. https://doi.org/10.3390/engproc2023044005 (2023).
Yuan, Y., Yang, M., Shang, X., Xiong, Y. & Zhang, Y. Predicting the compressive strength of UHPC with coarse aggregates in the context of machine learning. Case Stud. Constr. Mater. 19, e02627. https://doi.org/10.1016/J.CSCM.2023.E02627 (2023).
Abdellatief, M., Murali, G. & Dixit, S. Leveraging machine learning to evaluate the effect of Raw materials on the compressive strength of ultra-high-performance concrete. Results Eng. 25, 104542. https://doi.org/10.1016/J.RINENG.2025.104542 (2025).
Abuodeh, O. R., Abdalla, J. A. & Hawileh, R. A. Assessment of compressive strength of Ultra-high performance concrete using deep machine learning techniques. Appl. Comput. 95, 106552. https://doi.org/10.1016/J.ASOC.2020.106552 (2020).
Salman, K. M. et al. Explainable AutoML models for predicting the strength of high-performance concrete using Optuna, SHAP and ensemble learning. Front. Mater. https://doi.org/10.3389/fmats.2025.1542655 (2025).
Gebremariam, H. G., Taye, S. & Tarekegn, A. G. CO₂ Uptake in recycled concrete aggregates: influence of parent concrete strength and preprocessing carbonation. unpublished manuscript. (2025).
Alahakoon, Y. et al. Prediction of alkali–silica reaction expansion of concrete using explainable machine learning methods. Discov Appl. Sci. 7, 407. https://doi.org/10.1007/s42452-025-06880-y (2025).
Meddage, D. P. P., Mohotti, D., Wijesooriya, K., Lee, C. K. & Kwok, K. C. S. Interpolating wind pressure time-histories around a tall building - a deep learning-based approach. J. Wind Eng. Ind. Aerodyn. 256, 105968. https://doi.org/10.1016/j.jweia.2024.105968 (2025).
O’brien, R. M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 41, 673–690. https://doi.org/10.1007/s11135-006-9018-6 (2007).
Singh, S. G. & Kumar, S. V. Dealing with multicollinearity problem in analysis of side friction characteristics under urban heterogeneous traffic conditions. Arab. J. Sci. Eng. 46, 10739–10755. https://doi.org/10.1007/s13369-020-05213-y (2021).
Loureiro, A. A. B. & Stefani, R. Comparing the performance of machine learning models for predicting the compressive strength of concrete. Discov Civ. Eng. 1, 19. https://doi.org/10.1007/s44290-024-00022-w (2024).
Mondal, A. & Bhanja, S. Augmentation of Abrams law for fly Ash concrete. Mater. Today Proc. 65 (2), 644–650. https://doi.org/10.1016/j.matpr.2022.03.204 (2022).
Sah, A. K. & Hong, Y. M. Performance comparison of machine learning models for concrete compressive strength prediction. Mater. (Basel). 7 (9), 2075. https://doi.org/10.3390/ma17092075 (2024).
Gayathri, R., Rani, S. U., Čepová, L., Rajesh, M. & Kalita, K. A comparative analysis of machine learning models in prediction of mortar compressive strength. Processes 10 (7), 1387. https://doi.org/10.3390/pr10071387 (2022).
Biau, G. & Scornet, E. A random forest guided tour. TEST 25, 197–227. https://doi.org/10.1007/s11749-016-0481-7 (2016).
Padula, W. V. Introduction to decision tree modeling, in Bishai, D., Brenzel L., Padula, W. (eds), Handbook of Applied Health Economics in Vaccines, Handbooks in Health Economic Evaluation (Oxford Academic)., accessed 16 Aug. 2025. (2023). https://doi.org/10.1093/oso/9780192896087.003.0020
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction To Statistical Learning with Applications in R. Part of the Book Series: Springer Texts in Statistics (Springer, 2021). https://doi.org/10.1007/978-1-0716-1418-1
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Elshaarawy, M. K., Zeleňáková, M. & Armanuos, A. M. Hydraulic performance modeling of inclined double cutoff walls beneath hydraulic structures using optimized ensemble machine learning. Sci. Rep. 15, 27592. https://doi.org/10.1038/s41598-025-10990-3 (2025).
Hajihosseinlou, M., Maghsoudi, A. & Ghezelbash, R. A novel scheme for mapping of MVT-Type Pb-Zn prospectivity: LightGBM, a highly efficient gradient boosting decision tree machine learning algorithm. Nat. Resour. Res. 32, 2417–2438. https://doi.org/10.1007/s11053-023-10249-6 (2023).
Elshaarawy, M. K., Alsaadawi, M. M. & Hamed, A. K. Machine learning and interactive GUI for concrete compressive strength prediction. Sci. Rep. 14(1), 16694 (2024).
Harith, I. K. et al. Prediction of high-performance concrete strength using machine learning with hierarchical regression. Multiscale Multidiscip Model. Exp. Des. 7, 4911–4922. https://doi.org/10.1007/s41939-024-00467-7 (2024).
Das, P. & Kashem, A. Hybrid machine learning approach to prediction of the compressive and fexural strengths of UHPC and parametric analysis with Shapley additive explanations. Case stud. Constr. Mater. 20, e02723. https://doi.org/10.1016/j.cscm.2023.e02723 (2024).
Elhishi, S., Elashry, A. M. & El-Metwally, S. Unboxing machine learning models for concrete strength prediction using XAI. Sci. Rep. 13, 19892. https://doi.org/10.1038/s41598-023-47169-7 (2023).
Arunvivek, G. K., Anandaraj, S., Kumar, P., Pratap, B. & Sembeta, R. Y. Compressive strength modelling of cenosphere and copper slag-based geopolymer concrete using deep learning model. Sci. Rep. 15, 27849 (2025).
Albostami, A. S., Al-Hamd, R. K. S. & Al-Matwari, A. A. Data-Driven predictive modeling of steel slag concrete strength for sustainable construction. Buildings 14 (8), 2476. https://doi.org/10.3390/buildings14082476 (2024).
Pratap, B., Kumar, P., Shubham, K. & Chaudhary, N. Soft computing-based investigation of mechanical properties of concrete using ready-mix concrete waste water as partial replacement of mixing portable water. Asian J. Civil Eng. 25 (2), 1255–1266. https://doi.org/10.1007/s42107-023-00841-9 (2024).
Kumar, P., Pratap, B., Sharma, S. & Kumar, I. Compressive strength prediction of fly Ash and blast furnace slag-based geopolymer concrete using convolutional neural network. Asian J. Civil Eng. 25 (2), 1561–1569. https://doi.org/10.1007/s42107-023-00861-5 (2024).
Gogineni, A., Panday, I. K., Kumar, P. & Paswan, R. K. Predicting compressive strength of concrete with fly Ash and admixture using xgboost: a comparative study of machine learning algorithms. Asian J. Civil Eng. 25 (1), 685–698. https://doi.org/10.1007/s42107-023-00804-0 (2024).
Anwar, M. K., Qurashi, M. A., Zhu, X., Shah, R., Siddiq, S. A. & M.U A comparative performance analysis of machine learning models for compressive strength prediction in fly ash-based geopolymers concrete using reference data, case stud. Constr. Mater. 22, e04207. https://doi.org/10.1016/j.cscm.2025.e04207 (2025).
Dhengare, S., Waghe, U., Yenurkar, G. & Shyamala, A. A comprehensive model for concrete strength prediction using advanced learning techniques. Discov Appl. Sci. 7, 551. https://doi.org/10.1007/s42452-025-07095-x (2025).
Cho, S. J., De Boeck, P., Naveiras, M. & Ervin, H. Level-specific residuals and diagnostic measures, plots, and tests for random effects selection in multilevel and mixed models. Behav. Res. 54, 2178–2220. https://doi.org/10.3758/s13428-021-01709-z (2022).
Haque, M. A., Chen, B., Kashem, A., Qureshi, T. & Ahmed, A. A. M. hybrid intelligence models for compressive strength prediction of MPC composites and parametric analysis with SHAP algorithm. Mater. Today Commun. 35, 105547 (2023).
Singh, R., Tipu, R. K., Mir, A. A. & Patel, M. Predictive modelling of flexural strength in recycled Aggregate-Based concrete: A comprehensive approach with machine learning and global sensitivity analysis. Iran. J. Sci. Technol. Trans. Civ. Eng. 49, 1089–1114. https://doi.org/10.1007/s40996-024-01502-w (2025).
Acknowledgements
College of technology and built environment (CTBE), Addis Ababa University (AAU) is honored for infrastructural support during the study.
Author information
Authors and Affiliations
Contributions
Gebremariam H.G. conceptualization, data analysis, wrote the initial draft; Taye S. methodology development, visualization, and supervision, Tarekegn A.G. conceptualization, revision and supervision. All authors reviewed and approved the final manuscript and contributed to discussion.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Gebremariam, H.G., Taye, S. & Tarekegn, A.G. Compressive strength prediction of carbonated recycled aggregate concrete using regression based machine learning models. Sci Rep 16, 5825 (2026). https://doi.org/10.1038/s41598-026-36197-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-36197-8










