Abstract
Accurate prediction of soil stress–strain behavior remains a major challenge in geotechnical engineering due to the inherent heterogeneity, nonlinearity, and sparsity of soil datasets. Conventional laboratory and in-situ testing methods are often expensive, time-consuming, and sensitive to sampling disturbances, which limits their efficiency in large-scale engineering applications. To address these challenges, this study proposes an optimized stacking ensemble framework that integrates advanced tree-based learning algorithms with metaheuristic optimization. The selected base learners Light Gradient Boosting (LGB), Extreme Gradient Boosting (XGB), Random Forest (RFR), and Histogram-based Gradient Boosting (HGB) were chosen for their complementary strengths in capturing nonlinear interactions, handling high-dimensional inputs, and maintaining robustness under sparse and heterogeneous data conditions. These models are optimized using the Puma Optimization (PO) algorithm and combined through a stacking strategy to enhance predictive stability and generalization performance. A dataset comprising 1,410 samples was compiled literature data, witch the K-fold cross-validation was employed to evaluate model robustness. The proposed stacking model, particularly the optimized XGBPO ensemble, achieved superior predictive accuracy with a coefficient of determination (R2) of 0.9914 in the testing phase, outperforming individual and hybrid models. Interpretability and sensitivity analyses further identified dry density (\({\gamma }_{d}\)), void ratio (e0), and degree of saturation (\({S}_{r}\)) as the most influential factors governing soil compressibility behavior. The proposed framework provides a scalable, reliable, and computationally efficient alternative to traditional geotechnical testing methods, offering improved predictive accuracy and practical applicability for infrastructure design and decision-making under complex soil conditions.
Similar content being viewed by others
Data availability
The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.
Abbreviations
- \({E}_{s}\) :
-
Soil compression modulus
- XGB :
-
Extreme gradient boosting
- HGB :
-
Histogram gradient boosting
- ML :
-
Machine learning
- \({D}_{U}\) :
-
Depth of the upper soil
- \({D}_{D}\) :
-
Depth of the lower soil
- \(\omega\) :
-
Water content
- \({\gamma }_{d}\) :
-
Dry density
- \({e}_{^\circ }\) :
-
Void ratio
- \({S}_{r}\) :
-
Degree of saturation
- IQR :
-
Interquartile range
- FAST :
-
Fourier amplitude sensitivity test
- S1 :
-
Individual sensitivity
- ST :
-
Total sensitivity
- Max :
-
Maximum
- Min :
-
Minimum
- GP :
-
Genetic programming
- LSTM :
-
Long short-term memory
- MCMC :
-
Markov chain-based Monte Carlo
- GBRT :
-
Gradient-boosted regression tree
- PO :
-
Puma optimization algorithm
- LGB :
-
Light gradient boosting
- RFR :
-
Random forest
- R 2 :
-
Coefficient of determination
- RMSE :
-
Root mean square error
- NRMSE :
-
Normalized root mean square error
- MSLE :
-
Mean squared logarithmic error
- RMSLE :
-
Root mean squared logarithmic error
- MASE :
-
Mean absolute scaled error
- U95 :
-
Under-95th percentile
- U :
-
Theil’s inequality coefficient
- SI :
-
Scatter index
- WI :
-
Willmott’s index
- ICE :
-
Individual conditional expectation
- PDP :
-
Partial dependence plot
- RM :
-
Resilient modulus
- ANN :
-
Artificial neural networks
- SVM :
-
Support vector machines
- EPM :
-
Evolutionary polynomial regression
- SGSPG :
-
Spatial–geological stratigraphic graph
References
Wang, M., Wang, E., Liu, X. & Wang, C. Topological graph representation of stratigraphic properties of spatial-geological characteristics and compression modulus prediction by mechanism-driven learning. Comput. Geotech. 153, 105112 (2023).
Li, H., Duan, M., Yang, X., Wang, R. & Ouyang, Z. Modified CPTU parameters and SBTn chart for predicting shear behavior of organic soils at large strains. Eng. Geol. 356, 108273 (2025).
Du, Z. et al. Correlating the subsidence pattern and land use in Bandung, Indonesia with both Sentinel-1/2 and ALOS-2 satellite images. Int. J. Appl. Earth Obs. Geoinf. 67, 54–68. https://doi.org/10.1016/j.jag.2018.01.001 (2018).
Ding, Q. et al. Monitoring, analyzing and predicting urban surface subsidence: A case study of Wuhan City, China. Int. J. Appl. Earth Obs. Geoinf. 102, 102422 (2021).
Swannell, N., Palmer, M., Barla, G. & Barla, M. Geotechnical risk management approach for TBM tunnelling in squeezing ground conditions. Tunn. Undergr. Sp. Technol. 57, 201–210. https://doi.org/10.1016/j.tust.2016.01.013 (2016).
Sharafat, A., Latif, K. & Seo, J. Risk analysis of TBM tunneling projects based on generic bow-tie risk analysis approach in difficult ground conditions. Tunn. Undergr. Sp. Technol. 111, 103860. https://doi.org/10.1016/j.tust.2021.103860 (2021).
Wang, K., Qiu, Z., Hu, H., Sun, M. & Wang, J. Macro-meso mechanical behavior and degradation mechanisms of silty clay subgrade subjected to coupled traffic loading and dry-wet cycles. Transp. Geotech. https://doi.org/10.1016/j.trgeo.2025.101725 (2025).
Kurnaz, T. F. & Kaya, Y. The comparison of the performance of ELM, BRNN, and SVM methods for the prediction of compression index of clays. Arab. J. Geosci. 11, 770. https://doi.org/10.1007/s12517-018-4143-9 (2018).
Zhou, X. et al. A return mapping algorithm based on the hyper dual step derivative approximation for elastoplastic models. Comput. Methods Appl. Mech. Eng. 417, 116418 (2023).
Zhuang, P.-Z. et al. CASM-U: A unified critical state model for unsaturated clays and sands. Acta Geotech. 20, 211–230 (2025).
Wagh, J. D. & Bambole, A. N. Improved correlation of soil modulus with SPT N values. Open Eng. https://doi.org/10.1515/eng-2024-0046 (2024).
Kennedy, C. Estimation of dynamic soil properties for geotechnical analysis in the Niger Delta using index testing and empirical correlations. Discov. Geosci. 2, 94. https://doi.org/10.1007/s44288-024-00092-4 (2024).
Ghorbanzadeh, S., Armaghani, D. J., Salimi, M. & Payan, M. Predictive models for dynamic properties of soils using machine learning approaches: A comprehensive review. Eng. Appl. Artif. Intell. 163, 113014. https://doi.org/10.1016/j.engappai.2025.113014 (2026).
Almarzooqi, A., Arab, M. G., Omar, M. & Alotaibi, E. Benchmarking conventional machine learning models for dynamic soil property prediction. Buildings 15, 4188. https://doi.org/10.3390/buildings15224188 (2025).
Zhou, X. et al. Material characteristic length insensitive nonlocal modelling: A computationally efficient scaled nonlocal integral method. Comput. Geotech. 188, 107587 (2025).
Wang, W. et al. Mesh size identification for cohesive fracture model based on experimentally calibrated FPZ length and FDEM simulation. Theor. Appl. Fract. Mech. 143, 105485 (2026).
Yang, J. et al. Three-dimensional peridynamics based on matrix operation and its application in rock mass compression failure simulation. Comput. Geotech. 185, 107354 (2025).
Yang, G. et al. New insights into linear-nonlinear tensile creep mechanisms of UHPC: Stress level-dependent creep-active site transitions. Cem. Concr. Res. 198, 107988 (2025).
Zhang, W., Lin, J. A fatigue-dependent cohesive zone model for SFRC-RC-CFRP composite beams: Development and experimental validation, In: Structures, (Elsevier, 2025) 109977.
Lu, D. et al. A dynamic elastoplastic model of concrete based on a modeling method with environmental factors as constitutive variables. J. Eng. Mech. 149, 4023102 (2023).
Li, L. et al. Simulation of catastrophic submarine landslide processes at model and engineering scales using SPH: A case study. Eng. Fail. Anal. 178, 109703 (2025).
Jia, C., Cheng, S., Li, L. & Chen, Y. Seismic ahead-prospecting method based on delayed blasting excitation in the tunnel face: A case study. Tunn. Undergr. Sp. Technol. 161, 106577 (2025).
Fang, K. et al. Nature-based profiling of subsurface soil stiffness driven by tidal forces. Geophys. Res. Lett. 52, e2025GL118702 (2025).
Li, X. & Ma, Z. Inherent anisotropy of aeolian sand with different bedding angles in true-triaxial tests. Eng. Geol. 356, 108272 (2025).
Ren, Q. et al. An innovative approach to discrete facture network modeling driven by geomechanics and multiple factors. Geoenergy Sci. Eng. 257, 214200 (2025).
Zheng, Y., Baudet, B. A., Coop, M. R., Pereira, J.-M. & Delage, P. A study of pore properties in reconstituted Lucera clay at states of the critical state framework. Géotechnique https://doi.org/10.1680/jgeot.24.01146 (2025).
Sun, M. et al. Swin-UNETR: A transformer-based model for 3D pore network segmentation in low-permeability sedimentary rocks. Int. J. Coal Geol. 315, 104951 (2026).
Naeim, B. et al. Machine learning approaches for fatigue life prediction of steel and feature importance analyses. Infrastructures 10, 295 (2025).
Long, X., Mao, M., Su, T., Su, Y. & Tian, M. Machine learning method to predict dynamic compressive response of concrete-like material at high strain rates. Def. Technol. 23, 100–111 (2023).
Sun, L., Wang, X., Zhang, C. Three-dimensional high fidelity mesoscale rapid modelling algorithm for concrete. In: Structures (Elsevier, 2024) 107561.
Khuntia, S. et al. Prediction of compaction parameters of coarse grained soil using multivariate adaptive regression splines (MARS). Int. J. Geotech. Eng. 9, 79–88 (2015).
Zhang, W., Wu, C., Zhong, H., Li, Y. & Wang, L. Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geosci. Front. 12, 469–477. https://doi.org/10.1016/j.gsf.2020.03.007 (2021).
Ahangar‐Asr, A., Javadi, A. A. & Khalili, N. An evolutionary approach to modelling the thermomechanical behaviour of unsaturated soils. Int. J. Numer. Anal. Methods Geomech. 39, 539–557 (2015).
Mohammadzadeh, S. D., Bolouri Bazaz, J., Vafaee Jani Yazd, S. H. & Alavi, A. H. Deriving an intelligent model for soil compression index utilizing multi-gene genetic programming. Environ. Earth Sci. 75, 1–11 (2016).
Zhang, G.-H. et al. A rock strength prediction model utilizing real-time data from percussion–rotary drilling measurements. Rock Mech. Rock Eng. 58, 7783–7803 (2025).
Yu, J. et al. Prediction of jacking force in circular press-in caisson in clay. Can. Geotech. J. 63, 1–17 (2025).
Zhang, D. M., Zhang, J. Z., Huang, H. W., Qi, C. C. & Chang, C. Y. Machine learning-based prediction of soil compression modulus with application of 1D settlement. J. Zhejiang Univ. A (Applied Phys. Eng.) 21, 430–444 (2020).
Malinverno, A. Parsimonious Bayesian Markov chain Monte Carlo inversion in a nonlinear geophysical problem. Geophys. J. Int. 151, 675–688 (2002).
Zhao, C., Zhang, G. & Zhang, J. Probabilistic inversion for compressional modulus and shear modulus based on QA-MCMC algorithm with joint probability distribution. J. Appl. Geophys. 178, 104070 (2020).
Zhang, W., Shi, P., Zhou, X. & Jia, P. Missing data analysis and soil compressive modulus estimation via bayesian evolutionary trees BT - advanced intelligent computing technology and applications. In Springer Nature Singapore (eds Huang, D.-S. et al.) 90–100 (2023).
Roy, D. G., Singh, T. N. & Kodikara, J. Predicting mode-I fracture toughness of rocks using soft computing and multiple regression. Measurement 126, 231–241 (2018).
Farid Hama Ali, H. & Mohammed, A. S. New approaches to evaluate the impact of chemical oxides on the liquid limit, plasticity index, and unconfined compressive strength of clay soils. Geomech. Geoengin. 20, 1–26 (2024).
Sherwani, A. F. H., Abdulrahman, P. I., Younis, K. H., Mohammed, A. S. & Zrar, Y. J. Compressive strength modeling of the eco-friendly self-compacting rubberized concrete with fly ash and ground granulated blast furnace slag additives. Innov. Infrastruct. Solut. 10, 1–28 (2025).
Kardani, N. et al. Prediction of the resilient modulus of compacted subgrade soils using ensemble machine learning methods. Transp. Geotech. 36, 100827 (2022).
Koopialipoor, M. et al. Introducing stacking machine learning approaches for the prediction of rock deformation. Transp. Geotech. 34, 100756 (2022).
Fereidooni, D., Karimi, Z. & Ghasemi, F. Non-destructive test-based assessment of uniaxial compressive strength and elasticity modulus of intact carbonate rocks using stacking ensemble models. PLoS ONE 19, e0302944 (2024).
Zhang, X., Li, Z., Sui, Y., Liu, C., Li, Z. Hybrid soil strength prediction model for geotechnical ground investigation using convolutional neural network and ensemble learning. In: J. Phys. Conf. Ser., (IOP Publishing, 2024). 12066.
Sen, M. K. & Biswas, R. Transdimensional seismic inversion using the reversible jump Hamiltonian Monte Carlo algorithm. Geophysics 82, R119–R134 (2017).
Zhu, D. & Gibson, R. Seismic inversion and uncertainty quantification using transdimensional Markov chain Monte Carlo method. Geophysics 83, R321–R334 (2018).
Li, Y., Weng, X., Hu, D., Tan, Z. & Liu, J. Data-driven method for predicting long-term underground pipeline settlement induced by rectangular pipe jacking tunnel construction. J. Pipeline Syst. Eng. Pract. 16, 4025046 (2025).
Chen, T., Guestrin, C. Xgboost: A scalable tree boosting system. In: Proc. 22nd Acm Sigkdd Int. Conf. Knowl. Discov. Data Min. 785–794 (2016).
Natekin, A. & Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 7, 21 (2013).
Ni, C. et al. Light gradient boosting machine (LightGBM) to forecasting data and assisting the defrosting strategy design of refrigerators. Int. J. Refrig. 160, 182–196 (2024).
Khajavi, E., Taghavi Khanghah, A. R. & Javadzade Khiavi, A. An efficient prediction of punching shear strength in reinforced concrete slabs through boosting methods and metaheuristic algorithms. Structures 74, 108519. https://doi.org/10.1016/j.istruc.2025.108519 (2025).
Breiman, L., Friedman, J., Olshen, R. & Stone, C. Classification and regression trees–crc press (Florida, 1984).
Taghavi Khangah, A. R., Khajavi, E., Azizi, H. & Alizade Novin, A. R. Radial basis function coupling with metaheuristic algorithms for estimating the compressive strength and slump of high-performance concrete. Adv. Eng. Intell. Syst. 3, 124–142 (2024).
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Drake, J.H., Özcan, E., Burke, E.K. An improved choice function heuristic selection for cross domain heuristic search. In: Parallel Probl. Solving from Nature-PPSN XII 12th Int. Conf. Taormina, Italy, Sept. 1–5, 2012, Proceedings, Part II 12, (Springer, 2012) 307–316.
Funding
No funds, grants, or other support was received.
Author information
Authors and Affiliations
Contributions
Reza Sarkhani Benemaran: Conceptualization; Data curation; Formal analysis; Project administration; Supervision; Erfan Khajavi: Software; Validation; Visualization; Roles/Writing—original draft; Amir Reza Taghavi Khanghah: Visualization;
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Sarkhani Benemaran, R., Khajavi, E. & Taghavi Khanghah, A.R. Construction of multi-parameter intelligent comprehensive estimation algorithms for soil compression modulus. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43812-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-43812-1


