Abstract
As a key indicator of structural integrity and in-service performance, crack detection is essential for the condition assessment and preventive maintenance of bridges. To address the challenges of detecting cracks with various scales and shapes under low-contrast backgrounds in bridge inspection tasks, this paper proposes a robust detection method named CLGDS. It is based on YOLO11 with enhanced feature fusion and SIoU loss optimization, which effectively improves the accuracy and robustness of crack identification. The proposed framework includes three key innovations. (1) A Cross Stage Partially Large Separable Kernel Attention (C2LSKA) module is integrated in the backbone network to enhanced the representation of crack features in the case of morphologically diverse and complex background interference. (2) A Gathering and Distributing (GD) mechanism serves as the neck network, facilitating multi-scale feature fusion and improving the detection performance for cracks of varying scales and geometrically irregular edges. (3) A Scylla-IoU (SIoU) loss function is introduced to replace the commonly used Complete IoU (CIoU) loss. By explicitly incorporating directional sensitivity and multi-scale adaptability, SIoU effectively mitigates angle-dependent misalignment during bounding box regression. Experimental results demonstrate that CLGDS achieves a mean average precision (mAP@50) of 93.5%, outperforming YOLOv5, YOLOv8, and YOLO11 by margins of +1.5%, +0.6%, and +1.2%, respectively. Furthermore, it attains a mAP@50-95 of 68.5%, significantly higher than that of YOLOv5 (61.3%), YOLOv8 (62.6%), and YOLO11 (65.3%). These results validate the effectiveness of CLGDS in accurate bridge crack detection, providing a solid technical foundation for automated structural health monitoring and preventive maintenance.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
References
Zinno, R., Haghshenas, S. S., Guido, G. & VItale, A. Artificial intelligence and structural health monitoring of bridges: A review of the state-of-the-art. IEEE Access 10, 88058–88078. https://doi.org/10.1109/ACCESS.2022.3199443 (2022).
Zhao, H., Qin, G. & Wang, X. Improvement of canny algorithm based on pavement edge detection. In 2010 3rd International Congress on Image and Signal Processing. Vol. 2. 964–967. https://doi.org/10.1109/CISP.2010.5646923 (2010).
Li, G., Zhou, P., Shen, D. & Zhao, S. Evptmff: Bridge crack detection based on efficient visual pyramid transformer and multiple-feature fusion. J. Perform. Construct. Fac. 38, 04024023. https://doi.org/10.1061/JPCFEV.CFENG-4709 (2024).
Nigam, R. & Singh, S. K. Crack detection in a beam using wavelet transform and photographic measurements. Structures 25, 436–447. https://doi.org/10.1016/j.istruc.2020.03.010 (2020).
Talab, A. M. A., Huang, Z. & Xi, e. a., F. Detection crack in image using Otsu method and multiple filtering in image processing techniques. Optik 127, 1030–1033. https://doi.org/10.1016/j.ijleo.2015.09.147 (2016).
Choi, K. et al. Image processing algorithm for real-time crack inspection in hole expansion test. Int. J. Precis. Eng. Manuf. 20, 1139–1148. https://doi.org/10.1007/s12541-019-00101-4 (2019).
Yao, Y., Zhang, Y., Liu, Z. & Yuan, H. A bridge crack segmentation algorithm based on fuzzy c-means clustering and feature fusion. Sensors 25 (2025).
Luo, W., Sun, J., Liao, Z. & Zhang, Y. Research on the real-time detection method for image processing-based civil structure crack. Trait. Signal 39, 2223–2228. https://doi.org/10.18280/ts.390638 (2022).
Li, P., Xia, H., Zhou, B., Yan, F. & Guo, R. A method to improve the accuracy of pavement crack identification by combining a semantic segmentation and edge detection model. Appl. Sci. 12. https://doi.org/10.3390/app12094714 (2022).
Inoue, Y. & Nagayoshi, H. Crack detection as a weakly-supervised problem: Towards achieving less annotation-intensive crack detectors. In 2020 25th International Conference on Pattern Recognition. 65–72. https://doi.org/10.1109/ICPR48806.2021.9412041 (2021).
Avendaño, J. C., Leander, J. & Karoumi, R. Image-based concrete crack detection method using the median absolute deviation. Sensors 24. https://doi.org/10.3390/s24092736 (2024).
Choi, Y., Park, H. W., Mi, Y. & Song, S. Crack detection and analysis of concrete structures based on neural network and clustering. Sensors 24. https://doi.org/10.3390/s24061725 (2024).
Lin, Y., Ahmadi, M. & Alnowibet, K. e. a. Concrete crack detection using Ridgelet neural network optimized by advanced human evolutionary optimization. Sci. Rep. 15. https://doi.org/10.3390/s24061725 (2025).
Dung, C. V. & Anh, L. D. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Construct. 99, 52–58. https://doi.org/10.1016/j.autcon.2018.11.028 (2010).
Fang, F., Li, L., Gu, Y., Zhu, H. & Lim, J.-H. A novel hybrid approach for crack detection. Pattern Recognit. 107, 107474. https://doi.org/10.1016/j.patcog.2020.107474 (2020).
Huang, X., Zhang, X., Xiong, Y., Dai, F. & Zhang, Y. Intelligent fault diagnosis of turbine blade cracks via multiscale sparse filtering and multi-kernel support vector machine for information fusion. Adv. Eng. Inform. 56, 101979. https://doi.org/10.1016/j.aei.2023.101979 (2023).
Wang, C., X., H. & Y., L. Three-dimensional crack recognition by unsupervised machine learning. Rock Mech. Rock Eng. 54, 893–903. https://doi.org/10.1007/s00603-020-02287-w (2021).
V, D. H., M, R., H, T.-N. & et al. Connection stiffness reduction analysis in steel bridge via deep cnn and modal experimental data. Struct. Eng. Mech. 77, 495–508. https://doi.org/10.1016/j.patcog.2020.107474 (2021).
H., T.-N., S., K., Le-Xuan, T. et al. Damage assessment in structures using artificial neural network working and a hybrid stochastic optimization. Sci. Rep. 12, 4958. https://doi.org/10.1038/s41598-022-09126-85 (2022).
Tran-Ngoc, H., Nguyen-Huu, Q., Nguyen-Chi, T. & Bui-Tien, T. Enhancing damage detection in truss bridges through structural stiffness reduction using 1dcnn, bilstm, and data augmentation techniques. Structures 68, 107035. https://doi.org/10.1016/j.istruc.2024.107035 (2024).
Nguyen-Ngoc, L. et al. A two-step approach for damage identification in bridge structure using convolutional long short-term memory with augmented time-series data. Adv. Eng. Softw. 198, 103795. https://doi.org/10.1016/j.advengsoft.2024.103795 (2024).
Xiong, C., Zayed, T. & Abdelkader, E. M. A novel yolov8-gam-wise-IOU model for automated detection of bridge surface cracks. Construct. Build. Mater. 414, 135025. https://doi.org/10.1016/j.conbuildmat.2024.135025 (2024).
Girshick, R., Donahue, J., Darrell, T. & Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).
Girshick, R. Fast r-cnn. In IEEE International Conference on Computer Vision. 1440–1448. https://doi.org/10.1109/ICCV.2015.169 (2015).
Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 (2017).
Wang, A. et al. YOLOv10: Real-time end-to-end object detection. In The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024).
Ren, S., He, K., Girshick, R. & Sun, J. Yolo advances to its genesis: A decadal and comprehensive review of the you only look once (yolo) series. Artif. Intell. Rev. 58, 274. https://doi.org/10.1007/s10462-025-11253-3 (2025).
Liu, W. et al. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2 (2016).
Du, Y. et al. Pavement distress detection and classification based on yolo network. Int. J. Pavem. Eng. 22, 1659–1672. https://doi.org/10.1080/10298436.2020.1714047 (2021).
Zhang, K., Qin, L. & Zhu, L. Pds-yolo: A real-time detection algorithm for pipeline defect detection. Electronics 14. https://doi.org/10.3390/electronics14010208 (2025).
Zhang, H., Liu, J. & Hu, G. Fcn attention enhancing asphalt pavement crack detection through attention mechanisms and fully convolutional networks. Sci. Rep. 15. https://doi.org/10.1038/s41598-025-92971-0 (2025).
Lou, C., Tinsley, L., Duarte Martinez, F., Gray, S. & Honarvar Shakibaei Asli, B. Optimized AI methods for rapid crack detection in microscopy images. Electronics 13. https://doi.org/10.3390/electronics13234824 (2024).
Liu, L., Ke, C., Lin, H. & Xu, H. Research on pedestrian detection algorithm based on mobilenet-yolo. Comput. Intell. Neurosci. 2022, 8924027. https://doi.org/10.1155/2022/8924027 (2022).
Zhang, J., Qian, S. & Tan, C. Automated bridge crack detection method based on lightweight vision models. Complex Intell. Syst. 9, 1639–1652. https://doi.org/10.1007/s40747-022-00876-6 (2023).
Dong, X., Yuan, J. & Dai, J. Study on lightweight bridge crack detection algorithm based on yolo11. Sensors 25. https://doi.org/10.3390/s25113276 (2025).
Khanam, R. & Hussain, M. Yolov11: An overview of the key architectural enhancements. arXiv e-prints arXiv:2410.17725 (2024).
Lau, K. W., Po, L.-M. & Rehman, Y. A. U. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Syst. Appl. 236, 121352. https://doi.org/10.1016/j.eswa.2023.121352 (2024).
Wang, C. et al. Gold-yolo: Efficient object detector via gather-and-distribute mechanism. Adv. Neural Inf. Process. Syst. 36, 51094–51112. https://doi.org/10.3390/s25113276 (2023).
Gevorgyan, Z. Siou loss: More powerful learning for bounding box regression. arXiv e-prints arXiv:2205.12740 (2022).
Ramos, L. T. & Sappa, A. D. A decade of you only look once (yolo) for object detection: A review. arXiv e-prints arXiv:2504.18586 (2025).
Cheng, M. & Liu, M. Image convolution techniques integrated with yolov3 algorithm in motion object data filtering and detection. Sci. Rep. 14, 7651. https://doi.org/10.1038/s41598-024-57799-0 (2024).
Wang, C. Y. et al. Cspnet: A new backbone that can enhance learning capability of cnn. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1571–1580. https://doi.org/10.1109/CVPRW50498.2020.00203 (2020).
Cheng, H., Lian, J. & Jiao, W. Enhanced mobilenet for skin cancer image classification with fused spatial channel attention mechanism. Sci. Rep. 14, 28850. https://doi.org/10.1038/s41598-024-80087-w (2024).
Ma, N., Zhang, X., Zheng, H.-T. & Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Computer Vision - ECCV. 122–138. https://doi.org/10.1007/978-3-030-01264-9_8 (Springer, 2018).
Woo, S., Park, J., Lee, J.-Y. & Kweon, I. S. Cbam: Convolutional block attention module. In Computer Vision-ECCV. 3–19. https://doi.org/10.1007/978-3-030-01234-2_1 (Springer, 2018).
Wang, W. et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 548–558. https://doi.org/10.1109/ICCV48922.2021.00061 (2021).
Liu, S., Qi, L., Qin, H., Shi, J. & Jia, J. Path aggregation network for instance segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8759–8768. https://doi.org/10.1109/CVPR.2018.00913 (2018).
M. Tan, R. Pang & Q. V. Le. EfficientDet: Scalable and Efficient Object Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 10778–10787. doi: 10.1109/CVPR42600.2020.01079 (2020).
Hu, J., Shen, L., Albanie, S., Sun, G. & Wu, E. Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372 (2020).
Ge, Z., Liu, S., Wang, F., Li, Z. & Sun, J. Yolox: Exceeding yolo series in 2021. arXiv e-prints arXiv:2107.08430 (2021).
Zhang, H., Wang, Y., Dayoub, F. & Sünderhauf, N. Varifocalnet: An IOU-aware dense object detector. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8510–8519. https://doi.org/10.1109/CVPR46437.2021.00841 (2021).
Xiong, Z., Zhan, Z. & Wang, X. Position-sensitive attention based on fully convolutional neural networks for land cover classification. ISPRS Ann. Photogram. Remote Sens. Spatial Inf. Sci. 3, 281–288. https://doi.org/10.5194/isprs-annals-V-3-2022-281-2022 (2022).
Olorunshola, O. E., Irhebhude, M. E. & Evwiekpaefe, A. E. A comparative study of yolov5 and yolov7 object detection algorithms. J. Comput. Soc. Inform. 2, 1–12. https://doi.org/10.33736/jcsi.5070.2023 (2023).
Acknowledgements
The authors acknowledge the team for providing the crack image dataset; we also sincerely thank the reviewers for their expert comments and suggestions.
Funding
The authors gratefully acknowledge the support from the Sichuan Provincial Science and Technology Program (Key Research and Development Project: 23ZDYF0110). Sichuan Provincial Education Science Planning Program (Major Project of Collaborative Research: SCJG24A003-1).
Author information
Authors and Affiliations
Contributions
Jiao Bao: Writing-review and editing, Methodology. Canjun Xiao: Funding acquisition, Software. Dong Guo: Conceptualization, Methodology. Chenyu Wang: conducted the experiment(s), Mi Peng and Xinping Zhao: Writing-original draft. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jiao, B., Canjun, X., Dong, G. et al. CLGDS: robust bridge crack detection with YOLO enhanced feature fusion and SIoU optimization. Sci Rep (2026). https://doi.org/10.1038/s41598-026-42727-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-42727-1