CLGDS: robust bridge crack detection with YOLO enhanced feature fusion and SIoU optimization

Jiao, Bao; Canjun, Xiao; Dong, Guo; Chenyu, Wang; Mi, Peng; Xinping, Zhao

doi:10.1038/s41598-026-42727-1

Download PDF

Article
Open access
Published: 04 April 2026

CLGDS: robust bridge crack detection with YOLO enhanced feature fusion and SIoU optimization

Bao Jiao¹,
Xiao Canjun¹,
Guo Dong²,
Wang Chenyu¹,
Peng Mi¹^na1 &
…
Zhao Xinping¹^na1

Scientific Reports , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

As a key indicator of structural integrity and in-service performance, crack detection is essential for the condition assessment and preventive maintenance of bridges. To address the challenges of detecting cracks with various scales and shapes under low-contrast backgrounds in bridge inspection tasks, this paper proposes a robust detection method named CLGDS. It is based on YOLO11 with enhanced feature fusion and SIoU loss optimization, which effectively improves the accuracy and robustness of crack identification. The proposed framework includes three key innovations. (1) A Cross Stage Partially Large Separable Kernel Attention (C2LSKA) module is integrated in the backbone network to enhanced the representation of crack features in the case of morphologically diverse and complex background interference. (2) A Gathering and Distributing (GD) mechanism serves as the neck network, facilitating multi-scale feature fusion and improving the detection performance for cracks of varying scales and geometrically irregular edges. (3) A Scylla-IoU (SIoU) loss function is introduced to replace the commonly used Complete IoU (CIoU) loss. By explicitly incorporating directional sensitivity and multi-scale adaptability, SIoU effectively mitigates angle-dependent misalignment during bounding box regression. Experimental results demonstrate that CLGDS achieves a mean average precision (mAP@50) of 93.5%, outperforming YOLOv5, YOLOv8, and YOLO11 by margins of +1.5%, +0.6%, and +1.2%, respectively. Furthermore, it attains a mAP@50-95 of 68.5%, significantly higher than that of YOLOv5 (61.3%), YOLOv8 (62.6%), and YOLO11 (65.3%). These results validate the effectiveness of CLGDS in accurate bridge crack detection, providing a solid technical foundation for automated structural health monitoring and preventive maintenance.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

Zinno, R., Haghshenas, S. S., Guido, G. & VItale, A. Artificial intelligence and structural health monitoring of bridges: A review of the state-of-the-art. IEEE Access 10, 88058–88078. https://doi.org/10.1109/ACCESS.2022.3199443 (2022).
Zhao, H., Qin, G. & Wang, X. Improvement of canny algorithm based on pavement edge detection. In 2010 3rd International Congress on Image and Signal Processing. Vol. 2. 964–967. https://doi.org/10.1109/CISP.2010.5646923 (2010).
Li, G., Zhou, P., Shen, D. & Zhao, S. Evptmff: Bridge crack detection based on efficient visual pyramid transformer and multiple-feature fusion. J. Perform. Construct. Fac. 38, 04024023. https://doi.org/10.1061/JPCFEV.CFENG-4709 (2024).
Google Scholar
Nigam, R. & Singh, S. K. Crack detection in a beam using wavelet transform and photographic measurements. Structures 25, 436–447. https://doi.org/10.1016/j.istruc.2020.03.010 (2020).
Google Scholar
Talab, A. M. A., Huang, Z. & Xi, e. a., F. Detection crack in image using Otsu method and multiple filtering in image processing techniques. Optik 127, 1030–1033. https://doi.org/10.1016/j.ijleo.2015.09.147 (2016).
Choi, K. et al. Image processing algorithm for real-time crack inspection in hole expansion test. Int. J. Precis. Eng. Manuf. 20, 1139–1148. https://doi.org/10.1007/s12541-019-00101-4 (2019).
Yao, Y., Zhang, Y., Liu, Z. & Yuan, H. A bridge crack segmentation algorithm based on fuzzy c-means clustering and feature fusion. Sensors 25 (2025).
Luo, W., Sun, J., Liao, Z. & Zhang, Y. Research on the real-time detection method for image processing-based civil structure crack. Trait. Signal 39, 2223–2228. https://doi.org/10.18280/ts.390638 (2022).
Li, P., Xia, H., Zhou, B., Yan, F. & Guo, R. A method to improve the accuracy of pavement crack identification by combining a semantic segmentation and edge detection model. Appl. Sci. 12. https://doi.org/10.3390/app12094714 (2022).
Inoue, Y. & Nagayoshi, H. Crack detection as a weakly-supervised problem: Towards achieving less annotation-intensive crack detectors. In 2020 25th International Conference on Pattern Recognition. 65–72. https://doi.org/10.1109/ICPR48806.2021.9412041 (2021).
Avendaño, J. C., Leander, J. & Karoumi, R. Image-based concrete crack detection method using the median absolute deviation. Sensors 24. https://doi.org/10.3390/s24092736 (2024).
Choi, Y., Park, H. W., Mi, Y. & Song, S. Crack detection and analysis of concrete structures based on neural network and clustering. Sensors 24. https://doi.org/10.3390/s24061725 (2024).
Lin, Y., Ahmadi, M. & Alnowibet, K. e. a. Concrete crack detection using Ridgelet neural network optimized by advanced human evolutionary optimization. Sci. Rep. 15. https://doi.org/10.3390/s24061725 (2025).
Dung, C. V. & Anh, L. D. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Construct. 99, 52–58. https://doi.org/10.1016/j.autcon.2018.11.028 (2010).
Google Scholar
Fang, F., Li, L., Gu, Y., Zhu, H. & Lim, J.-H. A novel hybrid approach for crack detection. Pattern Recognit. 107, 107474. https://doi.org/10.1016/j.patcog.2020.107474 (2020).
Google Scholar
Huang, X., Zhang, X., Xiong, Y., Dai, F. & Zhang, Y. Intelligent fault diagnosis of turbine blade cracks via multiscale sparse filtering and multi-kernel support vector machine for information fusion. Adv. Eng. Inform. 56, 101979. https://doi.org/10.1016/j.aei.2023.101979 (2023).
Wang, C., X., H. & Y., L. Three-dimensional crack recognition by unsupervised machine learning. Rock Mech. Rock Eng. 54, 893–903. https://doi.org/10.1007/s00603-020-02287-w (2021).
V, D. H., M, R., H, T.-N. & et al. Connection stiffness reduction analysis in steel bridge via deep cnn and modal experimental data. Struct. Eng. Mech. 77, 495–508. https://doi.org/10.1016/j.patcog.2020.107474 (2021).
H., T.-N., S., K., Le-Xuan, T. et al. Damage assessment in structures using artificial neural network working and a hybrid stochastic optimization. Sci. Rep. 12, 4958. https://doi.org/10.1038/s41598-022-09126-85 (2022).
Tran-Ngoc, H., Nguyen-Huu, Q., Nguyen-Chi, T. & Bui-Tien, T. Enhancing damage detection in truss bridges through structural stiffness reduction using 1dcnn, bilstm, and data augmentation techniques. Structures 68, 107035. https://doi.org/10.1016/j.istruc.2024.107035 (2024).
Google Scholar
Nguyen-Ngoc, L. et al. A two-step approach for damage identification in bridge structure using convolutional long short-term memory with augmented time-series data. Adv. Eng. Softw. 198, 103795. https://doi.org/10.1016/j.advengsoft.2024.103795 (2024).
Google Scholar
Xiong, C., Zayed, T. & Abdelkader, E. M. A novel yolov8-gam-wise-IOU model for automated detection of bridge surface cracks. Construct. Build. Mater. 414, 135025. https://doi.org/10.1016/j.conbuildmat.2024.135025 (2024).
Google Scholar
Girshick, R., Donahue, J., Darrell, T. & Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).
Girshick, R. Fast r-cnn. In IEEE International Conference on Computer Vision. 1440–1448. https://doi.org/10.1109/ICCV.2015.169 (2015).
Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 (2017).
Google Scholar
Wang, A. et al. YOLOv10: Real-time end-to-end object detection. In The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024).
Ren, S., He, K., Girshick, R. & Sun, J. Yolo advances to its genesis: A decadal and comprehensive review of the you only look once (yolo) series. Artif. Intell. Rev. 58, 274. https://doi.org/10.1007/s10462-025-11253-3 (2025).
Google Scholar
Liu, W. et al. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2 (2016).
Du, Y. et al. Pavement distress detection and classification based on yolo network. Int. J. Pavem. Eng. 22, 1659–1672. https://doi.org/10.1080/10298436.2020.1714047 (2021).
Google Scholar
Zhang, K., Qin, L. & Zhu, L. Pds-yolo: A real-time detection algorithm for pipeline defect detection. Electronics 14. https://doi.org/10.3390/electronics14010208 (2025).
Zhang, H., Liu, J. & Hu, G. Fcn attention enhancing asphalt pavement crack detection through attention mechanisms and fully convolutional networks. Sci. Rep. 15. https://doi.org/10.1038/s41598-025-92971-0 (2025).
Lou, C., Tinsley, L., Duarte Martinez, F., Gray, S. & Honarvar Shakibaei Asli, B. Optimized AI methods for rapid crack detection in microscopy images. Electronics 13. https://doi.org/10.3390/electronics13234824 (2024).
Liu, L., Ke, C., Lin, H. & Xu, H. Research on pedestrian detection algorithm based on mobilenet-yolo. Comput. Intell. Neurosci. 2022, 8924027. https://doi.org/10.1155/2022/8924027 (2022).
Google Scholar
Zhang, J., Qian, S. & Tan, C. Automated bridge crack detection method based on lightweight vision models. Complex Intell. Syst. 9, 1639–1652. https://doi.org/10.1007/s40747-022-00876-6 (2023).
Google Scholar
Dong, X., Yuan, J. & Dai, J. Study on lightweight bridge crack detection algorithm based on yolo11. Sensors 25. https://doi.org/10.3390/s25113276 (2025).
Khanam, R. & Hussain, M. Yolov11: An overview of the key architectural enhancements. arXiv e-prints arXiv:2410.17725 (2024).
Lau, K. W., Po, L.-M. & Rehman, Y. A. U. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Syst. Appl. 236, 121352. https://doi.org/10.1016/j.eswa.2023.121352 (2024).
Google Scholar
Wang, C. et al. Gold-yolo: Efficient object detector via gather-and-distribute mechanism. Adv. Neural Inf. Process. Syst. 36, 51094–51112. https://doi.org/10.3390/s25113276 (2023).
Google Scholar
Gevorgyan, Z. Siou loss: More powerful learning for bounding box regression. arXiv e-prints arXiv:2205.12740 (2022).
Ramos, L. T. & Sappa, A. D. A decade of you only look once (yolo) for object detection: A review. arXiv e-prints arXiv:2504.18586 (2025).
Cheng, M. & Liu, M. Image convolution techniques integrated with yolov3 algorithm in motion object data filtering and detection. Sci. Rep. 14, 7651. https://doi.org/10.1038/s41598-024-57799-0 (2024).
Google Scholar
Wang, C. Y. et al. Cspnet: A new backbone that can enhance learning capability of cnn. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1571–1580. https://doi.org/10.1109/CVPRW50498.2020.00203 (2020).
Cheng, H., Lian, J. & Jiao, W. Enhanced mobilenet for skin cancer image classification with fused spatial channel attention mechanism. Sci. Rep. 14, 28850. https://doi.org/10.1038/s41598-024-80087-w (2024).
Google Scholar
Ma, N., Zhang, X., Zheng, H.-T. & Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Computer Vision - ECCV. 122–138. https://doi.org/10.1007/978-3-030-01264-9_8 (Springer, 2018).
Woo, S., Park, J., Lee, J.-Y. & Kweon, I. S. Cbam: Convolutional block attention module. In Computer Vision-ECCV. 3–19. https://doi.org/10.1007/978-3-030-01234-2_1 (Springer, 2018).
Wang, W. et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 548–558. https://doi.org/10.1109/ICCV48922.2021.00061 (2021).
Liu, S., Qi, L., Qin, H., Shi, J. & Jia, J. Path aggregation network for instance segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8759–8768. https://doi.org/10.1109/CVPR.2018.00913 (2018).
M. Tan, R. Pang & Q. V. Le. EfficientDet: Scalable and Efficient Object Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 10778–10787. doi: 10.1109/CVPR42600.2020.01079 (2020).
Hu, J., Shen, L., Albanie, S., Sun, G. & Wu, E. Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372 (2020).
Google Scholar
Ge, Z., Liu, S., Wang, F., Li, Z. & Sun, J. Yolox: Exceeding yolo series in 2021. arXiv e-prints arXiv:2107.08430 (2021).
Zhang, H., Wang, Y., Dayoub, F. & Sünderhauf, N. Varifocalnet: An IOU-aware dense object detector. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8510–8519. https://doi.org/10.1109/CVPR46437.2021.00841 (2021).
Xiong, Z., Zhan, Z. & Wang, X. Position-sensitive attention based on fully convolutional neural networks for land cover classification. ISPRS Ann. Photogram. Remote Sens. Spatial Inf. Sci. 3, 281–288. https://doi.org/10.5194/isprs-annals-V-3-2022-281-2022 (2022).
Google Scholar
Olorunshola, O. E., Irhebhude, M. E. & Evwiekpaefe, A. E. A comparative study of yolov5 and yolov7 object detection algorithms. J. Comput. Soc. Inform. 2, 1–12. https://doi.org/10.33736/jcsi.5070.2023 (2023).

Download references

Acknowledgements

The authors acknowledge the team for providing the crack image dataset; we also sincerely thank the reviewers for their expert comments and suggestions.

Funding

The authors gratefully acknowledge the support from the Sichuan Provincial Science and Technology Program (Key Research and Development Project: 23ZDYF0110). Sichuan Provincial Education Science Planning Program (Major Project of Collaborative Research: SCJG24A003-1).

Author information

Peng Mi and Zhao Xinping contributed equally to this work.

Authors and Affiliations

Institute of Mine Intelligence, Chengdu Technological University, Chengdu, 610031, China
Bao Jiao, Xiao Canjun, Wang Chenyu, Peng Mi & Zhao Xinping
China Southwest Architectural Design and Research Institute Corp. Ltd, Chengdu, 610041, Sichuan, China
Guo Dong

Authors

Bao Jiao
View author publications
Search author on:PubMed Google Scholar
Xiao Canjun
View author publications
Search author on:PubMed Google Scholar
Guo Dong
View author publications
Search author on:PubMed Google Scholar
Wang Chenyu
View author publications
Search author on:PubMed Google Scholar
Peng Mi
View author publications
Search author on:PubMed Google Scholar
Zhao Xinping
View author publications
Search author on:PubMed Google Scholar

Contributions

Jiao Bao: Writing-review and editing, Methodology. Canjun Xiao: Funding acquisition, Software. Dong Guo: Conceptualization, Methodology. Chenyu Wang: conducted the experiment(s), Mi Peng and Xinping Zhao: Writing-original draft. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xiao Canjun.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jiao, B., Canjun, X., Dong, G. et al. CLGDS: robust bridge crack detection with YOLO enhanced feature fusion and SIoU optimization. Sci Rep (2026). https://doi.org/10.1038/s41598-026-42727-1

Download citation

Received: 11 October 2025
Accepted: 27 February 2026
Published: 04 April 2026
DOI: https://doi.org/10.1038/s41598-026-42727-1