Abstract
The detection of small target defects on aluminum surfaces is critical in modern manufacturing, particularly in sectors such as aerospace, automotive, electronics, and high-end equipment manufacturing. These defects can severely compromise the safety, stability, and durability of products. However, due to the diversity of surface defect types, their small size, and the presence of complex background interference, traditional detection methods often struggle to achieve high precision under conditions of low contrast and significant noise. To address this challenge, this paper proposes an enhanced small target detection method for aluminum surfaces, leveraging the YOLOv11n framework. Specifically, we introduce a dilation-wise residual and dilated reparameterization block module to strengthen the model’s feature extraction capabilities, thereby improving the capture of fine details in small targets. In addition, the SimAM attention mechanism is integrated to optimize the model’s focus on critical feature regions, further enhancing its sensitivity and recognition performance for small defects. Moreover, we incorporate the CARAFE (Content-Aware ReAssembly of Features) upsampling operator, which effectively enlarges small target details and mitigates the information loss inherent in conventional upsampling techniques, thus significantly boosting detection accuracy. Experimental results show that the proposed model achieves a mean average precision (mAP@0.5) of 79.4% and a recall of 76.6%, reflecting improvements of 2.9% and 4.4% over the baseline model, respectively. Compared to existing methods, our approach demonstrates notable advantages in both detection accuracy and recognition ability, providing a promising foundation for future practical applications in industrial scenarios.
Similar content being viewed by others
Data availability
The dataset generated during the current study is available from the corresponding author upon reasonable request.
References
Li, S. S. et al. Development and applications of aluminum alloys for aerospace industry. J. Mater. Res. Technol. 27, 944–983 (2023).
Feng, Y. A. & Wei-Wei, S. Surface defect detection for aerospace aluminum profiles with attention mechanism and multi-scale features. Electronics 13, 2861 (2024).
Redmon, J. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition. (2016).
Redmon, J. & Farhadi, A. YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition. (2017).
Redmon, J. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).
Bochkovskiy, A., Wang, C. Y., Hong-Yuan Mark & Liao Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:.10934 (2020). (2020). (2004).
Zhang, Y. et al. Real-time vehicle detection based on improved Yolo v5. Sustainability 14, 12274 (2022).
Li, C. et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 (2022).
Wang, C. Y., Bochkovskiy, A., Hong-Yuan Mark & Liao YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. (2023).
Sohan, M. et al. A review on yolov8 and its advancements. International Conference on Data Intelligence and Cognitive Informatics. Springer, Singapore, (2024).
Wang, Chien-Yao, I-H., Yeh & Liao, H. Y. M. Yolov9: Learning what you want to learn using programmable gradient information. arXiv preprint arXiv:2402.13616 (2024).
Wang, A. et al. Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:2405.14458 (2024).
Zhao, D. et al. LAD-YOLO: A lightweight YOLOv5 network for surface defect detection on aluminum profiles. Int. J. Adv. Comput. Sci. Appl. 14, 9 (2023).
Jiang, L., Yuan, B., Wang, Y., Ma, Y., Du, J., Wang, F., & Guo, J. MA-YOLO: a method for detecting surface defects of aluminum profiles with attention guidance. IEEE Access 11, 71269–71286 (2023).
Li, Z. et al. An effective surface defect classification method based on RepVGG with CBAM attention mechanism (RepVGG-CBAM) for aluminum profiles. Metals 12.11 : 1809. (2022).
Sui, T. & Wang, J. An improved multiscale semantic enhancement network for aluminum defect detection. IEEE Access 12, 138362–138371 (2024).
Wang, T. et al. An intelligent method for detecting surface defects in aluminium profiles based on the improved YOLOv5 algorithm. Electronics 11.15 : 2304. (2022).
Wang, L. et al. A lightweight aluminum profile surface defect detection algorithm based on the improved YOLOv8 algorithm. Proceedings of the International Conference on Machine Learning, Pattern Recognition and Automation Engineering. (2024).
Wang, L. et al. A defect detection method for industrial aluminum sheet surface based on improved YOLOv8 algorithm. Front. Phys. 12, 1419998 (2024).
Tang, J. et al. An algorithm for real-time aluminum profile surface defects detection based on lightweight network structure. Metals 13.3 : 507. (2023).
Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:.11929 (2020). (2020). (2010).
Touvron, H. et al. Training data-efficient image transformers & distillation through attention. International conference on machine learning. PMLR, (2021).
Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF international conference on computer vision. (2021).
Wang, W. et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE/CVF international conference on computer vision. (2021).
Wu, H. et al. Cvt: Introducing convolutions to vision transformers. Proceedings of the IEEE/CVF international conference on computer vision. (2021).
Bahdanau, D. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
Vaswani, A. Attention is all you need. Advances Neural Inform. Process. Systems (2017).
Kitaev, N. Łukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer. arXiv preprint arXiv:.04451 (2020). (2020). (2001).
Beltagy, I. & Peters, M. E. and Arman Cohan. Longformer: The long-document transformer. arXiv preprint arXiv:.05150 (2020). (2020). (2004).
Khanam, R. & Hussain, M. YOLOv11: An Overview of the Key Architectural Enhancements. arXiv preprint arXiv:2410.17725 (2024).
Wei, H. et al. DWRSeg: Rethinking efficient acquisition of multi-scale contextual information for real-time semantic segmentation. arXiv preprint arXiv:2212.01173 (2022).
Ding, X. et al. UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. (2024).
Yang, L. et al. Simam: A simple, parameter-free attention module for convolutional neural networks. International conference on machine learning. PMLR, (2021).
Wang, J. et al. Carafe: Content-aware reassembly of features. Proceedings of the IEEE/CVF international conference on computer vision. (2019).
Wu, L., Zhang, L. & Zhou, Q. Printed circuit board quality detection method integrating lightweight network and dual attention mechanism. IEEE Access. 10, 87617–87629 (2022).
Tian, Y., Ye, Q. & Doermann, D. Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524. (2025).
Lei, M., Li, S., Wu, Y., Hu, H., Zhou, Y., Zheng, X., … Gao, Y. (2025). YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception. arXiv preprint arXiv:2506.17733.
Kingma, D. P. Adam: A method for stochastic optimization. ArXiv Preprint arXiv :14126980. (2014).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. ArXiv Preprint arXiv :171105101. (2017).
Xiang, Q., Wang, X., Lei, L. & Song, Y. Dynamic bound adaptive gradient methods with belief in observed gradients. Pattern Recogn., 111819. (2025).
Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22–27, 2010 Keynote, Invited and Contributed Papers (pp. 177–186). Heidelberg: Physica-Verlag HD. (2010)
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
Rui Zhang conducted the methodology design, experiments, and writing—original draft. Shanshan Cai conceived the study, supervised the project, and writing—original draft. Zhen He contributed to software implementation, experimental validation and writing—review & editing. Yingjie Zhao assisted with data preprocessing, visualization, and data curation.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts to disclose.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, R., Cai, S., He, Z. et al. Aerospace aluminum surface defect detection method based on Multi-Scale Convolution and attention mechanism. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37293-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-37293-5


