Abstract
Deploying deep learning–based object detectors on low-compute edge AI SoCs remains challenging, as real-world performance depends on factors beyond nominal TOPS ratings, including architectural design, memory bandwidth, and system-level contention. This study presents a comprehensive and reproducible benchmarking of nine YOLO variants across three widely used Rockchip SoCs, covering multiple input resolutions, compute configurations, and operating conditions. Our results show that inference latency correlates more strongly with detection accuracy (mAP) than with FLOPs or parameter count, revealing the execution overhead introduced by recent architectural modules. Latency scaling with input size deviates from quadratic theoretical predictions due to bandwidth limitations, and multi-core NPU scheduling provides only marginal gains because of synchronization and shared-memory bottlenecks. Under multitasking stress, memory bandwidth emerges as the primary factor governing robustness, while energy-per-inference measurements highlight substantial efficiency differences across SoCs. These findings offer practical guidance for selecting and deploying object detection models on embedded platforms, emphasizing the need for hardware-aware model choices and memory-efficient optimizations in real-time edge AI applications.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author, Dr. Feng Li, upon reasonable request.
References
Nayak, R., Pati, U. C. & Das, S. K. A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing 106, 104078, https://doi.org/10.1016/j.imavis.2020.104078 (2021).
Lu, L. et al. Deep learning-assisted real-time defect detection and closed-loop adjustment for additive manufacturing of continuous fiber-reinforced polymer composites. Robotics and Computer-Integrated Manufacturing 79, 102431. https://doi.org/10.1016/j.rcim.2022.102431 (2023).
Berens, F. et al. Adaptive training for robust object detection in autonomous driving environments. IEEE Transactions on Intelligent Vehicles https://doi.org/10.1109/TIV.2024.3439001 (2024).
Ahmed, F. A. et al. Deep learning for surgical instrument recognition and segmentation in robotic-assisted surgeries: A systematic review. Artificial Intelligence Review 58, 1. https://doi.org/10.1007/s10462-024-10979-w (2024).
Mohaidat, T. & Khalil, K. A survey on neural network hardware accelerators. IEEE Transactions on Artificial Intelligence 5, 3801–3822. https://doi.org/10.1109/TAI.2024.3377147 (2024).
Shuvo, M. M. H., Islam, S. K., Cheng, J. & Morshed, B. I. Efficient acceleration of deep learning inference on resource-constrained edge devices: A review. Proceedings of the IEEE 111, 42–91. https://doi.org/10.1109/JPROC.2022.3226481 (2023).
Jouppi, N. P., Yoon, D. H., Ashcraft, M., Gottscho, M., Jablin, T. B., Kurian, G., Laudon, J., Li, S., Ma, P., Ma, X. and Norrie, T. (2021) Ten lessons from three generations shaped google’s tpuv4i: Industrial product. 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), https://doi.org/10.1109/ISCA52012.2021.00010.
Zhu, J., Feng, H., Zhong, S. & Yuan, T. Performance analysis of real-time object detection on jetson device. In 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science (ICIS), 156–161, https://doi.org/10.1109/ICIS54925.2022.9882480 (2022).
Liu, S., Zha, J., Sun, J., Li, Z. & Wang, G. Edgeyolo: An edge-real-time object detector. In 2023 42nd Chinese Control Conference (CCC), 7507–7512, https://doi.org/10.23919/CCC58697.2023.10239786 (2023).
Lazarevich, I. et al. Yolobench: Benchmarking efficient object detectors on embedded systems. In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 1161–1170, https://doi.org/10.1109/ICCVW60793.2023.00126 (2023).
Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection, https://doi.org/10.48550/arXiv.1506.02640 (2016). arXiv:1506.02640.
Redmon, J. & Farhadi, A. Yolo9000: Better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517–6525, https://doi.org/10.1109/CVPR.2017.690 (2017).
Redmon, J. & Farhadi, A. Yolov3: An incremental improvement, https://doi.org/10.48550/arXiv.1804.02767 (2018). arXiv:1804.02767.
Bochkovskiy, A., Wang, C.-Y. & Liao, H.-Y. M. Yolov4: Optimal speed and accuracy of object detection, https://doi.org/10.48550/arXiv.2004.10934 (2020). arXiv:2004.10934.
Jocher, G. Yolov5 by ultralytics, https://doi.org/10.5281/zenodo.3908559 (2020).
Wang, C.-Y., Bochkovskiy, A. & Liao, H.-Y. M. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7464–7475, https://doi.org/10.1109/CVPR52729.2023.00721 (2023).
Wang, C.-Y., Yeh, I.-H. & Mark Liao, H.-Y. Yolov9: Learning what you want to learn using programmable gradient information. In Leonardis, A. et al. (eds.) Computer Vision – ECCV 2024, 1–21, https://doi.org/10.1007/978-3-031-72751-1_1 (Springer Nature Switzerland, 2024).
Wang, A. et al. Yolov10: Real-time end-to-end object detection,https://doi.org/10.48550/arXiv.2405.14458 (2024). arXiv:2405.14458.
Jocher, G., Qiu, J. & Chaurasia, A. Ultralytics yolo (2023).
Deng, L. et al. Lightweight aerial image object detection algorithm based on improved YOLOv5s. Scientific Reports 13, 7817. https://doi.org/10.1038/s41598-023-34892-4 (2023).
Xiao, Y. & Di, N. SOD-YOLO: a lightweight small object detection framework. Scientific Reports 14, 25624. https://doi.org/10.1038/s41598-024-77513-4 (2024).
Xiao, L., Li, W., Yao, S., Liu, H. & Ren, D. High-precision and lightweight small-target detection algorithm for low-cost edge intelligence. Scientific Reports 14, 23542. https://doi.org/10.1038/s41598-024-75243-1 (2024).
Chen, R. et al. An optimized lightweight real-time detection network model for IoT embedded devices. Scientific Reports 15, 3839. https://doi.org/10.1038/s41598-025-88439-w (2025).
Chen, W., Liu, J., Liu, T. & Zhuang, Y. PCPE-YOLO with a lightweight dynamically reconfigurable backbone for small object detection. Scientific Reports 15, 29988. https://doi.org/10.1038/s41598-025-15975-w (2025).
Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2704–2713, https://doi.org/10.1109/CVPR.2018.00286 (2018).
Kong, C., Mo, P., Zeng, T., Luo, Q. & Mao, R. Challenges and benchmarking of object detection models on edge AI SoCs. In Huang, D.-S., Chen, H., Li, B. & Zhang, Q. (eds.) Advanced Intelligent Computing Technology and Applications, 253–264, https://doi.org/10.1007/978-981-96-9908-7_21 (Springer Nature, Singapore, 2025).
Gao, W. et al. Aibench: Towards scalable and comprehensive datacenter ai benchmarking. In Zheng, C. & Zhan, J. (eds.) Benchmarking, Measuring, and Optimizing, 3–9,https://doi.org/10.1007/978-3-030-32813-9_1 (Springer International Publishing, 2019).
Horvath, A., j.m.slocum & Willian.Zhang. mbw: Memory bandwidth benchmark. https://github.com/raas/mbw (2003). Version: Latest commit c3155b5.
King, C. I. stress-ng: A tool to load and stress a computer system. https://github.com/ColinIanKing/stress-ng (2013). Version: v0.18.10.
Acknowledgements
This work was supported by the Post-doctoral Later-stage Foundation Project of Shenzhen Polytechnic University (Grant No. 6023271039K), the Research Projects of the Department of Education of Guangdong Province (Grant No. 2024KTSCX052), and the Shenzhen Polytechnic University Research Fund (Grant Nos. 6023310030K and 6024310045K).
Funding
The Post-doctoral Later-stage Foundation Project of Shenzhen Polytechnic University (Grant No. 6023271039K), the Research Projects of the Department of Education of Guangdong Province (Grant No. 2024KTSCX052), and the Shenzhen Polytechnic University Research Fund (Grant Nos. 6023310030K and 6024310045K).
Author information
Authors and Affiliations
Contributions
C.K. conceived the study, designed the experiments, and wrote the initial draft of the manuscript. F.L. supervised the project, provided critical revisions, and approved the final version of the paper. X.Y. and J.Y. performed experiments and data collection. P.M. and Q.L. contributed to model deployment and hardware benchmarking. R.M. assisted with data analysis and visualization. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kong, C., Li, F., Yan, X. et al. Object detection on low-compute edge SoCs: a reproducible benchmark and deployment guidelines. Sci Rep (2026). https://doi.org/10.1038/s41598-026-36862-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-36862-y


