Object detection on low-compute edge SoCs: a reproducible benchmark and deployment guidelines

Kong, Chang; Li, Feng; Yan, Xiaohu; Yang, Jinfeng; Mo, Peng; Luo, Qiuming; Mao, Rui

doi:10.1038/s41598-026-36862-y

Download PDF

Article
Open access
Published: 21 January 2026

Object detection on low-compute edge SoCs: a reproducible benchmark and deployment guidelines

Chang Kong¹,
Feng Li¹,
Xiaohu Yan¹,
Jinfeng Yang^1,2,
Peng Mo³,
Qiuming Luo³ &
…
Rui Mao³

Scientific Reports , Article number: (2026) Cite this article

328 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Deploying deep learning–based object detectors on low-compute edge AI SoCs remains challenging, as real-world performance depends on factors beyond nominal TOPS ratings, including architectural design, memory bandwidth, and system-level contention. This study presents a comprehensive and reproducible benchmarking of nine YOLO variants across three widely used Rockchip SoCs, covering multiple input resolutions, compute configurations, and operating conditions. Our results show that inference latency correlates more strongly with detection accuracy (mAP) than with FLOPs or parameter count, revealing the execution overhead introduced by recent architectural modules. Latency scaling with input size deviates from quadratic theoretical predictions due to bandwidth limitations, and multi-core NPU scheduling provides only marginal gains because of synchronization and shared-memory bottlenecks. Under multitasking stress, memory bandwidth emerges as the primary factor governing robustness, while energy-per-inference measurements highlight substantial efficiency differences across SoCs. These findings offer practical guidance for selecting and deploying object detection models on embedded platforms, emphasizing the need for hardware-aware model choices and memory-efficient optimizations in real-time edge AI applications.

Deploying TinyML for energy-efficient object detection and communication in low-power edge AI systems

Article Open access 05 December 2025

Robust high-dimensional memory-augmented neural networks

Article Open access 29 April 2021

Energy efficient training of private recommendation systems using multi-armed bandit models and analog in-memory computing

Article Open access 01 September 2025

Data availability

The data that support the findings of this study are available from the corresponding author, Dr. Feng Li, upon reasonable request.

References

Nayak, R., Pati, U. C. & Das, S. K. A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing 106, 104078, https://doi.org/10.1016/j.imavis.2020.104078 (2021).
Lu, L. et al. Deep learning-assisted real-time defect detection and closed-loop adjustment for additive manufacturing of continuous fiber-reinforced polymer composites. Robotics and Computer-Integrated Manufacturing 79, 102431. https://doi.org/10.1016/j.rcim.2022.102431 (2023).
Google Scholar
Berens, F. et al. Adaptive training for robust object detection in autonomous driving environments. IEEE Transactions on Intelligent Vehicles https://doi.org/10.1109/TIV.2024.3439001 (2024).
Google Scholar
Ahmed, F. A. et al. Deep learning for surgical instrument recognition and segmentation in robotic-assisted surgeries: A systematic review. Artificial Intelligence Review 58, 1. https://doi.org/10.1007/s10462-024-10979-w (2024).
Google Scholar
Mohaidat, T. & Khalil, K. A survey on neural network hardware accelerators. IEEE Transactions on Artificial Intelligence 5, 3801–3822. https://doi.org/10.1109/TAI.2024.3377147 (2024).
Google Scholar
Shuvo, M. M. H., Islam, S. K., Cheng, J. & Morshed, B. I. Efficient acceleration of deep learning inference on resource-constrained edge devices: A review. Proceedings of the IEEE 111, 42–91. https://doi.org/10.1109/JPROC.2022.3226481 (2023).
Google Scholar
Jouppi, N. P., Yoon, D. H., Ashcraft, M., Gottscho, M., Jablin, T. B., Kurian, G., Laudon, J., Li, S., Ma, P., Ma, X. and Norrie, T. (2021) Ten lessons from three generations shaped google’s tpuv4i: Industrial product. 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), https://doi.org/10.1109/ISCA52012.2021.00010.
Zhu, J., Feng, H., Zhong, S. & Yuan, T. Performance analysis of real-time object detection on jetson device. In 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science (ICIS), 156–161, https://doi.org/10.1109/ICIS54925.2022.9882480 (2022).
Liu, S., Zha, J., Sun, J., Li, Z. & Wang, G. Edgeyolo: An edge-real-time object detector. In 2023 42nd Chinese Control Conference (CCC), 7507–7512, https://doi.org/10.23919/CCC58697.2023.10239786 (2023).
Lazarevich, I. et al. Yolobench: Benchmarking efficient object detectors on embedded systems. In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 1161–1170, https://doi.org/10.1109/ICCVW60793.2023.00126 (2023).
Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection, https://doi.org/10.48550/arXiv.1506.02640 (2016). arXiv:1506.02640.
Redmon, J. & Farhadi, A. Yolo9000: Better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517–6525, https://doi.org/10.1109/CVPR.2017.690 (2017).
Redmon, J. & Farhadi, A. Yolov3: An incremental improvement, https://doi.org/10.48550/arXiv.1804.02767 (2018). arXiv:1804.02767.
Bochkovskiy, A., Wang, C.-Y. & Liao, H.-Y. M. Yolov4: Optimal speed and accuracy of object detection, https://doi.org/10.48550/arXiv.2004.10934 (2020). arXiv:2004.10934.
Jocher, G. Yolov5 by ultralytics, https://doi.org/10.5281/zenodo.3908559 (2020).
Wang, C.-Y., Bochkovskiy, A. & Liao, H.-Y. M. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7464–7475, https://doi.org/10.1109/CVPR52729.2023.00721 (2023).
Wang, C.-Y., Yeh, I.-H. & Mark Liao, H.-Y. Yolov9: Learning what you want to learn using programmable gradient information. In Leonardis, A. et al. (eds.) Computer Vision – ECCV 2024, 1–21, https://doi.org/10.1007/978-3-031-72751-1_1 (Springer Nature Switzerland, 2024).
Wang, A. et al. Yolov10: Real-time end-to-end object detection,https://doi.org/10.48550/arXiv.2405.14458 (2024). arXiv:2405.14458.
Jocher, G., Qiu, J. & Chaurasia, A. Ultralytics yolo (2023).
Deng, L. et al. Lightweight aerial image object detection algorithm based on improved YOLOv5s. Scientific Reports 13, 7817. https://doi.org/10.1038/s41598-023-34892-4 (2023).
Google Scholar
Xiao, Y. & Di, N. SOD-YOLO: a lightweight small object detection framework. Scientific Reports 14, 25624. https://doi.org/10.1038/s41598-024-77513-4 (2024).
Google Scholar
Xiao, L., Li, W., Yao, S., Liu, H. & Ren, D. High-precision and lightweight small-target detection algorithm for low-cost edge intelligence. Scientific Reports 14, 23542. https://doi.org/10.1038/s41598-024-75243-1 (2024).
Google Scholar
Chen, R. et al. An optimized lightweight real-time detection network model for IoT embedded devices. Scientific Reports 15, 3839. https://doi.org/10.1038/s41598-025-88439-w (2025).
Google Scholar
Chen, W., Liu, J., Liu, T. & Zhuang, Y. PCPE-YOLO with a lightweight dynamically reconfigurable backbone for small object detection. Scientific Reports 15, 29988. https://doi.org/10.1038/s41598-025-15975-w (2025).
Google Scholar
Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2704–2713, https://doi.org/10.1109/CVPR.2018.00286 (2018).
Kong, C., Mo, P., Zeng, T., Luo, Q. & Mao, R. Challenges and benchmarking of object detection models on edge AI SoCs. In Huang, D.-S., Chen, H., Li, B. & Zhang, Q. (eds.) Advanced Intelligent Computing Technology and Applications, 253–264, https://doi.org/10.1007/978-981-96-9908-7_21 (Springer Nature, Singapore, 2025).
Gao, W. et al. Aibench: Towards scalable and comprehensive datacenter ai benchmarking. In Zheng, C. & Zhan, J. (eds.) Benchmarking, Measuring, and Optimizing, 3–9,https://doi.org/10.1007/978-3-030-32813-9_1 (Springer International Publishing, 2019).
Horvath, A., j.m.slocum & Willian.Zhang. mbw: Memory bandwidth benchmark. https://github.com/raas/mbw (2003). Version: Latest commit c3155b5.
King, C. I. stress-ng: A tool to load and stress a computer system. https://github.com/ColinIanKing/stress-ng (2013). Version: v0.18.10.

Download references

Acknowledgements

This work was supported by the Post-doctoral Later-stage Foundation Project of Shenzhen Polytechnic University (Grant No. 6023271039K), the Research Projects of the Department of Education of Guangdong Province (Grant No. 2024KTSCX052), and the Shenzhen Polytechnic University Research Fund (Grant Nos. 6023310030K and 6024310045K).

Funding

The Post-doctoral Later-stage Foundation Project of Shenzhen Polytechnic University (Grant No. 6023271039K), the Research Projects of the Department of Education of Guangdong Province (Grant No. 2024KTSCX052), and the Shenzhen Polytechnic University Research Fund (Grant Nos. 6023310030K and 6024310045K).

Author information

Authors and Affiliations

Undergraduate College of Artificial Intelligence, Shenzhen Polytechnic University, Shenzhen, 518055, China
Chang Kong, Feng Li, Xiaohu Yan & Jinfeng Yang
Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong Macao Greater Bay Area, Shenzhen Polytechnic University, Shenzhen, 518055, China
Jinfeng Yang
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
Peng Mo, Qiuming Luo & Rui Mao

Authors

Chang Kong
View author publications
Search author on:PubMed Google Scholar
Feng Li
View author publications
Search author on:PubMed Google Scholar
Xiaohu Yan
View author publications
Search author on:PubMed Google Scholar
Jinfeng Yang
View author publications
Search author on:PubMed Google Scholar
Peng Mo
View author publications
Search author on:PubMed Google Scholar
Qiuming Luo
View author publications
Search author on:PubMed Google Scholar
Rui Mao
View author publications
Search author on:PubMed Google Scholar

Contributions

C.K. conceived the study, designed the experiments, and wrote the initial draft of the manuscript. F.L. supervised the project, provided critical revisions, and approved the final version of the paper. X.Y. and J.Y. performed experiments and data collection. P.M. and Q.L. contributed to model deployment and hardware benchmarking. R.M. assisted with data analysis and visualization. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Feng Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kong, C., Li, F., Yan, X. et al. Object detection on low-compute edge SoCs: a reproducible benchmark and deployment guidelines. Sci Rep (2026). https://doi.org/10.1038/s41598-026-36862-y

Download citation

Received: 01 October 2025
Accepted: 16 January 2026
Published: 21 January 2026
DOI: https://doi.org/10.1038/s41598-026-36862-y