Deep learning-based visual algorithms for identity and action recognition in engineering practical courses

Ma, Jun; Wang, RuoYu; Lan, WenQi

doi:10.1038/s41598-026-45964-6

Download PDF

Article
Open access
Published: 31 March 2026

Deep learning-based visual algorithms for identity and action recognition in engineering practical courses

Jun Ma¹,
RuoYu Wang² &
WenQi Lan³

Scientific Reports , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Engineering practice is an important component of engineering education. In this teaching scenario, students are frequently moving around, making the identification of their identities and actions using computer vision methods a prominent and ongoing research challenge. This is a challenge for AI-based identity recognition algorithms. Some facial recognition algorithms and person re-identification algorithms have attempted to solve the problem of identity recognition, but they all face difficulties in recognizing angles and low recognition accuracy. Some action recognition algorithms, such as the optical flow estimation, still face characteristics such as a lack of practical teaching scenarios, a lack of action training sets, complex networks, and complex operations. This paper introduces an identity recognition algorithm based on facial recognition algorithm and person re-identification algorithm, which improves the accuracy and effectiveness of recognition by introducing dynamic feature caching. And based on the target classification algorithm of torso and limb recognition, achieve action recognition. Finally, we validated the effectiveness and accuracy of the algorithm in practical engineering courses and conducted comparative experimental analysis.

AIFS: an efficient face recognition method based on AI and enhanced few-shot learning

Article Open access 30 November 2025

A comprehensive deep learning framework for real time emotion detection in online learning using hybrid models

Article Open access 25 November 2025

An AI-driven tools assessment framework for english teachers using the Fuzzy Delphi algorithm and deep learning

Article Open access 24 November 2025

Data availability

The datasets generated and/or analysed during the current study are not publicly available due to the fact that the data contain identifiable information that cannot be fully anonymized without compromising their utility, but are available from the corresponding author on reasonable request.

Code availability

The custom code used in this study is publicly available on GitHub at https://github.com/markchalse/DetectionTeachingScenarios.git and has been archived on Zenodo at https://doi.org/10.5281/zenodo.17186956.

References

Ma, J. et al. Cam4docc: Benchmark for camera-only 4d occupancy forecasting in autonomous driving applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 21486–21495 (2024).
Ma, J. et al. Asymmetric self-play for learning robust human-robot interaction on crowd navigation tasks. In 2022 3rd International Conference on Electronics, Communications and Information Technology (CECIT), 211–218 (IEEE, 2022).
Brown, T. et al. Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020).
Google Scholar
Grams, D. A quantitative study of the use of DreamBox learning and its effectiveness in improving math achievement of elementary students with math difficulties. Ph.D. thesis, Northcentral University (2018).
Song, W., Liu, S., Wang, X. & Wu, W. An improved sparrow search algorithm. In 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 537–543 (IEEE, 2020).
Singh, A., Karayev, S., Gutowski, K. & Abbeel, P. Gradescope: A fast, flexible, and fair system for scalable assessment of handwritten work. In Proceedings of the fourth (2017) ACM conference on learning@ scale, 81–88 (2017).
Crawley, E. F., Malmqvist, J., Lucas, W. A. & Brodeur, D. R. The cdio syllabus v2. 0. an updated statement of goals for engineering education. In Proceedings of the 7th International CDIO Conference, vol. 20 (Technical University of Denmark Copenhagen, 2011).
Zhao, W., Chellappa, R., Phillips, P. J. & Rosenfeld, A. Face recognition: A literature survey. ACM Comput. Surv. (CSUR) 35, 399–458 (2003).
Google Scholar
Schroff, F., Kalenichenko, D. & Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 815–823 (2015).
Taigman, Y., Yang, M., Ranzato, M. & Wolf, L. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1701–1708 (2014).
Ye, M. et al. Deep learning for person re-identification: A survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44, 2872–2893 (2021).
Google Scholar
Mihaescu, R.-E., Chindea, M., Paleologu, C., Carata, S. & Ghenescu, M. Person re-identification across data distributions based on general purpose DNN object detector. Algorithms 13, 343 (2020).
Google Scholar
Ren, X. et al. Facial geometric detail recovery via implicit representation. In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG) (2023).
Kilany, S. & Mahfouz, A. A comprehensive survey of deep face verification systems adversarial attacks and defense strategies. Sci. Rep. 15, 30861 (2025).
Google Scholar
Yan, J., Wang, Y., Luo, X. & Tai, Y.-W. Fusionsegreid: Advancing person re-identification with multimodal retrieval and precise segmentation. arXiv:2503.21595 (2025).
Feng, Y., Li, J., Xie, C., Tan, L. & Ji, J. Multi-modal object re-identification via sparse mixture-of-experts. In Forty-second International Conference on Machine Learning.
King, D. E. Dlib-ml: A machine learning toolkit. J. Mac. Learn. Res. 10, 1755–1758 (2009).
Google Scholar
Dalal, N. & Triggs, B. Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, 886–893 (IEEE, 2005).
Boyko, N., Basystiuk, O. & Shakhovska, N. Performance evaluation and comparison of software for face recognition, based on dlib and opencv library. In 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), 478–482 (IEEE, 2018).
Boyd, A., Czajka, A. & Bowyer, K. Deep learning-based feature extraction in iris recognition: Use existing models, fine-tune or train from scratch? In 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), 1–9 (IEEE, 2019).
Simonyan, K. & Zisserman, A. Two-stream convolutional networks for action recognition in videos. Advances in neural information processing systems 27 (2014).
Ilg, E. et al. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2462–2470 (2017).
Kreiss, S., Bertoni, L. & Alahi, A. Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Trans. Intell. Transp. Syst. 23, 13498–13511 (2021).
Google Scholar
Ligayo, M. A. D., Costa, M. T., Tejada, R. R., Lacatan, L. L. & Cunanan, C. F. An augmented deep learning inference approach of vehicle headlight recognition for on-road vehicle detection and counting. In 2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), 389–393 (IEEE, 2021).
Su, H., Luo, Z.-A., Feng, Y.-Y. & Liu, Z.-S. Application of siemens plc in thermal simulator control system. Procedia Manufacturing 37, 38–45 (2019).
Google Scholar
Tran, L. & Liu, X. On learning 3d face morphable model from in-the-wild images. IEEE Trans. Pattern Anal. Mach. Intell. 43, 157–171 (2019).
Google Scholar
Ma, J., Yao, S., Chen, G., Song, J. & Ji, J. Distributed reinforcement learning with self-play in parameterized action space. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 1178–1185 (IEEE, 2021).

Download references

Funding

The authors received no funding for this work.

Author information

Authors and Affiliations

Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, 999078, China
Jun Ma
School of Life Sciences, Shandong University, Jinan, 250000, China
RuoYu Wang
College of Transportation, Nanchang Jiaotong Institute, Nanchang, 330000, China
WenQi Lan

Authors

Jun Ma
View author publications
Search author on:PubMed Google Scholar
RuoYu Wang
View author publications
Search author on:PubMed Google Scholar
WenQi Lan
View author publications
Search author on:PubMed Google Scholar

Contributions

J.M. and W.L. wrote the main manuscript text and W.L. prepared figures and R.W. prepared tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to WenQi Lan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

The images included in this manuscript feature only the authors of this paper. Informed consent was obtained from all authors for the publication of their images in an online open-access publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ma, J., Wang, R. & Lan, W. Deep learning-based visual algorithms for identity and action recognition in engineering practical courses. Sci Rep (2026). https://doi.org/10.1038/s41598-026-45964-6

Download citation

Received: 12 February 2025
Accepted: 23 March 2026
Published: 31 March 2026
DOI: https://doi.org/10.1038/s41598-026-45964-6