Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Object tracking algorithm based on deformable attention mechanism
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 06 March 2026

Object tracking algorithm based on deformable attention mechanism

  • Qiaoling Liu1,2,
  • Na Yu1 &
  • Jinfu Cheng3 

Scientific Reports , Article number:  (2026) Cite this article

  • 601 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Mathematics and computing

Abstract

Occlusion, sudden illumination changes, and rapid motion in complex scenes severely degrade the robustness of existing object tracking methods. To address this issue, this paper proposes a novel object tracking algorithm that integrates a deformable attention mechanism. The method first embeds a deformable attention module into the ResNet-18 feature extraction network to enable adaptive enhancement of target key features. Second, the method adopts an improved Bidirectional Feature Pyramid Network as the feature fusion module to enhance the representational capability of multi-scale features. Finally, the method incorporates a dynamic Kalman filtering prediction module to improve the algorithm’s adaptability to changes in the target’s motion state and its continuous tracking capability. Experimental results show that the improved feature extraction network achieves an average overlap rate and success rate of 61.5% and 68.4%, respectively, on the GOT-10k dataset, with a computational load of only 1.96 GFLOPs and an increase of only 0.23 M in parameters. On the MOT20 dataset, the proposed object tracking network achieves a Multiple Object Tracking Accuracy of 77.5%, an Identity F1 Score of 77.0%, with 54.6% Majority of Tracked Trajectories and 12.5% Majority of Lost Trajectories. Its tracking performance surpasses that of the compared object tracking algorithms. These results confirm the efficacy of the Deformable Attention Mechanism and present a robust solution for complex dynamic tracking scenarios.

Similar content being viewed by others

Method for reconstructing safety and arming motion process by integrating Kalman filter and KCF

Article Open access 11 March 2025

A two stage multi object tracking algorithm with transformer and attention mechanism

Article Open access 26 August 2025

Adaptive sparse attention-based compact transformer for object tracking

Article Open access 28 May 2024

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

  1. Verma, J. K., Chhabra, J. K. & Ranga, V. Track consensus-based labeled multi-target tracking in mobile distributed sensor network. IEEE Trans. Mob. Comput. 23(6), 7351–7362. https://doi.org/10.1109/TMC.2023.3333916 (2024).

    Google Scholar 

  2. Mokayed, H., Quan, T. Z., Alkhaled, L. & Sivakumar, V. Real-time human detection and counting system using deep learning computer vision techniques. AIA. 1(4), 205–213. https://doi.org/10.47852/bonviewAIA2202391 (2023).

    Google Scholar 

  3. Sun, N., Zhao, J., Shi, Q., Liu, C. & Liu, P. Moving target tracking by unmanned aerial vehicle: A survey and taxonomy. IEEE Trans. Ind. Inform. 20(5), 7056–7068. https://doi.org/10.1109/TII.2024.3363084 (2024).

    Google Scholar 

  4. Zhou, L. & Kumar, V. Robust multi-robot active target tracking against sensing and communication attacks. IEEE Trans. Robot. 39(3), 1768–1780. https://doi.org/10.1109/TRO.2022.3233341 (2023).

    Google Scholar 

  5. Zhou, G., Zhu, B. & Ye, X. Switch-constrained multiple-model algorithm for maneuvering target tracking. IEEE Trans. Aerosp. Electron. Syst. 59(4), 4414–4433. https://doi.org/10.1109/TAES.2023.3242944 (2023).

    Google Scholar 

  6. Ma, B. et al. Target tracking control of UAV through deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 24(6), 5983–6000. https://doi.org/10.1109/TITS.2023.3249900 (2023).

    Google Scholar 

  7. Lin, B. et al. Motion-aware correlation filter-based object tracking in satellite videos. IEEE T GEOSCI REMOTE 62, 1–13. https://doi.org/10.1109/TGRS.2024.3350988 (2024).

    Google Scholar 

  8. Wang, Y. & Mariano, V. Y. A multi-object tracking framework based on YOLOv8s and bytetrack algorithm. IEEE Access 12, 120711–120719. https://doi.org/10.1109/ACCESS.2024.3450370 (2024).

    Google Scholar 

  9. Zha, C., Luo, S. & Xu, X. Infrared multi-target detection and tracking in dense urban traffic scenes. IET Image Process. 18(6), 1613–1628. https://doi.org/10.1049/ipr2.13053 (2024).

    Google Scholar 

  10. Liu, Y., An, B., Chen, S. & Zhao, D. Multi-target detection and tracking of shallow marine organisms based on improved YOLO v5 and DeepSORT. IET Image Process. 18(9), 2273–2290. https://doi.org/10.1049/ipr2.13090 (2024).

    Google Scholar 

  11. Nguyen, T. T., Nguyen, H. H., Sartipi, M. & Fisichella, M. LaMMOn: Language model combined graph neural network for multi-target multi-camera tracking in online scenarios. Mach. Learn. 113(9), 6811–6837. https://doi.org/10.1007/S10994-024-06592-1 (2024).

    Google Scholar 

  12. Ishtiaq, N., Gostar, A. K., Bab-Hadiashar, A. & Hoseinnezhad, R. Interaction-aware labeled multi-Bernoulli filter. IEEE Trans. Intell. Transp. Syst. 11, 11668–11681. https://doi.org/10.1109/TITS.2023.3294519 (2023).

    Google Scholar 

  13. Szántó, P., Kiss, T. & Sipos, K. J. FPGA accelerated DeepSORT object tracking. ICCC 7(1), 423–428. https://doi.org/10.1109/ICCC57093.2023.10178935 (2023).

    Google Scholar 

  14. Razak, R. N. & Abdullah, H. N. Improving multi-object detection and tracking with deep learning, DeepSORT, and frame cancellation techniques. Open Eng. 14(1), 533–545. https://doi.org/10.1515/eng-2024-0056 (2024).

    Google Scholar 

  15. Alamri, F. S. & El-Hadidy, M. A. A. Optimal linear tracking for a hidden target on one of K-intervals. J. Eng. Math. 144(1), 8. https://doi.org/10.1007/s10665-023-10315-1 (2024).

    Google Scholar 

  16. Ayman, B., Malik, M. & Lotfi, B. DAM-SLAM: Depth attention module in a semantic visual SLAM based on objects interaction for dynamic environments. Appl. Intell. 53(21), 25802–25815. https://doi.org/10.1007/s10489-023-04720-3 (2023).

    Google Scholar 

  17. Ge, Q. et al. Hyper-progressive real-time detection transformer (HPRT-DETR) algorithm for defect detection on metal bipolar plates. Int. J. Hydrogen Energy 74(7), 49–55. https://doi.org/10.1016/j.ijhydene.2024.06.028 (2024).

    Google Scholar 

  18. Pan, Y., Zhu, C., Luo, L., Liu, Y. & Cheng, Z. FedTrack: A collaborative target tracking framework based on adaptive federated learning. IEEE Trans. Veh. Technol. 73(9), 13868–13882. https://doi.org/10.1109/TVT.2024.3395292 (2024).

    Google Scholar 

  19. Diaz-Vilor, C., Lozano, A. & Jafarkhani, H. A reinforcement learning approach for wildfire tracking with UAV swarms. IEEE Trans. Wirel. Commun. 24(4), 2766–2782. https://doi.org/10.1109/TWC.2024.3524324 (2025).

    Google Scholar 

  20. Vial, A., Hendeby, G., Daamen, W., van Arem, B. & Hoogendoorn, S. Framework for network-constrained tracking of cyclists and pedestrians. IEEE Trans. Intell. Transp. Syst. 24(3), 3282–3296. https://doi.org/10.1109/TITS.2022.3225467 (2022).

    Google Scholar 

  21. Zhang, Z., Zhang, F., Cao, M., Feng, C. & Chen, D. Enhancing UAV-assisted vehicle edge computing networks through a digital twin-driven task offloading framework. Wirel. Netw. 31(1), 965a–9981. https://doi.org/10.1007/s11276-024-03804-3 (2025).

    Google Scholar 

  22. Aishwarya, N., Chandhana, C. & Gowri, P. Y. S. A Hybrid Approach using modified ResNet18 for Marine Mammal Sound classification. PCS 257, 864–871. https://doi.org/10.1016/PROCS.2025.03.111 (2025).

    Google Scholar 

  23. Mei, Y. ResNet18 facial feature extraction algorithm improved based on hybrid domain attention mechanism. PLoS One 20(3), e0319921. https://doi.org/10.1371/JOURNAL.PONE.0319921 (2025).

    Google Scholar 

  24. Fahad, M. et al. Advanced deepfake detection with enhanced Resnet-18 and multilayer CNN max pooling. Visual Comput. 41(5), 3473–3486. https://doi.org/10.1007/S00371-024-03613-X (2025).

    Google Scholar 

  25. Gao, Y., Liu, B., Wang, P. & Wang, P. Acceleration of ResNet18 based on run-time inference engine. ICICM https://doi.org/10.1109/ICICM63644.2024.10814151 (2024).

    Google Scholar 

  26. Yang, H., Chen, D. & Feng, X. Abnormality Monitoring and Recognition of Surveillance Video Based on ResNet Residual Network. AICIT https://doi.org/10.1109/AICIT62434.2024.10730173 (2024).

    Google Scholar 

  27. Zhang, H. et al. A defect detection network for painted wall surfaces based on YOLOv5 enhanced by attention mechanism and bi-directional FPN. Soft Comput. 28(17), 10391–10402. https://doi.org/10.1007/s00500-024-09799-5 (2024).

    Google Scholar 

  28. Adli, T., Bujaković, D., Bondžulić, B., Laidouni, M. Z. & Andrić, M. A modified YOLOv5 architecture for aircraft detection in remote sensing images. J. Indian Soc. Remote Sens. 53(3), 933–948. https://doi.org/10.1007/s12524-024-02033-7 (2025).

    Google Scholar 

  29. Gao, J. & Zhang, Z. Small target detection based on attention mechanism feature fusion. In: Proc. Fourth Int. Conf. Comput. Vis. Data Mining (ICCVDM 2023). 13063(2): 213–217. https://doi.org/10.1117/12.3021360 (2024).

  30. Divya, G. N. & Koteswara Rao, S. Implementation of ensemble Kalman filter algorithm for underwater target tracking. J. Control Decis. 11(3), 345–354. https://doi.org/10.1080/23307706.2022.2092039 (2024).

    Google Scholar 

  31. Liu, Y., Nie, L., Dong, R. & Chen, G. BP neural network-Kalman filter fusion method for unmanned aerial vehicle target tracking. Proc. Inst. Mech. Eng. C J. Mech. Eng. Sci. 237(18), 4203–4212. https://doi.org/10.1177/0954406220983864 (2023).

    Google Scholar 

  32. Shao, D., Gao, G. & Ma, L. Attentional residual network based spatial transformer mechanism for facial expression recognition. J. Intell. Fuzzy Syst. Appl. Eng. Technol. 49(3), 751–766. https://doi.org/10.1177/18758967251355732 (2025).

    Google Scholar 

  33. Yang, W., Zhang, L., Guo, J., Peng, H. & Liu, Z. Optimizing Facial Expression Recognition: A One-Class Classification Approach Using ResNet18 and CBAM. ICCTech https://doi.org/10.1109/ICCTECH61708.2024.00009 (2024).

    Google Scholar 

  34. Li, J. et al. Image recognition based on thgs algorithm to optimize resnet-18 model. AAI 1(1), 169–191. https://doi.org/10.59782/AAI.V1I1.284 (2024).

    Google Scholar 

  35. Xue, C. et al. Similarity-guided layer-adaptive vision transformer for UAV tracking. In: Proceedings of the Computer Vision and Pattern Recognition Conference. 6730–6740. https://doi.org/10.48550/ARXIV.2503.06625 (2025).

  36. Wang, H., Qian, H., Feng, S. & Yan, S. Calyolov4: Lightweight yolov4 target detection based on coordinated attention. J. Supercomput. 79(16), 18947–18969. https://doi.org/10.1007/s11227-023-05380-3 (2023).

    Google Scholar 

  37. Wu, Q. et al. A lightweight deep learning algorithm for multi-objective detection of recyclable domestic waste. Environ. Eng. Sci. 40(12), 667–677. https://doi.org/10.1089/ees.2023.0138 (2023).

    Google Scholar 

Download references

Funding

The research is supported by Research on Nonlinear System Model Identification and Optimal Control Method under Weak Continuous Incentive Conditions in Sichuan Province Science and Technology Plan Project (2025ZNSFSC1513).

Author information

Authors and Affiliations

  1. School of Electronic Information and Electrical Engineering, Chengdu University, Chengdu, 610106, China

    Qiaoling Liu & Na Yu

  2. Entrepreneurship College, Chengdu University, Chengdu, 610106, China

    Qiaoling Liu

  3. Department Engineering, Datong Vocational and Technical College of Coal, Datong, 037000, China

    Jinfu Cheng

Authors
  1. Qiaoling Liu
    View author publications

    Search author on:PubMed Google Scholar

  2. Na Yu
    View author publications

    Search author on:PubMed Google Scholar

  3. Jinfu Cheng
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Q.L.L. processed the numerical attribute linear programming of communication big data, and the mutual information feature quantity of communication big data numerical attribute was extracted by the cloud extended distributed feature fitting method. N.Y. Combined with fuzzy C-means clustering and linear regression analysis, the statistical analysis of big data numerical attribute feature information was carried out, and the associated attribute sample set of communication big data numerical attribute cloud grid distribution was constructed. J.F.C. did the experiments, recorded data, and created manuscripts. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Qiaoling Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Q., Yu, N. & Cheng, J. Object tracking algorithm based on deformable attention mechanism. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43147-x

Download citation

  • Received: 30 September 2025

  • Accepted: 02 March 2026

  • Published: 06 March 2026

  • DOI: https://doi.org/10.1038/s41598-026-43147-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Object tracking algorithm
  • Deformable attention mechanism
  • ResNet-18
  • Bidirectional feature pyramid network
  • Kalman filter
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics