Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Attention-guided saptio-temporal feature fusion for robus video surveillance anomaly detection
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 10 February 2026

Attention-guided saptio-temporal feature fusion for robus video surveillance anomaly detection

  • S. Deepa Nivethika1,
  • Shreyash Joshi1,
  • Kshitij Verma1,
  • V. Aishwarya1,
  • Vimal Varshan Srinivasan2,
  • M. Senthil Pandian3 &
  • …
  • Prabhakaran Paulraj4 

Scientific Reports , Article number:  (2026) Cite this article

  • 238 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Mathematics and computing

Abstract

Dynamic object detection and tracking are essential components of intelligent video surveillance systems, enabling real-time monitoring and early identification of anomalous activities. Existing approaches often rely on either spatial appearance modeling or temporal sequence analysis, which limits robustness in crowded and dynamically evolving scenes. This study first evaluates representative spatial and temporal baseline models for theft detection, including an EfficientNetV2B0–HOG framework and a ConvLSTM-based temporal model, which achieve F1-scores of 0.86 and high recall but suffer from limited temporal consistency and sensitivity to data imbalance. To address these limitations, we propose an attention-guided spatio-temporal hybrid framework, referred to as HybridModel-1, which integrates object-level spatial detection with temporal motion modeling. The proposed model incorporates an Adaptive Feature Fusion Module (AFFM) to dynamically emphasize salient spatial features and a Temporal Confidence Reweighting Loss to suppress temporally inconsistent predictions. Evaluated on large-scale surveillance benchmarks including UCF-Crime, ShanghaiTech, and DCSASS, the proposed framework achieves an accuracy of 87.6%, a precision of 95.6%, a recall of 77.1%, and a ROC–AUC of 0.96, outperforming standalone spatial and temporal baselines. Ablation studies further confirm the effectiveness of the proposed fusion and temporal consistency mechanisms, demonstrating the model’s suitability for real-time surveillance applications.

Data availability

The DCSASS (Dynamic Crime and Security Anomaly Surveillance System) dataset used in this study is publicly available on Kaggle at: https://www.kaggle.com/datasets/mateohervas/dcsass-dataset. The UCF-Crime dataset is also publicly available for academic research at: https://www.crcv.ucf.edu/projects/real-world/. Both datasets are open-access and do not involve direct interaction with human subjects. All the data used in this research were obtained from publicly available repositories and are fully anonymized.

References

  1. Pandurangan, K. & Nagappan, K. A Deep Assessment of Thermal Image-Based Object Detection for a Wide Range of Applications. in 2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications (AIMLA) (IEEE, 2024). https://doi.org/10.1109/AIMLA59606.2024.10531492.

  2. Ilić, V. The Integration of Artificial Intelligence and Computer Vision in Large-Scale Video Surveillance of Railway Stations. in 2024 Zooming Innovation in Consumer Technologies Conference (ZINC) (IEEE, 2024). https://doi.org/10.1109/ZINC61849.2024.10579411.

  3. Bose, S., Kolekar, M. H., Nawale, S. & Khut, D. LoLTV: A low light two-wheeler violation dataset with anomaly detection technique. IEEE Access. 11, 124951–124961 (2023). https://doi.org/10.1109/ACCESS.2023.3329737

    Google Scholar 

  4. Ul Amin, S. et al. EADN: An efficient deep learning model for anomaly detection in videos. Mathematics 10(9), 1555. https://doi.org/10.3390/math10091555 (2022).

    Google Scholar 

  5. Yang, Y. Research on Real-time Dynamic Object Detection Based on YOLOv3 Deep Learning Network. 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI). IEEE. (2023). https://doi.org/10.1109/ICETCI57876.2023.10176887

  6. Thinakaran, N. T. J. K. CNN-Based Moving Object Detection from Surveillance Video in Comparison with GMM (IEEE, 2022).

  7. Amin, S. U., Hussain, A., Kim, B. & Seo, S. Deep learning based active learning technique for data annotation and improve the overall performance of classification models, Expert Syst. Appl. 228, 120391. https://doi.org/10.1016/j.eswa.2023.120391 (2023).

  8. Modi, P., Menon, D., Areeckal, A. S. & Verma, A. Real- time Object Tracking in Videos using Deep Learning and Optical Flow. in Proceedings of the 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT-2024) (IEEE, 2024). https://doi.org/10.1109/IDCIOT59759.2024.10467997.

  9. Jyothi, D. N., Vardhan, N. V., Reddy, G. H. & Prashanth, B. Collaborative Training of Object Detection and Re- Identification in Multi-Object Tracking Using YOLOv8. in 2024 International Conference on Computing and Data Science (ICCDS) (IEEE, 2024). https://doi.org/10.1109/ICCDS60734.2024.10560451.

  10. Ul Amin, S., Sibtain Abbas, M., Kim, B., Jung, Y. & Seo, S. Enhanced Anomaly detection in pandemic surveillance videos: An attention approach With EfficientNet-B0 and CBAM Integration. IEEE Access. 12, 162697–162712 (2024). https://doi.org/10.1109/ACCESS.2024.3488797

    Google Scholar 

  11. Al-Jawahry, H. M., Alkhafaji, M. A., Ravindran, G., Kumar, P. S. & Hussein, A. H. An Effective Object Tracking Using YOLOv3 with Bidirectional Feature Pyramid Network on Video Surveillance (IEEE, 2023).

  12. Elaoua, A., Nadour, M., Elasri, A. & Cherroun, L. Real- Time People Counting System using YOLOv8 Object Detection. in 2023 2nd International Conference on Electronics, Energy and Measurement(IC2EM) (IEEE, 2023). https://doi.org/10.1109/IC2EM59347.2023.10419684.

  13. Supreeth, H. S. G. & Patil, C. M. Moving Object Detection and Tracking Using Deep Learning Neural Network and Correlation Filter. in Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT) (IEEE, 2018).

  14. Al-E’mari, S., Sanjalawe, Y. & Alqudah, H. Integrating Enhanced Security Protocols with Moving Object Detection: A Yolo-Based Approach for Real-Time Surveillance. in 2024 2nd International Conference on Cyber Resilience (ICCR) (IEEE, 2024). https://doi.org/10.1109/ICCR61006.2024.10532863.

  15. Thomas, K. L. R., Pandeeswaran, C., Sanjay, G. J. & Raghi, K. R. Advanced CCTV Surveillance Anomaly Detection, Alert Generation, and Crowd Management using Deep Learning Algorithm. in 2024 3rd International Conference on Artificial Intelligence for Internet of Things (AIIoT) (IEEE, 2024).

  16. Bose, S., Ramesh, C. D. & Kolekar, M. H. Vehicle Classification and Counting for Traffic Video Monitoring Using YOLO-v3. in International Conference on Connected Systems & Intelligence (CSI), Trivandrum, India, 2022, 1–8, (Trivandrum, India, 2022). https://doi.org/10.1109/CSI54720.2022.9924018.

  17. Kapoor, P. Video Surveillance Detection of Moving Object Using Deep Learning. in 2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON) (IEEE, 2023). https://doi.org/10.1109/SMARTGENCON60755.2023.10442023.

  18. Devi, M. T. S. & Dhanalakshmi, M. A., S., & M. L., S., & N., L. Anomaly Detection in Video Surveillance. in 2024 IEEE 9th International Conference for Convergence in Technology (I2CT) (IEEE, 2024). https://doi.org/10.1109/I2CT61223.2024.10543949.

  19. Yan, R., Schubert, L., Kamm, A., Komar, M. & Schreier, M. Deep Generic Dynamic Object Detection Based on Dynamic Grid Maps. in 2024 IEEE Intelligent Vehicles Symposium (IV) (IEEE, 2024). https://doi.org/10.1109/IV55156.2024.10588415.

  20. Antony, J. C., Chowdary, C. L. S., Prabhu, N., Murali, E. & Mayan, A. Advancing Crowd Management through Innovative Surveillance using YOLOv8 and ByteTrack. in 2024 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET) (IEEE, 2024). https://doi.org/10.1109/WISPNET61464.2024.10533138.

  21. Chandan, G., Jain, A., Jain, H. & Mohana Real-Time Object Detection and Tracking Using Deep Learning and OpenCV. in Proceedings of the International Conference on Inventive Research in Computing Applications (ICIRCA) (IEEE, 2018).

Download references

Funding

Open access funding provided by Vellore Institute of Technology.

Author information

Authors and Affiliations

  1. School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India

    S. Deepa Nivethika, Shreyash Joshi, Kshitij Verma & V. Aishwarya

  2. School of Mechanical Engineering, Vellore Institute of Technology, Chennai, India

    Vimal Varshan Srinivasan

  3. School of Civil Engineering, Vellore Institute of Technology, Chennai, India

    M. Senthil Pandian

  4. Department of ECE, St. Joseph University in Tanzania, Dar es Salaam, Tanzania

    Prabhakaran Paulraj

Authors
  1. S. Deepa Nivethika
    View author publications

    Search author on:PubMed Google Scholar

  2. Shreyash Joshi
    View author publications

    Search author on:PubMed Google Scholar

  3. Kshitij Verma
    View author publications

    Search author on:PubMed Google Scholar

  4. V. Aishwarya
    View author publications

    Search author on:PubMed Google Scholar

  5. Vimal Varshan Srinivasan
    View author publications

    Search author on:PubMed Google Scholar

  6. M. Senthil Pandian
    View author publications

    Search author on:PubMed Google Scholar

  7. Prabhakaran Paulraj
    View author publications

    Search author on:PubMed Google Scholar

Contributions

S.D.N. Conceptualization, Methodology design, Supervision, Manuscript writing, and Corresponding author responsibilities. S.J. Model development, Dataset preparation. K.V. Implementation of framework and result analysis. A.V. Development of the model and effective implementation of the model. V.V.S. Experimental Validation. M.S.P. Structural design of the research framework, technical validation, and proofreading. P.P. Implementation of ConvLSTM model, data preprocessing, and performance evaluation.

Corresponding author

Correspondence to S. Deepa Nivethika.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval and informed consent

All methods were carried out in accordance with relevant guidelines and regulations. The datasets (DCSASS and UCF-Crime) used in this study consist entirely of publicly available, anonymized surveillance video footage that does not contain identifiable human subjects. Therefore, this research did not require ethical approval or informed consent, as no human participants or personal data were involved.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nivethika, S.D., Joshi, S., Verma, K. et al. Attention-guided saptio-temporal feature fusion for robus video surveillance anomaly detection. Sci Rep (2026). https://doi.org/10.1038/s41598-026-36130-z

Download citation

  • Received: 15 October 2025

  • Accepted: 09 January 2026

  • Published: 10 February 2026

  • DOI: https://doi.org/10.1038/s41598-026-36130-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Video surveillance
  • Anomaly detection
  • Spatio-temporal feature fusion
  • Attention mechanism
  • ConvLSTM
  • Temporal consistency modeling
  • Theft detection
  • UCF-crime dataset
  • ShanghaiTech dataset
  • YOLO-v4
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics