A boosting strategy based on feature mimicking with attention for visual anomaly detection

Zheng, Boyuan; Gan, Yi; Wang, Lianggang; Cong, Xunchao; Hu, Chao; Wang, Di

doi:10.1038/s41598-026-37667-9

Download PDF

Article
Open access
Published: 26 March 2026

A boosting strategy based on feature mimicking with attention for visual anomaly detection

Boyuan Zheng¹,
Yi Gan¹,
Lianggang Wang¹,
Xunchao Cong¹,
Chao Hu¹ &
…
Di Wang¹

Scientific Reports , Article number: (2026) Cite this article

457 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Anomaly detection (AD), referred to as detecting anomalies from images or videos, is commonly considered a one-class classification task (i.e the model is only trained on the normal training data to identify abnormal data during the inference period). A distinguished category of the existing works is the reconstruction-based method where models are trained to reconstruct the inputs and leverage the reconstruction error with the target as an abnormality score. However, without considering global information, these methods may fail due to the generalization capability of the reconstruction model. To tackle this problem, we propose a proxy task of feature mimicking that can be integrated into a wide range of anomaly detection frameworks and utilizes their inherently discriminative hidden-layer features. Moreover, a novel attention module that takes the feature inconsistency matrix generated by the feature-mimicking task as input is presented. The feature inconsistency guided attention module enables the reconstruction-based model to focus on the region or pattern where the global, semantic feature inconsistency is higher. We integrate our method into several state-of-the-art methods for anomaly detection on images and videos. The empirical results show that our method can bring improvement and achieve new SOTA performance on MVTec AD, CUHK Avenue and ShanghaiTech.

Data availability

The data supporting this study’s findings are available from the corresponding author upon reasonable request.

Code availability

You can find our main code in https://github.com/jtkullo/FMABS.

References

Liu, W., Luo, W., Lian, D. & Gao, S. Future frame prediction for anomaly detection—A new baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6536–6545 (2018).
Gong, D. et al. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1705–1714 (2019).
Park, H., Noh, J. & Ham, B. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14372–14381 (2020).
Yu, G. et al. Cloze test helps: Effective video anomaly detection via learning to complete video events. In Proceedings of the 28th ACM International Conference on Multimedia. 583–591 (2020).
Liu, Z., Nie, Y., Long, C., Zhang, Q. & Li, G. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13588–13597 (2021).
Georgescu, M.-I. et al. Anomaly detection in video via self-supervised and multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12742–12752 (2021).
Wang, G. et al. Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X. 494–511 (Springer, 2022).
Li, C.-L., Sohn, K., Yoon, J. & Pfister, T. Cutpaste: Self-supervised learning for anomaly detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9664–9674 (2021).
Salehi, M., Sadjadi, N., Baselizadeh, S., Rohban, M. H. & Rabiee, H. R. Multiresolution knowledge distillation for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14902–14912 (2021).
Aslam, N., Rai, P. K. & Kolekar, M. H. A3n: Attention-based adversarial autoencoder network for detecting anomalies in video sequence. J. Vis. Commun. Image Represent. 87, 103598 (2022).
Google Scholar
Aslam, N. & Kolekar, M. H. Unsupervised anomalous event detection in videos using spatio-temporal inter-fused autoencoder. Multimed. Tools Appl. 81, 42457–42482 (2022).
Google Scholar
Bergmann, P., Fauser, M., Sattlegger, D. & Steger, C. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4183–4192 (2020).
Defard, T., Setkov, A., Loesch, A. & Audigier, R. Padim: A patch distribution modeling framework for anomaly detection and localization. In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part IV. 475–489 (Springer, 2021).
Wang, S., Wu, L., Cui, L. & Shen, Y. Glancing at the patch: Anomaly localization with global and local feature comparison. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 254–263 (2021).
Zavrtanik, V., Kristan, M. & Skočaj, D. Draem-A discriminatively trained reconstruction embedding for surface anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8330–8339 (2021).
Zavrtanik, V., Kristan, M. & Skočaj, D. Dsr—A dual subspace re-projection network for surface anomaly detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXI. 539–554 (Springer, 2022).
Roth, K. et al. Towards total recall in industrial anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14318–14328 (2022).
Deng, H. & Li, X. Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9737–9746 (2022).
Morais, R. et al. Learning regularity in skeleton trajectories for anomaly detection in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11996–12004 (2019).
Aslam, N. & Kolekar, M. H. A-vae: Attention based variational autoencoder for traffic video anomaly detection. In 2023 IEEE 8th International Conference for Convergence in Technology (I2CT). 1–7 (IEEE, 2023).
Aslam, N. & Kolekar, M. H. Demaae: Deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences. Vis. Comput. 40, 1729–1743 (2024).
Google Scholar
Aslam, N. & Kolekar, M. H. Transganomaly: Transformer based generative adversarial network for video anomaly detection. J. Vis. Commun. Image Represent. 100, 104108 (2024).
Google Scholar
Zhang, X., Xu, M. & Zhou, X. Realnet: A feature selection network with realistic synthetic anomaly for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16699–16708 (2024).
Costanzino, A., Ramirez, P. Z., Lisanti, G. & Di Stefano, L. Multimodal industrial anomaly detection by crossmodal feature mapping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17234–17243 (2024).
Freytsis, M., Perelstein, M. & San, Y. C. Anomaly detection in the presence of irrelevant features. J. High Energy Phys. 2024, 1–22 (2024).
Google Scholar
Miao, J., Tao, H., Xie, H., Sun, J. & Cao, J. Reconstruction-based anomaly detection for multivariate time series using contrastive generative adversarial networks. Inf. Process. Manag. 61, 103569 (2024).
Google Scholar
Lai, C.-Y. A., Sun, F.-K., Gao, Z., Lang, J. H. & Boning, D. Nominality score conditioned time series anomaly detection by point/sequential reconstruction. Adv. Neural Inf. Process. Syst. 36 (2024).
Yu, J. & Do, H. Proximity-based density description with regularized reconstruction algorithm for anomaly detection. Inf. Sci. 654, 119816 (2024).
Google Scholar
Kwon, G., Prabhushankar, M., Temel, D. & AlRegib, G. Backpropagated gradient representations for anomaly detection. In European Conference on Computer Vision. 206–226 (Springer, 2020).
Ruff, L. et al. Deep one-class classification. In International Conference on Machine Learning. 4393–4402 (PMLR, 2018).
Yi, J. & Yoon, S. Patch SVDD: Patch-level SVDD for anomaly detection and segmentation. In Proceedings of the Asian Conference on Computer Vision (2020).
Hinton, G., Vinyals, O. & Dean, J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
Ristea, N.-C. et al. Self-supervised predictive convolutional attentive block for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13576–13586 (2022).
Ristea, N. C. et al. Self-distilled masked auto-encoders are efficient video anomaly detectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 15984–15995 (2024).
Artola, A., Kolodziej, Y., Morel, J. M. & Ehret, T. Model-guided contrastive fine-tuning for industrial anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3981–3991 (2024).
Madan, N. et al. Self-supervised masked convolutional transformer block for anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 46(1), 525–542 (2024).
Google Scholar
Barbalau, A. et al. Ssmtl++: Revisiting self-supervised multi-task learning for video anomaly detection. In Computer Vision and Image Understanding (2023).
Deng, Z., Chen, D. & Deng, S. Prior knowledge guided network for video anomaly detection. In Proceedings of the 5th ACM International Conference on Multimedia in Asia. 1–7 (2023).
Liu, Z., Zhou, Y., Xu, Y. & Wang, Z. Simplenet: A simple network for image anomaly detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 20402–20411 (2023).
Sun, S. & Gong, X. Hierarchical semantic contrast for scene-aware video anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 22846–22856 (2023).
Liu, W., Chang, H., Ma, B., Shan, S. & Chen, X. Diversity-measurable anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12147–12156 (2023).
Bergmann, P., Fauser, M., Sattlegger, D. & Steger, C. Mvtec ad—A comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9592–9600 (2019).
Lu, C., Shi, J. & Jia, J. Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE International Conference on Computer Vision. 2720–2727 (2013).
Luo, W., Liu, W. & Gao, S. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE International Conference on Computer Vision. 341–349 (2017).
Noghre, G. A., Pazho, A. D. & Tabkhi, H. An exploratory study on human-centric video anomaly detection through variational autoencoders and trajectory prediction. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 995–1004 (2024).
Ravanbakhsh, M. et al. Abnormal event detection in videos using generative adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP). 1577–1581 (IEEE, 2017).
Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132–7141 (2018).
Woo, S., Park, J., Lee, J.-Y. & Kweon, I. S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV). 3–19 (2018).
Wang, Q. et al. Eca-net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11534–11542 (2020).
Li, X., Wang, W., Hu, X. & Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 510–519 (2019).

Download references

Author information

Authors and Affiliations

The 10th Research Institute, China Electronic Technology Group Corporation, Chengdu, 610036, China
Boyuan Zheng, Yi Gan, Lianggang Wang, Xunchao Cong, Chao Hu & Di Wang

Authors

Boyuan Zheng
View author publications
Search author on:PubMed Google Scholar
Yi Gan
View author publications
Search author on:PubMed Google Scholar
Lianggang Wang
View author publications
Search author on:PubMed Google Scholar
Xunchao Cong
View author publications
Search author on:PubMed Google Scholar
Chao Hu
View author publications
Search author on:PubMed Google Scholar
Di Wang
View author publications
Search author on:PubMed Google Scholar

Contributions

Boyuan Zheng and Yi Gan contribute to the innovation of the paper and the conception of the proposed method architecture. Lianggang Wang completes the code implementation and series of ablation and comparison experiments of the proposed method. Xunchao Cong and Chao Hu make valuable suggestions and contributions on the module involved in the proposed method. Di Wang provides relevant data, computing resources, and revision suggestions for the paper writing.

Corresponding author

Correspondence to Boyuan Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zheng, B., Gan, Y., Wang, L. et al. A boosting strategy based on feature mimicking with attention for visual anomaly detection. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37667-9

Download citation

Received: 01 December 2024
Accepted: 23 January 2026
Published: 26 March 2026
DOI: https://doi.org/10.1038/s41598-026-37667-9