Abstract
Face anti-spoofing (FAS) has become a crucial component in securing face recognition systems against presentation attacks, such as printed photos, replay videos, and 3D masks. While recent advances have improved generalization to unseen spoofing attempts, many existing methods remain black-box models that provide binary decisions without interpretable reasoning. In this paper, we investigate explainable face anti-spoofing from a supervision-centric perspective, using a vision-language model (VLM) to analyze how natural language explanations influence model behavior. To enable this study under controlled conditions, we construct an explanation-augmented benchmark by enriching four standard FAS datasets—MSU-MFSD, CASIA-FASD, Replay-Attack, and OULU-NPU—with both vanilla and reasoning-structured captions generated via the GPT-4o API. We further adopt a dual-objective training strategy that combines spoof classification loss with explanation generation loss, allowing us to examine the effect of explanation-based supervision while keeping the backbone architecture fixed. Through extensive cross-dataset evaluations, we show that reasoning-style captions can enhance detection performance and domain generalization in many settings, while also introducing inductive biases that may degrade performance when emphasized cues are misaligned with unseen attack types. These findings suggest that explanations in FAS should be viewed not only as interpretable outputs, but also as controllable training signals that shape generalization behavior. To support reproducibility, we publicly release the explanation annotations and associated metadata—excluding all face images—via a Hugging Face repository at https://huggingface.co/datasets/DescriptiveFAS/MCIO_public.
Similar content being viewed by others
Data availability
The four benchmark datasets used in this study (MSU-MFSD, CASIA-FASD, Replay-Attack, and OULU-NPU) are publicly available benchmarks for research purposes. The official datasets download links: - MSU-MFSD: https://drive.google.com/drive/folders/1nJCPdJ7R67xOikIF1omkfz4yHeJwhQsz - CASIA-FASD: http://www.cbsr.ia.ac.cn/english/FaceAntiSpoofDatabases.asp - Replay-Attack: https://www.idiap.ch/en/scientific-research/data/replayattack - OULU-NPU: https://sites.google.com/site/oulunpudatabase - SiW-Mv2: https://cvlab.cse.msu.edu/siw-mv2-dataset.html To support reproducibility, we publicly release only the additional metadata(captions) generated for training our model, excluding all images, at the following Hugging Face repository: https://huggingface.co/datasets/DescriptiveFAS/ MCIO_public.
References
Guo, J. et al. Learning meta face recognition in unseen domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6163–6172 (2020).
Ramachandra, R. & Busch, C. Presentation attack detection methods for face recognition systems: A comprehensive survey. ACM Comput. Surv. 50, 1–37 (2017).
Liu, A. et al. Cross-ethnicity face anti-spoofing recognition challenge: A review. IET Biometr. 10, 24–43 (2021).
Jia, S., Guo, G. & Xu, Z. A survey on 3D mask presentation attack detection and countermeasures. Pattern Recognit. 98, 107032 (2020).
Yu, Z. et al. Deep learning for face anti-spoofing: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 5609–5631 (2022).
Wen, D., Han, H. & Jain, A. K. Face spoof detection with image distortion analysis. IEEE Trans. Inf. Forensics Secur. 10, 746–761 (2015).
Zhang, Z. et al. A face antispoofing database with diverse attacks. In 2012 5th IAPR international conference on Biometrics (ICB), 26–31 (IEEE, 2012).
Chingovska, I., Anjos, A. & Marcel, S. On the effectiveness of local binary patterns in face anti-spoofing. In 2012 BIOSIG-Proceedings of the International Conference of Biometrics Special Interest Group (BIOSIG), 1–7 (IEEE, 2012).
Boulkenafet, Z., Komulainen, J., Li, L., Feng, X. & Hadid, A. Oulu-npu: A mobile face presentation attack database with real-world variations. In 2017 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017), 612–618 (IEEE, 2017).
Shao, R., Lan, X. & Yuen, P. C. Regularized fine-grained meta face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 11974–11981 (2020).
Jia, Y., Zhang, J., Shan, S. & Chen, X. Single-side domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8484–8493 (2020).
Liu, S. et al. Feature generation and hypothesis verification for reliable face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence 36, 1782–1791 (2022).
Wang, Z. et al. Domain generalization via shuffled style assembly for face anti-spoofing. In Proceedings of the IEEECVF Conference on Computer Vision and Pattern Recognition, 4123–4133 (2022).
Wang, C.-Y., Lu, Y.-D., Yang, S.-T. & Lai, S.-H. Patchnet: A simple face anti-spoofing framework via fine-grained patch recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20281–20290 (2022).
Sun, Y., Liu, Y., Liu, X., Li, Y. & Chu, W.-S. Rethinking domain generalization for face anti-spoofing: Separability and alignment. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 24563–24574. (IEEE, Vancouver, BC, Canada, 2023). https://doi.org/10.1109/CVPR52729.2023.02353
Le, B. M. & Woo, S. S. Gradient alignment for cross-domain face anti-spoofing. In Proceedings of the IEEECVF Conference on Computer Vision and Pattern Recognition, 188–199 (2024).
Huang, H.-P. et al. Adaptive transformers for robust few-shot cross-domain face anti-spoofing. In Proceedings of the European Conference on Computer Vision, 37–54 (Springer, 2022).
Cai, R. et al. S-adapter: Generalizing vision transformer for face anti-spoofing with statistical tokens. IEEE Trans. Inf. Forensics Secur. 19, 8385–8397 (2024).
Liao, C.-H. et al. Domain invariant vision transformer learning for face anti-spoofing. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 6087–6096. (IEEE, Waikoloa, HI, USA, 2023). https://doi.org/10.1109/WACV56688.2023.00604
Li, D., Chen, G., Wu, X., Yu, Z. & Tan, M. Face anti-spoofing with cross-stage relation enhancement and spoof material perception. Neural Netw. 175, 106275 (2024).
Kong, C., Zheng, K., Wang, S., Rocha, A. & Li, H. Beyond the pixel world: A novel acoustic-based face anti-spoofing system for smartphones. IEEE Trans. Inf. Forensics Secur. 17, 3238–3253 (2022).
Kong, C. et al. FAS: An accurate and robust multimodal mobile face anti-spoofing system. IEEE Trans. Dependable Secure Comput. 21, 5650–5666 (2024).
Srivatsan, K., Naseer, M. & Nandakumar, K. FLIP: Cross-domain face anti-spoofing with language guidance. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 19628–19639. (IEEE, Paris, France, 2023). https://doi.org/10.1109/ICCV51070.2023.01803
Liu, A. et al. CFPL-FAS: Class free prompt learning for generalizable face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 222–232 (2024).
Zhang, H. et al. Concept discovery in deep neural networks for explainable face anti-spoofing. arXiv:2412.17541 (2024).
Singh, R. P., Dash, R. & Mohapatra, R. K. Unveiling explainability in face anti-spoofing: Hybrid feature extraction with XAI-guided feature aggregation. Pattern Recognit. 169, 111905 (2026).
Zhang, G. et al. Interpretable face anti-spoofing: Enhancing generalization with multimodal large language models. arXiv:2501.01720 (2025).
Wang, H. et al. Faceshield: Explainable face anti-spoofing with multimodal large language models. arXiv:2505.09415 (2025).
Tian, J. et al. ADMM-based adversarial false data injection attacks against multi-label locational detection. IEEE Trans. Depend. Secur. Comput. (2025).
Tian, J. et al. Evade: Targeted adversarial false data injection attacks for state estimation in smart grid. IEEE Trans. Sustain. Comput. (2024).
Kong, C., Wang, S. & Li, H. Digital and physical face attacks: Reviewing and one step further. APSIPA Trans. Signal Inf. Process. 12, 1–51 (2023).
Guo, X., Liu, Y., Jain, A. & Liu, X. Multi-domain learning for updating face anti-spoofing models. In European Conference on Computer Vision, 230–249 (Springer, 2022).
Funding
This research was supported by the research fund of Hanbat National University in 2024. This research was supported by the Regional Innovation System & Education (RISE) program through the Daejeon RISE Center, funded by the Ministry of Education(MOE) and the Daejeon Metropolitan City, Republic of Korea (2025-RISE-06-002).
Author information
Authors and Affiliations
Contributions
Conceptualization and methodology, J.M., K.L. and H.J.; software, data curation and visualization, J.M. and M.K.; validation, formal analysis and investigation, J.M., S.O. and D.K.; writing—original draft preparation, J.M.; writing—review and editing, E.K. and H.J.; supervision, project administration and funding acquisition, E.K. and H.J. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This study was conducted using publicly available datasets and did not involve any new human participants or the collection of private human data; therefore, ethics committee or institutional review board approval was not required.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Min, J., Lim, K., Kim, M. et al. Analyzing the effect of reasoning-based supervision on face anti-spoofing. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43800-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-43800-5


