Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Facial expression recognition via variational inference
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 05 February 2026

Facial expression recognition via variational inference

  • Gang Lv1,
  • JunLing Zhang1 &
  • Chiki Tsoi2 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Mathematics and computing

Abstract

Facial expressions in the wild are rarely discrete; they often manifest as compound emotions or subtle variations that challenge the discriminative capabilities of conventional models. While psychological research suggests that expressions are often combinations of basic emotional units, most existing FER methods rely on deterministic point estimation, failing to model the intrinsic uncertainty and continuous nature of emotions. To address this, we propose POSTER-Var, a framework integrating a Variational Inference-based Classification Head (VICH). Unlike standard classifiers, VICH maps facial features into a probabilistic latent space via the reparameterization trick, enabling the model to learn the underlying distribution of expression intensities. Furthermore, we enhance feature representation by introducing layer embeddings and nonlinear transformations into the feature pyramid, facilitating the fusion of hierarchical semantic information. Extensive experiments on RAF-DB, AffectNet, and FER+ demonstrate that our method effectively handles fine-grained expression recognition, achieving state-of-the-art performance. The code has been open-sourced at: https://github.com/lg2578/poster-var.

Data availability

The RAF-DB dataset is available from the original authors upon request for non-commercial research purposes. Researchers affiliated with academic institutions may request access by contacting the authors as described at http://whdeng.cn/RAF/model1.html. The FER+ dataset is available at https://github.com/microsoft/FERPlus. The AffectNet dataset can be requested from the original authors at https://mohammadmahoor.com/pages/databases/affectnet/ by eligible researchers (e.g., Principal Investigators) subject to a signed license agreement.

References

  1. Wang, C., Chen, L., Wang, L., Li, Z. & Lv, X. Qcs: Feature refining from quadruplet cross similarity for facial expression recognition. In Proceedings of the AAAI conference on artificial intelligence, vol. 39, pp. 7563–7572 (2025).

  2. Deng, J., Guo, J., Xue, N. & Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4690–4699 . http://openaccess.thecvf.com/content_CVPR_2019/html/Deng_ArcFace_Additive_Angular_Margin_Loss_for_Deep_Face_Recognition_CVPR_2019_paper.html (2019).

  3. Savchenko, A.V. Facial expression and attributes recognition based on multi-task learning of lightweight neural networks. In 2021 IEEE 19th International symposium on intelligent systems and informatics (SISY). IEEE, pp. 119–124 https://ieeexplore.ieee.org/abstract/document/9582508/ (2021).

  4. Xue, F., Wang, Q. & Guo, G. Transfer: Learning relation-aware facial expression representations with transformers. in: Proceedings of the IEEE/CVF International Conference on Computer vision, pp. 3601–3610. http://openaccess.thecvf.com/content/ICCV2021/html/Xue_TransFER_Learning_Relation-Aware_Facial_Expression_Representations_With_Transformers_ICCV_2021_paper.html (2021) (Accessed 2025-04-10).

  5. Zheng, C., Mendieta, M. & Chen, C. Poster: A pyramid cross-fusion transformer network for facial expression recognition. In: Proceedings of the IEEE/CVF International conference on computer vision, pp. 3146–3155. https://openaccess.thecvf.com/content/ICCV2023W/AMFG/html/Zheng_POSTER_A_Pyramid_Cross- Fusion_Transformer_Network_for_Facial_Expression_Recognition_ICCVW_2023_paper.html (2023) (Accessed 2025-04-10).

  6. Mao, J. et al. POSTER++: A simpler and stronger facial expression recognition network. Patt. Recognit. 157, 110951. https://doi.org/10.1016/j.patcog.2024.110951. (2025) (Accessed 2025-03-10) .

    Google Scholar 

  7. Chen, C. PyTorch Face Landmark: A fast and accurate facial landmark detector. Opensource software available at https://github.com/cunjian/pytorch_face_landmark, 27 (2021).

  8. Ekman, P. & Friesen, W.V. Facial action coding system. Environmental Psychology & Nonverbal Behavior (1978).

  9. Plutchik, R. A general psychoevolutionary theory of emotion. In: Theories of Emotion, pp. 3–33. Elsevier https://www.sciencedirect.com/science/article/pii/B9780125587013500077 (1980).

  10. Zhou, Y., Xue, H. & Geng, X. Emotion Distribution Recognition from Facial Expressions. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1247–1250. ACM, Brisbane Australia. https://doi.org/10.1145/2733373.2806328 (2015).

  11. Jia, X., Zheng, X., Li, W., Zhang, C. & Li, Z. Facial emotion distribution learning by exploiting low-rank label correlations locally. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9841–9850. http://openaccess.thecvf.com/content_CVPR_2019/html/Jia_Facial_Emotion_Distribution_Learning_by_Exploiting_Low-Rank_Label_Correlations_Locally_CVPR_2019_paper.html (2019) (Accessed 2025-11-10).

  12. Yang, S., Yang, X., Wu, J. & Feng, B. Significant feature suppression and cross-feature fusion networks for fine-grained visual classification. Sci. Rep. 14(1), 24051. https://doi.org/10.1038/s41598-024-74654-4 (2024) . (Accessed 2025-12-02).

    Google Scholar 

  13. Zhao, Z., Liu, Q. & Zhou, F. Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3510–3519. https://ojs.aaai.org/index.php/aaai/article/view/16465 (2021) (Accessed 2025-04-10).

  14. Gildenblat, J. contributors: PyTorch library for CAM methods. GitHub. https://github.com/jacobgil/pytorch-grad-cam (2021).

  15. Zhang, C., Bütepage, J., Kjellström, H. & Mandt, S. Advances in Variational Inference. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 2008–2026. https://doi.org/10.1109/TPAMI.2018.2889774. (2019) (Accessed 2025-04-12).

    Google Scholar 

  16. Van Den Oord, A. & Vinyals, O. Neural discrete representation learning. Advances in neural information processing systems 30. (2017) (Accessed 2025-04-17).

  17. Zhang, Z., Li, X., Guo, K. & Xu, X. Facial expression recognition based on multi-task self-distillation with coarse and fine grained labels. Expert Syst. Appl. 281, 127440. https://doi.org/10.1016/j.eswa.2025.127440 (2025) (Accessed 2025-07-10).

    Google Scholar 

  18. Parthasarathy, S., Rozgic, V., Sun, M. & Wang, C. Improving emotion classification through variational inference of latent variables. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7410–7414. IEEE, https://ieeexplore.ieee.org/abstract/document/8682823/ (2019) (Accessed 2025-04-12).

  19. Chamain, L. D., Qi, S. & Ding, Z. End-to-end image classification and compression with variational autoencoders. IEEE Internet Things J. 9(21), 21916–21931. https://doi.org/10.1109/JIOT.2022.3182313 (2022) (Accessed 2025-03-14).

    Google Scholar 

  20. Hashemifar, S., Marefat, A., Hassannataj Joloudari, J. & Hassanpour, H. Enhancing face recognition with latent space data augmentation and facial posture reconstruction. Expert Syst. Appl. 238, 122266. https://doi.org/10.1016/j.eswa.2023.122266 (2024) (Accessed 2025-07-10).

    Google Scholar 

  21. Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141. http://openaccess.thecvf.com/content_cvpr_2018/html/Hu_Squeeze-and-Excitation_Networks_CVPR_201 8_paper.html (2018) (Accessed 2025-04-28).

  22. Woo, S., Park, J., Lee, J.-Y. & Kweon, I.S. CBAM: Convolutional Block Attention Module. In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, pp. 3–19. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-030-01234-2_1 (2018).

  23. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G. & Gelly, S. An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2021).

  24. He, J. et al. Micro_nest: multi-scale attention enhanced micro-expression recognition framework. Expert Syst. Appl. 290, 128372. https://doi.org/10.1016/j.eswa.2025.128372 (2025) (Accessed 2025-07-10).

    Google Scholar 

  25. Lu, Z., Lin, R. & Hu, H. Tri-level modality-information disentanglement for visible-infrared person re-identification. IEEE Trans Multim 26, 2700–2714 (2023) (Accessed 2025-11-08).

    Google Scholar 

  26. Lu, Z., Lin, R. & Hu, H. Disentangling modality and posture factors: Memory-attention and orthogonal decomposition for visible-infrared person re-identification. IEEE Trans. Neural Netw. Learn. Syst. 36(3), 5494–5508 (2024).

    Google Scholar 

  27. Hatamizadeh, A., Yin, H., Heinrich, G., Kautz, J. & Molchanov, P. Global context vision transformers. In: International conference on machine learning, pp. 12633–12646. PMLR. https://proceedings.mlr.press/v202/hatamizadeh23a.html (2023).

  28. Li, S., Deng, W. & Du, J. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861. http://openaccess.thecvf.com/content_cvpr_2017/html/Li_Reliable_Crowdsourcing_and_CVPR_2017_paper.html (2017) (Accessed 2025-03-19).

  29. Mollahosseini, A., Hasani, B. & Mahoor, M. H. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2017).

    Google Scholar 

  30. Barsoum, E., Zhang, C., Ferrer, C.C. & Zhang, Z. Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283. ACM, Tokyo Japan. https://doi.org/10.1145/2993148.2993165. (2016) (Accessed 2025-04-26).

  31. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In: International Conference on Learning Representations (ICLR) (2019).

  32. Vo, T.-H., Lee, G.-S., Yang, H.-J. & Kim, S.-H. Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access 8, 131988–132001. https://doi.org/10.1109/ACCESS.2020.3010018 (2020) (Accessed 2025-06-10).

    Google Scholar 

  33. Zeng, D., Lin, Z., Yan, X., Liu, Y., Wang, F. & Tang, B. Face2Exp: Combating Data Biases for Facial Expression Recognition, pp. 20291–20300. https://openaccess.thecvf.com/content/CVPR2022/html/Zeng_Face2Exp_Combating_Data_Biases_for_Facial_Expression_Recognition_CVPR_2022_paper.html (2022) (Accessed 2025-06-10).

  34. Xu, J., Li, Y., Yang, G., He, L. & Luo, K. Multiscale facial expression recognition based on dynamic global and static local attention. IEEE Trans. Affect. Comput. https://doi.org/10.1109/TAFFC.2024.3458464 (2024) (Accessed 2025-11-08).

    Google Scholar 

  35. Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arxiv.org/abs/1312.6114https://arxiv.org/abs/1312.6114 (2013).

Download references

Funding

This work was supported by funding from Zhejiang Office Philosophy and Social Sciences Planning Project (24NDJC04Z), the 3rd Batch of Scientific Research Innovation Teams of Zhejiang Open University. Jinhua Science and Technology Bureau (2025-4-178). The funders had no role in the design of the study, collection and analysis of data, writing of the manuscript, or decision to submit the manuscript for publication.

Author information

Authors and Affiliations

  1. Learning and Information Center, JinHua Open University, No. 18 Qingzhao Road, JinHua, 321000, Zhejiang, China

    Gang Lv & JunLing Zhang

  2. Advanced Institute of Information Technology, Peking University, No. 233 Yonghui Road, Xiaoshan District, Hangzhou, 311200, Zhejiang, China

    Chiki Tsoi

Authors
  1. Gang Lv
    View author publications

    Search author on:PubMed Google Scholar

  2. JunLing Zhang
    View author publications

    Search author on:PubMed Google Scholar

  3. Chiki Tsoi
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Gang lv: Conceptualization, Methodology, Writing-Original draft preparation, Investigation, Software, Validation Junling Zhang: Conceptualization, Writing- Reviewing and Editing Chiki Tsoi:Validation,Provided valuable guidance–particularly on improving the figures

Corresponding author

Correspondence to Gang Lv.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lv, G., Zhang, J. & Tsoi, C. Facial expression recognition via variational inference. Sci Rep (2026). https://doi.org/10.1038/s41598-026-38734-x

Download citation

  • Received: 10 September 2025

  • Accepted: 30 January 2026

  • Published: 05 February 2026

  • DOI: https://doi.org/10.1038/s41598-026-38734-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Facial expression recognition
  • Variational inference
  • Probabilistic model
  • Feature representation
Download PDF

Associated content

Collection

Deep learning for image analysis

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics