Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
A multi-context fusion-aware graph modelling for group activity recognition using pose-conditioned spatial encoding and actor relations
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 10 April 2026

A multi-context fusion-aware graph modelling for group activity recognition using pose-conditioned spatial encoding and actor relations

  • M. R. Tejonidhi1,2,
  • K. R. Raghunandan1,
  • B. Uma2,
  • C. K. Madhu1,2 &
  • …
  • A. M. Vinod1,2 

Scientific Reports , Article number:  (2026) Cite this article

  • 492 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Mathematics and computing

Abstract

Group activity recognition requires a holistic understanding of individual actions, their spatial relationships, and the surrounding environment. Traditional methods that focus solely on isolated movements often fail to capture the complex inter-player and scene-level dependencies inherent in sports and crowd scenarios. In this research work, a model for group activity recognition is developed. The proposed model combines various contextual features through the integration of poses of individual actors in the scene with the pose-aligned spatial scene context for relational reasoning. Pose features of individual actors are extracted using mmPose, while the scene-level context is encoded through pose-conditioned spatial feature aggregation rather than explicit semantic segmentation. These pose and scene context features extracted are combined and used to construct Actor Relation Graphs (ARGs) using Zero Normalized Cross Correlation (ZNCC) which improves robustness to appearance and variations in illumination. Further, Graph Convolutional Networks (GCNs) are modelled using relationships between individual actors in a scene and their group activities. The proposed framework explicitly combines pose-level and scene-level contextual features into a single relational graph, in contrast to previous ARG-GCN approaches that mainly rely on appearance features. The model is evaluated on two benchmark datasets: the Collective Activity dataset (CAD) and the Volleyball dataset (VD). The model exhibits classification accuracies of 95.02% and 94.81% on CAD and VD, respectively. On a TITAN-XP GPU, the average time per video clip with 41 frames is approximately 0.2 s. The results show that the combination of pose and scene contexts features enhances graph-based relational learning and improves recognition accuracy.

Similar content being viewed by others

A human activity recognition method based on Vision Transformer

Article Open access 03 July 2024

Hierarchical intertwined graph representation learning for skeleton-based action recognition

Article Open access 10 October 2025

Content oriented 3D-CNN sequence learning architecture for academic activities recognition using a realistic CAD dataset

Article Open access 12 July 2025

Data availability

The Volleyball dataset is publicly available in the “mostafa-saad/deep-activity-rec **”** repository, [https://github.com/mostafa-saad/deep-activity-rec? tab=readme-ov-file#dataset](https:/github.com/mostafa-saad/deep-activity-rec? tab=readme-ov-file) and Collective Activity Dataset is publicly available in the [Computational Vision and Geometry Lab (CVGL) website at Stanford University](https:/cvgl.stanford.edu/projects/collective/collectiveActivity.html) , [https://cvgl.stanford.edu/projects/collective/collectiveActivity.html](https:/cvgl.stanford.edu/projects/collective/collectiveActivity.html) . The datasets used and/or analysed during the current study will be available from the corresponding author on reasonable request.

References

  1. Ullah, H., Muhammad, K., Sajjad, M., Baik, S. W. & Kwak, K. S. Machine learning model for group activity recognition based on discriminative interaction contextual relationship. Appl. Sci. 11, 9545. https://doi.org/10.3390/app11209545 (2021).

    Google Scholar 

  2. Khan, S. D., Ullah, H. & Kwak, K. S. Human group activity recognition using robust features extraction. J. Electr. Comput. Eng. 3090343, (2017). https://doi.org/10.1155/2017/3090343 (2017).

  3. Ibrahim, M. S., Muralidharan, S., Deng, Z., Vahdat, A. & Mori, G. A hierarchical deep temporal model for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).

  4. Bagautdinov, T., Alahi, A., Fleuret, F., Fua, P. & Savarese, S. Social scene understanding: End-to-end multi-person action localization and collective activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).

  5. Li, Y. & Vasconcelos, N. Efficient multi-person group activity recognition by hierarchical relational modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019).

  6. Li, S. et al. Groupformer: Group activity recognition with clustered spatial-temporal transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 13668–13677 (2021).

  7. Du, Z. & Wang, Q. Exploring global context and position-aware representation for group activity recognition. https://ssrn.com/abstract=4493017, (2023). https://doi.org/10.2139/ssrn.4493017

  8. Yuan, H. & Ni, D. Learning visual context for group activity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 3261–3269, (2021). https://doi.org/10.1609/aaai.v35i4.16437

  9. Dasgupta, A., Jawahar, C. V. & Alahari, K. Context aware group activity recognition. In 2020 25th International Conference on Pattern Recognition (ICPR), 10098–10105, (2021). https://doi.org/10.1109/ICPR48806.2021.9412306

  10. Li, S., He, X., Song, W., Hao, A. & Qin, H. Graph diffusion convolutional network for skeleton based semantic recognition of two-person actions. IEEE Trans. Pattern Anal. Mach. Intell. 45, 8477–8493. https://doi.org/10.1109/TPAMI.2023 (2023).

    Google Scholar 

  11. Vahora, S. A. & Chauhan, N. C. Deep neural network model for group activity recognition using contextual relationship. Eng. Sci. Technol. Int. J. 22, 47–54. https://doi.org/10.1016/j.jestch.2018.08.010 (2019).

    Google Scholar 

  12. Amer, M. R., Lei, P., Todorovic, S. & Hirf Hierarchical random field for collective activity recognition in videos. In Proceedings of the European Conference on Computer Vision (ECCV) (2014).

  13. Amer, M. R., Xie, D., Zhao, M., Todorovic, S. & Zhu, S. C. Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In Proceedings of the European Conference on Computer Vision (ECCV) (2012).

  14. Lan, T., Sigal, L. & Mori, G. Social roles in hierarchical models for human activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012).

  15. Lan, T., Wang, Y., Yang, W., Robinovitch, S. N. & Mori, G. Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI). 33, 814–830 (2011).

    Google Scholar 

  16. Gavrilyuk, K., Sanford, R., Javan, M. & Snoek, C. G. M. Actor-transformers for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).

  17. Hu, G., Cui, B., He, Y. & Yu, S. Progressive relation learning for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).

  18. Ibrahim, M. S. & Mori, G. Hierarchical relational networks for group activity recognition and retrieval. In Proceedings of the European Conference on Computer Vision (ECCV) (2018).

  19. Li, X., Chuah, M. C. & Sbgar Semantics based group activity recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017).

  20. Qi, M. et al. Stagnet: An attentive semantic rnn for group activity recognition. In Proceedings of the European Conference on Computer Vision (ECCV), 101–117 (2018).

  21. Shu, T., Todorovic, S., Zhu, S. C. & Cern Confidence energy recurrent network for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).

  22. Wu, J., Wang, L., Wang, L., Guo, J. & Wu, G. Learning actor relation graphs for group activity recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9964–9974 (2019).

  23. Alexe, B., Heess, N., Teh, Y. W. & Ferrari, V. Searching for objects driven by context. In Advances in Neural Information Processing Systems (NeurIPS) (2012).

  24. Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A. & Hebert, M. An empirical study of context in object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009).

  25. Liang, J., Jiang, L., Niebles, J. C., Hauptmann, A. G. & Fei-Fei, L. Peeking into the future: Predicting future person activities and locations in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019).

  26. Lisotto, M., Coscia, P. & Ballan, L. Social and scene-aware trajectory prediction in crowded spaces. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2019).

  27. Liu, Y., Wang, R., Shan, S. & Chen, X. Structure inference net: Object detection using scene-level context and instance-level relationships. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018).

  28. Mottaghi, R. et al. The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).

  29. Torralba, A. et al. Context-based vision system for place and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2003).

  30. Shrivastava, A. & Gupta, A. Contextual priming and feedback for faster r-cnn. In Proceedings of the European Conference on Computer Vision (ECCV) (2016).

  31. Das, S. et al. Learning video-pose embedding for activities of daily living. In Proceedings of the European Conference on Computer Vision (ECCV) (2020).

  32. Ulutan, O., Iftekhar, A., Manjunath, B. S. & Vsgnet Spatial attention network for detecting human object interactions using graph convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).

  33. Ding, H., Jiang, X., Shuai, B., Liu, A. Q. & Wang, G. Context contrasted feature and gated multi-scale aggregation for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018).

  34. Zhou, Y., Sun, X., Zha, Z. J. & Zeng, W. Context reinforced semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019).

  35. Deng, Z., Vahdat, A., Hu, H. & Mori, G. Structure inference machines: Recurrent neural networks for analyzing relations in group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4772–4781 (2016).

  36. Wang, M., Ni, B. & Yang, X. Recurrent modeling of interaction context for collective activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).

  37. Tejonidhi, M. R., Aravinda, C. V., Kumar, S. V. A., Madhu, C. K. & Vinod, A. M. Optimizing group activity recognition with actor relation graphs and gcn-lstm architectures. IEEE Access. 13, 55957–55969. https://doi.org/10.1109/ACCESS.2025.3552668 (2025).

    Google Scholar 

  38. Chappa, N. V. S. R. et al. Spartan: Self-supervised spatiotemporal transformers approach to group activity recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 5158–5168, (2023). https://doi.org/10.1109/CVPRW59228.2023.00544

  39. Shu, T., Todorovic, S., Zhu, S. C. & Cern Confidence-energy recurrent network for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5523–5531 (2017).

  40. Yuan, H., Ni, D. & Wang, M. Spatio-temporal dynamic inference network for group activity recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 7456–7465, (2021). https://doi.org/10.1109/ICCV48922.2021.00738

  41. Hajimirsadeghi, H., Yan, W., Vahdat, A. & Mori, G. Visual recognition by counting instances: A multi-instance cardinality potential kernel. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2596–2605 (2015).

  42. Kim, D., Lee, J., Cho, M. & Kwak, S. Detector-free weakly supervised group activity recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20051–20061, (2022). https://doi.org/10.1109/CVPR52688. 01945 (2022).

  43. Choi, W., Shahid, K. & Savarese, S. What are they doing? Collective activity classification using spatio-temporal relationship among people. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2009).

  44. He, H., Li, Y., Wang, Y., Li, G. & Guo, W. Runlin Zou. Group Activity Recognition via Spatio-Temporal Reasoning of Key Instances. In Proceedings of the 35th British Machine Vision Conference (2024).

  45. Wang, D. et al. Multi-dimensional convolution transformer for group activity recognition. Multimed Tools Appl 84, 27071–27090 (2025). https://doi.org/10.1007/s11042-024-19973-4 (2024).

  46. Zhu, X., Zhou, Y., Wang, D., Ouyang, W. & Su, R. MLST-Former: Multi-Level Spatial-Temporal Transformer for Group Activity Recognition. IEEE Trans. Circuits Syst. Video Technol. 33(7), 3383–3397 https://doi.org/10.1109/TCSVT.2022.3233069(2023).

    Google Scholar 

  47. Xie, Z., Jiao, C., Wu, K., Guo, D. & Hong, R. Active Factor Graph Network for Group Activity Recognition. IEEE Trans. Image Process. 33, 1574–1587. https://doi.org/10.1109/TIP.2024.3362140 (2024).

    Google Scholar 

  48. Zhu, X. et al. Dynamical Attention Hypergraph Convolutional Network for Group Activity Recognition. IEEE Trans. Neural Networks Learn. Syst. 36(5), 8911–8925. https://doi.org/10.1109/TNNLS.2024.3422265 (2025)

    Google Scholar 

  49. Su, Y. et al. Coming Out of the Dark: Human Pose Estimation in Low-light Conditions, in Proc. 34th Int. Joint Conf. Artif. Intell. (IJCAI), pp. 1888–1896, (2025). https://doi.org/10.24963/ijcai.2025/210

  50. Zhu, S., Liu, X., Xing, M., Oh, C. & Li, J. Spatio-temporal articulation & coordination co-attention graph network for human motion prediction. IEEE Trans. Circuits Syst. Video Technol. 34 (5), 3456–3468 (2024).

    Google Scholar 

  51. Tang, J. et al. MTAN: Multi-degree Tail-aware Attention Network for Human Motion Prediction. Internet Things. 25, 101134. https://doi.org/10.1016/j.iot.2024.101134 (2024).

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Nitte (Deemed to be University), NMAM Institute of Technology (NMAMIT), Department of Computer Science and Engineering, Nitte, 574110, Karnataka, India

    M. R. Tejonidhi, K. R. Raghunandan, C. K. Madhu & A. M. Vinod

  2. Department of Computer Science and Engineering, Malnad College of Engineering, Hassan, 573201, Karnataka, India

    M. R. Tejonidhi, B. Uma, C. K. Madhu & A. M. Vinod

Authors
  1. M. R. Tejonidhi
    View author publications

    Search author on:PubMed Google Scholar

  2. K. R. Raghunandan
    View author publications

    Search author on:PubMed Google Scholar

  3. B. Uma
    View author publications

    Search author on:PubMed Google Scholar

  4. C. K. Madhu
    View author publications

    Search author on:PubMed Google Scholar

  5. A. M. Vinod
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Tejonidhi M R contributed in the literature survey, problem identification, design implementation and preparing the article. Raghunandan K R guided throughout all these process. Uma B Co-guided during the above process. Madhu C K and Vinod AM contributed in drafting the article and reviewing and fine tuning the contents of the research.

Corresponding authors

Correspondence to M. R. Tejonidhi or K. R. Raghunandan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tejonidhi, M.R., Raghunandan, K.R., Uma, B. et al. A multi-context fusion-aware graph modelling for group activity recognition using pose-conditioned spatial encoding and actor relations. Sci Rep (2026). https://doi.org/10.1038/s41598-026-46296-1

Download citation

  • Received: 19 February 2026

  • Accepted: 25 March 2026

  • Published: 10 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-46296-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics