Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Athlete action quality assessment based on transfer neural network quality score decoupling in complex sports scenarios
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 02 April 2026

Athlete action quality assessment based on transfer neural network quality score decoupling in complex sports scenarios

  • Lei Gao1,
  • Yuhong Ma2,
  • Sijuan Bi3 &
  • …
  • Shuangjun Li1 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Engineering
  • Mathematics and computing

Abstract

Action quality assessment (AQA) is an important and challenging task in computer vision, which has received wide attention in many fields, especially in sports video analysis. In this paper, for the problem of uneven distribution of AQA scores in long sports videos, an AQA model based on transfer neural network quality score decoupling is proposed, which mainly consists of a dual-stream structure combining dynamic and static streams, a quality score decoupling module, and a pairwise sorting prediction module. Specifically, inspired by the action alignment processing in video understanding, a quality score decoupling module is designed based on the Transformer network decoder, which is able to decouple the input visual features into high/low quality score features, and at the same time adopts temporally average-pooled features as the average quality score representation. The overall skill level of a long video is assessed by focusing on the skill level related parts of the video in a pairwise order, and the task of assessing the quality of the action is accomplished by aligning the scores. In addition, the algorithm adopts a twin neural network structure to compare the input paired samples, and the dual-stream structure combining dynamic and static streams is able to extract the video motion information and frame-specific information separately, so that the model focuses on the dynamic time information and moment-specific action information simultaneously, which contributes to the feature extraction network to obtain a richer feature representation. This explicit separation of motion-centric and posture-centric representations avoids early entanglement of heterogeneous quality cues; the two streams are fused only after quality-aware feature disentanglement, following a late fusion strategy that is particularly suitable for long, untrimmed videos. Finally, by comparing with the existing methods on several public datasets, the comparison experiments and visual validation results show the effectiveness as well as the superiority of the proposed algorithm.

Data availability

All data used and generated during the current study are available from the corresponding author upon reasonable request.

References

  1. Borges, P. V. K., Conci, N. & Cavallaro, A. Video-based human behavior understanding: A survey. IEEE Trans. Circuits Syst. Video Technol. 23, 1993–2008 (2013).

    Google Scholar 

  2. Fu, C. et al. Video-MME: The first-ever comprehensive evaluation benchmark of multi-modal LLMS in video analysis. In Proceedings of the Computer Vision and Pattern Recognition Conference. 24108–24118 (2025).

  3. Tang, Y. et al. Video understanding with large language models: A survey. In IEEE Transactions on Circuits and Systems for Video Technology (2025).

  4. Zhou, J. et al. Mlvu: A comprehensive benchmark for multi-task long video understanding. arXiv preprint arXiv:2406.04264 (2024).

  5. Liu, M., Yu, J. & Zhao, K. Dynamic event-triggered asynchronous fault detection via zonotopic threshold analysis for fuzzy hidden Markov jump systems subject to generally hybrid probabilities. IEEE Transactions on Fuzzy Systems 32, 6363–6377 (2024).

    Google Scholar 

  6. Yang, H., Liu, Z., Cui, H. et al. An electrified railway catenary component anomaly detection frame based on invariant normal region prototype with segment anything model. In IEEE Transactions on Transportation Electrification; IEEE: Piscataway, NJ, USA. 1–12 (2025).

  7. Wang, H., Song, Y., Yang, H. & Liu, Z. Generalized Koopman neural operator for data-driven modelling of electric railway pantograph-catenary systems. In IEEE Transactions on Transportation Electrification (2025).

  8. Yan, J. et al. Research on multimodal techniques for arc detection in railway systems with limited data. Struct. Health Monit. 14759217251336797 (2025).

  9. Zhang, H., Zhang, X., Liu, Y., Gao, S. & Ma, D. Event-triggered adaptive tracking control for USV based on enhanced optimized backstepping technique. ISA Trans. 168, 67–80 (2026).

    Google Scholar 

  10. Liang, X. et al. Three-dimensional printing resin-based dental provisional crowns and bridges: Recent progress in properties, applications, and perspectives. Materials 18, 2202 (2025).

    Google Scholar 

  11. Zhang, S.-J., Pan, J.-H., Gao, J. & Zheng, W.-S. Semi-supervised action quality assessment with self-supervised segment feature recovery. IEEE Trans. Circuits Syst. Video Technol. 32, 6017–6028 (2022).

    Google Scholar 

  12. Sun, W., Min, X., Lu, W. & Zhai, G. A deep learning based no-reference quality assessment model for UGC videos. In Proceedings of the 30th ACM International Conference on Multimedia. 856–865 (2022).

  13. Bai, Y. et al. Action quality assessment with temporal parsing transformer. In European Conference on Computer Vision. 422–438 (Springer, 2022).

  14. Liu, M., Yu, J. & Rodríguez-Andina, J. J. Adaptive event-triggered asynchronous fault detection for nonlinear Markov jump systems with its application: A zonotopic residual evaluation approach. IEEE Trans. Netw. Sci. Eng. 10, 1792–1808 (2023).

    Google Scholar 

  15. Wang, X., Jiang, H., Zeng, T. & Dong, Y. An adaptive fused domain-cycling variational generative adversarial network for machine fault diagnosis under data scarcity. Inf. Fusion 103616 (2025).

  16. Jia, Z., Liu, Z., Li, Z., Wang, K. & Vong, C.-M. Lightweight fault diagnosis via siamese network for few-shot EHA circuit analysis. In IEEE Transactions on Aerospace and Electronic Systems (2025).

  17. Yamada, D.K., Lin, F. & Nakamura, T. Developing a novel recurrent neural network architecture with fewer parameters and good learning performance. Interdiscip. Inf. Sci. 27, 25–40 (2021).

  18. Gao, C. et al. Conditional feature learning based transformer for text-based person search. IEEE Trans. Image Process. 31, 6097–6108 (2022).

    Google Scholar 

  19. Liu, Y., Cheng, X. & Ikenaga, T. A hierarchical joint training based replay-guided contrastive transformer for action quality assessment of figure skating. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 108, 332–341 (2025).

    Google Scholar 

  20. Zhou, K., Cai, R., Wang, L., Shum, H.P. & Liang, X. A comprehensive survey of action quality assessment: Method and benchmark. arXiv preprint arXiv:2412.11149 (2024).

  21. Pirsiavash, H., Vondrick, C. & Torralba, A. Assessing the quality of actions. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13. 556–571 (Springer, 2014).

  22. Pan, J.-H., Gao, J. & Zheng, W.-S. Action assessment by joint relation graphs. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6331–6340 (2019).

  23. Sun, K., Xiao, B., Liu, D. & Wang, J. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5693–5703 (2019).

  24. Nekoui, M., Cruz, F. O. T. & Cheng, L. Eagle-eye: Extreme-pose action grader using detail bird’s-eye view. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 394–402 (2021).

  25. Parmar, P. & Tran Morris, B. Learning to score Olympic events. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 20–28 (2017).

  26. Parmar, P. & Morris, B. T. What and how well you performed? A multitask learning approach to action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 304–313 (2019).

  27. Wang, S., Yang, D., Zhai, P., Chen, C. & Zhang, L. Tsa-net: Tube self-attention network for action quality assessment. In Proceedings of the 29th ACM International Conference on Multimedia. 4902–4910 (2021).

  28. Yu, X., Rao, Y., Zhao, W., Lu, J. & Zhou, J. Group-aware contrastive regression for action quality assessment. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7919–7928 (2021).

  29. Parmar, P. & Morris, B. Action quality assessment across multiple actions. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). 1468–1476 (IEEE, 2019).

  30. Xu, J. et al. Finediving: A fine-grained dataset for procedure-aware action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2949–2958 (2022).

  31. Zeng, L.-A. et al. Hybrid dynamic-static context-aware attention network for action assessment in long videos. In Proceedings of the 28th ACM International Conference on Multimedia. 2526–2534 (2020).

  32. Tang, Y. et al. Uncertainty-aware score distribution learning for action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9839–9848 (2020).

  33. Xu, A., Zeng, L.-A. & Zheng, W.-S. Likert scoring with grade decoupling for long-term action assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3232–3241 (2022).

  34. Li, Z., Huang, Y., Cai, M. & Sato, Y. Manipulation-skill assessment from videos with spatial attention network. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019).

  35. Jain, H., Harit, G. & Sharma, A. Action quality assessment using siamese network-based deep metric learning. IEEE Trans. Circuits Syst. Video Technol. 31, 2260–2273 (2020).

    Google Scholar 

  36. Liu, M., Yu, J. & Liu, Y. Dynamic event-triggered asynchronous fault detection for Markov jump systems with partially accessible hidden information and subject to aperiodic dos attacks. Appl. Math. Comput. 431, 127317 (2022).

    Google Scholar 

  37. Doughty, H., Damen, D. & Mayol-Cuevas, W. Who’s better? Who’s best? Pairwise deep ranking for skill determination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6057–6066 (2018).

  38. Doughty, H., Mayol-Cuevas, W. & Damen, D. The pros and cons: Rank-aware temporal attention for skill determination in long videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7862–7871 (2019).

  39. Xiang, X., Tian, Y., Reiter, A., Hager, G.D. & Tran, T. D. S3d: Stacking segmental p3d for action quality assessment. In 2018 25th IEEE International Conference on Image Processing (ICIP). 928–932 (IEEE, 2018).

  40. Piergiovanni, A., Fan, C. & Ryoo, M. Learning latent subevents in activity videos using temporal attention filters. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31 (2017).

  41. Carreira, J. & Zisserman, A. Quo Vadis, action recognition? A new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308 (2017).

  42. Carion, N. et al. End-to-end object detection with transformers. In European Conference on Computer Vision. 213–229 (Springer, 2020).

  43. Redmon, J. & Farhadi, A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

  44. Lin, T.-Y. et al. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. 740–755 (Springer, 2014).

  45. Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).

    Google Scholar 

  46. Joachims, T. Training linear SVMS in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 217–226 (2006).

  47. Yao, T., Mei, T. & Rui, Y. Highlight detection with pairwise deep ranking for first-person video summarization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 982–990 (2016).

  48. Pan, J.-H., Gao, J. & Zheng, W.-S. Adaptive action assessment. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8779–8795 (2021).

    Google Scholar 

  49. Nguyen, P., Liu, T., Prasad, G. & Han, B. Weakly supervised action localization by sparse temporal pooling network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6752–6761 (2018).

Download references

Author information

Authors and Affiliations

  1. College of Physical Education, Shandong Sport University, Jinan, 276826, China

    Lei Gao & Shuangjun Li

  2. Tai’an Institute of Technology Secondary Vocational School, Xintai, 271221, China

    Yuhong Ma

  3. Xinwen Middle School in Xin Tai, Xintai, 271219, China

    Sijuan Bi

Authors
  1. Lei Gao
    View author publications

    Search author on:PubMed Google Scholar

  2. Yuhong Ma
    View author publications

    Search author on:PubMed Google Scholar

  3. Sijuan Bi
    View author publications

    Search author on:PubMed Google Scholar

  4. Shuangjun Li
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Lei Gao: conceptualization, data curation, formal analysis, investigation, methodology, supervision, validation, writing-review and editing, and writing-original draft. Yuhong Ma: formal analysis, investigation, data curation, writing-review and editing. Sijuan Bi: formal analysis, investigation, data curation, writing-review and editing. Shuangjun Li: conceptualization, funding acquisition, investigation, validation, visualization, resources, software, writing-review and editing.

Corresponding author

Correspondence to Shuangjun Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, L., Ma, Y., Bi, S. et al. Athlete action quality assessment based on transfer neural network quality score decoupling in complex sports scenarios. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43987-7

Download citation

  • Received: 02 October 2025

  • Accepted: 09 March 2026

  • Published: 02 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-43987-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Action quality assessment
  • Transformer network
  • Feature decoupling
  • Twin neural network
  • Dual-stream structure
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics