Abstract
Recently, deep-learning-based approaches have been widely applied in sEMG-based gesture recognition. While some existing methods have explored the integration of time-domain and frequency-domain features, many still face challenges in comprehensively leveraging the complementary information from both domains, often due to limitations in capturing multi-scale temporal dependencies or sub-optimal fusion strategies. To address these persistent limitations and further enhance accuracy and robustness, we propose the Multi-Scale Dual-Stream Fusion Network (MSDS-FusionNet), a novel deep learning framework that integrates temporal and frequency-domain features for enhanced gesture classification accuracy. MSDS-FusionNet introduces two key innovations: the Multi-Scale Mamba (MSM) modules, which extract multi-scale temporal features through parallel convolutions with varying kernel sizes and linear-time sequence modeling with selective state spaces, enabling the capture of temporal patterns at multiple scales and the modeling of both short-term and long-term dependencies; and the Bi-directional Attention Fusion Module (BAFM), which effectively combines temporal and frequency-domain features using bi-directional attention mechanisms to fuse complementary information and improve recognition accuracy dynamically. Extensive experiments on the NinaPro dataset demonstrate that MSDS-FusionNet outperforms state-of-the-art methods, achieving accuracy improvements of up to 2.41%, 2.46%, and 1.38% on the DB2, DB3, and DB4 datasets, respectively, with final accuracies of 90.15%, 72.32%, and 87.10%. This study presents a robust and flexible solution for sEMG-based gesture recognition, effectively addressing the complexities of recognizing intricate gestures and offering significant potential for applications in prosthetics, virtual reality, and assistive technologies. The code of this study is available at https://github.com/hdy6438/MSDS-FusionNet.
Similar content being viewed by others
Data availability
The NinaPro database used in this study are publicly available in https://ninapro.hevs.ch/. The code for the proposed MSDS-FusionNet is available in https: //github.com/hdy6438/MSDS-FusionNet.
References
Atzori, M. et al. Electromyography data for non-invasive naturally-controlled robotic hand prostheses. Sci. Data1, 140053. https://doi.org/10.1038/sdata.2014.53 (2014).
Pizzolato, S. et al. Comparison of six electromyography acquisition setups on hand movement classification tasks. PLoS ONE12, e0186132. https://doi.org/10.1371/journal.pone.0186132 (2017).
Smith, J. & Doe, A. Subject-dependent models for surface electromyography analysis. J. Biomech.33, 123–134 (2020).
Wang, R. & Liu, Y. Cross-subject analysis in biometric models: Methods and evaluation. Pattern Recogn. Lett.33, 789–797 (2019).
Furno, G. & Tompkins, W. J. A learning filter for removing noise interference. IEEE Transactions on Biomedical EngineeringBME-30, 234–238, https://doi.org/10.1109/TBME.1983.325225 (1983).
Abe, M., Yamashita, H., Jinno, S., Custance, O. & Toki, H. Reduction of noise induced by power supply lines using phase-locked loop. Rev. Sci. Instrum.93, 113901. https://doi.org/10.1063/5.0124433 (2022).
Chen, J., Sun, Y., Sun, S. & Yao, Z. Reducing power line interference from semg signals based on synchrosqueezed wavelet transform. Sensors23, 5182. https://doi.org/10.3390/s23115182 (2023).
Chen, P., Li, Z., Togo, S., Yokoi, H. & Jiang, Y. A layered semg-fmg hybrid sensor for hand motion recognition from forearm muscle activities. IEEE Trans. Hum.-Mach. Syst.53, 935–944 (2023).
Chen, P. et al. Intra-and inter-channel deep convolutional neural network with dynamic label smoothing for multichannel biosignal analysis. Neural Netw.183, 106960 (2025).
Zafar, M. H., Langås, E. F. & Sanfilippo, F. Empowering human-robot interaction using semg sensor: Hybrid deep learning model for accurate hand gesture recognition. Results Eng.20, 101639 (2023).
Guo, L., Lu, Z. & Yao, L. Human-machine interaction sensing technology based on hand gesture recognition: A review. IEEE Trans. Hum.-Mach. Syst.51, 300–309 (2021).
Kadavath, M. R. K., Nasor, M. & Imran, A. Enhanced hand gesture recognition with surface electromyogram and machine learning. Sensors24, 5231 (2024).
Ni, S. et al. A survey on hand gesture recognition based on surface electromyography: Fundamentals, methods, applications, challenges and future trends. Applied Soft Computing 112235 (2024).
Zhang, Y., Yang, F., Fan, Q., Yang, A. & Li, X. Research on semg-based gesture recognition by dual-view deep learning. IEEE Access10, 32928–32937 (2023).
Li, W., Shi, P. & Yu, H. Gesture recognition using surface electromyography and deep learning for prostheses hand: state-of-the-art, challenges, and future. Front. Neurosci.15, 621885 (2021).
Xiong, B. et al. A global and local feature fused cnn architecture for the semg-based hand gesture recognition. Comput. Biol. Med.166, 107497. https://doi.org/10.1016/j.compbiomed.2023.107497 (2023).
Tuncer, S. A. & Alkan, A. Classification of emg signals taken from arm with hybrid cnn-svm architecture. Concurrency Computat. Pract. Exp.34, e6746 (2022).
Hu, Y. et al. A novel attention-based hybrid cnn-rnn architecture for semg-based gesture recognition. PLoS ONE13, e0206049 (2018).
Khushaba, R. N. et al. A long short-term recurrent spatial-temporal fusion for myoelectric pattern recognition. Expert Syst. Appl.178, 114977 (2021).
Kim, J.-S., Kim, M.-G. & Pan, S.-B. Two-step biometrics using electromyogram signal based on convolutional neural network-long short-term memory networks. Appl. Sci.11, 6824 (2021).
Sun, T., Hu, Q., Gulati, P. & Atashzar, S. F. Temporal dilation of deep lstm for agile decoding of semg: Application in prediction of upper-limb motor intention in neurorobotics. IEEE Robot. Autom. Lett.6, 6212–6219 (2021).
Wei, W., Hong, H. & Wu, X. A hierarchical view pooling network for multichannel surface electromyography-based gesture recognition. Comput. Intell. Neurosci.2021, 6591035 (2021).
Zabihi, S., Rahimian, E., Asif, A. & Mohammadi, A. Trahgr: Transformer for hand gesture recognition via electromyography. IEEE Transactions on Neural Systems and Rehabilitation Engineering (2023).
Peng, X., Zhou, X., Zhu, H., Ke, Z. & Pan, C. Msff-net: multi-stream feature fusion network for surface electromyography gesture recognition. PLoS ONE17, e0276436 (2022).
Fatayer, A., Gao, W. & Fu, Y. semg-based gesture recognition using deep learning from noisy labels. IEEE J. Biomed. Health Inform.26, 4462–4473 (2022).
Zhang, Y., Yang, F., Fan, Q., Yang, A. & Li, X. Research on semg-based gesture recognition by dual-view deep learning. IEEE Access10, 32928–32937 (2022).
Jiang, B. et al. Nkdff-cnn: A convolutional neural network with narrow kernel and dual-view feature fusion for multitype gesture recognition based on semg. Digital Signal Processing156, 104772 (2025).
Jiang, B. et al. An efficient surface electromyography-based gesture recognition algorithm based on multiscale fusion convolution and channel attention. Sci. Rep.14, 30867 (2024).
Nguyen, P.T.-T., Su, S.-F. & Kuo, C.-H. A frequency-based attention neural network and subject-adaptive transfer learning for semg hand gesture classification. IEEE Robot. Autom. Lett.9, 7835–7842. https://doi.org/10.1109/LRA.2024.3433748 (2024).
Ni, S. et al. A survey on hand gesture recognition based on surface electromyography: Fundamentals, methods, applications, challenges and future trends. Appl. Soft Comput.166, 112235. https://doi.org/10.1016/j.asoc.2024.112235 (2024).
Politti, F., Casellato, C., Kalytczak, M. M., Garcia, M. B. S. & Biasotto-Gonzalez, D. A. Characteristics of emg frequency bands in temporomandibullar disorders patients. J. Electromyogr. Kinesiol.31, 119–125. https://doi.org/10.1016/j.jelekin.2016.10.006 (2016).
Liu, Y., Zhou, J., Zhou, D. & Peng, L. Quantitative assessment of muscle fatigue based on improved gramian angular difference field. IEEE Sens. J.24, 32966–32980. https://doi.org/10.1109/JSEN.2024.3456479 (2024).
Li, N. et al. Non-invasive techniques for muscle fatigue monitoring: A comprehensive survey. ACM Comput. Surv.56, 1–40 (2024).
Gao, R. et al. Study on the nonfatigue and fatigue states of orchard workers based on electrocardiogram signal analysis. Sci. Rep.12, 4858 (2022).
Sun, J. et al. Application of surface electromyography in exercise fatigue: a review. Front. Syst. Neurosci.16, 893275 (2022).
Chen, W., Niu, Y., Gan, Z., Xiong, B. & Huang, S. Spatial feature integration in multidimensional electromyography analysis for hand gesture recognition. Appl. Sci. https://doi.org/10.3390/app132413332 (2023).
He, H. & Wu, D. Transfer learning for brain-computer interfaces: A euclidean space data alignment approach. IEEE Trans. Biomed. Eng.67, 399–410 (2019).
Côté-Allard, U. et al. Deep learning for electromyographic hand gesture signal classification using transfer learning. IEEE Trans. Neural Syst. Rehabil. Eng.27, 760–771 (2019).
Gu, A. & Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).
Zhang, D., Niu, H. & Jiang, M. Modeling of islanding detection by sensing jump change of harmonic voltage at pcc by the combination of a narrow band-pass filter and wavelet analysis. In 2013 IEEE ECCE Asia Downunder, 1–6, https://doi.org/10.1109/ECCE-ASIA.2013.6579275 (2013).
Diab, M. S. & Mahmoud, S. A 6nw seventh-order ota-c band pass filter for continuous wavelet transform. In 2019 International SoC Design Conference (ISOCC), 22–26, https://doi.org/10.1109/ISOCC47750.2019.9027752 (2019).
Han, D., Pei, L., An, S. & Shi, P. Multi-frequency weak signal detection based on wavelet transform and parameter compensation band-pass multi-stable stochastic resonance. Mech. Syst. Signal Process.70–71, 937–951. https://doi.org/10.1016/J.YMSSP.2015.09.003 (2016).
Lee, T. & Kim, S. Partitioning strategies in time-series data for neural network training. IEEE Trans. Neural Netw.25, 543–555 (2018).
Brown, C. et al. Addressing class imbalance in electromyography-based gesture recognition. Front. Neurosci.14, 976 (2021).
Zhang, C. et al. An end-to-end lower limb activity recognition framework based on semg data augmentation and enhanced capsnet. Expert Syst. Appl.213, 120257. https://doi.org/10.1016/j.eswa.2023.120257 (2023).
Nanthini, K. et al. A survey on data augmentation techniques. In Proceedings of the International Conference on Communication and Computing (ICCMC), 10084010, https://doi.org/10.1109/ICCMC56507.2023.10084010 (2023).
Fons, E., Dawson, P., Zeng, X.-J., Keane, J. & Iosifidis, A. Adaptive weighting scheme for automatic time-series data augmentation. arXiv preprint arXiv:2102.08310 (2021).
Lee, M. A. et al. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks. In 2019 International conference on robotics and automation (ICRA), 8943–8950 (IEEE, 2019).
Cho, K. et al. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
Lin, T. Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017).
Merletti, R. & Muceli, S. Tutorial surface emg detection in space and time: Best practices. J. Electromyogr. Kinesiol.49, 102363 (2019).
Dumitru, D., King, J. C. & Zwarts, M. J. Determinants of motor unit action potential duration. Clin. Neurophysiol.110, 1876–1882 (1999).
Asmussen, M. J., Von Tscharner, V. & Nigg, B. M. Motor unit action potential clustering-theoretical consideration for muscle activation during a motor task. Front. Hum. Neurosci.12, 15 (2018).
Xing, K. et al. Hand gesture recognition based on deep learning method. In 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC), 542–546 (IEEE, 2018).
Zhang, S., Zhou, H., Tchantchane, R. & Alici, G. A wearable human-machine-interface (hmi) system based on colocated emg-pfmg sensing for hand gesture recognition. IEEE/ASME Trans. Mechatron.30, 369–380 (2024).
Too, J., Abdullah, A. R., Zawawi, T. T., Saad, N. M. & Musa, H. Classification of emg signal based on time domain and frequency domain features. Int. J. Hum. Technol. Interact. (IJHaTI)1, 25–30 (2017).
Xie, H. Radio adversarial attacks on emg-based gesture recognition networks. arXiv preprint arXiv:2507.21387 (2025).
Chan, P. P. et al. Unsupervised domain adaptation for gesture identification against electrode shift. IEEE Trans. Hum.-Mach. Syst.52, 1271–1280 (2022).
Côté-Allard, U. et al. Interpreting deep learning features for myoelectric control: A comparison with handcrafted features. Front. Bioeng. Biotechnol.8, 158 (2020).
Tolooshams, B. et al. Interpretable deep learning for deconvolutional analysis of neural signals. Neuron113, 1151–1168 (2025).
Du, G. et al. Meta-transfer-learning-based multimodal human pose estimation for lower limbs. Sensors25, 1613 (2025).
Li, A., Li, H. & Yuan, G. Continual learning with deep neural networks in physiological signal data: a survey. In Healthcare, vol. 12, 155 (MDPI, 2024).
Acknowledgements
This work is funded by the Scientific Research Foundation of Chongqing University of Technology (2023ZDZ019), the Scientific and Technological Research Program of the Chongqing Education Commission (KJZD-K202303103, KJQN202501104), the Natural Science Foundation of Chongqing (CSTB2025NSCQ-GPX0794), the Chongqing University of Technology Research and Innovation Team Cultivation Program (2023TDZ012), the Chongqing Municipal Key Project for Technology Innovation and Application Development (CSTB2024TIAD-KPX0042), the National Natural Science Foundation of P.R. China (61173184), and the National Key R&D Plan “Intelligent Robots” Key Project of P.R. China (2018YFB1308602).
Author information
Authors and Affiliations
Contributions
Conceptualization, D.H. and B.J; methodology, D.H.; software, D.H.; validation, D.H. and B.J; formal analysis, D.H and B.J; resources, W.L.; writing—original draft preparation, D.H.; writing—review and editing, D.H., W.L., Z.Y. and B.J.; visualization, D.H.; supervision, W.L. H.Y and B.J.; project administration, W.L.; funding acquisition, W.L. H.Y and B.J. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
He, D., Liu, W., Yan, H. et al. A multi-scale dual-stream fusion network for high-accuracy sEMG-based gesture classification. Sci Rep (2026). https://doi.org/10.1038/s41598-025-34909-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-34909-0


