Abstract
The PhysioMio dataset presented in this paper provides longitudinal high-density surface electromyography (HD-sEMG) recordings from both the healthy and impaired forearms of 48 stroke patients with arm paresis, captured during the performance of 16 distinct hand gestures. Patients were recorded at regular intervals during their individual inpatient rehabilitation stay, resulting in an average of three recording sessions at different stages of post-stroke rehabilitation per patient. The HD-sEMG signals were collected using a dry 64-electrode array positioned around the forearm, enabling observation of muscle activation patterns. The healthy arm was recorded during the first week to serve as a reference, while subsequent recordings focused solely on the impaired arm. This data can offer insights into neuromuscular deficits related to stroke and allow for comparative analysis between healthy and impaired arms. This dataset serves as a valuable resource for studying motor impairment and recovery potential in stroke-induced arm paresis, supporting advancements in personalized rehabilitation and assistive technologies.
Similar content being viewed by others
Background & Summary
With a prevalence of approximately 65% among stroke survivors, hemiparesis represents the most common motor impairment and constitutes a primary focus of treatment and rehabilitation in post-acute care1. In particular, the therapeutic management of arm paresis following stroke is considered a key element of stroke rehabilitation2. Impairments of the upper extremities, specifically affecting the movement and coordination of the arms, hands, and fingers, often result in significant difficulties in performing activities of daily living such as eating, dressing, and washing3. Consequently, the treatment of arm paresis has been identified by stroke survivors, caregivers and healthcare professionals as one of the top ten research priorities for life after stroke4.
Given the clinical relevance of upper arm impairments in stroke survivors, surface electromyography (sEMG) represents a promising tool to enhance rehabilitation strategies through objective, non-invasive, quantifiable insights into muscle activation patterns. To fully exploit this potential, comprehensive sEMG datasets from stroke patients are essential. Datasets capturing hand and finger movements are of critical importance, as fine motor skills are crucial for the performance of activities of daily living and are often severely affected after stroke. Such datasets enable the systematic analysis of neuromuscular function and facilitate the development of data-driven models for motor function assessment.
While several publicly available sEMG datasets focusing on hand and finger movements exist, the majority of these datasets have been collected from healthy individuals. An overview and comparison of the most relevant existing (HD-)sEMG datasets, including the few datasets that involve stroke survivors, is provided in Table 1.
To address the lack of large-scale, high-quality HD-sEMG datasets in the context of stroke rehabilitation, we conducted a clinical study involving 48 stroke survivors. To the best of our knowledge, the resulting PhysioMio dataset5 represents the largest collection of HD-sEMG recordings focused on hand and finger movements in stroke patients to date. HD-sEMG was employed to capture detailed neuromuscular activity across 64 electrodes, enabling a high spatial resolution of muscle activation patterns. In contrast to many existing datasets that are limited to isolated snapshots, our recordings encompass continuous four-second segments of muscle activity for each gesture, thereby facilitating time-resolved analyses of neuromuscular function. In addition, patients were recorded throughout their individual inpatient rehabilitation stay to monitor potential longitudinal changes in muscle activity during the course of recovery. Furthermore, bilateral recordings from both the paretic and non-paretic sides were acquired systematically, allowing direct intra-subject comparisons of neuromuscular function for each gesture.
This dataset offers considerable potential for reuse across a range of research applications, including, but not limited to, the development and validation of machine learning models for movement classification, the investigation of compensatory motor strategies following stroke and the advancement of personalized rehabilitation approaches.
Methods
Study participants
Fourty-eight stroke patients with hemiparesis (20 female, 28 male) participated in the study. Of these, 44 were right-handed and 4 were left-handed. In 26 patients, the left arm was paretic and in 22 the right, while the dominant arm was affected in 22 patients and the non-dominant arm in 26. The median age of the participants was 69 years (±13.8), ranging from 25 to 90 years. The median body weight was 75.5 kg (±12.3), with a range of 58 to 101 kg. The median height was 170.5 cm (±8.2), ranging from 157 to 190 cm. The median time post-stroke at the time of testing was 35.5 days, ranging from 13 to 2,308 days. One patient was classified as having chronic hemiparesis (more than six months post-stroke) and represented an outlier in the time since stroke, with a duration of 2,308 days. Each subject participated in a median of six recording sessions (range: three to fourteen), conducted throughout their inpatient rehabilitation stay. A total of 92 assessments were conducted on the healthy arm and 237 on the impaired arm of the stroke patients. Table 2 gives an overview of assessments included in the dataset.
The study was approved by the ethics committee at Technical University Munich (2023-273-S-KH) according to the Declaration of Helsinki. The study was registered at the German Clinical Trials Register (DRKS) with the ID DRKS00032380. Each participant was briefed by a medical doctor who informed the participants about the procedure and potential risks before engaging in the study. Each participant was given a patient information sheet and a declaration of consent for the study prior to their participation where they consented that the acquired health data can be used and published for medical research. Patients were excluded from the study if they had severe physical or neurological impairments that prevented measurement, were unable to understand the instructions, or were unable to give consent. The HD-EMG data and metadata were stored locally on an encrypted hard drive in pseudonymized form. Before transferring the data to the hard drive, each patient was assigned a pseudonym in the form of a randomly generated Universal Unique Identifier, which links the HD-EMG data and metadata to the patient. The pseudonym–real name assignment was kept offline on a separate encrypted USB stick and was deleted after the end of the study.
Instrumentation and materials
HD-sEMG signals were acquired in monopolar acquisition mode using the OT Bioelettronica Quattrocento (OT Bioelettronica, Turin, Italy). The Quattrocento is a commercial, Medical Device Regulation (MDR) 2017/745-certified EMG amplifier, that was used with a sampling frequency of 2048 Hz for all HD-sEMG signal acquisitions in this study. A 64-channel, custom-designed, hybrid rigid-flexible electrode array with goldplated, dry electrodes provided by OT Bioelettronica (see Fig. 1a)) was used for acquiring the HD-sEMG signals on the skin. This electrode array consists of 16 columns, each with four electrodes equidistantly spaced on a rigid Printed Circuit Board (PCB) stripe. These stripes are connected by flexible PCBs to ensure easy use on measurement areas and subjects of differing sizes. To attach the electrodes to the subjects’ forearms, two elastic bands were fixed to each end of the rigid columns under the connection to the flexible PCB (see Fig. 1b)). Snap buttons were used as a fixing mechanism. This ensured a tight, proper fit of the HD-sEMG electrodes on patients with differing circumferences of the lower arm.
To facilitate a clear interpretation and assignment of the recorded signals, the electrodes of the array were systematically numbered from 01 (row 1 and column 1) to 64 (row 4 and column 16), as shown in Fig. 2.
OT BioLab + (OT Bioelettronica, Turin, Italy) was used as software for the acquisition of the HD-sEMG data. The software saves the data to a proprietary data type.otb + , which can be exported to typical data types in data analysis, like Comma-Separated Values (CSV). Furthermore, it enables a live visualization of the data to check for interference and other noise.
In addition to the acquisition of HD-sEMG data, the Quattrocento system was also configured to collect a trigger signal to label the start and the end of the data acquisition for each movement. To facilitate this, a foot trigger mechanism was implemented between a USB plug and a BNC connector. Both connectors were manually soldered together, allowing for the transfer of a 5 V signal from the USB to the BNC. All the trials were recorded in a video to be able to check the execution of the respective movements to ensure proper data validation. The videos are not included in the PhysioMio dataset5 to ensure the subjects’ privacy.
The data was acquired on a Microsoft Surface Pro 8 (Core i7-1185G7, 16GB RAM, 256GB SSD) (Microsoft Corporation, 2021) running Windows 11 Professional using a custom Python module, controlled via a command line interface. This module is part of a custom package ensuring standardized, uniform data acquisition and storage of the HD-sEMG data. It ensured synchronous acquisition of video and HD-sEMG data and stores the data on an encrypted, external hard drive. During data acquisition, the power plug was disconnected from the computer to mitigate powerline interference. Instead, a laptop power bank was used, which supplied the laptop with direct current (DC) directly. This way no alternating current (AC) was introduced by the charging system. Figure 3 shows the experimental setup of the study.
Acquisition protocol
Data was collected over a period of eight months at two stroke rehabilitation clinics. Each patient was recorded longitudinally throughout their individual inpatient rehabilitation stay. Since the length of stay varied across patients, the number of recording sessions differs from patient to patient. Prior to each recording session, all systems were tested according to the manufacturer’s instructions to ensure proper functionality.
The subject’s skin on the lower arm and the electrode array were disinfected with alcohol and the electrode array was positioned 2 cm from the patient’s crook of the arm and tightly wrapped around the lower arm to ensure proper contact between the electrodes and the skin. The array was aligned such that, during recordings on the left forearm, the electrode column 15 (electrodes 15, 31, 47 and 63) was positioned over the ulna, with the connection cable oriented towards the subject’s hand. For recordings on the right forearm, the electrode column 2 (electrodes 2, 18, 34 and 50) was positioned over the ulna as well, with the cable likewise oriented towards the hand (see Fig. 4).
The first session of a subject consisted of up to four assessments in total, one to two assessments were dedicated to recording baseline reference data from the healthy arm, and one to two assessments were recorded from the impaired arm. In all subsequent sessions, only the impaired arm was recorded with one to two assessments per session. Variations in the number of assessments were primarily due to patient fatigue, which is an expected factor in clinical studies.
In each assessment, the subject had to perform 16 gestures. The set of gestures included one resting pose to capture baseline signals with little muscle activity, along with 15 other distinct gestures, as illustrated in Fig. 5. Some gestures required additional items and interaction with the principal investigator (PI). The patients were motivated to execute or interact as strongly as possible throughout all performed gestures. According to the definition of the Fugl-Meyer Assessment (FMA), the executed movements were rated from the PI on a three-point ordinal scale ranging from 0 to 2. A score of 0 indicated that the movement could not be performed, 1 that the movement could be partially performed, or the limb could not maintain position against resistance, and 2 that the movement could be fully performed or sustained against resistance. The recording of each assessment took approximately 10–15 minutes and was followed by a short break to allow patients to recover.
Prior to each gesture, the PI explained and demonstrated the gesture to the subject. The subject repeated the gesture and as soon as the subject reached the correct pose, the PI started counting for seven seconds. The PI pressed a foot trigger after the first second and released it after six seconds to ensure sufficiently long timing and to prevent premature release by the participant. If the subject was unable to fully execute the required gesture, the counting was initiated from the maximum position the subject was able to attain.
Throughout the procedure, the PI monitored the execution and recording of each gesture. If necessary, data collection of certain gestures was repeated to ensure accuracy. The PI assessed each movement via the FMA scale. All attempts were included in the PhysioMio dataset5. If a patient was unable to perform the gesture, the trial was retained and assigned an FMA score of 0. After completion of all recorded assessments, the electrodes were removed from the subject and the subject’s forearm as well as all objects used during the session were disinfected.
Data processing
The HD-sEMG data for each recording were stored as continuous recordings lasting approximately 10 to 15 minutes, comprising signals from all 64 electrodes during the performance of 16 distinct hand gestures as well as the intervening transition periods.
During signal acquisition, a digital high-pass filter at 10 Hz and a digital low-pass filter at 500 Hz, configured via the OT BioLab + (OT Bioelettronica, Turin, Italy) software, were applied to the recorded data. Notably, no notch filter was applied to suppress potential power line interference. Users intending to apply a notch filter should consider that the recordings were made in Germany, where the mains frequency is 50 Hz.
A trigger signal marking the onset and offset of each gesture was synchronously recorded alongside the HD-sEMG signals and was digitally debounced during post-processing.
The recordings and trigger signals were processed using OT BioLab + and initially saved in the proprietary otb + file format. To ensure compatibility with a broader range of analysis tools, the data were exported to comma-separated values (CSV) format and then converted to Apache Parquet format using Python, enabling more efficient storage and faster data processing.
Gesture segmentation was performed based on synchronized trigger signals. Since the initiation and termination of each gesture were manually controlled by the PI, slight variations in duration occurred. To standardize the PhysioMio dataset5, the central four seconds of each labeled gesture were extracted, ensuring uniform segment lengths across all recordings. Transition periods between gestures were excluded from the final PhysioMio dataset5.
Each assessment was saved as a separate Parquet file named XX.parquet, where XX represents the respective recording number. Each file contains 66 columns. The “time” column represents the timestamp in seconds relative to the start of the individual gesture. Columns “channel_01” to “channel_64” contain the raw HD-sEMG signals from the 64 electrodes, expressed in mV and numbered as shown in Fig. 2. The column “fma” shows the evaluation of the respective gesture. For rest poses the “fma” field is empty. The “movement_type” column indicates the classified gesture, corresponding to a four-second segment for each gesture.
Data Records
The PhysioMio dataset collected in the study can be downloaded from Hugging Face5. The data is organized according to the file structure illustrated in Fig. 6.
Structure of the PhysioMio dataset5 available at Hugging Face.
For each participant, a dedicated folder named sequentially from patient1 to patient48 is provided, containing all corresponding data. Within each participant folder, two subfolders are available: one for the recordings from the healthy arm and one for those from the impaired arm.
The healthy_arm folder contains up to two Parquet files, each representing one assessment of the healthy arm acquired during the initial recording session.
The impaired_arm folder contains up to twelve Parquet files, with the number corresponding to the number of recorded assessments conducted for the impaired arm.
A metadata file in csv-format accompanies the PhysioMio dataset5 to provide participant information and contextualize the recordings. All metadata is fully anonymized and does not contain any personally identifiable information. Each row corresponds to one participant and includes the following columns:
-
id: A unique participant identifier, ranging from patient01 to patient48.
-
arm_type: Arm type for each recording, recorded as healthy_arm or impaired_arm.
-
recording_index: The number of the respective recorded assessments per patient and arm type, starting with 1.
-
file_path: The file path of the recording relative to the data folder.
-
sha256sum: Checksum of the file.
-
age_in_years: Participant’s age in years at the time of the first recording.
-
gender: Participant’s gender, recorded as m for male or f for female.
-
height_in_cm: Participant’s height in centimeters at the time of the first recording.
-
weight_in_kg: Participant’s weight in kilograms at the time of the first recording.
-
impaired_arm: Participant’s side of the body affected by the stroke, recorded as l for left and r for right.
-
dominant_arm: Participant’s dominant arm prior to the stroke, recorded as l for left and r for right.
-
days_after_stroke: Number of days after the stroke event at which the recording was performed.
Standard participant information, including gender, weight, and height, is provided in the metadata file. Additionally, the information on the number of days after stroke for each recording allows users to reconstruct the longitudinal course of data acquisition for each participant. The PhysioMio dataset5 contains 329 files and has a size of 4,42 GB.
Technical Validation
The EMG system was applied and used according to the manufacturer’s specifications to ensure accurate and reproducible signal acquisition. Prior to each recording session, a test contraction was performed to verify signal quality. In cases where elevated baseline noise was detected, an error protocol was followed to identify and address potential sources of interference (e.g., poor skin contact, cable motion artifacts).
After processing the data, we conducted a secondary filtering step by manually reviewing every recording through an intuitive visualization interface, allowing rapid identification and removal of recordings with strong residual artifacts.
To further validate the final PhysioMio dataset5, we applied quantitative signal quality metrics, including signal-to-noise ratio (SNR), coefficient of correlation to a normal distribution (CCN), power spectral density (PSD), and a basic classifier to separate healthy and impaired patients.
Secondary filtering
We conducted an additional manual quality-control pass to ensure the high quality of the final PhysioMio dataset5. For each recording, we generated a compact visual summary that displayed the time-series traces of all channels alongside channel statistics over the SNRs of all movements in this channel. SNRs were computed in decibel (dB) treating the Rest gesture as noise \(N\) and each of the 15 remaining movements \({m}_{i}\) as separate signals:
For each recording, we generated an overall SNR statistic (SNR and standard deviation) by pooling the SNRs for all movements for all 64 channels. Recordings exhibiting uniformly low SNRs (<2.5 dB), SNRs with high standard deviation, or unusually large inter-movement SNR were flagged, and their traces were inspected visually and excluded from the final PhysioMio dataset5. If more than 10% of the 64 channels for a respective movement did not record an HD-sEMG signal due to missing skin contact, the entire recording was excluded. The two panels below illustrate this procedure: Fig. 7 shows a recording retained in the PhysioMio dataset5, whereas Fig. 8 with its conspicuous channel failure was discarded, resulting in the removal of the associated patient from the final PhysioMio dataset5.
This screening step resulted in the exclusion of six patients and 45 individual recordings from the final PhysioMio dataset5. These were removed prior to the statistical summary presented at the beginning of this paper (as shown in Table 2), which reflects the final, quality-controlled PhysioMio dataset5. A small number of recordings still exhibit sporadic channel dropouts or elevated noise levels. Such artefacts are inevitable when using a 64-electrode array, especially during active tasks performed with a paretic arm.
Healthy vs. Impaired arm
Recordings from neurologically intact arms differ markedly from those obtained from paretic arms, especially in cases of pronounced paresis. In Fig. 9, for example, the channel amplitudes are substantially attenuated in comparison to Fig. 7, producing a markedly lower signal-to-noise ratio (SNR). Moreover, compensatory movements required to perform the required gesture can in some cases introduce additional motion artefacts, which can affect the SNR.
Signal-to-noise ratio (SNR)
For the curated PhysioMio dataset5, we computed patient-level SNR statistics (SNR and standard deviation) by pooling SNR for each movement for all channels for each recording separately for healthy and impaired arms. As an overall statistic we provide the means of SNR and standard deviation over all patients separately for healthy and impaired arms. All results are summarized in Table 3.
The results presented above clearly demonstrate the presence of a meaningful signal during gesture execution compared to baseline resting conditions. Applying the same methodology to the initial subset of the Ninapro dataset (DB1) yields an average SNR of 17.90 ± 6.84 dB. It should be noted, however, that these values correspond to young, healthy subjects, and thus serve primarily as a reference for optimal physiological conditions.
Correlation coefficient of Normality (CCN)
A normally distributed signal amplitude is a sign for a good EMG signal, whereas a signal amplitude with a non-normal distribution would be considered contaminated6. We quantify how closely our signal amplitudes match a normal distribution by calculating the Correlation Coefficient of Normality (CCN), which is defined as the Pearson correlation between the empirical amplitude histogram and an ideal Gaussian distribution having identical mean and variance:
A CCN value approaching 1 denotes a near-Gaussian amplitude distribution, which is indicative of a clean EMG trace. We aggregated CCN values across all retained recordings and report pooled summary statistics separately for healthy and paretic arms (see Table 4).
The PhysioMio dataset5 achieves a robust CCN of 0.896 ± 0.087 for the healthy arm, which is comparable to the CCN of the widely used Ninapro dataset (0.848 ± 0.075)6.
Spectral analysis
To characterize the frequency content of the HD-sEMG, we estimated the power-spectral density (PSD). Each channel was first high-pass filtered with a fourth-order Butterworth filter (cut-off = 20 Hz) to suppress motion artefacts and then filtered at 50 Hz with a second-order notch filter to remove mains interference. PSDs were obtained via Welch’s method using Hann windows of 1024 samples with 50% overlap.
The resulting spectra in Fig. 10 displays the canonical EMG power band (≈20–500 Hz). Although healthy and paretic arms share peak frequencies, the impaired recordings consistently exhibit a lower overall spectral magnitude, consistent with the reduced motor-unit recruitment expected in severe paresis.
Power-spectral density analysis of the PhysioMio dataset5. The blue line represents the mean PSD for the recordings of the healthy arm while the lower and upper border of the shaded area represent the 25th and 75th percentile respectively. The red line represents the mean PSD for the recordings of the impaired arm with the same shaded area to represent the 25th and 75th percentile.
Basic classifier to separate healthy and impaired arms
To complement the signal-quality analyses, we assessed whether a basic classifier could discriminate between recordings from the healthy and paretic arms. Given the significant differences observed in these features (see Table 3 and Fig. 10), the per-recording PSD (20 frequency bins) together with the SNR mean and standard deviation were merged into a feature matrix. Each feature was normalized per patient using min–max normalization to mitigate inter-subject variability. Figure 10 suggests that the main difference between healthy and impaired PSD curves is a linear factor, so we simplified the feature space further by selecting only two normalized PSD frequency bands (68–92 Hz, 236–260 Hz), resulting in four final features. A Random Forest classifier (with fixed hyperparameters) was then trained using patient-based data splits to prevent subject leakage.
In a held-out evaluation with patients entirely unseen during training, the model achieved Accuracy = 0.878, Balanced Accuracy = 0.877, Macro-F1 = 0.865, AUROC = 0.890, and AUPRC = 0.913. A 5-fold patient-based cross-validation yielded 0.855 with per fold accuracy of 0.840, 0.924, 0.955, 0.677 and 0.878 and balanced accuracy of 0.859 with per fold balanced accuracy of 0.855, 0.949, 0.963, 0.641 and 0.888. A stricter leave-one-patient-out (LOPO) evaluation gave 0.851 micro-accuracy and 0.874 macro-accuracy (averaged across per-patient results) with 0.856 micro-balanced accuracy and 0.870 macro-balanced accuracy.
These consistent results demonstrate that even a simple, out-of-the-box model can separate healthy from impaired recordings, corroborating the PhysioMio dataset’s5 validity and the systematic signal differences between arms observed in our SNR and spectral analyses.
The PhysioMio dataset5 on its own is inherently imbalanced for this task, due to recording the healthy arm only once or twice per patient during the initial sessions, while the impaired arm was recorded longitudinally throughout rehabilitation (≈1: 3.5 ratio). This was addressed in two ways. Firstly, all splits were patient-wise, eliminating data leakage and ensuring balanced contribution of each subject. Secondly, the Random Forest was trained with class_weight = ‘balanced’, ensuring equal effective weighting of both classes. The close agreement between balanced accuracy and overall accuracy, together with high AUPRC, indicates that the classifier did not exploit the imbalance spuriously and that both classes were learned effectively.
This classification experiment serves only as a baseline validation, deeper investigation into features, model architecture and combining it with other datasets are interesting topics for future research based on the PhysioMio dataset5.
Usage Notes
The PhysioMio dataset5 is hosted in a public Hugging Face repository. The authors of this paper consented to the sharing of the data.
The users are advised to preprocess the PhysioMio dataset5 upon receiving the data. This includes filtering the desired EMG power band (e.g., 20–500 Hz) and applying a notch filter at 50 Hz to suppress potential power line interference. Please see the next section for example code.
Data availability
The presented PhysioMio dataset can be downloaded from Hugging Face (https://doi.org/10.57967/hf/6783)5.
Code availability
The code required to preprocess the PhysioMio dataset5 and to reproduce the analyses above is available at the PhysioMio GitHub repository https://github.com/formove-ai/physiomio. The scripts were developed and written in Python 3.11 with standard open-source software packages such as matplotlib, numpy, and scikit-learn.
References
Cauraugh, J. H. & Kim, S. B. Chronic stroke motor recovery: duration of active neuromuscular stimulation. Journal of the neurological sciences 215, 13–19, https://doi.org/10.1016/s0022-510x(03)00169-2 (2003).
Lang, C. E., Bland, M. D., Bailey, R. R., Schaefer, S. Y. & Birkenmeier, R. L. Assessment of upper extremity impairment, function, and activity after stroke: foundations for clinical decision making. Journal of hand therapy: official journal of the American Society of Hand Therapists 26, 104–14;quiz 115, https://doi.org/10.1016/j.jht.2012.06.005 (2013).
Pollock, A. et al. Interventions for improving upper limb function after stroke. The Cochrane database of systematic reviews 2014, CD010820, https://doi.org/10.1002/14651858.cd010820.pub2 (2014).
Pollock, A., St George, B., Fenton, M. & Firkins, L. Top ten research priorities relating to life after stroke. The Lancet. Neurology 11, 209, https://doi.org/10.1016/s1474-4422(12)70029-7 (2012).
Ilg, J. et al. PhysioMio: Bilateral and Longitudinal HD-sEMG Dataset of 16 Hand Gestures from 48 Stroke Patients; Hugging Face https://doi.org/10.57967/hf/6783 (2025).
Pradhan, A., He, J. & Jiang, N. Multi-day dataset of forearm and wrist electromyogram for hand gesture recognition and biometrics. Scientific data 9, 733, https://doi.org/10.1038/s41597-022-01836-y (2022).
Atzori, M. et al. Electromyography data for non-invasive naturally controlled robotic hand prostheses. Scientific data 1, 14053, https://doi.org/10.1038/sdata.2014.53 (2014).
Di Domenico, D. et al. Reach&Grasp: a multimodal dataset of the whole upper-limb during simple and complex movements. Scientific data 12, 233, https://doi.org/10.1038/s41597-025-04552-5 (2025).
Du, Y., Jin, W., Wei, W., Hu, Y. & Geng, W. Surface EMG-Based Inter-Session Gesture Recognition Enhanced by Deep Domain Adaptation. Sensors (Basel, Switzerland) 17, https://doi.org/10.3390/s17030458 (2017).
Gomez-Correa, M., Ballesteros, M., Salgado, I. & Cruz-Ortiz, D. Forearm sEMG data from young healthy humans during the execution of hand movements. Scientific data 10, 310, https://doi.org/10.1038/s41597-023-02223-x (2023).
Gowda, H. T. et al. A database of upper limb surface electromyogram signals from demographically diverse individuals. Scientific data 12, 517, https://doi.org/10.1038/s41597-025-04825-z (2025).
Guo, W. et al. Hand kinematics, high-density sEMG comprising forearm and far-field potentials for motion intent recognition. Scientific data 12, 445, https://doi.org/10.1038/s41597-025-04749-8 (2025).
Israely, S., Leisman, G., Machluf, C. C. & Carmeli, E. Muscle Synergies Control during Hand-Reaching Tasks in Multiple Directions Post-stroke. Frontiers in computational neuroscience 12, 10, https://doi.org/10.3389/fncom.2018.00010 (2018).
Jarque-Bou, N. J., Vergara, M., Sancho-Bru, J. L., Gracia-Ibáñez, V. & Roda-Sales, A. A calibrated database of kinematics and EMG of the forearm and hand during activities of daily living. Scientific data 6, 270, https://doi.org/10.1038/s41597-019-0285-1 (2019).
Jiang, X. et al. Open Access Dataset, Toolbox and Benchmark Processing Results of High-Density Surface Electromyogram Recordings. IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society 29, 1035–1046, https://doi.org/10.1109/tnsre.2021.3082551 (2021).
Kyranou, I., Szymaniak, K. & Nazarpour, K. EMG Dataset for Gesture Recognition with Arm Translation. Scientific data 12, 100, https://doi.org/10.1038/s41597-024-04296-8 (2025).
Malešević, N. et al. A database of high-density surface electromyogram signals comprising 65 isometric hand gestures. Scientific data 8, 63, https://doi.org/10.1038/s41597-021-00843-9 (2021).
Matran-Fernandez, A., Rodríguez Martínez, I. J., Poli, R., Cipriani, C. & Citi, L. SEEDS, simultaneous recordings of high-density EMG and finger joint angles during multiple hand movements. Scientific data 6, 186, https://doi.org/10.1038/s41597-019-0200-9 (2019).
Zhao, K., Wen, H., Zhang, Z., He, C. & Wu, J. Fractal characteristics-based motor dyskinesia assessment. Biomedical Signal Processing and Control 68, 102707, https://doi.org/10.1016/j.bspc.2021.102707 (2021).
Acknowledgements
We thank all the participants who generously participated in this research. We thank all the clinical and research staff involved in participant recruitment, data collection and data processing. The authors gratefully acknowledge the received funding from the German Federal Ministry of Education and Research for project 16SV9068 and from the German Federal Ministry for Economic Affairs and Energy for project 03EFBY0337, in which the presented study played a central role.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
J. Ilg conceived the study, created the protocol, collected and analyzed the data and drafted the manuscript. A. Oldemeier and L. Deuschel developed the data acquisition software and performed the data analysis. M. Fieweger performed the data collection, processing and analysis. P. Rieckmann, P. Young, and S. Krause were responsible for participant recruitment and provided medical oversight for the study, while T. C. Lueth supervised the study from a technical standpoint. All authors reviewed and edited the manuscript and approved the submitted version.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ilg, J., Oldemeier, A.C.R., Fieweger, M. et al. PhysioMio: bilateral and longitudinal HD-sEMG dataset of 16 hand gestures from 48 stroke patients. Sci Data 13, 19 (2026). https://doi.org/10.1038/s41597-026-06557-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-026-06557-0












