Machine learning classification of early-stage Parkinson’s disease using sit-to-walk biomechanical features

Kim, Minsoo; Youm, Changhong; Park, Hwayoung; Kim, Bohyeon; Choi, Hyejin; Hwang, Juseon; Cheon, Sang-Myung

doi:10.1038/s41598-026-45122-y

Download PDF

Article
Open access
Published: 30 March 2026

Machine learning classification of early-stage Parkinson’s disease using sit-to-walk biomechanical features

Minsoo Kim¹,
Changhong Youm^1,2,
Hwayoung Park²,
Bohyeon Kim²,
Hyejin Choi²,
Juseon Hwang¹ &
…
Sang-Myung Cheon³

Scientific Reports volume 16, Article number: 10559 (2026) Cite this article

418 Accesses
Metrics details

Subjects

Abstract

Early-stage Parkinson’s disease (PD) impairs motor control during complex tasks requiring coordinated postural adjustment and locomotion. The sit-to-walk (STW) task integrates standing and gait initiation (GI), placing balance demands on people with PD; however, its potential for early PD detection remains underexplored. We aimed to identify STW-related biomechanical biomarkers of early-stage PD and evaluate the performance of a machine-learning-based classification model. We enrolled 106 participants (63 with early-stage PD and 43 age-matched healthy controls). Three-dimensional motion capture, force plates, and surface electromyography assessed participants’ STW task performed at a self-selected speed. We extracted 200 kinematic, kinetic, and neuromuscular variables across three task phases, with Phase 2 (P2) corresponding to the GI phase encompassing the first stepping cycle, during which dynamic balance control is challenged. Weighted feature importance and stepwise binary logistic regression identified three variables: mean center of mass (COM) speed during the entire task; anteroposterior center of pressure-COM displacement during P2; and forward thoracic range of motion during P2, indicating trunk flexion associated with postural adjustment during GI. A random forest classifier incorporating these variables achieved 84.9% accuracy. These biomarkers may be associated with compensatory movement strategies related to postural stability and support objective early screening of PD.

Introduction

Parkinson’s disease (PD) is a progressive neurodegenerative disorder characterized by gradual loss of dopamine-producing neurons in the substantia nigra pars compacta, leading to impaired motor control and gait dysfunction^1,2. PD affects approximately 12 million people worldwide, and approximately 140,000 individuals in South Korea^3,4, and its incidence is increasing with population aging⁵. PD is irreversible and frequently underdiagnosed in its early stages, when mild symptoms are often misattributed to aging. By the time PD is clinically diagnosed, motor symptoms are typically pronounced, and over 50% of the dopaminergic neurons have already degenerated^2,6. Therefore, detecting PD before overt motor symptoms is crucial to maximize therapeutic opportunities and enable personalized treatment strategies that may improve quality of life^2,7.

Notably, several studies on biomarkers for early diagnosis and prediction of PD report that non-motor symptoms often precede motor symptoms^2,8. Of these, reduced olfactory function is observed in about 90% of people with PD⁹, and sleep disturbances are reported in approximately 46% of cases¹⁰. However, these non-motor symptoms are also common in normal aging and in early Alzheimer’s disease, limiting their diagnostic accuracy and objectivity for identifying people with early-stage PD¹¹. Early-stage motor symptoms include unilateral rigidity or tremor, bradykinesia, shuffling gait, reduced arm swing, stooped posture, increased gait variability, asymmetry in limb movement, and reduced range of motion (ROM) of the ankle, knee, and hip joints, reflecting movement-related, task-specific kinematic limitations observed during gait and postural transition tasks^2,12. Nevertheless, diagnostic accuracy based on motor symptoms alone remains variable (27.8–88.3%)^2,13,14. To address these limitations, studies using artificial-intelligence (AI)–based machine learning (ML) and deep learning (DL) have emerged¹⁵, reporting accuracies of 84.6% and 67.3% during forward-walking tasks^16,17, 75.8% on the Timed-Up-and-Go (TUG) test¹⁸, and 83.5% on the 6-min walk test, which includes turning and forward walking¹⁹. These ML/DL approaches could detect subtle gait changes²⁰ and are increasingly applied to improve early diagnosis and predict symptom progression in PD.

Notably, many people with PD experience difficulties with complex tasks requiring movement initiation and postural transitions in daily life, largely due to gait instability, muscle weakness, and increased rigidity²¹. Of these, the sit-to-walk (STW) task is a critical daily activity that integrates sit-to-stand (STS) and gait initiation (GI) into a single transition and requires both locomotion and postural control²². The ability to perform STW declines with age, with reported reductions in center-of-pressure (COP)–center-of-mass (COM) distance, step length, and walking speed²³, alongside a temporal separation between rising from a chair and initiating walking²⁴. People with PD exhibit similar characteristics during STW, including separation of standing up and walking initiation, longer task duration, reduced vertical COM velocity, and decreased step length and velocity²². STW has been proposed as a useful assessment tool for PD; however, research remains limited, particularly regarding AI-based approaches to detect PD or classify symptom severity.

Therefore, we aimed to identify key biomechanical biomarkers, derived from kinematic, kinetic, and neuromuscular variables during the STW task, that could classify people with early-stage PD using ML algorithms and to evaluate the classification performance of the developed models. In addition, we aimed to verify whether an ML-based algorithm using the selected biomarkers could accurately classify people with early-stage PD. Furthermore, we hypothesized that biomechanical variables obtained from the STW task could serve as predictive biomarkers for early-stage PD classification.

Results

Feature selection for early-stage PD group classification

Variable selection for classifying early-stage PD was performed using Random Forest (RF) and Extreme Gradient Boosting (XGBoost) based on weighted importance analysis. From the 200 STW variables, 26 with a weighted importance ≥ 0.01 were selected (see Supplementary Table S1 online, Fig. 1c). The variables with the greatest importance were, in descending order: mediolateral COP–COM displacement in Phase 1 (P1_MedLat COP–COM), the average COM speed across the entire STW phase (TP_COM Speed), the integrated surface electromyography (sEMG) value of the left the biceps femoris (BF) in Phase 1 (P1_Left_BF_iEMG), the peak COM speed in Phase 2 (P2_COM Pspeed), and the COM Sway Area in Phase 1 (P1_COM_Sway_Area).

A stepwise binary logistic regression analysis was performed to classify early-stage PD and healthy controls (HC) based on the 26 selected variables. Three variables were retained in the final model: TP_COM Speed (area under curve [AUC] = 0.865, odds ratio [OR] 50.562, 95% confidence interval [CI]: 8.008–319.257, p < 0.001), P2_anteroposterior (AP) COP–COM (AUC = 0.701, OR 3.412, 95% CI: 1.515–7.684, p = 0.003), and P2_10th vertebral level (T10) For_ROM (AUC = 0.535, OR 0.369, 95% CI: 0.159–0.860, p = 0.021). The model’s total explanatory power was 71.8% (Table 1, Fig. 2).

Table 1 Stepwise binary logistic regression results for the 26 weighted-importance STW features.

Full size table

Classification accuracy and performance metrics for groups of early-stage PD

We analyzed three datasets to evaluate the classification accuracy between early-stage PD and HC: (1) all 200 variables collected during the STW task, (2) the top 26 variables selected through weighted importance analysis using the RF and XGBoost algorithms, and (3) the final three variables selected from the 26 variables using stepwise binary logistic regression. When using all 200 variables, the RF algorithm achieved an accuracy of 87.3 ± 3.2%. When using the top 26 variables, the accuracy increased to 92.1 ± 2.8%, whereas using only the final three variables yielded an accuracy of 84.9 ± 5.3% (see Supplementary Table S2 online, Fig. 3a). Performance metrics from the RF algorithm showed precision (88.4 ± 3.4%), recall (87.3 ± 3.2%), and F1 score (87.2 ± 3.2%) when using all 200 variables; precision (92.4 ± 2.9%), recall (92.1 ± 2.8%), and F1 score (92.0 ± 2.8%) when using 26 variables; and precision (85.8 ± 6.1%), recall (84.9 ± 5.3%), and F1 score (84.8 ± 5.2%) when using three variables (see Supplementary Table S2 online, Fig. 3b).

Discussion

PD is often difficult to distinguish from age-related motor decline as initial motor symptoms are subtle and task-dependent, possibly leading to delayed or missed diagnosis. People with PD experience difficulties during movement transitions, and previous studies have demonstrated a dissociation between the STS and GI components during the STW task. Despite the functional relevance of STW and its sensitivity to transitional motor deficits, quantitative biomechanical investigations of STW specifically targeting people with early-stage PD remain limited. In this study, we investigated key biomechanical biomarkers capable of classifying people with early-stage PD using features derived from the STW task, including kinematic, kinetic, and muscle activation measures, and to evaluate the classification accuracy of an ML–based algorithm using these biomarkers. Three STW based variables were selected as the most relevant indicators for distinguishing people with early-stage PD from HC as follows: (1) the mean COM speed across the entire STW task (AUC = 0.865), (2) the AP displacement between the COP and COM during Phase 2—from right heel-off to left heel-strike (AUC = 0.701), and (3) forward ROM at the T10 during Phase 2 (AUC = 0.535) (Table 1, Fig. 2). The explanatory power of the model was 71.8%. In addition, the RF algorithm demonstrated the greatest classification accuracy. When all 200 variables were used, accuracy reached 87.3%; when the top 26 weighted importance variables were used, accuracy increased to 92.1%; and when only the final three selected biomarkers were used, accuracy remained robust at 84.9%.

Our findings demonstrate that people with early-stage PD show a slower mean COM speed across the entire STW task, a shorter AP displacement between the COP and COM during Phase 2, and a reduced forward ROM at T10, compared with HC. Notably, the shortened COP–COM displacement observed during the STW task may reflect a deviation from the typical forward progression pattern from sitting to walking and may serve as a compensatory strategy to preserve postural stability during movement execution^22,24,25. Previous research has reported that compensatory strategies used to maintain dynamic stability are characterized by reduced AP linear momentum and velocity^22,25, longer task execution time, greater braking impulse, shortened COP–COM displacement, and increased hip and knee ROM and extension torque²⁴. Consistent with these findings, people with early-stage PD in the present study exhibited a slower mean COM speed across the entire STW task and a shorter AP COP–COM displacement during Phase 1 (from the onset of the 7th cervical vertebra (C7) movement to right heel-off), relative to the HC. These results suggest that people with early-stage PD should first stabilize posture before initiating forward progression, rather than transitioning immediately into gait after standing.

These compensatory strategies are thought to arise from dysfunction within the basal ganglia circuitry resulting from reduced dopaminergic neurotransmission, which leads to impaired movement planning and a diminished capacity to maintain automatic motor patterns during gait, postural transitions, and balance control^26,27. Accordingly, the present findings suggest that neurophysiological alterations in people with early-stage PD directly contribute to impaired trunk stability and deficits in movement-transition control.

Gan et al.²⁸ reported that balance and postural instability commonly observed in PD are associated with reduced trunk ROM during gait, which may represent a compensatory strategy in the early stages of the disease. Palmisano et al.²⁹ further emphasized the importance of assessing spinal kinematics under dynamic conditions and demonstrated altered trunk movement patterns in people with PD who experience freezing of gait, characterized by increased thoracic forward tilt and reduced overall trunk ROM. Similar observations have also been reported during the STW task. Sveinbjornsdottir³⁰ described marked asymmetry and axial impairment in people with early-stage PD, likely attributable to unilateral motor symptoms. In addition, Palmerini et al.³¹ linked rigidity and bradykinesia to reduced trunk sway in both the AP and vertical directions. Consistent with these findings, the present study demonstrated that people with early-stage PD exhibited significantly reduced forward ROM at the T10 segment during Phase 2 of the STW task compared with HC.

These findings are consistent with those of Zampieri et al.³², who reported reduced trunk mobility during walking and STS tasks in people with PD. This reduction has been interpreted as a compensatory strategy in which increased trunk muscle tension and axial rigidity constrain trunk ROM to preserve postural stability. Accordingly, the reduced trunk mobility observed in people with early-stage PD may represent a biomechanical adaptive response for maintaining balance control and postural stability.

Previous studies including people with PD across mild to severe disease stages have reported prolonged STW task duration, reduced vertical COM velocity, and decreased step length and step velocity²². In addition, Palmisano et al.³³ reported that people with PD and a history of falls exhibited longer STW completion times and shorter COP–COM displacement than those without a history of falls.

These findings support the relevance of the three variables identified in the present study—TP_COM_Speed, P2_AP_COP-COM, and P2_T10_For_ROM—as potential indicators for detecting people with early-stage PD. Specifically, these markers appear to capture subtle impairments in postural stability and gait-transition control that may not be evident during routine clinical assessment. Moreover, the reduced COM speed, shortened AP COP–COM displacement, and limited T10 ROM observed in this study suggest that people with early-stage PD may already be exposed to risk of falls associated with early biomechanical deterioration. Accordingly, small but measurable alterations in trunk kinematics and COM dynamics during the STW task may serve as clinically meaningful indicators for early PD screening and fall risk prediction.

In this study, the dominant side of all participants with early-stage PD was standardized to the right side. Examination of the clinically affected side revealed that 49.2%, 36.5%, and 14.3% presented with left-sided, right-sided, and bilateral involvement, respectively. In addition, people with early-stage PD demonstrated a significantly greater root mean square (RMS) asymmetry index (ASI) in the tibialis anterior (TA) muscle during Phase 2 (P2_RMS_ASI_TA) compared to the HC. This finding is consistent with the results reported by Keloth et al.³⁴, who observed increased bilateral asymmetry in the TA and medial gastrocnemius (GAS) muscles during gait in people with early-stage PD, particularly during the swing and single-leg support phases. These results indicate that left–right asymmetry in muscle activation is not limited to straight-line walking but is also evident during transitional tasks such as STW. This pattern of asymmetry reflects the unilateral onset of PD³⁵ and may be associated with reduced lower limb muscle activation resulting from side-specific neurodegeneration³⁶.

The RF algorithm demonstrated the greatest classification accuracy among the models evaluated in this study. This level of performance is comparable to, and in some cases exceeds, that reported in previous studies using gait-based ML approaches for PD classification. For example, Hwang et al.¹⁷ reported an RF accuracy of 71.3% when classifying people with early-stage PD across three gait-speed conditions, whereas Ferreira et al.¹⁶ achieved an accuracy of 84.6% using a Naïve Bayes (NB) model during preferred-speed walking. Similarly, Choi et al.¹⁹ reported an accuracy of 83.5% using a convolutional neural network based on data from the 6-min walk test in older adults.

In comparison, the 84.9% accuracy achieved in the present study using only three variables suggests that the selected STW-based indicators may serve as efficient and clinically applicable markers for screening people with early-stage PD, even when the dimensionality of the dataset is substantially reduced. In this study, we quantitatively identified biomechanical characteristics and neuromuscular control alterations occurring during the STW task in people with early-stage PD, providing novel evidence relevant to early detection and clinical assessment. Compared with HC, people with early-stage PD exhibited reduced trunk stability and impaired movement-transition ability, as reflected by slower mean COM speed, shorter AP COP–COM displacement, and reduced T10 ROM. These alterations likely represent compensatory movement strategies aimed at maintaining postural stability and may be associated with functional impairment of basal ganglia circuitry resulting from reduced dopaminergic transmission. Furthermore, asymmetry in muscle activation—particularly in the TA muscle—was significantly greater in the PD group, supporting the characteristic unilateral neurodegeneration of early-stage PD and suggesting asymmetric deficits in neuromuscular coordination and postural control.

The 84.9% classification accuracy achieved using only three selected variables (TP_COM_Speed, P2_AP_COP-COM, and P2_T10_For_ROM) suggests that subtle alterations in trunk kinematics and COM dynamics during the STW task may serve as sensitive biomechanical indicators of early-stage PD. Collectively, these findings highlight the potential clinical value of STW-based measures as noninvasive and time-efficient markers for early PD screening and fall risk prediction.

This study had some limitations. First, people with PD were assessed only during the “ON” medication state, which may not fully capture gait and postural impairments that may be more pronounced in the “OFF” state. Second, differences in sample size and sex distribution between groups were controlled using covariate-based statistical adjustments; however, residual imbalance may still have influenced the results. Third, the STW task required all participants to initiate gait with the right foot, which may have constrained the expression of natural movement strategies. Fourth, the study included only people with early-stage PD (Hoehn and Yahr stages 1–2), thereby limiting the generalizability of the findings to individuals in more advanced stages. In addition, tree-based algorithms such as RF and XGBoost are relatively robust to multicollinearity; however, the presence of correlated variables may still affect the stability and interpretability of feature-importance rankings. To address this issue, pairwise Spearman correlation coefficients among all extracted variables were calculated. The results are provided in the Supplementary Table S3. However, correlated variables were not explicitly excluded before the feature-importance analysis, which should be considered when interpreting the ranked predictors. Future studies should include assessments conducted during both “ON” and “OFF” medication states, apply improved approaches to address sample imbalance, allow natural step initiation, and recruit participants across a broader range of disease severity.

Conclusion

In this study, we evaluated the classification performance of ML algorithms using biomechanical variables capable of distinguishing between people with early-stage PD and HC, using biomechanical variables derived from the STW task, including kinematic, kinetic, and neuromuscular parameters. The following three key variables were identified: (1) the mean speed of the COM across the entire task, (2) the AP displacement between the COP and COM during Phase 2, and (3) forward ROM at T10 during Phase 2. These variables the RF algorithm achieved a classification accuracy of 84.9%.

These findings indicate that subtle alterations in trunk kinematics and COM dynamics during the STW task may serve as sensitive early markers of neuromuscular impairment in PD. Accordingly, these variables may offer a non-invasive and efficient screening approach for early detection and fall risk assessment in clinical settings. Future studies should validate the predictive performance and clinical applicability of these markers in larger and more diverse older populations.

Methods

Participants

People with PD were recruited from the outpatient neurology clinic of a medical center. We included those with idiopathic PD diagnosed by a neurologist according to the Movement Disorder Society (MDS) diagnostic criteria³⁷. The criteria were as follows: (a) adults aged ≥ 50 years with right-hand dominance, (b) a score of ≥ 24 on the Korean-Mini Mental State Examination³⁸, (c) Hoehn and Yahr stage 1–2, (d) the ability to walk and ambulate independently, and (e) a stable response to antiparkinsonian medication. We excluded people with PD and a history of cardiovascular, musculoskeletal, vestibular, or other neurological disorders; those who required assistive devices or caregiver support during movement; and those with uncontrolled dyskinesia despite medication. HC were recruited from the local community and comprised age- and sex-matched individuals with no history of musculoskeletal, cardiovascular, vestibular, or neurological disorders affecting gait or cognition within the past 6 months, and no history of orthopedic surgery.

Based on these criteria, 63 people with early-stage PD and 43 HC were included in the final analysis (Fig. 4). Participants’ demographic and physical characteristics are presented in Table 2. The Institutional Review Board (IRB) of Dong-A University Hospital approved the study protocol and consent procedures (IRB number: DAUHIRB-22-089). All participants provided written informed consent before participation. The study was registered with the Clinical Research Information Service in the Republic of Korea (KCT0009353).

Table 2 Physical and clinical characteristics of the participants.

Full size table

Instrumentation

The STW task was recorded using nine infrared cameras (Vicon MX-T10, Oxford Metrics, UK), two force plates (OR6-7, AMTI, USA), and eight wireless sEMG sensors (Delsys, Natick, MA, USA). All equipment were arranged to fully capture the STW movement. The nine cameras were positioned around the measurement space, and the two force plates were embedded in the floor at locations where participants’ feet were placed while seated on a chair.

The global coordinate system for the STW task was defined with the left edge of the walkway set as the origin (0, 0, 0). The X-axis represented the participant’s mediolateral direction, the Y-axis corresponded to the anterior–posterior direction of motion, and the Z-axis indicated the vertical direction (Fig. 5a). All reflective markers and surface EMG electrodes were applied by a single experienced researcher with over 5 years of experience in clinical gait and sEMG analysis. All markers were placed by the same evaluator to minimize inter-rater variability and ensure anatomical consistency (see Supplementary Table S4 online, Fig. 5b). A full-body biomechanical model was constructed using the Plug-in Gait model based on the modified Helen Hayes marker set, with 39 spherical reflective markers (14 mm in diameter). Previous studies have demonstrated that marker placement by highly experienced evaluators substantially reduces placement error and kinematic variability compared with less experienced raters³⁹.

Muscle activation was recorded by placing eight wireless sEMG sensors on the rectus femoris, and the short head of the BF, TA, and medial GAS muscles on both lower limbs (Fig. 5c). All sEMG electrodes were applied by the same researcher who placed the reflective markers, thereby preventing inter-rater variability. Electrode placement followed the Surface EMG for Non-Invasive Assessment of Muscles recommendations, with sensors positioned over the maximal bulging region of each muscle belly between the origin and insertion points, aligned with muscle fiber orientation⁴⁰. Before sensor attachment, the skin was shaved and cleaned with alcohol to minimize noise. The sensors were secured using double-sided adhesive tape, with additional fixation tape applied to prevent detachment during movement⁴⁰.

Test procedures

All measurements were conducted in the “ON” medication state, with antiparkinsonian drugs taken approximately 2 h before testing. The experimental procedure consisted of two stages.

First, participants completed informed consent and underwent evaluations of clinical status, physical characteristics, and anthropometric measurements required for joint kinematic calculations during the STW task. Clinical status evaluation included the collection of disease-related information, including disease duration, antiparkinsonian medication dosage, and treatment duration, obtained by a clinician from medical records and participant interviews. Anthropometric measurements required for subject-specific biomechanical modeling were obtained to enable accurate joint kinematic calculations using the Plug-in Gait model. These measurements included leg length (measured from the anterior superior iliac spine to the medial malleolus); shoulder, elbow, wrist, and knee width; and ankle and hand thickness, measured using a tape measure and calipers. Physical characteristics included body height and body mass measured using a stadiometer and a calibrated digital scale, respectively.

Second, participants changed into spandex shirts and shorts, after which 39 spherical 14 mm reflective markers (Fig. 5b) and eight wireless sEMG electrodes (Fig. 5c) were attached. Before data collection, participants performed a 5-min warm-up, primarily consisting of walking, followed by 2–3 practice trials of the STW task. In addition, brief low-intensity movements were included, such as light upper- and lower-limb stretching (comparable to gentle self-initiated limb extension) and seated heel raises. These movements were limited in amplitude and duration and were implemented solely to promote comfort, ensure participant safety, and facilitate task familiarization, not to increase joint ROM. Then, participants were allowed to rest in a seated position for approximately 1 min to minimize learning effects. For the STW measurement, all participants sat on a chair with a backrest, and the chair height was adjusted to maintain the knee joint angle at 90°. The position of the feet was adjusted so that the lower legs were approximately perpendicular to the ground, with each foot placed separately on one of the two force plates. To eliminate upper limb involvement, participants were instructed to maintain a standing at-attention posture with both arms naturally lowered. The first step after seat-off was standardized so that all participants initiated gait with the right foot. The STW task was assessed using the TUG protocol. After a static calibration, participants were instructed to stand from the chair, walk forward at a self-selected speed to a 20 cm marker cone positioned 3 m ahead, turn around it, and return to the chair. Similarly, a 1-min seated rest was provided between trials to reduce learning effects, and the task was performed three times.

Data preprocessing

Motion analysis, ground reaction force (GRF), and sEMG data collected during the STW task were processed using Nexus software (version 2.10.3, Vicon, UK) and MATLAB R2024b (MathWorks, Natick, MA, USA). Spatiotemporal, kinematic, and kinetic variables were computed using the standard Plug-in Gait pipeline implemented in Vicon Nexus. Joint angles were calculated using Cardan (Euler) angle rotations. GRF and inverse dynamics–based joint moments and powers, were calculated within Nexus and subsequently exported for phase-specific variable extraction and additional computations in MATLAB. The kinematic modeling framework and joint angle calculations followed established marker-based gait analysis conventions⁴¹. GRF data were exported in Newtons (N) and normalized to body weight in MATLAB. Joint moment and power outputs exported from Nexus were normalized to body mass and reported in N·mm/kg and W/kg, respectively, in accordance with standard biomechanical practice⁴². No additional normalization was applied in MATLAB. The sampling frequencies were set to 100 Hz, 1000 Hz, and 2000 Hz for motion analysis, GRF, and sEMG signals. Motion capture and GRF data were filtered using a fourth order low-pass Butterworth filter with cut-off frequencies of 6 Hz and 25 Hz, respectively. sEMG signals were processed separately using a 40–400 Hz band-pass Butterworth filter and a 60 Hz notch filter to remove power-line interference⁴³.

During the STW task, motion analysis was divided into five events and three phases. Event 1 (E1) was defined as the time point at which the trunk forward inclination—calculated using the C7 marker—exceeded 5° in the anterior direction relative to the seated baseline⁴⁴. Event 2 (E2) was defined as the seat-off moment, when the buttocks lifted off the chair and the magnitude of the anteroposterior GRF reached its maximum (peak of |AP GRF|)^22,45. Event 3 (E3) corresponded to the heel-off of the first stepping limb. Event 4 (E4) was the initial heel-strike of the first step, marking ground contact of the leading limb. Event 5 (E5) was defined as the heel-strike of the contralateral limb during the second step⁴⁶. Heel-off (E3) was defined as the time point when the heel marker of the first stepping limb showed a vertical rise > 1.5 cm, indicating foot lift-off^22,44. Heel-strike events (E4 and E5) were identified as the frames at which the heel marker reached its minimum vertical position, corresponding to ground contact. The analysis phases were defined as follows: the STS phase from E1 to E2, Phase 1 (P1) from E1 to E3, and Phase 2 (P2) from E3 to E5 (see Supplementary Fig. S1 online).

In total, 200 biomechanical variables were extracted from the STW task, comprising 14, 32, and 114 spatiotemporal, kinematic, and musculokinetic variables, respectively, calculated for each phase of the task (see Supplementary Table S5 online). Of these, the muscle ASI between the left and right limbs was calculated using Eq. (1)⁴⁷ based on the unilateral RMS values of the sEMG signals, with greater values indicating greater left–right asymmetry in muscle activation and values closer to zero indicating more symmetric bilateral activation. In this equation, Leg1 represents the greater RMS value between the left and right limbs, whereas Leg2 represents the smaller RMS value of the opposite limb. A value of 0 indicates perfect symmetry, while larger values indicate greater asymmetry.

$$ASI=100-\left(\frac{Leg2}{Leg1}*100\right)$$

(1)

The co-contraction index (CCI), which quantifies the degree of simultaneous activation between the agonist and antagonist muscles, was calculated using Eq. (2). The peak dynamic method, in which the sEMG amplitude of each channel during the task was divided by its maximum absolute value, was applied to normalize the sEMG signals⁴⁸. In Eq. (2), EMGago and EMGant represent the normalized values of the agonist and antagonist muscles at each discrete time point t, and n denotes the total number of samples within the analysis phase. In this study, the agonist–antagonist muscle pairs consisted of the rectus femoris and BF, and the TA and GAS muscles on the left and right sides, calculated separately for each phase of the lower limb. Greater CCI values indicate increased simultaneous activation of agonist–antagonist muscle pairs, reflecting a joint-stabilizing strategy, whereas smaller values indicate more efficient reciprocal muscle activation and selective motor control.

$$CCI=\frac{{\sum }_{t=1}^{n}\left|EM{G}_{ago}\left(t\right)\right|\cdot \left|EM{G}_{ant}\left(t\right)\right|}{{\sum }_{t=1}^{n}\left|EM{G}_{ago}\left(t\right)\right|+\left|EM{G}_{ant}\left(t\right)\right|}$$

(2)

Whole-body COM trajectories were obtained using the standard Plug-in Gait full-body model, which is based on a conventional marker-based gait analysis framework⁴¹. In this model, COM is calculated as the mass-weighted average of all body segments based on subject-specific anthropometric data⁴². The COM Sway Area, which represents postural stability by quantifying the outer boundary of the COM trajectory during the STW task, was calculated using Eq. (3). This variable was defined as the sway area obtained using the Convex Hull method⁴⁹. The convex hull was constructed from the COM coordinates (x_com, y_com), and its enclosed area was computed. Here, (x_com,i, y_com,i) denotes the COM coordinates at time step i. Larger COM sway areas indicate reduced whole-body postural stability and increased body sway during the STW task, whereas smaller values reflect more stable and efficient control of the body mass.

$$\text{COM}\_Sway\_Area=Area(Convexhull({\{{x}_{COM,i},{y}_{COM,i}\}}_{i=1}^{N}))$$

(3)

The displacement between the COP and the COM, used as an indicator of dynamic stability, was calculated using Eq. (4). This variable was defined as the mean absolute distance between the AP and mediolateral coordinates of the COP and COM⁵⁰. Here, COP(i) and COM(i) denote the AP and mediolateral coordinates of the COP and COM at time i. Greater COP–COM displacement indicates less efficient dynamic balance control and a larger separation between the control action (COP) and body mass (COM), whereas smaller values indicate more efficient weight transfer and stable dynamic control.

$$COP-COM\,displacement=\frac{1}{N}\sum_{i=1}^{N}|COP\left(i\right)-COM\left(i\right)|$$

(4)

Entropy, used to quantify the complexity and irregularity of sEMG signals, was calculated as spectral entropy derived from the normalized power spectral density of the EMG signal, as defined in Eq. (5), following a frequency-domain entropy framework previously applied in EMG analysis⁵¹. In this formula, P_j denotes the power of the j-th frequency component in the full power spectrum, whereas P_i represents the probability distribution of the i-th frequency component, obtained by normalizing each power value to the total power. Greater entropy values indicate increased complexity and irregularity of neuromuscular activation patterns, whereas smaller values indicate more regular, stereotyped, or constrained muscle activation.

$$Entropy=-\sum_{i=1}^{N}{p}_{i}\text{log}\left({p}_{i}\right),where\,{p}_{i}=\frac{{P}_{i}}{{\sum }_{j=1}^{N}{P}_{j}}$$

(5)

Statistical analysis

Statistical analyses were performed using IBM SPSS Statistics version 21.0 (IBM Corp., Armonk, NY, USA). Descriptive statistics were calculated for all variables, including means, standard deviations, and 95% confidence intervals. Data normality was assessed using the Shapiro–Wilk test. For between-group comparisons of physical characteristics, an independent t-test was applied when normality was satisfied, whereas the Mann–Whitney U test was used when normality was violated. The significance level was set at p < 0.05.

ML-based feature selection was performed to evaluate the relative importance of STW task-based features and identify key predictors that distinguish people with early-stage PD from age-matched HC. Weighted feature importance analysis was conducted using the RF⁵² and XGBoost⁵³ algorithms. The weighting procedure incorporated each model’s classification accuracy to derive relative algorithm weights for variable ranking. This procedure was not used for direct optimization of classification performance. This approach was adopted to integrate importance estimates from two tree-based models that capture nonlinear relationships among the 200 STW-derived variables (Fig. 1a,b).

The relative weights (Wₖ) were derived from the five-fold cross-validation accuracy of each algorithm (see Supplementary Table S6 online) using Eq. (6)⁵⁴. This cross-validation procedure was applied solely to estimate algorithm-level performance for weight derivation and was not used for feature selection at the fold level. In this equation, K is the number of algorithms used, and Accuracyₖ denotes the classification accuracy of the RF and XGBoost models. Each weight was calculated as the ratio of an individual model’s accuracy to the sum of accuracies across models, ensuring that the weights sum to 1 and reflect the relative contribution of each algorithm to the variable ranking, such that models with greater cross-validation accuracy receive proportionally greater weights in the final importance calculation.

$${W}_{k}=\frac{Accuracyk}{{\sum }_{m=1}^{K}Accuracym}, \sum_{k=1}^{K}{W}_{k}=1$$

(6)

The resulting weights were applied to each algorithm’s variable importance values via Eq. (7) to obtain the weighted importance. In this equation, FI_k,i denotes the importance score of variable i for algorithm k, and Wₖ is as defined in Eq. (6). The final weighted importance for each variable (WFI_i) was computed by summing the weighted importance values from both models.

$${WFI}_{i}=\sum_{k=1}^{K}{W}_{k}*{FI}_{k,i}$$

(7)

Through this process, an integrated variable importance score reflecting the relative performance of the two algorithms was obtained. Based on the final weighted importance scores derived from the full dataset, variables with scores ≥ 0.01 were retained as predictors with significant contributions, resulting in 26 selected variables⁵⁵. To further identify the core predictors among these 26 variables, binary logistic regression analysis was performed with age, sex, height, and body mass index included as covariates. Subsequent classification analyses were conducted independently of the feature-ranking procedure to evaluate the discriminative performance of the selected variables.

To evaluate the classification performance between people with early-stage PD and age-matched HC, seven ML classifiers were applied: Logistic Regression⁵⁶, K-Nearest Neighbor⁵⁷, Naïve Bayes (NB)⁵⁸, Linear Discriminant Analysis⁵⁹, Quadratic Discriminant Analysis⁵⁹, Support Vector Machine⁶⁰, and RF. Cut-off values for the gait-based predictors selected for group classification were determined using receiver operating characteristic (ROC) curves, and the area under the ROC curve (AUC) was calculated to assess the discriminative ability of each variable. Hyperparameters for all models were optimized using Grid Search (see Supplementary Table S7 online). Model performance—accuracy, precision, recall, and F1 score—was evaluated using five-fold cross-validation, in which the dataset was randomly partitioned into five subsets; four were used for training, and one for testing, and this procedure was repeated five times (Fig. 6). Each metric was computed based on the number of true positives, true negatives (TN), false positives (FP), and false negatives (FN), defined as follows:

True positives: correctly predicted positive cases (PD correctly classified)
TN: correctly predicted negative cases (healthy controls correctly classified)
FP: negative cases incorrectly predicted as positive
FN: positive cases incorrectly predicted as negative

The performance metrics were calculated using the following equations:

$$\text{Accuracy}=\frac{True\,positives + TN}{True\,positives + TN + FP + FN}$$

(8)

$$\text{Precision}=\frac{True\,positives}{True\,positives + FP}$$

(9)

$$\text{Recall}=\frac{True\,positives}{True\,positives + FN}$$

(10)

$$\text{F1~score}=2* \frac{(Precision * Recall)}{(Precision + Recall)}$$

(11)

Accuracy quantified the overall proportion of correct predictions across both classes. Precision reflected the proportion of true positives predictions among all positive predictions made by the model, whereas recall reflected the proportion of true positives among all actual positive cases. The F1 score, the harmonic mean of precision and recall, was particularly informative in handling class imbalance, as it penalized extreme disparities between the two⁶¹. The final performance metrics were reported as the mean values across all five folds. All ML analyses were performed using Python (version 3.10; Python Software Foundation).

Data availability

The datasets that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

The source code used in this study is not publicly available. All machine learning analyses were performed in Python 3.10 using scikit-learn 1.5.2. Motion capture and EMG preprocessing, including event detection and biomechanical variable computation, were conducted using MATLAB R2023b. The code used for completing the analysis may be requested by contacting the corresponding author.

Abbreviations

PD:: Parkinson’s disease
ROM:: Range of motion
AI:: Artificial intelligence
ML:: Machine learning
DL:: Deep learning
TUG:: Timed up and go
STW:: Sit-to-walk
STS:: Sit-to-stand
GI:: Gait initiation
COP:: Center of pressure
COM:: Center of mass
RF:: Random forest
XGBoost:: Extreme gradient boosting
sEMG:: Surface electromyography
BF:: Biceps femoris
HC:: Healthy controls
AUC:: Area under the ROC curve
OR:: Odds ratio
CI:: Confidence interval
AP:: Anterior–posterior
T10:: The 10th thoracic vertebra
C7:: The 7th cervical vertebra
RMS:: Root mean square
ASI:: Asymmetry index
TA:: Tibialis anterior
GAS:: Gastrocnemius
NB:: Naїve bayes
MDS:: Movement Disorder Society
GRF:: Ground reaction force
E1–5:: Event 1–5
P1–2:: Phase 1–2
CCI:: Co-contraction index
LR:: Logistic regression
ROC:: Receiver operating characteristic
MedLat:: Mediolateral
TP:: Total phase
iEMG:: Integrated surface electromyography value

References

Kalia, L. V. & Lang, A. E. Parkinson’s disease. Lancet 386, 896–912 (2015).
Article CAS PubMed Google Scholar
Tolosa, E., Garrido, A., Scholz, S. W. & Poewe, W. Challenges in the diagnosis of Parkinson’s disease. Lancet Neurol. 20, 385–397 (2021).
Article CAS PubMed PubMed Central Google Scholar
Health Insurance Review & Assessment Service. Status of medical care by disease (OLAP). Open Healthcare Big Data System https://opendata.hira.or.kr/op/opc/olapMfrnIntrsIlnsInfoTab1.do (2023).
Parkinson’s Foundation. Statistics. https://www.parkinson.org/understanding-parkinsons/statistics (2022).
Zhu, J. et al. Temporal trends in the prevalence of Parkinson’s disease from 1980 to 2023: a systematic review and meta-analysis. Lancet Healthy Longev. 5, e464–e479 (2024).
Article PubMed Google Scholar
Lee, J. Y. et al. Multimodal brain and retinal imaging of dopaminergic degeneration in Parkinson disease. Nat. Rev. Neurol. 18, 203–220 (2022).
Article PubMed Google Scholar
Li, H., Zecca, M. & Huang, J. Evaluating the utility of wearable sensors for the early diagnosis of Parkinson disease: Systematic review. J. Med. Internet Res. 27, e69422 (2025).
Article PubMed PubMed Central Google Scholar
Heinzel, S. et al. Update of the MDS research criteria for prodromal Parkinson’s disease. Mov. Disord. 34, 1464–1470 (2019).
Article PubMed Google Scholar
Haehner, A., Hummel, T. & Reichmann, H. A clinical approach towards smell loss in Parkinson’s disease. J. Parkinsons Dis. 4, 189–195 (2014).
Article PubMed Google Scholar
Maggi, G., Vitale, C., Cerciello, F. & Santangelo, G. Sleep and wakefulness disturbances in Parkinson’s disease: A meta-analysis on prevalence and clinical aspects of REM sleep behavior disorder, excessive daytime sleepiness and insomnia. Sleep Med. Rev. 68, 101759 (2023).
Article PubMed Google Scholar
Moscovich, M. et al. How specific are non-motor symptoms in the prodrome of Parkinson’s disease compared to other movement disorders?. Parkinsonism Relat. Disord. 81, 213–218 (2020).
Article PubMed Google Scholar
Guo, Y., Yang, J., Liu, Y., Chen, X. & Yang, G. Z. Detection and assessment of Parkinson’s disease based on gait analysis: A survey. Front. Aging Neurosci. 14, 916971 (2022).
Article PubMed PubMed Central Google Scholar
Adler, C. H. et al. Clinical diagnostic accuracy of early/advanced Parkinson disease: an updated clinicopathologic study. Neurol. Clin. Pract. 11, e414–e421 (2021).
Article PubMed PubMed Central Google Scholar
Beach, T. G. & Adler, C. H. Importance of low diagnostic accuracy for early Parkinson’s disease. Mov. Disord. 33, 1551–1554 (2018).
Article PubMed PubMed Central Google Scholar
Valliani, A. A. A., Ranti, D. & Oermann, E. K. Deep learning and neurology: A systematic review. Neurol. Ther. 8, 351–365 (2019).
Article PubMed PubMed Central Google Scholar
Ferreira, M. I. A. S. N., Barbieri, F. A., Moreno, V. C., Penedo, T. & Tavares, J. M. R. Machine learning models for Parkinson’s disease detection and stage classification based on spatial-temporal gait parameters. Gait Posture 98, 49–55 (2022).
Article PubMed Google Scholar
Hwang, J. et al. Machine learning for early detection and severity classification in people with Parkinson’s disease. Sci. Rep. 15, 234 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Lin, S. et al. Wearable sensor-based gait analysis to discriminate early Parkinson’s disease from essential tremor. J. Neurol. 270, 2283–2301 (2023).
Article PubMed PubMed Central Google Scholar
Choi, H. et al. Convolutional neural network based detection of early stage Parkinson’s disease using the six minute walk test. Sci. Rep. 14, 22648 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Yang, N. et al. Motor symptoms of Parkinson’s disease: Critical markers for early AI-assisted diagnosis. Front. Aging Neurosci. 17, 1602426 (2025).
Article PubMed PubMed Central Google Scholar
Horak, F. B. & Mancini, M. Objective biomarkers of balance and gait for Parkinson’s disease using body‐worn sensors. Mov. Disord. 28, 1544–1551 (2013).
Article PubMed PubMed Central Google Scholar
Buckley, T. A., Pitsikoulis, C. & Hass, C. J. Dynamic postural stability during sit‐to‐walk transitions in Parkinson disease patients. Mov. Disord. 23, 1274–1280 (2008).
Article PubMed Google Scholar
Buckley, T., Pitsikoulis, C., Barthelemy, E. & Hass, C. J. Age impairs sit-to-walk motor performance. J. Biomech. 42, 2318–2322 (2009).
Article PubMed Google Scholar
Perera, C. K., Gopalai, A. A., Gouwanda, D., Ahmad, S. A. & Salim, M. S. B. Sit-to-walk strategy classification in healthy adults using hip and knee joint angles at gait initiation. Sci. Rep. 13, 16640 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Magnan, A., McFadyen, B. J. & St-Vincent, G. Modification of the sit-to-stand task with the addition of gait initiation. Gait Posture. 4, 232–241 (1996).
Article Google Scholar
Hernandez, L. F., Obeso, I., Costa, R. M., Redgrave, P. & Obeso, J. A. Dopaminergic vulnerability in Parkinson disease: The cost of humans’ habitual performance. Trends Neurosci. 42, 375–383 (2019).
Article CAS PubMed Google Scholar
Redgrave, P. et al. Goal-directed and habitual control in the basal ganglia: implications for Parkinson’s disease. Nat. Rev. Neurosci. 11, 760–772 (2010).
Article CAS PubMed PubMed Central Google Scholar
Gan, J. et al. Evolution characteristics of dynamic balance disorder over the course of PD and relationship with dopamine depletion. Front. Aging Neurosci. 14, 1075572 (2023).
Article PubMed PubMed Central Google Scholar
Palmisano, C. et al. Dynamic evaluation of spine kinematics in individuals with Parkinson’s disease and freezing of gait. Gait Posture 108, 199–207 (2024).
Article CAS PubMed Google Scholar
Sveinbjornsdottir, S. The clinical symptoms of Parkinson’s disease. J. Neurochem. 139(Suppl 1), 318–324 (2016).
Article ADS CAS PubMed Google Scholar
Palmerini, L., Mellone, S., Avanzolini, G., Valzania, F. & Chiari, L. Quantification of motor impairment in Parkinson’s disease using an instrumented timed up and go test. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 664–673 (2013).
Article PubMed Google Scholar
Zampieri, C. et al. The instrumented timed up and go test: potential outcome measure for disease modifying therapies in Parkinson’s disease. J. Neurol. Neurosurg. Psychiatry 81, 171–176 (2010).
Article PubMed Google Scholar
Palmisano, C. et al. Sit-to-walk performance in Parkinson’s disease: A comparison between faller and non-faller patients. Clin. Biomech. (Bristol) 63, 140–146 (2019).
Article PubMed Google Scholar
Keloth, S. M., Arjunan, S. P., Raghav, S. & Kumar, D. K. Muscle activation strategies of people with early-stage Parkinson’s during walking. J. Neuroeng. Rehabil. 18, 133 (2021).
Article PubMed PubMed Central Google Scholar
Barrett, M. J., Wylie, S. A., Harrison, M. B. & Wooten, G. F. Handedness and motor symptom asymmetry in Parkinson’s disease. J. Neurol. Neurosurg. Psychiatry 82, 1122–1124 (2011).
Article PubMed Google Scholar
Modestino, E. J., Amenechi, C., Reinhofer, A. & O’Toole, P. Side‐of‐onset of Parkinson’s disease in relation to neuropsychological measures. Brain Behav. 7, e00590 (2017).
Article PubMed Google Scholar
Hughes, A. J., Daniel, S. E. & Lees, A. J. Improved accuracy of clinical diagnosis of Lewy body Parkinson’s disease. Neurology 57, 1497–1499 (2001).
Article CAS PubMed Google Scholar
Folstein, M. F., Folstein, S. E. & McHugh, P. R. Mini-Mental State”: A practical method for grading the cognitive state of patients for the clinician. J. Psychiatr. Res. 12, 189–198 (1975).
Article CAS PubMed Google Scholar
Fonseca, M. et al. Evaluation of lower limb and pelvic marker placement precision among different evaluators and its impact on gait kinematics computed with the Conventional Gait Model. Gait Posture 104, 22–30 (2023).
Article PubMed Google Scholar
Hermens, H. J., Freriks, B., Disselhorst-Klug, C. & Rau, G. Development of recommendations for SEMG sensors and sensor placement procedures. J. Electromyogr. Kinesiol. 10, 361–374 (2000).
Article CAS PubMed Google Scholar
Davis, R. B. III., Ounpuu, S., Tyburski, D. & Gage, J. R. A gait analysis data collection and reduction technique. Hum. Mov. Sci. 10, 575–587 (1991).
Article Google Scholar
Winter, D. A. Biomechanics and Motor Control of Human Movement (Wiley, New York, 2009).
Book Google Scholar
Pradon, D., Tong, L., Chalitsios, C. & Roche, N. Development of surface EMG for gait analysis and rehabilitation of hemiparetic patients. Sensors (Basel, Switzerland) 24, 5954 (2024).
Article ADS PubMed PubMed Central Google Scholar
Youm, C. et al. Impact of trunk resistance and stretching exercise on fall-related factors in patients with parkinson’s disease: A randomized controlled pilot study. Sensors (Basel, Switzerland) 20, 4106 (2020).
Article ADS PubMed PubMed Central Google Scholar
Bishop, M., Brunt, D., Pathare, N., Ko, M. & Marjama-Lyons, J. Changes in distal muscle timing may contribute to slowness during sit to stand in Parkinsons disease. Clin. Biomech. (Bristol) 20, 112–117 (2005).
Article PubMed Google Scholar
Park, H., Youm, C., Son, M., Lee, M. & Kim, J. Effects of freezing of gait on spatiotemporal variables, ground reaction forces, and joint moments during sit-to-walk task in Parkinson’s disease. Korean J. Appl. Biomech. 28, 19–27 (2018).
Google Scholar
Bailey, C. A. et al. Electromyographical gait characteristics in Parkinson’s disease: Effects of combined physical therapy and rhythmic auditory stimulation. Front. Neurol. 9, 211 (2018).
Article PubMed PubMed Central Google Scholar
Knarr, B. A., Zeni, J. A. Jr. & Higginson, J. S. Comparison of electromyography and joint moment as indicators of co-contraction. J. Electromyogr. Kinesiol. 22, 607–611 (2012).
Article PubMed PubMed Central Google Scholar
Wollseifen, T. Different methods of calculating body sway area. Pharm. Program 4, 91–106 (2011).
Article Google Scholar
Hof, A. L., Gazendam, M. G. J. & Sinke, W. E. The condition for dynamic stability. J. Biomech. 38, 1–8 (2005).
Article ADS CAS PubMed Google Scholar
Trybek, P., Nowakowski, M., Salowka, J. & Machura, L. The distribution of information for sEMG signals in the rectal cancer treatment process. Biosystems 176, 1316 (2019).
Article Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).
Fisher, A., Rudin, C. & Dominici, F. All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20, 177 (2019).
MathSciNet PubMed PubMed Central Google Scholar
Du, Z., Wang, D. & Li, Y. Comprehensive evaluation and comparison of machine learning methods in QSAR modeling of antioxidant tripeptides. ACS Omega 7, 25760–25771 (2022).
Article CAS PubMed PubMed Central Google Scholar
Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R. & Lin, C. J. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008).
Google Scholar
Fix, E. & Hodges, J. L. Discriminatory analysis. Nonparametric discrimination: consistency properties. Int. Stat. Rev./Revue Internationale de Statistique 57, 238 (1989).
Google Scholar
Zhang, H. Exploring conditions for the optimality of naive Bayes. Int. J. Pattern Recogn. Artif. Intell. 19, 183–198 (2005).
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J. H. & Friedman, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction Vol. 2, 1–758 (Springer, 2009).
Google Scholar
Bishop, C. M. & Nasrabadi, N. M. Pattern Recognition and Machine Learning (Vol. 4, No. 4, p. 738) (Springer, 2006).
Saito, T. & Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10, e0118432 (2015).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank all participants who participated in this study. The authors also thank Editage (www.editage.co.kr) for English language editing.

Funding

This work was supported by the National Research Foundation of Korea (NRF) funded by the Korean government (MSIT) (grant No. 2022R1A2C100933711; Changhong Youm), the Basic Science Research Program through the NRF funded by the Ministry of Education (grant No. 2022R1A6A3A0108756411; Hwayoung Park), and the Ministry of Education of the Republic of Korea and NRF (grant No. 2024S1A5B5A16021673; Hwayoung Park). The funders had no role in the study design, data collection, and analysis, decision to publish, or manuscript preparation.

Author information

Authors and Affiliations

Department of Health Sciences, The Graduate School of Dong-A University, 37 Nakdong-Daero 550 beon-gil, Saha-gu, Busan, 49315, Republic of Korea
Minsoo Kim, Changhong Youm & Juseon Hwang
Biomechanics Laboratory, Dong-A University, Saha-gu, Busan, Republic of Korea
Changhong Youm, Hwayoung Park, Bohyeon Kim & Hyejin Choi
Department of Neurology, School of Medicine, Dong-A University, Seo-gu, Busan, Republic of Korea
Sang-Myung Cheon

Authors

Minsoo Kim
View author publications
Search author on:PubMed Google Scholar
Changhong Youm
View author publications
Search author on:PubMed Google Scholar
Hwayoung Park
View author publications
Search author on:PubMed Google Scholar
Bohyeon Kim
View author publications
Search author on:PubMed Google Scholar
Hyejin Choi
View author publications
Search author on:PubMed Google Scholar
Juseon Hwang
View author publications
Search author on:PubMed Google Scholar
Sang-Myung Cheon
View author publications
Search author on:PubMed Google Scholar

Contributions

MK, CY, and HP conceived and designed the study. MK, HP, BK, HC, JH, and SC recruited the participants. MK, CY, HP, BK, HC, JH, and SC performed the data acquisition. MK, CY, and HP analyzed and interpreted the data. MK and CY drafted the article. All authors read and approved the final version of the manuscript submitted.

Corresponding author

Correspondence to Changhong Youm.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval

All procedures involving human participants performed in this study were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The study and its supplementary information files were approved by the Institutional Review Board of Dong-A University Hospital (IRB number: DAUHIRB-22-089) (see ethics approval letter). The study was also registered with the Clinical Research Information Service of the Republic of Korea (KCT0009353).

Consent to participate

All patients provided written informed consent before data collection.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Supplementary Material 2 (download XLSX )

Supplementary Material 3 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kim, M., Youm, C., Park, H. et al. Machine learning classification of early-stage Parkinson’s disease using sit-to-walk biomechanical features. Sci Rep 16, 10559 (2026). https://doi.org/10.1038/s41598-026-45122-y

Download citation

Received: 26 December 2025
Accepted: 17 March 2026
Published: 30 March 2026
Version of record: 31 March 2026
DOI: https://doi.org/10.1038/s41598-026-45122-y