Introduction

Parkinson’s disease (PD) is a progressive neurodegenerative disorder characterized by gradual loss of dopamine-producing neurons in the substantia nigra pars compacta, leading to impaired motor control and gait dysfunction1,2. PD affects approximately 12 million people worldwide, and approximately 140,000 individuals in South Korea3,4, and its incidence is increasing with population aging5. PD is irreversible and frequently underdiagnosed in its early stages, when mild symptoms are often misattributed to aging. By the time PD is clinically diagnosed, motor symptoms are typically pronounced, and over 50% of the dopaminergic neurons have already degenerated2,6. Therefore, detecting PD before overt motor symptoms is crucial to maximize therapeutic opportunities and enable personalized treatment strategies that may improve quality of life2,7.

Notably, several studies on biomarkers for early diagnosis and prediction of PD report that non-motor symptoms often precede motor symptoms2,8. Of these, reduced olfactory function is observed in about 90% of people with PD9, and sleep disturbances are reported in approximately 46% of cases10. However, these non-motor symptoms are also common in normal aging and in early Alzheimer’s disease, limiting their diagnostic accuracy and objectivity for identifying people with early-stage PD11. Early-stage motor symptoms include unilateral rigidity or tremor, bradykinesia, shuffling gait, reduced arm swing, stooped posture, increased gait variability, asymmetry in limb movement, and reduced range of motion (ROM) of the ankle, knee, and hip joints, reflecting movement-related, task-specific kinematic limitations observed during gait and postural transition tasks2,12. Nevertheless, diagnostic accuracy based on motor symptoms alone remains variable (27.8–88.3%)2,13,14. To address these limitations, studies using artificial-intelligence (AI)–based machine learning (ML) and deep learning (DL) have emerged15, reporting accuracies of 84.6% and 67.3% during forward-walking tasks16,17, 75.8% on the Timed-Up-and-Go (TUG) test18, and 83.5% on the 6-min walk test, which includes turning and forward walking19. These ML/DL approaches could detect subtle gait changes20 and are increasingly applied to improve early diagnosis and predict symptom progression in PD.

Notably, many people with PD experience difficulties with complex tasks requiring movement initiation and postural transitions in daily life, largely due to gait instability, muscle weakness, and increased rigidity21. Of these, the sit-to-walk (STW) task is a critical daily activity that integrates sit-to-stand (STS) and gait initiation (GI) into a single transition and requires both locomotion and postural control22. The ability to perform STW declines with age, with reported reductions in center-of-pressure (COP)–center-of-mass (COM) distance, step length, and walking speed23, alongside a temporal separation between rising from a chair and initiating walking24. People with PD exhibit similar characteristics during STW, including separation of standing up and walking initiation, longer task duration, reduced vertical COM velocity, and decreased step length and velocity22. STW has been proposed as a useful assessment tool for PD; however, research remains limited, particularly regarding AI-based approaches to detect PD or classify symptom severity.

Therefore, we aimed to identify key biomechanical biomarkers, derived from kinematic, kinetic, and neuromuscular variables during the STW task, that could classify people with early-stage PD using ML algorithms and to evaluate the classification performance of the developed models. In addition, we aimed to verify whether an ML-based algorithm using the selected biomarkers could accurately classify people with early-stage PD. Furthermore, we hypothesized that biomechanical variables obtained from the STW task could serve as predictive biomarkers for early-stage PD classification.

Results

Feature selection for early-stage PD group classification

Variable selection for classifying early-stage PD was performed using Random Forest (RF) and Extreme Gradient Boosting (XGBoost) based on weighted importance analysis. From the 200 STW variables, 26 with a weighted importance ≥ 0.01 were selected (see Supplementary Table S1 online, Fig. 1c). The variables with the greatest importance were, in descending order: mediolateral COP–COM displacement in Phase 1 (P1_MedLat COP–COM), the average COM speed across the entire STW phase (TP_COM Speed), the integrated surface electromyography (sEMG) value of the left the biceps femoris (BF) in Phase 1 (P1_Left_BF_iEMG), the peak COM speed in Phase 2 (P2_COM Pspeed), and the COM Sway Area in Phase 1 (P1_COM_Sway_Area).

Fig. 1
Fig. 1
Full size image

Feature importance and weighted feature importance for early-stage Parkinson’s disease classification. (a) RF: Random forest; TP: Total phase; COM: Center of mass; P1: Phase 1; MedLat: Mediolateral; COP: Center of press; P2: Phase 2; Pspeed: Peak speed; BF: Biceps femoris; iEMG: Integrated Electromyograph; TCH: Toe clearance height; AP: Anterior–posterior; disp: Displacement; RMS: Root mean square; PGRF: Peak ground reaction force; STS_P: Sit-to-stand phase; TA: Tibialis anterior; PAmp: Peak amplitude; GAS: Gastrocnemius; (b) XGBoost: Extreme Gradient Boosting; PAmp: Peak amplitude; ROM: Range of motion; POW: Power; ASI: Asymmetry index; Freq: Frequency; T10: the 10th thoracic vertebra; For: Forward; SL: Step length; ACC: Acceleration.

A stepwise binary logistic regression analysis was performed to classify early-stage PD and healthy controls (HC) based on the 26 selected variables. Three variables were retained in the final model: TP_COM Speed (area under curve [AUC] = 0.865, odds ratio [OR] 50.562, 95% confidence interval [CI]: 8.008–319.257, p < 0.001), P2_anteroposterior (AP) COP–COM (AUC = 0.701, OR 3.412, 95% CI: 1.515–7.684, p = 0.003), and P2_10th vertebral level (T10) For_ROM (AUC = 0.535, OR 0.369, 95% CI: 0.159–0.860, p = 0.021). The model’s total explanatory power was 71.8% (Table 1, Fig. 2).

Table 1 Stepwise binary logistic regression results for the 26 weighted-importance STW features.
Fig. 2
Fig. 2
Full size image

Receiver operating characteristic curves for the three key features. TP_COM Speed: total phase center of mass speed; P2_AP COP-COM: Phase 2 anteroposterior center of pressure center of mass; P2_T10_For_ROM: Phase 2 forward range of motion of the 10th thoracic vertebra.

Classification accuracy and performance metrics for groups of early-stage PD

We analyzed three datasets to evaluate the classification accuracy between early-stage PD and HC: (1) all 200 variables collected during the STW task, (2) the top 26 variables selected through weighted importance analysis using the RF and XGBoost algorithms, and (3) the final three variables selected from the 26 variables using stepwise binary logistic regression. When using all 200 variables, the RF algorithm achieved an accuracy of 87.3 ± 3.2%. When using the top 26 variables, the accuracy increased to 92.1 ± 2.8%, whereas using only the final three variables yielded an accuracy of 84.9 ± 5.3% (see Supplementary Table S2 online, Fig. 3a). Performance metrics from the RF algorithm showed precision (88.4 ± 3.4%), recall (87.3 ± 3.2%), and F1 score (87.2 ± 3.2%) when using all 200 variables; precision (92.4 ± 2.9%), recall (92.1 ± 2.8%), and F1 score (92.0 ± 2.8%) when using 26 variables; and precision (85.8 ± 6.1%), recall (84.9 ± 5.3%), and F1 score (84.8 ± 5.2%) when using three variables (see Supplementary Table S2 online, Fig. 3b).

Fig. 3
Fig. 3
Full size image

Accuracies of the seven classifiers and confusion matrices for the three feature sets. PD: Parkinson’s disease; HC: Healthy controls; LR: Logistic regression; KNN: K-nearest neighbors; NB: Naïve Bayes; LDA: Linear discriminant analysis; QDA: Quadratic discriminant analysis; SVM: Support vector machine; RF: Random Forest.

Discussion

PD is often difficult to distinguish from age-related motor decline as initial motor symptoms are subtle and task-dependent, possibly leading to delayed or missed diagnosis. People with PD experience difficulties during movement transitions, and previous studies have demonstrated a dissociation between the STS and GI components during the STW task. Despite the functional relevance of STW and its sensitivity to transitional motor deficits, quantitative biomechanical investigations of STW specifically targeting people with early-stage PD remain limited. In this study, we investigated key biomechanical biomarkers capable of classifying people with early-stage PD using features derived from the STW task, including kinematic, kinetic, and muscle activation measures, and to evaluate the classification accuracy of an ML–based algorithm using these biomarkers. Three STW based variables were selected as the most relevant indicators for distinguishing people with early-stage PD from HC as follows: (1) the mean COM speed across the entire STW task (AUC = 0.865), (2) the AP displacement between the COP and COM during Phase 2—from right heel-off to left heel-strike (AUC = 0.701), and (3) forward ROM at the T10 during Phase 2 (AUC = 0.535) (Table 1, Fig. 2). The explanatory power of the model was 71.8%. In addition, the RF algorithm demonstrated the greatest classification accuracy. When all 200 variables were used, accuracy reached 87.3%; when the top 26 weighted importance variables were used, accuracy increased to 92.1%; and when only the final three selected biomarkers were used, accuracy remained robust at 84.9%.

Our findings demonstrate that people with early-stage PD show a slower mean COM speed across the entire STW task, a shorter AP displacement between the COP and COM during Phase 2, and a reduced forward ROM at T10, compared with HC. Notably, the shortened COP–COM displacement observed during the STW task may reflect a deviation from the typical forward progression pattern from sitting to walking and may serve as a compensatory strategy to preserve postural stability during movement execution22,24,25. Previous research has reported that compensatory strategies used to maintain dynamic stability are characterized by reduced AP linear momentum and velocity22,25, longer task execution time, greater braking impulse, shortened COP–COM displacement, and increased hip and knee ROM and extension torque24. Consistent with these findings, people with early-stage PD in the present study exhibited a slower mean COM speed across the entire STW task and a shorter AP COP–COM displacement during Phase 1 (from the onset of the 7th cervical vertebra (C7) movement to right heel-off), relative to the HC. These results suggest that people with early-stage PD should first stabilize posture before initiating forward progression, rather than transitioning immediately into gait after standing.

These compensatory strategies are thought to arise from dysfunction within the basal ganglia circuitry resulting from reduced dopaminergic neurotransmission, which leads to impaired movement planning and a diminished capacity to maintain automatic motor patterns during gait, postural transitions, and balance control26,27. Accordingly, the present findings suggest that neurophysiological alterations in people with early-stage PD directly contribute to impaired trunk stability and deficits in movement-transition control.

Gan et al.28 reported that balance and postural instability commonly observed in PD are associated with reduced trunk ROM during gait, which may represent a compensatory strategy in the early stages of the disease. Palmisano et al.29 further emphasized the importance of assessing spinal kinematics under dynamic conditions and demonstrated altered trunk movement patterns in people with PD who experience freezing of gait, characterized by increased thoracic forward tilt and reduced overall trunk ROM. Similar observations have also been reported during the STW task. Sveinbjornsdottir30 described marked asymmetry and axial impairment in people with early-stage PD, likely attributable to unilateral motor symptoms. In addition, Palmerini et al.31 linked rigidity and bradykinesia to reduced trunk sway in both the AP and vertical directions. Consistent with these findings, the present study demonstrated that people with early-stage PD exhibited significantly reduced forward ROM at the T10 segment during Phase 2 of the STW task compared with HC.

These findings are consistent with those of Zampieri et al.32, who reported reduced trunk mobility during walking and STS tasks in people with PD. This reduction has been interpreted as a compensatory strategy in which increased trunk muscle tension and axial rigidity constrain trunk ROM to preserve postural stability. Accordingly, the reduced trunk mobility observed in people with early-stage PD may represent a biomechanical adaptive response for maintaining balance control and postural stability.

Previous studies including people with PD across mild to severe disease stages have reported prolonged STW task duration, reduced vertical COM velocity, and decreased step length and step velocity22. In addition, Palmisano et al.33 reported that people with PD and a history of falls exhibited longer STW completion times and shorter COP–COM displacement than those without a history of falls.

These findings support the relevance of the three variables identified in the present study—TP_COM_Speed, P2_AP_COP-COM, and P2_T10_For_ROM—as potential indicators for detecting people with early-stage PD. Specifically, these markers appear to capture subtle impairments in postural stability and gait-transition control that may not be evident during routine clinical assessment. Moreover, the reduced COM speed, shortened AP COP–COM displacement, and limited T10 ROM observed in this study suggest that people with early-stage PD may already be exposed to risk of falls associated with early biomechanical deterioration. Accordingly, small but measurable alterations in trunk kinematics and COM dynamics during the STW task may serve as clinically meaningful indicators for early PD screening and fall risk prediction.

In this study, the dominant side of all participants with early-stage PD was standardized to the right side. Examination of the clinically affected side revealed that 49.2%, 36.5%, and 14.3% presented with left-sided, right-sided, and bilateral involvement, respectively. In addition, people with early-stage PD demonstrated a significantly greater root mean square (RMS) asymmetry index (ASI) in the tibialis anterior (TA) muscle during Phase 2 (P2_RMS_ASI_TA) compared to the HC. This finding is consistent with the results reported by Keloth et al.34, who observed increased bilateral asymmetry in the TA and medial gastrocnemius (GAS) muscles during gait in people with early-stage PD, particularly during the swing and single-leg support phases. These results indicate that left–right asymmetry in muscle activation is not limited to straight-line walking but is also evident during transitional tasks such as STW. This pattern of asymmetry reflects the unilateral onset of PD35 and may be associated with reduced lower limb muscle activation resulting from side-specific neurodegeneration36.

The RF algorithm demonstrated the greatest classification accuracy among the models evaluated in this study. This level of performance is comparable to, and in some cases exceeds, that reported in previous studies using gait-based ML approaches for PD classification. For example, Hwang et al.17 reported an RF accuracy of 71.3% when classifying people with early-stage PD across three gait-speed conditions, whereas Ferreira et al.16 achieved an accuracy of 84.6% using a Naïve Bayes (NB) model during preferred-speed walking. Similarly, Choi et al.19 reported an accuracy of 83.5% using a convolutional neural network based on data from the 6-min walk test in older adults.

In comparison, the 84.9% accuracy achieved in the present study using only three variables suggests that the selected STW-based indicators may serve as efficient and clinically applicable markers for screening people with early-stage PD, even when the dimensionality of the dataset is substantially reduced. In this study, we quantitatively identified biomechanical characteristics and neuromuscular control alterations occurring during the STW task in people with early-stage PD, providing novel evidence relevant to early detection and clinical assessment. Compared with HC, people with early-stage PD exhibited reduced trunk stability and impaired movement-transition ability, as reflected by slower mean COM speed, shorter AP COP–COM displacement, and reduced T10 ROM. These alterations likely represent compensatory movement strategies aimed at maintaining postural stability and may be associated with functional impairment of basal ganglia circuitry resulting from reduced dopaminergic transmission. Furthermore, asymmetry in muscle activation—particularly in the TA muscle—was significantly greater in the PD group, supporting the characteristic unilateral neurodegeneration of early-stage PD and suggesting asymmetric deficits in neuromuscular coordination and postural control.

The 84.9% classification accuracy achieved using only three selected variables (TP_COM_Speed, P2_AP_COP-COM, and P2_T10_For_ROM) suggests that subtle alterations in trunk kinematics and COM dynamics during the STW task may serve as sensitive biomechanical indicators of early-stage PD. Collectively, these findings highlight the potential clinical value of STW-based measures as noninvasive and time-efficient markers for early PD screening and fall risk prediction.

This study had some limitations. First, people with PD were assessed only during the “ON” medication state, which may not fully capture gait and postural impairments that may be more pronounced in the “OFF” state. Second, differences in sample size and sex distribution between groups were controlled using covariate-based statistical adjustments; however, residual imbalance may still have influenced the results. Third, the STW task required all participants to initiate gait with the right foot, which may have constrained the expression of natural movement strategies. Fourth, the study included only people with early-stage PD (Hoehn and Yahr stages 1–2), thereby limiting the generalizability of the findings to individuals in more advanced stages. In addition, tree-based algorithms such as RF and XGBoost are relatively robust to multicollinearity; however, the presence of correlated variables may still affect the stability and interpretability of feature-importance rankings. To address this issue, pairwise Spearman correlation coefficients among all extracted variables were calculated. The results are provided in the Supplementary Table S3. However, correlated variables were not explicitly excluded before the feature-importance analysis, which should be considered when interpreting the ranked predictors. Future studies should include assessments conducted during both “ON” and “OFF” medication states, apply improved approaches to address sample imbalance, allow natural step initiation, and recruit participants across a broader range of disease severity.

Conclusion

In this study, we evaluated the classification performance of ML algorithms using biomechanical variables capable of distinguishing between people with early-stage PD and HC, using biomechanical variables derived from the STW task, including kinematic, kinetic, and neuromuscular parameters. The following three key variables were identified: (1) the mean speed of the COM across the entire task, (2) the AP displacement between the COP and COM during Phase 2, and (3) forward ROM at T10 during Phase 2. These variables the RF algorithm achieved a classification accuracy of 84.9%.

These findings indicate that subtle alterations in trunk kinematics and COM dynamics during the STW task may serve as sensitive early markers of neuromuscular impairment in PD. Accordingly, these variables may offer a non-invasive and efficient screening approach for early detection and fall risk assessment in clinical settings. Future studies should validate the predictive performance and clinical applicability of these markers in larger and more diverse older populations.

Methods

Participants

People with PD were recruited from the outpatient neurology clinic of a medical center. We included those with idiopathic PD diagnosed by a neurologist according to the Movement Disorder Society (MDS) diagnostic criteria37. The criteria were as follows: (a) adults aged ≥ 50 years with right-hand dominance, (b) a score of ≥ 24 on the Korean-Mini Mental State Examination38, (c) Hoehn and Yahr stage 1–2, (d) the ability to walk and ambulate independently, and (e) a stable response to antiparkinsonian medication. We excluded people with PD and a history of cardiovascular, musculoskeletal, vestibular, or other neurological disorders; those who required assistive devices or caregiver support during movement; and those with uncontrolled dyskinesia despite medication. HC were recruited from the local community and comprised age- and sex-matched individuals with no history of musculoskeletal, cardiovascular, vestibular, or neurological disorders affecting gait or cognition within the past 6 months, and no history of orthopedic surgery.

Based on these criteria, 63 people with early-stage PD and 43 HC were included in the final analysis (Fig. 4). Participants’ demographic and physical characteristics are presented in Table 2. The Institutional Review Board (IRB) of Dong-A University Hospital approved the study protocol and consent procedures (IRB number: DAUHIRB-22-089). All participants provided written informed consent before participation. The study was registered with the Clinical Research Information Service in the Republic of Korea (KCT0009353).

Fig. 4
Fig. 4
Full size image

Flow diagram of participant recruitment. PD: Parkinson’s disease; HC: Healthy controls.

Table 2 Physical and clinical characteristics of the participants.

Instrumentation

The STW task was recorded using nine infrared cameras (Vicon MX-T10, Oxford Metrics, UK), two force plates (OR6-7, AMTI, USA), and eight wireless sEMG sensors (Delsys, Natick, MA, USA). All equipment were arranged to fully capture the STW movement. The nine cameras were positioned around the measurement space, and the two force plates were embedded in the floor at locations where participants’ feet were placed while seated on a chair.

The global coordinate system for the STW task was defined with the left edge of the walkway set as the origin (0, 0, 0). The X-axis represented the participant’s mediolateral direction, the Y-axis corresponded to the anterior–posterior direction of motion, and the Z-axis indicated the vertical direction (Fig. 5a). All reflective markers and surface EMG electrodes were applied by a single experienced researcher with over 5 years of experience in clinical gait and sEMG analysis. All markers were placed by the same evaluator to minimize inter-rater variability and ensure anatomical consistency (see Supplementary Table S4 online, Fig. 5b). A full-body biomechanical model was constructed using the Plug-in Gait model based on the modified Helen Hayes marker set, with 39 spherical reflective markers (14 mm in diameter). Previous studies have demonstrated that marker placement by highly experienced evaluators substantially reduces placement error and kinematic variability compared with less experienced raters39.

Fig. 5
Fig. 5
Full size image

Equipment setup and marker placement for the Plug-in Gait Full Body Model. (a) Overview of the experimental setup and equipment layout; (b) Reflective markers were placed bilaterally, unless otherwise specified, at the following anatomical landmarks (Vicon label codes in parentheses): head-front/back, left and right (LFHD, RFHD, LBHD, and RBHD); the 7th cervical vertebra (C7); the 10th thoracic vertebra (T10); the clavicle (CLAV); the sternum (STRN); the medial border of the right scapula (RBAK); the shoulders (LSHO and RSHO); the upper lateral 1/3 surface of the arm (LUPA and RUPA); the lateral humeral epicondyles (LELB and RELB); the lower lateral 1/3 surface of the forearm (LFRM and RERM); the mediolateral styloid processes of the wrist (LWRA, RWRA, LWRB, and RWRB); the third metacarpal heads (LFIN and RFIN); the anterior and posterior superior iliac spines on both sides (LASI, RASI, LPSI, and RPSI); the lower lateral 1/3 surface of the thigh (LTHI and RTHI); the lateral femoral epicondyle (LKNE and RKNE); the lower 1/3 surface of the shank (LTIB and RTIB); the lateral malleoli (LANK and RANK); the calcaneus (LHEE and RHEE); and the second metatarsal heads (LTOE and RTOE); (c) sEMG sensors are positioned over the muscle bellies of the quadriceps femoris, biceps femoris (short head), tibialis anterior, and gastrocnemius muscles bilaterally on lower limbs.

Muscle activation was recorded by placing eight wireless sEMG sensors on the rectus femoris, and the short head of the BF, TA, and medial GAS muscles on both lower limbs (Fig. 5c). All sEMG electrodes were applied by the same researcher who placed the reflective markers, thereby preventing inter-rater variability. Electrode placement followed the Surface EMG for Non-Invasive Assessment of Muscles recommendations, with sensors positioned over the maximal bulging region of each muscle belly between the origin and insertion points, aligned with muscle fiber orientation40. Before sensor attachment, the skin was shaved and cleaned with alcohol to minimize noise. The sensors were secured using double-sided adhesive tape, with additional fixation tape applied to prevent detachment during movement40.

Test procedures

All measurements were conducted in the “ON” medication state, with antiparkinsonian drugs taken approximately 2 h before testing. The experimental procedure consisted of two stages.

First, participants completed informed consent and underwent evaluations of clinical status, physical characteristics, and anthropometric measurements required for joint kinematic calculations during the STW task. Clinical status evaluation included the collection of disease-related information, including disease duration, antiparkinsonian medication dosage, and treatment duration, obtained by a clinician from medical records and participant interviews. Anthropometric measurements required for subject-specific biomechanical modeling were obtained to enable accurate joint kinematic calculations using the Plug-in Gait model. These measurements included leg length (measured from the anterior superior iliac spine to the medial malleolus); shoulder, elbow, wrist, and knee width; and ankle and hand thickness, measured using a tape measure and calipers. Physical characteristics included body height and body mass measured using a stadiometer and a calibrated digital scale, respectively.

Second, participants changed into spandex shirts and shorts, after which 39 spherical 14 mm reflective markers (Fig. 5b) and eight wireless sEMG electrodes (Fig. 5c) were attached. Before data collection, participants performed a 5-min warm-up, primarily consisting of walking, followed by 2–3 practice trials of the STW task. In addition, brief low-intensity movements were included, such as light upper- and lower-limb stretching (comparable to gentle self-initiated limb extension) and seated heel raises. These movements were limited in amplitude and duration and were implemented solely to promote comfort, ensure participant safety, and facilitate task familiarization, not to increase joint ROM. Then, participants were allowed to rest in a seated position for approximately 1 min to minimize learning effects. For the STW measurement, all participants sat on a chair with a backrest, and the chair height was adjusted to maintain the knee joint angle at 90°. The position of the feet was adjusted so that the lower legs were approximately perpendicular to the ground, with each foot placed separately on one of the two force plates. To eliminate upper limb involvement, participants were instructed to maintain a standing at-attention posture with both arms naturally lowered. The first step after seat-off was standardized so that all participants initiated gait with the right foot. The STW task was assessed using the TUG protocol. After a static calibration, participants were instructed to stand from the chair, walk forward at a self-selected speed to a 20 cm marker cone positioned 3 m ahead, turn around it, and return to the chair. Similarly, a 1-min seated rest was provided between trials to reduce learning effects, and the task was performed three times.

Data preprocessing

Motion analysis, ground reaction force (GRF), and sEMG data collected during the STW task were processed using Nexus software (version 2.10.3, Vicon, UK) and MATLAB R2024b (MathWorks, Natick, MA, USA). Spatiotemporal, kinematic, and kinetic variables were computed using the standard Plug-in Gait pipeline implemented in Vicon Nexus. Joint angles were calculated using Cardan (Euler) angle rotations. GRF and inverse dynamics–based joint moments and powers, were calculated within Nexus and subsequently exported for phase-specific variable extraction and additional computations in MATLAB. The kinematic modeling framework and joint angle calculations followed established marker-based gait analysis conventions41. GRF data were exported in Newtons (N) and normalized to body weight in MATLAB. Joint moment and power outputs exported from Nexus were normalized to body mass and reported in N·mm/kg and W/kg, respectively, in accordance with standard biomechanical practice42. No additional normalization was applied in MATLAB. The sampling frequencies were set to 100 Hz, 1000 Hz, and 2000 Hz for motion analysis, GRF, and sEMG signals. Motion capture and GRF data were filtered using a fourth order low-pass Butterworth filter with cut-off frequencies of 6 Hz and 25 Hz, respectively. sEMG signals were processed separately using a 40–400 Hz band-pass Butterworth filter and a 60 Hz notch filter to remove power-line interference43.

During the STW task, motion analysis was divided into five events and three phases. Event 1 (E1) was defined as the time point at which the trunk forward inclination—calculated using the C7 marker—exceeded 5° in the anterior direction relative to the seated baseline44. Event 2 (E2) was defined as the seat-off moment, when the buttocks lifted off the chair and the magnitude of the anteroposterior GRF reached its maximum (peak of |AP GRF|)22,45. Event 3 (E3) corresponded to the heel-off of the first stepping limb. Event 4 (E4) was the initial heel-strike of the first step, marking ground contact of the leading limb. Event 5 (E5) was defined as the heel-strike of the contralateral limb during the second step46. Heel-off (E3) was defined as the time point when the heel marker of the first stepping limb showed a vertical rise > 1.5 cm, indicating foot lift-off22,44. Heel-strike events (E4 and E5) were identified as the frames at which the heel marker reached its minimum vertical position, corresponding to ground contact. The analysis phases were defined as follows: the STS phase from E1 to E2, Phase 1 (P1) from E1 to E3, and Phase 2 (P2) from E3 to E5 (see Supplementary Fig. S1 online).

In total, 200 biomechanical variables were extracted from the STW task, comprising 14, 32, and 114 spatiotemporal, kinematic, and musculokinetic variables, respectively, calculated for each phase of the task (see Supplementary Table S5 online). Of these, the muscle ASI between the left and right limbs was calculated using Eq. (1)47 based on the unilateral RMS values of the sEMG signals, with greater values indicating greater left–right asymmetry in muscle activation and values closer to zero indicating more symmetric bilateral activation. In this equation, Leg1 represents the greater RMS value between the left and right limbs, whereas Leg2 represents the smaller RMS value of the opposite limb. A value of 0 indicates perfect symmetry, while larger values indicate greater asymmetry.

$$ASI=100-\left(\frac{Leg2}{Leg1}*100\right)$$
(1)

The co-contraction index (CCI), which quantifies the degree of simultaneous activation between the agonist and antagonist muscles, was calculated using Eq. (2). The peak dynamic method, in which the sEMG amplitude of each channel during the task was divided by its maximum absolute value, was applied to normalize the sEMG signals48. In Eq. (2), EMGago and EMGant represent the normalized values of the agonist and antagonist muscles at each discrete time point t, and n denotes the total number of samples within the analysis phase. In this study, the agonist–antagonist muscle pairs consisted of the rectus femoris and BF, and the TA and GAS muscles on the left and right sides, calculated separately for each phase of the lower limb. Greater CCI values indicate increased simultaneous activation of agonist–antagonist muscle pairs, reflecting a joint-stabilizing strategy, whereas smaller values indicate more efficient reciprocal muscle activation and selective motor control.

$$CCI=\frac{{\sum }_{t=1}^{n}\left|EM{G}_{ago}\left(t\right)\right|\cdot \left|EM{G}_{ant}\left(t\right)\right|}{{\sum }_{t=1}^{n}\left|EM{G}_{ago}\left(t\right)\right|+\left|EM{G}_{ant}\left(t\right)\right|}$$
(2)

Whole-body COM trajectories were obtained using the standard Plug-in Gait full-body model, which is based on a conventional marker-based gait analysis framework41. In this model, COM is calculated as the mass-weighted average of all body segments based on subject-specific anthropometric data42. The COM Sway Area, which represents postural stability by quantifying the outer boundary of the COM trajectory during the STW task, was calculated using Eq. (3). This variable was defined as the sway area obtained using the Convex Hull method49. The convex hull was constructed from the COM coordinates (xcom, ycom), and its enclosed area was computed. Here, (xcom,i, ycom,i) denotes the COM coordinates at time step i. Larger COM sway areas indicate reduced whole-body postural stability and increased body sway during the STW task, whereas smaller values reflect more stable and efficient control of the body mass.

$$\text{COM}\_Sway\_Area=Area(Convexhull({\{{x}_{COM,i},{y}_{COM,i}\}}_{i=1}^{N}))$$
(3)

The displacement between the COP and the COM, used as an indicator of dynamic stability, was calculated using Eq. (4). This variable was defined as the mean absolute distance between the AP and mediolateral coordinates of the COP and COM50. Here, COP(i) and COM(i) denote the AP and mediolateral coordinates of the COP and COM at time i. Greater COP–COM displacement indicates less efficient dynamic balance control and a larger separation between the control action (COP) and body mass (COM), whereas smaller values indicate more efficient weight transfer and stable dynamic control.

$$COP-COM\,displacement=\frac{1}{N}\sum_{i=1}^{N}|COP\left(i\right)-COM\left(i\right)|$$
(4)

Entropy, used to quantify the complexity and irregularity of sEMG signals, was calculated as spectral entropy derived from the normalized power spectral density of the EMG signal, as defined in Eq. (5), following a frequency-domain entropy framework previously applied in EMG analysis51. In this formula, Pj denotes the power of the j-th frequency component in the full power spectrum, whereas Pi represents the probability distribution of the i-th frequency component, obtained by normalizing each power value to the total power. Greater entropy values indicate increased complexity and irregularity of neuromuscular activation patterns, whereas smaller values indicate more regular, stereotyped, or constrained muscle activation.

$$Entropy=-\sum_{i=1}^{N}{p}_{i}\text{log}\left({p}_{i}\right),where\,{p}_{i}=\frac{{P}_{i}}{{\sum }_{j=1}^{N}{P}_{j}}$$
(5)

Statistical analysis

Statistical analyses were performed using IBM SPSS Statistics version 21.0 (IBM Corp., Armonk, NY, USA). Descriptive statistics were calculated for all variables, including means, standard deviations, and 95% confidence intervals. Data normality was assessed using the Shapiro–Wilk test. For between-group comparisons of physical characteristics, an independent t-test was applied when normality was satisfied, whereas the Mann–Whitney U test was used when normality was violated. The significance level was set at p < 0.05.

ML-based feature selection was performed to evaluate the relative importance of STW task-based features and identify key predictors that distinguish people with early-stage PD from age-matched HC. Weighted feature importance analysis was conducted using the RF52 and XGBoost53 algorithms. The weighting procedure incorporated each model’s classification accuracy to derive relative algorithm weights for variable ranking. This procedure was not used for direct optimization of classification performance. This approach was adopted to integrate importance estimates from two tree-based models that capture nonlinear relationships among the 200 STW-derived variables (Fig. 1a,b).

The relative weights (Wₖ) were derived from the five-fold cross-validation accuracy of each algorithm (see Supplementary Table S6 online) using Eq. (6)54. This cross-validation procedure was applied solely to estimate algorithm-level performance for weight derivation and was not used for feature selection at the fold level. In this equation, K is the number of algorithms used, and Accuracyₖ denotes the classification accuracy of the RF and XGBoost models. Each weight was calculated as the ratio of an individual model’s accuracy to the sum of accuracies across models, ensuring that the weights sum to 1 and reflect the relative contribution of each algorithm to the variable ranking, such that models with greater cross-validation accuracy receive proportionally greater weights in the final importance calculation.

$${W}_{k}=\frac{Accuracyk}{{\sum }_{m=1}^{K}Accuracym}, \sum_{k=1}^{K}{W}_{k}=1$$
(6)

The resulting weights were applied to each algorithm’s variable importance values via Eq. (7) to obtain the weighted importance. In this equation, FIk,i denotes the importance score of variable i for algorithm k, and Wₖ is as defined in Eq. (6). The final weighted importance for each variable (WFIi) was computed by summing the weighted importance values from both models.

$${WFI}_{i}=\sum_{k=1}^{K}{W}_{k}*{FI}_{k,i}$$
(7)

Through this process, an integrated variable importance score reflecting the relative performance of the two algorithms was obtained. Based on the final weighted importance scores derived from the full dataset, variables with scores ≥ 0.01 were retained as predictors with significant contributions, resulting in 26 selected variables55. To further identify the core predictors among these 26 variables, binary logistic regression analysis was performed with age, sex, height, and body mass index included as covariates. Subsequent classification analyses were conducted independently of the feature-ranking procedure to evaluate the discriminative performance of the selected variables.

To evaluate the classification performance between people with early-stage PD and age-matched HC, seven ML classifiers were applied: Logistic Regression56, K-Nearest Neighbor57, Naïve Bayes (NB)58, Linear Discriminant Analysis59, Quadratic Discriminant Analysis59, Support Vector Machine60, and RF. Cut-off values for the gait-based predictors selected for group classification were determined using receiver operating characteristic (ROC) curves, and the area under the ROC curve (AUC) was calculated to assess the discriminative ability of each variable. Hyperparameters for all models were optimized using Grid Search (see Supplementary Table S7 online). Model performance—accuracy, precision, recall, and F1 score—was evaluated using five-fold cross-validation, in which the dataset was randomly partitioned into five subsets; four were used for training, and one for testing, and this procedure was repeated five times (Fig. 6). Each metric was computed based on the number of true positives, true negatives (TN), false positives (FP), and false negatives (FN), defined as follows:

  • True positives: correctly predicted positive cases (PD correctly classified)

  • TN: correctly predicted negative cases (healthy controls correctly classified)

  • FP: negative cases incorrectly predicted as positive

  • FN: positive cases incorrectly predicted as negative

Fig. 6
Fig. 6
Full size image

Workflow for data processing and machine learning analysis. GRF: Ground reaction force; sEMG: Surface electromyography; WFI: Weighted feature importance; RF: Random Forest; XGBoost: Extreme Gradient Boosting; HC: Healthy controls; PD: Parkinson’s disease; LR: Logistic regression; KNN: K-nearest neighbors; NB: Naïve Bayes; LDA: Linear discriminant analysis; QDA: Quadratic discriminant analysis; SVM: Support vector machine.

The performance metrics were calculated using the following equations:

$$\text{Accuracy}=\frac{True\,positives + TN}{True\,positives + TN + FP + FN}$$
(8)
$$\text{Precision}=\frac{True\,positives}{True\,positives + FP}$$
(9)
$$\text{Recall}=\frac{True\,positives}{True\,positives + FN}$$
(10)
$$\text{F1~score}=2* \frac{(Precision * Recall)}{(Precision + Recall)}$$
(11)

Accuracy quantified the overall proportion of correct predictions across both classes. Precision reflected the proportion of true positives predictions among all positive predictions made by the model, whereas recall reflected the proportion of true positives among all actual positive cases. The F1 score, the harmonic mean of precision and recall, was particularly informative in handling class imbalance, as it penalized extreme disparities between the two61. The final performance metrics were reported as the mean values across all five folds. All ML analyses were performed using Python (version 3.10; Python Software Foundation).