Introduction

Parkinson’s disease (PD), Huntington’s disease (HD), and Amyotrophic lateral sclerosis (ALS) are commonly occurring neurodegenerative disorders (NDD) arising due to neuronal death resulting in structural and functional loss of neurons. Such disorders thereby, lead to diminished abilities in performing both mobility and cognitive tasks affecting nearly 15% population globally1. Additionally, the main concern is the progressive nature of these diseases which is expected to increase by a rate of two-folds over the next two decades2. Among these NDD, PD is known to be second most common disorder affecting majorly elderly population ~ 8–18 persons per 100,000 in a year3 with high prevalence in males when compared to female counterparts. After PD, ALS is another chronic motor neuron disease condition that reported estimated incidence of 1.9 persons per 100,000 in a year4. However, HD is a genetic disorder occurring at around 30–50 years of age5 leading to jerky movement and abnormal posturing. Other body activities that are influenced due to NDD includes speech, respiration, heart rate, cognition, muscle stiffness, balance, and gait. Although, these symptoms are apparent in later disease staging leading to complexities in intervention procedure. However, some motor complications are visualized in early disease staging such as trembling, slowness of movements, rigidity, and difficulty in walking6. Therefore, understanding of mobility issues possess a vital direction for diagnosis of NDD. Also, clinically developed scales Hoehn & Yahr (H&Y)7,8, Unified Huntington Disease Rating Scale-Total Functional Capacity (UHDRS-TFC)8, and time elapsed since disease diagnosis are the prominent methods to grade the PD, HD, and ALS respectively. These conditions reflect abnormal involuntary and slowness of movements in PD, chorea for HD and ALS causing difficulties in lateral leg coordination and other related movements9.

Hence, gait analysis, study of rhythmic movements, is quantifiable and conducted to study walking deviations10. With technological advent various gait measurement systems have evolved ranging from laboratory-based gold standard equipment’s such as force-plates and motion capture system to portable, wearable inertial systems with capability to track human motion in outdoor environment11,12,13,14. Although, such systems are capable enough to record gait trajectories and with the use of signal processing techniques15,16,17 wide range of objective gait parameters can be derived. However, the data dimensionality and interpretation remained significant challenges. So, integration of predictive techniques can allow for automated classification of NDD which leads to reduce subjectivity, bias, and inter-rater variability13. Many researchers are working in this direction to develop solutions for automated classification of neurodegenerative disorders which would be effective in reducing time and efforts of clinicians and able to support decision making18,19. The existing studies, thereby, reflected that these abnormalities are disease-specific where, gait analysis had a great potential to differentiate and aid in disease diagnosis. Moreover, quantifying the disease severity, rate of gait variability, and determination of significant features related to specific disorder are the main issues of concern. This research work, therefore, contributes to classification of NDD by development of a three-level hybrid model, which has the potential to support prognostic assessments related to disease severity grading. The carried work uses Decision Tree (DT)20,21 with possibilistic approach using Hanman-Anirban entropy, designed to overcome several critical limitations associated with traditional probabilistic DT criteria such as Gini Index (GI) and Information Gain (IG). Probabilistic models rely on clearly defined, mutually exclusive class boundaries and are designed to perform optimally with separable and balanced data. Gait data from NDDs, however, are noisy, complex, high dimensional, non-linear in nature, and in many instances, heavily overlapping in terms of spatial-temporal features between disease types. Such overlapping makes probabilistic binary certainty assumptions difficult and raises the concern of overfitting or misclassification, particularly in early-stage NDDs. Moreover, these metrics struggle to effectively represent transitional states common in clinical data, where symptoms do not always conform to rigid class labels.

Contrary, the Hanman-Anirban entropy is rooted in possibility theory, which is more suited to handling imprecise information. It evaluates uncertainty not through frequency counts but through the degree of possibility, which allows the model to handle uncertainty regions when dealing with physiological signals influenced by intra-subject variability, walking speed, fatigue, and environmental conditions. Unlike GI and IG, Hanman-Anirban entropy is sensitive to local feature distributions, and thus optimally suited for learning non-linear decision boundaries in real-world gait datasets. In addition, this method decreases outlier and noise sensitivity through focus on the most likely hypothesis instead of aggregating all probabilistic outcomes. Consequently, it makes more robust, interpretable, and clinically meaningful decisions, particularly when dealing with low-data high-uncertainty scenarios like rare NDD classification and severity grading. These methodological advantages are supported both by theoretical grounding in uncertainty modeling and by empirical improvements demonstrated across all three levels of our hierarchical classification framework. The detailed analysis of proposed model with mathematical formulation is included in Methods and Materials.

Additionally, the inclusion of objective gait features based on domain expertise has vital meaning and clinical significance. Also, to reduce data dispersion, data normalization is carried out using non-linear data normalization approach20. Since, gait data is complex, and non-linear, therefore, it has been hypothesized that inclusion of normalized feature-set would prove significant in determining distinct gait traits and aids to improve severity assessment. The main contribution of the research work includes:

  • Inclusion of normalization approach to correct individual differences in extracted high-level gait features.

  • Detection of NDD severity proposed three-level hybrid model. The first level classifies the gait abnormality (HC vs. NDD), however, the type of gait abnormality is determined using second level (PD vs. ALS vs. HD), and further severity quantification (disease grading) is carried out at third level.

  • The proposed hybrid decision tree (DT) model is based on possibility approach rather than probability statistics used in conventional DT classifiers. In addition, the statistical significance of the model is validated using Wilcoxon signed-rank test.

  • To propose significant gait feature-set for quantification of NDD severity (PD, HD, and ALS) to build clinical insights.

  • Comparative analysis of proposed model with existing conventional state-of-the-art (SOTA) DT classifier using distinct performance metrics.

Literature review

Recent years witnessed use of gait measurement devices such as pressure-sensitive switches, motion camera systems, and inertial sensors to measure the gait abnormality in NDD patients. However, to address the problem of disease classification predictive algorithmic techniques such as Machine Learning (ML) and Deep Learning (DL) have been used in the past few decades. Table 1 provides an overview of existing studies with their outcomes and limitations. The earlier literary work is much targeted towards detection of NDD disorder by classifying normal vs. pathological gait (binary classification). Xia et al.15 used statistical features extracted from gait rhythm and reported an accuracy of 96.83% for classifying NDD patients and healthy controls (HC). The study further reported that mean value, Teager energy, and fuzzy entropy as most distinguishing features. Further, work by Pham et al.21, determined that by using texture analysis that is, fuzzy recurrence plot much insights can be gained by visualization of gait dynamics obtained for NDD patients. In addition, using the said approach which causes transformation of gait time series into images showed significant potential for binary classification (HC vs. PD, HC vs. ALS, and HC vs. HD). The work in22 combined qualitative approach of fuzzy networks with adaptability of neural network to develop adaptive neuro-fuzzy inference system (ANFIS) with Particle swarm optimization (PSO) to learn network parameters. Although, resulted in 90% accuracy statistics using the developed model but for clinical analysis more rigors evaluation of the model is required for monitoring disease progression rate. Additionally, when using rule based decision tree for NDD classification, this accuracy decreases by ~ 3% for detection of disorder class from HC17. Therefore, another work by Yan et al.16, discussed the use of topological motion analysis to study gait fluctuations and achieved higher area under curve (AUC). Also, the study showed higher potential for understanding characteristic gait pattern of NDD patients which would be effective to support early intervention.

The above-mentioned studies are particularly targeted on extraction of gait features from physiological signal. However, less involvement of the use of raw gait signal and DL methodologies has been visualized in early literary work. The recent work by Lin et al.23 classified NDD patients and HC using convolutional neural network (CNN) from gait signal with segmentation size of 10s and achieved accuracy of more than 96% for multiclass classification. Additional work in24 discussed about gait dynamics recognition of NDD patients using DL methodologies. With this DL algorithms, the researchers in25 developed a quick response code to effectively evaluate NDD patients based on disease severity grading from HC. Although, it has been observed that DL methodologies proved much more significant than traditional ML algorithms and pattern recognition techniques for the task of both binary and multi-class classification. But, the main challenges6 to adoption of such systems in clinical scenarios involves the need of high computing facilities and resources. The algorithms require training using huge amount of data, the lack of sufficient comprehensive medical data is considered the bottle-neck. Also, the process is iterative in nature involving tuning of multiple hyper parameters leading to additional model complexities and computational time. Thus, it would be quite effective to design and develop automated diagnostic systems that would be resource constrained and be able to provide assistance for clinical needs. A simple computational approach to characterize human walk using gait biomarkers has the potential to assist in estimating disease severity and warrants further investigation.

Table 1 Literature review of existing studies with their outcomes and limitations.

Therefore, the present research work aims to propose a three-level hybrid framework that simultaneously detects wide range of NDD using gait biomarkers. The clinically relevant normalized spatiotemporal gait features are used to detect gait abnormality and its type in NDD patients. Thereafter, the model quantifies disease severity grading and provides prognostic outcomes thereby, supporting early disease diagnosis.

Experimental results

Statistical analysis of gait characteristics

Using the Kolmogorov-Smirnov (KS) test, it has been observed that the considered high level gait characteristics are non-normally distributed. To confirm about statistical significance of these parameters at 95% significance KW analysis is conducted. Bonferroni correction allows multiple comparison between different classes for determining statistical significance. Table 2 describes the results of statistical significance with non-significant classes mentioned as superscripts. Few gait features are having association among them and these are mainly within NDD classes. The stride time, and nm_swing time are associated within PD and HD group, however, swing time between HD and HC are not statistically significant. CoV_DS_time feature of ALS patients is similar to HC. It has also been observed that the patients belonging to PD and ALS class have similar walking speed. Moreover, the majority of gait characteristics (~ 66.6%) shows distinct gait pattern between considered patients and healthy individuals, therefore, are significant to clearly discriminate the individuals gait pattern and for automated diagnostics.

Table 2 Statistical analysis results with non-significant values (p > 0.05) (PD-a, HD-b, ALS-c, & HC-d).

Classification results of proposed three-level hybrid model

The proposed hybrid three-level methodology using DT classifier with possibility approach outperformed the conventional DT classifiers using probability approaches (Gini index and Information Gain). Through experimental analysis (Table A1, Appendix), it has been observed that using normalized gait features, this study is able to reduce anthropometric variability, enhance inter-subject comparability, and improve model performance across diverse populations. Therefore, based on both evidence from literature and empirical validation, this work concludes to use dimensionless normalization for further analysis. Hence, experimental results using normalized NDD dataset with stratified k-fold validation (k = 10) are presented. The dataset consists of gait data collected from unique individuals divided into two independent subsets. The training dataset includes gait cycles from 37 subjects (PD = 7, HD = 10, ALS = 10, HC = 10), in total, 9,615 gait cycles. The testing dataset comprises gait cycles from a distinct set (not included in training dataset) of 22 subjects (PD = 3, HD = 10, ALS = 3, HC = 6), contributing 5,545 gait cycles. The added advantage of this initial segregation is that, it avoids cross mixing of data samples between two datasets. Here, “gait cycles” refer to individual time-series samples, whereas “subjects” represent the unique participants from whom multiple gait cycles were recorded. This approach is also beneficial in real-life settings where medical data is tested for newly recruited patients to have automated disease diagnostics to support clinical needs.

The classification results of hybrid model are tabulated in Tables 3, 4 and 5 using the proposed possibility approach and are also shown using conventional DT classifiers (probability approaches) for comparative analysis. Level 1 of proposed model (Table 3) produces accuracy of (99.78 ± 0.03) % to classify NDD patients from HC which is higher by ~ 14–15% using probability approach. The other metrics such as sensitivity, specificity, precision, and F1-score using the proposed approach are higher by 11.36–13.7%, 19-22.5%, ~ 8%, ~ 10–11% respectively. The outcomes of level 1 show high performance capability of proposed model resulting in high positive predictive values and lesser false discovery rate for HC and NDD patients. The proposed model achieved a high AUC of ~ 0.99, suggesting strong class separability between HC and NDD patients. Although, high AUC indicates the model is effective at ranking NDD instances above HC instances across various thresholds. However, further validation on larger independent datasets is necessary to confirm the model’s generalizability. Compared to traditional DT classifiers (AUC ~ 0.83), the proposed method demonstrates a notable improvement in discriminative capability.

Table 3 Classification results of level 1 (HC vs. NDD) using different models.

Thereafter, to determine the type of the NDD disease the dataset is evaluated by level 2 of proposed model (Table 4). It has been observed that the model provides higher accuracy by an order of ~ 29.8% as compared to conventional DT model approach. Evaluation of additional performance metrics suggests comparable diagnostic performance across the considered disease types. At Level 2 classification, the proposed model achieved a high AUC of 0.998, indicating strong class separability among NDD subtypes (HD, ALS, and PD). However, given the relatively small test samples, further validation on large cohorts is essential to verify the model’s generalizability. In contrast, the conventional DT model exhibited lower discriminative capability, leading to higher specificity than sensitivity. Notably, misclassifications are more frequent between HD and PD, likely due to overlapping gait characteristics. Approximately 25–28% of PD patients were incorrectly classified as HD, contributing to a higher false discovery rate. These findings suggest that the DT model lacks the capacity to effectively distinguish subtle differences in gait patterns among closely related NDD subtype.

Table 4 Classification results of level 2 (HD vs. ALS vs. PD) using different models.

Further, to determine the disease severity, level 3 of proposed model shows higher accuracy with minimum classification errors (Table 5) for considered disease types (HD, ALS, and PD). The reported RMSE is lesser than conventional DT model by the factor of 6.4–38.2%, 9-25.43%, and 12.8–34.6% for HD, ALS, and PD respectively. Additionally, the performance results that is, high R2 and low MAE proved significant for the proposed methodology. Additionally, the three-level hybrid approach is computationally efficient with average model training time of 3 min and 34 s in comparison to other techniques. Moreover, to predict the disease severity (Level 3) all models require almost comparable training time but for classification of NDD subjects with disease type training time is higher than GI (~ 75–78%) but is able to lessen the misclassification rate reducing bias and error. Although, the training time is almost comparable to IG model (~ 1%) with an added advantage of more accuracy, precision, F1-score, and recall. Hence, trade-off between the two results makes proposed methodological framework as the possible choice for classification of disease severity. All computations are performed on PC with 16GB RAM and Intel (R) Xeon (R) W-1255 CPU @3.30 GHz processor.

Table 5 Regression results of level 3 (Disease severity Prediction) using different models.

Thus, it can be marked from the results that the proposed three-level hybrid model well recognizes the gait pattern of the subjects and is able to provide automated disease diagnostics. Nevertheless, this approach uses high level gait features and data normalization technique, therefore, the inherent domain knowledge proved significant to achieve high outcomes. In addition, the segregation of data at the initial level led to creation of generic model with more generalization as required in real-life implementation for medical applications.

In addition, feature relevance analysis that is, recursive feature analysis27 is conducted that revealed disease-specific gait biomarkers across neurodegenerative disorders (Table 6) for clinical needs. For Huntington’s disease (HD), gait variability had the highest predictive weight (~ 82%), along with reduced gait speed (~ 68%), consistent with findings that HD patients exhibit significantly greater gait variability28 than other neurodegenerative conditions, likely due to choreiform movements and impaired motor coordination. Previous study by Browning et al.29 also confirmed elevated stride-to-stride variability, with CoV significantly higher than in Parkinson’s disease (PD) or controls. In PD, shorter strides (~ 69%) and increased swing-phase asymmetry (~ 81%) are observed as key discriminants. These findings align with the well-established shuffling gait and increased bilateral asymmetry in PD patients that attributed to bradykinesia and rigidity30. Existing studies also highlighted substantial changes in gait timing in PD patients31. For ALS, longer stride length and prolonged double support (DS) time (~ 77%) are identified as primary discriminative gait features in line with previous findings. Gait analysis by McArdle et al.32 demonstrated that reduced walking velocity, due to lower limb weakness, is accompanied by shorter stride length, reduced cadence, and increased double support time, which serve as compensatory mechanisms for maintaining balance33. Additionally, stride-to-stride rhythm analysis revealed that ALS gait is markedly less steady and temporally disorganized, underscoring elevated gait variability consistent with neuromuscular deterioration.

Table 6 Most influential gait features for each NDD type.

Comparative analysis of proposed three-level hybrid model with state-of-art classifiers

Statistical significance of the proposed hybrid model is tested using Wilcoxon signed rank test against state-of-the-art DT classifier. The null hypothesis \(\:\left({H}_{o}\right)\:\)that there lies no statistical difference in the classification accuracy of proposed model and existing state-of-the-art DT classifiers is examined. The p-value (p < 0.05) allows to reject the null hypothesis. If \(\:{H}_{o}\) is rejected, the alternate hypothesis \(\:\left({H}_{a}\right)\) is accepted which signifies the existence of statistical significance between two models.

Table 7 Statistical significance between proposed model and other SOTA classifiers using Wilcoxon signed rank test.

Table 7 establishes that the statistical significance between possibility approach (proposed model) and probability approaches (conventional DT classifier) is p < 0.05. Therefore, null hypothesis is rejected and it can be determined that the proposed model performed better with reduced complexity and architectural simplicity. The model demonstrated its capabilities in extracting informative parameters from gait patterns to automate diagnostic process of NDD patients.

Discussion

The main focus of the carried work is to develop an automated framework for assessing the severity of NDDs, providing clinical support using a possibility-based approach and a normalized high-level gait feature set. Compared to traditional ML and DL models, our proposed framework offers notable advantages in both inference quality and performance stability.

Comparison with traditional ML approaches

Earlier studies have demonstrated that conventional ML models, while effective in binary classification tasks, struggle with consistency and sensitivity when handling more complex or multi-class scenarios. For instance, Fraiwan et al.6 employed statistical features from vertical ground reaction force (VGRF) signals and an ensemble AdaBoost classifier to distinguish ALS, PD, and HD from HC, reporting high class-specific accuracies (98–99%). However, their reliance on single-phase features (stance, swing, stride) led to a significant drop in overall accuracy (80%). Similarly, Begum et al.34 utilized Fast Hadamard Transform and recurrence analysis with a random forest (RF) classifier, yielding 91.4% accuracy but only 82.5% sensitivity which poses a major risk in analyzing patients’ health status. However, in an attempt to improve classification, entropy measures were added to the VGRF signal, and a k-nearest neighbors (kNN) classifier was used35, which slightly improved the performance but still lacked robustness across diverse NDD types.

Other research efforts focused on characterizing gait fluctuations. For instance, non-parametric Parzen window estimation was used to calculate the probability density function (pdf) of stride, stance, and swing time, followed by classification using least square support vector machines (LS-SVM). This method classified ALS and PD patients from HC with accuracies of 82.8% and 90.32%, respectively36,37, indicating variability in performance depending on the disorder and selected features. Deterministic learning and radial basis function neural networks were also explored to extract prior knowledge from gait fluctuations38, aiding in pattern recognition. Further enhancements were observed when neural network capabilities were combined with fuzzy logic and particle swarm optimization (PSO), achieving accuracies between 90 and 94%22. Yet, these models remained constrained typically limited to binary classification tasks.

In contrast, the proposed possibility-based framework, leveraging a normalized high-level gait feature set, overcomes several of these challenges. It enhances sensitivity, supports multiclass classification, and provides better inferential depth, especially for disease severity grading. Unlike the aforementioned studies that relied heavily on handcrafted and often phase-specific features, our method incorporates higher-level abstractions from the gait cycle, which leads to improved generalizability and robustness across diverse patient categories.

Comparison with deep learning-based models

Recent DL models, particularly those based on convolutional neural networks (CNNs), have improved classification accuracy but often at the cost of interpretability and computational efficiency. Lin et al.23 transformed short-duration (10s) VGRF signals into image representations and achieved binary classification accuracies up to ~ 95–100% and multiclass classification up to 98%. Despite these strong performance metrics, DL models introduced significant computational overhead and were generally limited to binary classification. Similarly, twin-layered CNNs and wavelet coherence spectrograms39,40 offered slight improvements (2–3%) over traditional ML, but their complexity restricts real-time or resource-limited clinical deployment. Moreover, most DL-based frameworks have been developed for binary classification (i.e., HC vs. NDD), with limited exploration of multiclass settings or disease severity grading. The performance benefits, while statistically significant, come at the cost of model transparency and deployment feasibility in clinical workflows.

Conversely, our model accomplishes equivalent or better performance at the expense of the computational effort of deep networks. By integrating a possibility approach and high-level gait descriptors, the model remains efficient, interpretable, and robust in both binary and multi-class classification tasks.

Addressing previously reported limitations

Most existing literature has concentrated on binary classification (HC vs. NDD), with very few studies addressing multiclass classification or severity grading. Notably, only one prior study41 attempted disease severity grading using gait dynamics with traditional ML/DL methods, which suffered from data insufficiency and weak inference capabilities. Our work directly addresses these gaps by introducing a possibility-based framework that incorporates high-level normalized gait features. This approach enables robust multi-class classification (PD vs. HD vs. ALS) and provides a reliable grading mechanism for disease severity.

Proposed model evaluation and significance

Figure 1a, b shows a comparison of our approach with41 on various performance measures. The findings indicate evident improvements, with improved classification accuracy, larger coefficient of determination (R²), and reduced mean absolute error (MAE) and root mean square error (RMSE). These findings establish the model’s effectiveness and robustness in NDD classification and estimation of its severity, paving its way for the use in clinical decision support.

Fig. 1
figure 1

(a) Comparative analysis of our proposed model with existing studies for classification of NDD vs. HC. (b) Comparative analysis of our proposed model with existing studies for evaluation of disease severity (PD, HD, and ALS).

Conclusion

Accurate and reliable diagnosis of disease severity is crucial for dose management and to determine the disease progression rate. In this research work, an automated decision support framework is proposed not only to classify NDD but also to determine the severity of the disease based on clinically developed scales for PD, HD, and ALS patients using gait trajectories. Using the proposed DT based possibility approach neurological disease is detected with an accuracy of ~ 99% and evaluation of disease severity is carried out with R2 of 0.89–0.90 and low values of MAE (0.189–3.33) which shows that detection and severity evaluation is carried at higher accuracy statistics compared to traditional ML approaches. The added advantage of developed approach is utilization of the information carried by the source value within each gait feature or attribute to measure the entropy to reach output class particularly significant for model training. This allows to determine stable output without much variations in distinct iterations. However, the hyperparameter tuning to determine accurate information at each node is dependent on the dataset and its distribution. Therefore, in future, an ensemble learning classifier (ELC) using possibility approach or a combination of probability-possibility strategy with distinct hyperparameters can be studied and analyzed. Although, the proposed method is able to provide significant accuracy in assessment of disease severity associated with NDD, but some limitations are still there that can be improved in future studies. Limited subjects were used as input to proposed method; however, future studies should consider using data augmentation to increase sample size, and data balancing and Generative AI techniques42 can be explored to observe their impact on model robustness and generalizability. In addition, to enhance the clinical relevance of the study, it is recommended to carry out longitudinal assessment for monitoring the disease progression rate and evaluation of therapeutic intervention outcomes.

Methods and materials

Dataset description

The flow diagram of the methodological framework for evaluation of NDD disease severity is highlighted in (Fig. 2). The open-access publicly available “Gait in Neurodegenerative Disease” database43 is used in the current study. It includes 11,084 gait pattern recordings (Fig. 3) from total of 48 patients suffering from three distinct disease classes (PD: 3688 gait recordings; HD: 4846 gait recordings; ALS: 2550 gait recordings). In addition, the database also includes the gait step recordings from age-matched HC (4076 gait recordings).

Fig. 2
figure 2

Methodological Framework for evaluation of neurodegenerative disease severity (PD, HD, and ALS).

Fig. 3
figure 3

Sample distribution of gait recordings among considered subjects.

During data acquisition, the force sensitive switches capable of measuring the raw vertical ground reaction force (VGRF) signal during locomotion are placed in subjects’ insoles. A small, portable, and light weighted data logger unit (5.5 × 2 × 9 cm; 0.1 kg) is placed on ankle using ankle wallet.12-bit sampled digitized data after on board analog-to-digital conversion (ADC) is used for analysis. Furthermore, the data collection process involves a normal walk from all considered subjects in a 77 m hallway for 5 min with an end-turn33. Table 8 highlights the characteristic summary and demographic details of the involved subjects.

Table 8 Demographic details (µ ± σ) of considered subjects.

Pre-processing and feature extraction

To study the gait dynamics, the initial 20s data representing gait initiation have been removed for further analysis. Also, the outliers usually included during the turning phase are corrected using median filter. Thereafter, pre-processed gait signal is used for gait-cycle segmentation. The segmented gait signal is used for extraction of clinically relevant high-level gait features as per the recommendations from guidelines of Biomathics and Canadian consortium44. The formulation and definition of included primary and secondary gait features is provided in (Table 9). In addition, the walking pattern of the subjects varies with progression of the disease and can be accessed using gait dynamics. The study of gait dynamics using high-level gait features aids to understand disease staging and thereby, acts as a significant gait biomarker. To establish the statistical significance for gait characteristics among PD, HD, ALS, and HC, the Kruskal Wallis (KW) test (p < 0.05) is used after validating the data normality using KS test.

Table 9 Extracted gait features with formulation and definition.

From the perspective of computing the primary and secondary gait features, let’s consider the gait cycle time initiated from heel-strike of right foot, RHS(n) such that subscript ‘n’ represents present stride and ‘n + 1’ represents the next stride. Similarly, RTO (n) represents the present toe-off of right foot. For left foot, both these events are denoted by LHS (n) and LTO (n) respectively.

The distribution of the primary extracted gait features among considered subjects is presented in (Fig. 4). It has been observed that the included features have strong association with disease characteristics that describes the disease with its severity index.

Fig. 4
figure 4

Representation of extracted gait features among considered subjects.

Data normalization

The extracted features have the potential to evaluate lower-limb functionality and health outcomes. However, the spatiotemporal gait characteristics are affected by subjects’ demography and physical dimensions leading to biased classification outcomes45,46. To overcome these challenges, it is desirable to reduce data dispersion which can aid to improve prediction performance by reducing error bias. This research work uses the data dimensionless equations in order to normalize the gait features due to its potential benefits and effectiveness20. The temporal parameters \(\:{(t}_{N})\) do not associate linearly with the body-height \(\:\left(h\right)\:\)rather follows pendulum approach at preferred walking pace, thus are related to square root of body-height as given in Eq. (1).

$$\:{t}_{N}=\frac{{T}_{R}}{\sqrt{\frac{{h}_{i}}{g}}}$$
(1)

Other parameters such as gait speed \(\:{(v}_{N})\) are inversely related to temporal counterparts, so transformed differently using Eq. (2).

$$\:{v}_{N}=\frac{{v}_{R}}{\sqrt{\frac{g}{{h}_{i}}}}$$
(2)

Here, \(\:g=9.8m/{s}^{2}\) is acceleration due to gravity, and \(\:{h}_{i}\) represents individual’s height. After normalization of considered gait features, the model’s capability is established using proposed methodological framework.

Proposed prediction model

Automated clinical disease diagnostic models built using ML techniques are used to retrieve information from underlying data patterns. This work uses supervised tree-based iterative structures built using decision-based root node and leaf nodes following a given set of rules. The added advantage of these structures involves handling class complexities in presence of both linear and non-linear datasets. Here, this work presents the use of possibility approach to make disease prediction. To define disease severity levels, this study used well-established developed clinical scales specific to each neurodegenerative disease (NDD) condition. Hoehn and Yahr (H&Y) staging scale7,8, which classifies motor symptom progression from stage 1 (unilateral involvement) to stage 5 (wheelchair-bound or bedridden) is used to access the disease severity of Parkinson’s disease (PD) patients. For Huntington’s disease (HD), the Unified Huntington’s Disease Rating Scale–Total Functional Capacity (UHDRS-TFC)8 is employed, providing a functional staging framework ranging from 13 (normal function) to 0 (complete dependence). For amyotrophic lateral sclerosis (ALS), due to the heterogeneity and lack of a universally applied staging system, time duration since clinical diagnosis is used as a marker to study disease progression, which is commonly used in longitudinal ALS studies. These criteria are employed to stratify participants into severity categories for development and evaluation of downstream model.

In addition, 10-fold cross validation (CV) is used for validation purpose where the whole training dataset comprising of 9615 gait cycles during locomotion (almost equal distribution among all classes (PD, HD, ALS, HC)), are divided into ten equal parts. Each single part (962 gait cycles) is used for validation purpose and the remaining nine parts (8653 gait cycles) are used as training samples by the model. However, to rate the performance of the model average score is used and thereby, bias and variance are reduced from the unseen samples. Further, the classification model’s performance is evaluated using accuracy (Acc.), sensitivity/recall, specificity, precision, F1-score, and error rate. In order to determine the disease severity, metrics used are mean square error (MSE), root mean square error (RMSE), training time, and coefficient of determination (R2)31. Also, the comparative analysis of discussed approach with conventional probability approach is discussed.

Given a NDD dataset with sample size of \(\:NXM\), where, N denotes the total number of samples, and M represents considered attributes or features given by \(\:({{Z}_{1},Z}_{2},{Z}_{3},{\dots\:\dots\:\dots\:.Z}_{m})\). The algorithm is initiated by the random selection of any one attribute from \(\:({{Z}_{1},Z}_{2},{Z}_{3},{\dots\:\dots\:\dots\:.Z}_{m})\), wherein, it is split into child nodes by determining high information gain. The process continues iteratively with remaining attributes and the one with best splitting criterion among the dataset is selected as the root node. The chosen root node is split into child nodes based on predetermined threshold, these child nodes then become individual parent node and the process recurs until the given samples are considered. However, multiple stopping criterion are used to terminate the process during validation phase such as reaching maximum depth, number of samples in a node is less than a minimum sample split or encountering of nodes with only one class. Additionally, the target class is predicted as classification output at the leaf node.

The formulation of decision trees47,48 is based on recursive binary splitting following a top-down greedy approach. The source value of each attribute is partitioned into subsets recursively until final prediction is attained as the leaf node. The set of decision rules which are followed to build a decision tree are acquired through model training process. As model’s depth increases the uncertainty associated with it reduces (entropy decreases) and information regarding the target prediction increases. Generally, the Gini Index (GI) of an attribute undergoing the node split is determined by using Eq. (3). It is a probabilistic approach which measures the probability for a random instance when misclassified, therefore, lower the GI, better lower likelihoods of misclassification.

$$\:GI=1-\sum\:_{i=1}^{k}{P\left(i\right)}^{2}$$
(3)

where, \(\:k\:\)determines the number of classes in the considered dataset, and \(\:P\left(i\right)\) represents the probability.

Another, probability-based criterion widely used to understand node splitting in decision tree is Information Gain (IG) determined as given in Eq. (4).

$$\:IG=\:{H}_{root\_node}-weighted{avg\left({H}_{leaf\_node\:}\right)}_{\:}$$
(4)

where, H represents the entropy and weighted average measures the gain generated by splitting criterion and corresponds to number of samples in left and right node denoted by M; and N is the total number of samples. The formulation of entropy of root node and leaf node are discussed in Eqs. (5) & (6).

$$\:H\left(X\right)=\:-\sum\:_{i=1}^{k}{P}_{i}\text{log}{P}_{i}$$
(5)
$$\:weighted{avg\left({H}_{leaf\_node\:}\right)}_{\:}=\left(\frac{{M}_{leaf\_node\left(left\right)}}{N}{H}_{leaf\_node\left(left\right)}+\:\frac{{M}_{leaf\_node\left(right\right)}}{N}{H}_{leaf\_node\left(right\right)}\right)$$
(6)

where, \(\:{H}_{root\_node}=\:H\left(X\right),\:\:{H}_{leaf\_node}=\:H\left(X\right)\:\)

Apart from the probability approaches, another entropy criterion widely used over the past decade is the possibilistic approach that is, Hanman-Anirban entropy. The generalized form of this entropy is given in Eq. (7).

$$\:{H}_{Z}=\:\sum\:_{i}{I}_{Z}({Z}_{i}){e}^{{-{\left({a}_{Z}\right({I}_{z}\left({z}_{i}\right))}^{3}+{{b}_{Z}\left({I}_{Z}\left({z}_{i}\right)\right)}^{2}+\:{c}_{Z}\left({I}_{Z}\left({z}_{i}\:\right)\right)+{d}_{Z})}^{{\alpha\:}_{Z}}}\:\:\:\:$$
(7)

Here, \(\:{e}^{{-{\left({a}_{Z}\right({I}_{z}\left({z}_{i}\right))}^{3}+{{b}_{Z}\left({I}_{Z}\left({z}_{i}\right)\right)}^{2}+\:{c}_{Z}\left({I}_{Z}\left({z}_{i}\:\right)\right)+{d}_{Z})}^{{\alpha\:}_{Z}}}\) corresponds to \(\:{I}_{Z}\left({Z}_{i}\right)\); \(\:{a}_{Z},\:{b}_{Z},\:{c}_{Z},\:\) and \(\:{d}_{Z}\) are real valued parameters. These parameters can be replaced to capture the distribution of information source values using mean and standard deviation which are first and second order statistical moments. Considering, \(\:{a}_{Z}=0,\:\:{b}_{Z}=0,\:{c}_{Z}=\:\frac{1}{{\sigma\:}_{Z}},\:\:\)and \(\:{d}_{Z}=\:-\frac{{I}_{mean}}{{\sigma\:}_{Z}}\), the information gain corresponds to \(\:{e}^{{-\:\left(\frac{\left({I}_{Z}\left({z}_{i}\:\right)\right)-\:{I}_{mean}}{{\sigma\:}_{Z}}\right)}^{{\alpha\:}_{Z}}}\)where, \(\:{\alpha\:}_{Z}=0.5,\:1,\dots\:..5\:\)denotes the degree of membership. In order to have Gaussian distribution \(\:{\alpha\:}_{Z}=2\:\)is taken. Moreover, the entropy associated with normalized non-linear information source values is given by Eq. (8).

$$\:{H}_{z,NL}=\:\frac{1}{\left|N\right|}\:\sum\:_{i}{I}_{Z}\left({Z}_{i}\right){e}^{{-{\left({a}_{Z}\right({I}_{z}\left({z}_{i}\right))}^{3}+{{b}_{Z}\left({I}_{Z}\left({z}_{i}\right)\right)}^{2}+\:{c}_{Z}\left({I}_{Z}\left({z}_{i}\:\right)\right)+{d}_{Z})}^{{\alpha\:}_{Z}}}$$
(8)

Furthermore, in the proposed methodology, it has been hypothesized that possibility approach provides more significant prediction outcomes to determine disease severity. Hence, can be used in place of the conventional DT classifiers built using probability approach more effectively and efficiently.