Table 2 Comparison of gait impairment severity predictions between the model and three clinical experts

Examiner	Precision	Recall	Specificity	F1 score
Expert 1	0.926	0.892	0.961	0.904
Expert 2	0.761	0.692	0.833	0.668
Expert 3	0.897	0.886	0.939	0.885
Expert-Average	0.861	0.823	0.911	0.819
AI Model	0.804	0.811	0.898	0.806

Evaluation with an independent test dataset of 25 participants. Performance was evaluated using macro F1 score, precision, recall, and specificity. The consensus of three clinical specialists on the UPDRS scores of participants’ gaits served as the gold standard (Table 1). Expert-Average represents the average value of each performance metric across three experts. The F1 score shows a comprehensive performance of the predictions.
Bold values indicate the performance of our model.

Quick links

Search