Table 6 Ablation study of major components on the AUTSL dataset. Each module incrementally improves accuracy and F1-score, demonstrating their complementary contributions.
From: A deep learning-based method combines manual and non-manual features for sign language recognition
Configuration | Accuracy (%) | F1-score | Inference Time (s) |
|---|---|---|---|
Baseline (2D, no HPR, no MSA) | 85.6 | 0.86 | 1.4 |
+ Head Pose Rectification (HPR) | 87.8 | 0.88 | 1.3 |
+ Normalized 3D Skeleton (3D-Norm) | 88.9 | 0.89 | 1.3 |
+ Multi-Scale Attention (MSA) | 89.7 | 0.90 | 1.2 |
Full Model (HPR + 3D-Norm + MSA) | 90.5 | 0.91 | 1.1 |