Table 4 Extended ablation study on UTD-MHAD dataset showing the impact of removing or isolating key components from the XTinyHAR model across multiple evaluation metrics.

From: A tiny inertial transformer for human activity recognition via multimodal knowledge distillation and explainable AI

Model variant

Test accuracy (%)

F1-score (%)

Precision (%)

Recall (%)

Cohen’s Kappa

FLOPs (M)

Full model (KD + PE + AR)

98.71

98.71

98.72

98.71

0.985

11.3

Without knowledge distillation (KD)

96.84

96.80

96.87

96.84

0.962

11.3

Without positional embedding

97.42

97.38

97.40

97.42

0.973

11.3

Without attention rollout

98.11

98.08

98.12

98.11

0.978

11.3

Only knowledge distillation (KD only)

96.41

96.40

96.45

96.41

0.960

11.3

Only positional embedding (PE only)

95.78

95.73

95.70

95.78

0.954

11.3

Only attention rollout (AR only)

95.01

94.94

94.99

95.01

0.944

11.3