Table 9 Ablation study showing the contribution of Swin Transformer and Mish activation.

From: Advanced gesture recognition in Indian sign language using a synergistic combination of YOLOv10 with Swin Transformer model

Model Variant

Swin Transformer

Activation

mAP (%)

F1-score (%)

FPS

Baseline (YOLOv10)

LeakyReLU

\(94.87 \pm 0.05\)

\(93.60 \pm 0.09\)

44.6

YOLOv10 + Mish

Mish

\(95.23 \pm 0.06\)

\(94.20 \pm 0.07\)

43.8

YOLOv10 + Swin

LeakyReLU

\(96.41 \pm 0.04\)

\(95.10 \pm 0.05\)

46.2

YOLOv10-ST (Full)

Mish

\({\textbf {97.62}} \pm {\textbf {0.04}}\)

\({\textbf {96.58}} \pm {\textbf {0.03}}\)

48.7