Table 8 Performance comparison of models on noise datasets across different SNR levels on v2 dataset.

From: Dynamic convolution models for cross-frontend keyword spotting

Noise

SNR (dB)

Model

DynTempConvNet + DML

DynTempConvNet

TENet6

TCNet14

Urban

20

\(95.98 \pm 0.12\)

\(95.83 \pm 0.24\)

\(95.77 \pm 0.15\)

\(95.31 \pm 0.22\)

15

\(95.13 \pm 0.21\)

\(94.69 \pm 0.21\)

\(94.58 \pm 0.09\)

\(94.08 \pm 0.26\)

10

\(93.63 \pm 0.14\)

\(93.19 \pm 0.19\)

\(92.71 \pm 0.23\)

\(91.87 \pm 0.11\)

5

\(90.29 \pm 0.18\)

\(89.42 \pm 0.16\)

\(88.33 \pm 0.24\)

\(87.77 \pm 0.13\)

0

\(84.17 \pm 0.10\)

\(82.63 \pm 0.27\)

\(81.13 \pm 0.12\)

\(80.17 \pm 0.25\)

WHAM

20

\(96.06 \pm 0.15\)

\(96.02 \pm 0.14\)

\(94.96 \pm 0.11\)

\(95.17 \pm 0.22\)

15

\(94.81 \pm 0.18\)

\(94.71 \pm 0.25\)

\(93.65 \pm 0.09\)

\(93.60 \pm 0.27\)

10

\(93.35 \pm 0.13\)

\(92.54 \pm 0.21\)

\(91.15 \pm 0.24\)

\(90.96 \pm 0.12\)

5

\(87.75 \pm 0.21\)

\(87.25 \pm 0.15\)

\(85.96 \pm 0.23\)

\(85.19 \pm 0.10\)

0

\(78.67 \pm 0.08\)

\(78.12 \pm 0.19\)

\(76.29 \pm 0.14\)

\(74.96 \pm 0.29\)