Table 4 Overall evaluation of voice anonymization methods using ASR, and DNSMOS.
From: Privacy-aware speaker trait and multimodal features relationship analysis in job interviews
System | Metric | Original | MAC | PVT | NAC |
|---|---|---|---|---|---|
ASR | WER (%) (\(\downarrow\)) | 0.00 | 5.64 ± 0.11 | 8.24 ± 4.24 | 18.85 ± 11.57 |
DNSMOS | SIG (\(\uparrow\)) | 3.29 ± 0.28 | 2.92 ± 0.51 | 3.13 ± 0.28 | 3.17 ± 0.40 |
BAK (\(\uparrow\)) | 3.54 ± 0.40 | 3.17 ± 0.56 | 3.38 ± 0.40 | 3.70 ± 0.47 | |
OVR (\(\uparrow\)) | 2.81 ± 0.32 | 2.43 ± 0.44 | 2.61 ± 0.30 | 2.79 ± 0.41 |