Scientific Reports

Table 4 Performance comparison of the speaker embeddings with other pre-trained embeddings.

From: Manifestation of depression in speech overlaps with characteristics used to represent and recognize speaker identity

Model	DAIC-WOZ				Vocal Mind
Model	\(F_1(D)\)	\(F_1(H)\)	BAc.	RMSE	\(F_1(D)\)	\(F_1(H)\)	BAc.	RMSE
Mockingjay	0.27	0.70	0.49	7.09	0.27	0.70	0.48	7.58
vq-wav2vec	0.32	0.71	0.52	6.95	0.25	0.73	0.49	7.12
wav2vec-2.0	0.38	0.74	0.55	6.77	0.32	0.74	0.52	7.03
TRILL	0.36	0.77	0.56	6.46	0.34	0.76	0.55	6.80
ECAPA (Proposed)	0.46	0.79	0.63	6.31	0.34	0.81	0.57	6.62

Back to article page

Search

Advanced search

Quick links