Table 4 Performance comparison of the speaker embeddings with other pre-trained embeddings.

From: Manifestation of depression in speech overlaps with characteristics used to represent and recognize speaker identity

Model

DAIC-WOZ

Vocal Mind

\(F_1(D)\)

\(F_1(H)\)

BAc.

RMSE

\(F_1(D)\)

\(F_1(H)\)

BAc.

RMSE

 Mockingjay

0.27

0.70

0.49

7.09

0.27

0.70

0.48

7.58

 vq-wav2vec

0.32

0.71

0.52

6.95

0.25

0.73

0.49

7.12

 wav2vec-2.0

0.38

0.74

0.55

6.77

0.32

0.74

0.52

7.03

 TRILL

0.36

0.77

0.56

6.46

0.34

0.76

0.55

6.80

 ECAPA (Proposed)

0.46

0.79

0.63

6.31

0.34

0.81

0.57

6.62