Fig. 5
From: Depression detection methods based on multimodal fusion of voice and text

The text data corresponds to the audio data, and the picture shows the attention characteristics of different tokens.
From: Depression detection methods based on multimodal fusion of voice and text
The text data corresponds to the audio data, and the picture shows the attention characteristics of different tokens.