Table 7 Ablation results for multimodal vs. unimodal performance.

From: MULTICAUSENET temporal attention for multimodal emotion cause pair extraction

Model

IEMOCAP (WF1)

MELD (WF1)

Text-only

62.45

43.34

Audio-only

61.43

42.11

Video-only

60.11

40.54

Full model (T+V+A)

73.02

53.67

  1. Significant values are in bold.