Table 1 Comparative analysis of performance of various models.

From: Hierarchical cross-modal attention and dual audio pathways for enhanced multimodal sentiment analysis

Model name

Text modality

Image modality

Audio modality

 

A

P

R

F1

A

P

R

F1

A

P

R

F1

MISA

0.50

0.25

0.50

0.33

0.86

0.86

0.86

0.86

0.57

0.67

0.57

0.51

MSFNet

0.51

0.51

0.51

0.47

0.84

0.85

0.84

0.84

0.51

0.51

0.51

0.47

CLIP

0.80

0.80

0.80

0.80

0.92

0.92

0.92

0.92

0.79

0.80

0.80

0.80

CLIP+BERT

0.79

0.79

0.79

0.79

0.92

0.92

0.92

0.92

0.55

0.64

0.55

0.47

Proposed Model

0.84

0.84

0.84

0.84

0.94

0.94

0.94

0.94

0.85

0.85

0.85

0.85