Table 2 Training summary of the proposed model.

From: Hierarchical cross-modal attention and dual audio pathways for enhanced multimodal sentiment analysis

Metric

Value

Total Training Time

455.48 mins

CPU Memory Usage

10797.07 MB

CPU Utilization Change

3.70%

GPU Memory Usage

2751.00 MB

GPU Utilization Change

7.00%

Final Training Loss

0.3321

Final Unified Training Accuracy

84.25%

Final Text Training Accuracy

84.41%

Final Image Training Accuracy

94.25%

Final Audio Training Accuracy

84.30%

Final Validation Loss

0.3576

Final Unified Validation Accuracy

83.50%

Final Text Validation Accuracy

88.81%

Final Image Validation Accuracy

93.62%

Final Audio Validation Accuracy

83.44%