Table 11 Model performance comparison (proposed: 89.5% accuracy, 88.7% F1-score).

From: A deep learning framework for gender sensitive speech emotion recognition based on MFCC feature selection and SHAP analysis

Model

Accuracy (%)

Precision (%)

Recall (%)

F1-Score (%)

Proposed Model

89.5

90.4

87.3

88.7

VGGNet

85.4

84.1

83.7

83.9

InceptionV3

84.7

83.8

83.1

83.4

ResNet-50

82.9

81.9

80.5

81.2

Transformer

83.6

82.5

81

81.7