Table 4 Classification accuracy of the deep learning models on the RAVDESS datasets.

From: Stacked convolutional neural network for emotion recognition using multi feature speech analysis

Feature Sets

CNN + LSTM

CNN

LSTM

RNN + LSTM

Combined (196)

82.21

88.69

87.72

86.31

MFCC (40)

80.36

81.02

90.48

90.84

LPC (13)

77.68

73.36

60.79

59.67

Mel Spectrogram (128)

81.10

81.02

79.09

79.02

Chroma & Others (15)

33.63

32.07

30.88

29.91