Table 6 Comparison with existing work.

From: Speech emotion recognition with light weight deep neural ensemble model using hand crafted features

Author

Technique

Features

Datasets

Accuracy (%)

Akinpelu et al.57

VGGNet

-MFCC

RAVDESS

86.25

TESS

100

EmoDB

96

Ottoni et al.58

Meta-Learning

-MFCC

RAVDESS

97.01

RMSE

SAVEE

90.62

ZCR

TESS

100.00

 

CREMA-D

83.28

Jothimani et al.59

CNN1D

MFCC

RAVDESS

92.60

RMSE

SAVEE

84.90

-ZCR

TESS

99.60

 

CREMA-D

89.90

Jiang et al.60

Parallelized CRNN

Log Mel Spectrogram

EMODB

84.53

-Frame Level Features

SAVEE

59.40

Mustaqeem et al.69

Bi-LSTM

-Spatial Features

EMODB

85.57

RAVDESS

77.02

Wen et al.64

Transfer Learning

-Log Mel Spectrogram

EMODB

84.14

SAVEE

52.09

Guizzo et al.65

Quantarion CNN

Real-valued spectrograms

EMODB

73.00

RAVDESS

55.15

TESS

99.76

Meng et al.66

Bi-LSTM

3-D Log-Mel spectrums

EMODB

84.99

Kwon67

CNN

Spatial Features

EMODB

90.01

Krishnan et al.68

LDA

Entropy Feature

TESS

93.30

Proposed method

Averaging ensemble

MFCC

RAVDESS

97.57

RMSE

SAVEE

98.43

ZCR

TESS

100

Chroma

CREMA-D

98.66

EmoDB

98.60

  1. Significant values are in [bold].