Table 4 Accuracy of deep learning models in audio evaluation on GTZAN dataset. All models are trained and tested under the same conditions (hardware) without using pre trained models.

From: Dense dynamic convolutional network for Bel canto vocal technique assessment

Models

GTZAN

Top-1_Acc(%)

Paramters(M)

FLOPs(G)

CRNN8

63.21

4.93

1.10

MobileNet V236

60.40

4.08

2.87

CAM + +12

55.77

7.18

1.72

AST16

69.89

86.86

48.61

PETL-AST15

69.81

87.32

49.73

ResNet8

63.64

11.30

18.20

GhostNet35

73.33

5.18

5.24

Ours-M

70.25

6.89

2.80

Ours-L

71.28(+1.03)

6.15

4.61

Ours-XL

73.95(+3.70)

12.74

12.33