Table 3 Accuracy of deep learning models in audio evaluation on Urbansound8k dataset. All models are trained and tested under the same conditions (hardware) without using pre trained models.

From: Dense dynamic convolutional network for Bel canto vocal technique assessment

Models

Urbansound8k

Top-1_Acc(%)

Paramters(M)

FLOPs(G)

CRNN8

86.24

4.93

1.10

MobileNet V236

87.14

4.08

2.87

CAM + +12

84.53

7.18

1.72

AST16

86.78

86.86

48.61

PETL-AST15

87.92

87.32

49.73

ResNet8

86.88

11.30

18.20

GhostNet35

86.02

5.18

5.24

Ours-M

86.62

6.89

2.80

Ours-L

87.71(+1.09)

6.15

4.61

Ours-XL

89.31(+2.69)

12.74

12.33