Scientific Reports

Table 3 Accuracy of deep learning models in audio evaluation on Urbansound8k dataset. All models are trained and tested under the same conditions (hardware) without using pre trained models.

From: Dense dynamic convolutional network for Bel canto vocal technique assessment

Models	Urbansound8k
Models	Top-1_Acc(%)	Paramters(M)	FLOPs(G)
CRNN⁸	86.24	4.93	1.10
MobileNet V2³⁶	87.14	4.08	2.87
CAM + +¹²	84.53	7.18	1.72
AST¹⁶	86.78	86.86	48.61
PETL-AST¹⁵	87.92	87.32	49.73
ResNet⁸	86.88	11.30	18.20
GhostNet³⁵	86.02	5.18	5.24
Ours-M	86.62	6.89	2.80
Ours-L	87.71(+1.09)	6.15	4.61
Ours-XL	89.31(+2.69)	12.74	12.33

Back to article page

Search

Advanced search

Quick links