Scientific Reports

Table 4 Accuracy of deep learning models in audio evaluation on GTZAN dataset. All models are trained and tested under the same conditions (hardware) without using pre trained models.

From: Dense dynamic convolutional network for Bel canto vocal technique assessment

Models	GTZAN
Models	Top-1_Acc(%)	Paramters(M)	FLOPs(G)
CRNN⁸	63.21	4.93	1.10
MobileNet V2³⁶	60.40	4.08	2.87
CAM + +¹²	55.77	7.18	1.72
AST¹⁶	69.89	86.86	48.61
PETL-AST¹⁵	69.81	87.32	49.73
ResNet⁸	63.64	11.30	18.20
GhostNet³⁵	73.33	5.18	5.24
Ours-M	70.25	6.89	2.80
Ours-L	71.28(+1.03)	6.15	4.61
Ours-XL	73.95(+3.70)	12.74	12.33

Back to article page

Search

Advanced search

Quick links