Scientific Reports

Table 6 The experimental results for multiple different Transformers.

From: Visual feature-based multi-scale hybrid attention network for fine-grained Hawthorn varieties identification

Model Name	Top-1 ACC (%)	Params (M)	FLOPs	FPS
ViT⁴⁰	70.3	86.57	16.86	17.17
FocalNet⁴¹	78.47	27.67	4.41	14.69
Swin Transformer²⁹	85.67	28.29	4.36	24.29
CMT⁴²	73.22	9.51	0.62	20.56
CvT⁴³	73.76	20.23	4.53	22.83
PVT⁴⁴	67.25	24.52	3.81	21.37
MaxViT⁵¹	75.42	30.92	5.48	14.73
EfficientViT⁵²	83.52	2.3	79	49.62
SwinFG⁵³	87.69	28.62	4.42	22.78
Ours	90.96	15.92	4.76	15.8

Back to article page

Search

Advanced search

Quick links