Table 2 Performance comparison of HUNet with other efficient CNNs and ViTs on the MTACCR Test_1 set
From: HUNet: hierarchical universal network for multi-type ancient Chinese character recognition
Model | Top-1 (%) | Params (M) | Feature_dim | Total depths | Throughput (images/s) | ||
---|---|---|---|---|---|---|---|
GPU | CPU | ONNX | |||||
ShiftViT_32(Base)23 | 91.09 | 12.81 (2.43) | 1280 | 12 (2,2,6,2) | 18,022 | 47.1 | 42.2 |
FasterNet_32(Base)21 | 86.37 | 12.95 (2.56) | 1280 | 12 (2,2,6,2) | 20,268 | 51.6 | 141.8 |
89.78 | 11.92 (1.54) | 1280 | 12 (2,2,6,2) | 3124 | 20.7 | — | |
MobileNet_V2_1.0Ă—34 | 89.63 | 12.61 (2.22) | 1280 | 11 | 4893 | 30 | 93.7 |
93.61 | 13.05 (2.67) | 1280 | 11 | 2638 | 21.8 | — | |
MobileNet_V3_small28 | 86.83 | 12.05 (1.67) | 1280 | 11 | 16,904 | 111.8 | 322.6 |
MobileNet_V3_large28 | 92.07 | 14.59 (4.20) | 1280 | 15 | 9361 | 46.5 | 90.8 |
GhostNet_V2_1.0Ă—35 | 90.17 | 15.26 (4.88) | 1280 | 16 | 4279 | 27.2 | 80.1 |
ShuffleNet_v2_x1.036 | 87.46 | 11.75 (1.38) | 1280 | 16 | 4434 | 63.7 | 249 |
ShuffleNet_v2_x1.536 | 89.78 | 13.04 (2.66) | 1280 | 16 | 2917 | 55 | 158.5 |
EfficientViT_M037 | 88.61 | 3.72 (2.16) | 192 | 6 (1,2,3) | 9471 | 49.2 | 330.3 |
HUNet_24_M0(Ours) | 88.34 | 2.75 (1.19) | 192(24 × 8) | 12 (2,2,6,2) | 17,318 | 64.9 | 117.7 |
HUNet_24(Ours) | 92.59 | 11.82 (1.43) | 1280 | 12 (2,2,6,2) | 17,103 | 50.8 | 108.1 |
HUNet_32(Ours) | 93.01 | 12.82 (2.43) | 1280 | 12 (2,2,6,2) | 17,777 | 39.1 | 81.4 |
HUNet_36(Ours) | 93.22 | 13.42 (3.03) | 1280 | 12 (2,2,6,2) | 17,192 | 40.2 | 73.6 |
HUNet_48(Ours) | 94.23 | 15.60 (5.22) | 1280 | 12 (2,2,6,2) | 14,806 | 32.7 | 53.9 |
Multi_HUNet_24(Ours) | 93.28 | 13.89 (3.51) | 1280 | 12 (2,2,6,2) | 14,395 | 43.6 | — |