Table 2 Performance comparison of HUNet with other efficient CNNs and ViTs on the MTACCR Test_1 set

From: HUNet: hierarchical universal network for multi-type ancient Chinese character recognition

Model

Top-1 (%)

Params (M)

Feature_dim

Total depths

Throughput (images/s)

GPU

CPU

ONNX

ShiftViT_32(Base)23

91.09

12.81 (2.43)

1280

12 (2,2,6,2)

18,022

47.1

42.2

FasterNet_32(Base)21

86.37

12.95 (2.56)

1280

12 (2,2,6,2)

20,268

51.6

141.8

FasterNet_24_CA21,29

89.78

11.92 (1.54)

1280

12 (2,2,6,2)

3124

20.7

—

MobileNet_V2_1.0Ă—34

89.63

12.61 (2.22)

1280

11

4893

30

93.7

MobileNet_V2_1.0x_CA29,34

93.61

13.05 (2.67)

1280

11

2638

21.8

—

MobileNet_V3_small28

86.83

12.05 (1.67)

1280

11

16,904

111.8

322.6

MobileNet_V3_large28

92.07

14.59 (4.20)

1280

15

9361

46.5

90.8

GhostNet_V2_1.0Ă—35

90.17

15.26 (4.88)

1280

16

4279

27.2

80.1

ShuffleNet_v2_x1.036

87.46

11.75 (1.38)

1280

16

4434

63.7

249

ShuffleNet_v2_x1.536

89.78

13.04 (2.66)

1280

16

2917

55

158.5

EfficientViT_M037

88.61

3.72 (2.16)

192

6 (1,2,3)

9471

49.2

330.3

HUNet_24_M0(Ours)

88.34

2.75 (1.19)

192(24 × 8)

12 (2,2,6,2)

17,318

64.9

117.7

HUNet_24(Ours)

92.59

11.82 (1.43)

1280

12 (2,2,6,2)

17,103

50.8

108.1

HUNet_32(Ours)

93.01

12.82 (2.43)

1280

12 (2,2,6,2)

17,777

39.1

81.4

HUNet_36(Ours)

93.22

13.42 (3.03)

1280

12 (2,2,6,2)

17,192

40.2

73.6

HUNet_48(Ours)

94.23

15.60 (5.22)

1280

12 (2,2,6,2)

14,806

32.7

53.9

Multi_HUNet_24(Ours)

93.28

13.89 (3.51)

1280

12 (2,2,6,2)

14,395

43.6

—

  1. GPU Throughput and CPU Throughput are tested on the Nvidia RTX 3090 GPU and the Intel(R) Core (TM) i9-10900K CPU @ 3.70 GHz CPU, respectively. A higher Throughput indicates faster inference speed. The values in parentheses for Params represent the parameter size of the backbone network, and the numbers in the model names correspond to the number of channels C in the initial embedded feature maps.
  2. Boldface indicates the best performance in the test set, while underlined text denotes the second-best results.