Table 9 Baseline models vs proposed method with only data homogenization block (marked as DH only) and CA accuracy comparison on bronze inscriptions dataset
From: Multi-modal ancient scripts recognition via deep learning with data homogenization and augmentation
Model | Top-1 Acc | Top-5 Acc | ||||
|---|---|---|---|---|---|---|
Base | DH only | CA | Base | DH only | CA | |
AlexNet | 0.631 | 0.669 | 0.763 | 0.842 | 0.877 | 0.942 |
VGG19 | 0.588 | 0.665 | 0.727 | 0.831 | 0.858 | 0.911 |
ResNet50 | 0.615 | 0.619 | 0.803 | 0.850 | 0.850 | 0.960 |
ConvNext | 0.573 | 0.638 | 0.778 | 0.792 | 0.850 | 0.942 |
EfficientNet | 0.623 | 0.654 | 0.744 | 0.850 | 0.858 | 0.921 |
ShuffleNet | 0.531 | 0.558 | 0.754 | 0.762 | 0.785 | 0.931 |
ViT | 0.285 | 0.292 | 0.441 | 0.569 | 0.581 | 0.742 |
SwinTransformer | 0.427 | 0.577 | 0.718 | 0.742 | 0.842 | 0.911 |