Table 11 Baseline models vs proposed method with only data augmentation block (marked as DA only) and CA accuracy comparison on bronze inscriptions dataset
From: Multi-modal ancient scripts recognition via deep learning with data homogenization and augmentation
Model | Top-1 Acc | Top-5 Acc | ||||
|---|---|---|---|---|---|---|
Base | DA only | CA | Base | DA only | CA | |
AlexNet | 0.631 | 0.723 | 0.763 | 0.842 | 0.921 | 0.942 |
VGG19 | 0.588 | 0.672 | 0.727 | 0.831 | 0.889 | 0.911 |
ResNet50 | 0.615 | 0.758 | 0.803 | 0.850 | 0.929 | 0.960 |
ConvNext | 0.573 | 0.705 | 0.778 | 0.792 | 0.903 | 0.942 |
EfficientNet | 0.623 | 0.743 | 0.744 | 0.850 | 0.912 | 0.921 |
ShuffleNet | 0.531 | 0.703 | 0.754 | 0.762 | 0.900 | 0.931 |
ViT | 0.285 | 0.362 | 0.441 | 0.569 | 0.682 | 0.742 |
SwinTransformer | 0.427 | 0.606 | 0.718 | 0.742 | 0.866 | 0.911 |