Table 3 Summary of key implementation details and hyperparameters.
From: Enhancing artistic style classification through a novel ArtFusionNet framework
Parameter | Value |
|---|---|
Patch size (Transformer) | 16āĆā16 (196 tokens) |
Input image size | 224āĆā224 |
Embedding dimension | 768 |
Learning rate schedule | Cosine annealing with warm restarts |
Initial learning rate | 1āĆā10ā»ā“ |
Optimizer | AdamW |
Batch size | 64 |
Dropout rate (Transformer layers) | 0.1 |
Dropout rate (CNN layers) | 0.3 |
Weight decay | 1āĆā10ā»āµ |
Training epochs | 200 |
Knowledge distillation temperature | 4.0 |
Contrastive learning temperature (Ļ) | 0.07 |
Distillation loss weight (α) | 0.5 |