Table 3 Summary of key implementation details and hyperparameters.

From: Enhancing artistic style classification through a novel ArtFusionNet framework

Parameter

Value

Patch size (Transformer)

16 × 16 (196 tokens)

Input image size

224 × 224

Embedding dimension

768

Learning rate schedule

Cosine annealing with warm restarts

Initial learning rate

1 × 10⁻⁓

Optimizer

AdamW

Batch size

64

Dropout rate (Transformer layers)

0.1

Dropout rate (CNN layers)

0.3

Weight decay

1 × 10⁻⁵

Training epochs

200

Knowledge distillation temperature

4.0

Contrastive learning temperature (Ļ„)

0.07

Distillation loss weight (α)

0.5