Table 1 Main hyperparameter settings.
Hyperparameter | CIFAR-10/CIFAR-100 | ImageNet |
|---|---|---|
Optimizer | SGD | SGD |
Initial Learning Rate | 0.1 | 0.1 |
Learning Rate Schedule | Cosine Annealing | Cosine Annealing |
Batch Size | 64 | 128 |
Weight Decay | 5 × 10⁻⁴ | 1 × 10⁻⁴ |
Momentum | 0.9 | 0.9 |
Epochs | 200 | 100 |
r | 16 | 16 |
K | {3, 5, 7} | {3, 5, 7} |
τ | 1.0 | 1.0 |