Table 2 Key parameters of train process.
Parameter | Value | Definition scope |
|---|---|---|
Batch size | 16 | GPU memory constraints |
Initial learning rate | 0.01 | Model convergence stability |
Learning rate decay | 0.1/10 epochs | Avoid local minima |
Optimizer | SGD + Momentum | Saddle point avoidance |
Momentum | 0.9 | Gradient stabilization |