Table 3 Hyperparameter analysis of proposed model.
Parameter | Values |
|---|---|
Batch Size | 32 |
Number of Layers | 32 |
Hidden Layer Units | 1024 (L) |
Optimizer | AdamW |
Loss Function | Cross-Entropy |
Epsilon | 1.00E-06 |
Dropout Rate | 0.1 |
Epochs | 100 (fine-tuning) |
Learning Rate | 0.0001 |
Beta 1 & Beta 2 | 0.9 & 0.999 |
Weight Decay | 0.05–0.2 |
Gradient Clipping | 1.0 |
Dropout in FC Layers | 0.1 |
L2 Regularization | 0.0001 |
Early Stopping | Patience = 5 epochs |
Brightness Range | 0.2–0.4 |
Number of Parameters | 300 M (L) |
Hidden Layer Activation | GELU |
Final Layer Activation | SoftMax |
Gradient Descent Type | Mini-Batch Adaptive Gradient Descent |