Table 4 The optimum parameters value trained for this study.
Hyperparameter | PFO (Best) | APFO (Best) | GD (Best) |
|---|---|---|---|
Learning Rate (\(\alpha\)) | 0.0012 | 0.0016 | 0.0014 |
Batch Size (\(N\)) | 128 | 256 | 192 |
Number of Epochs | 50 | 60 | 55 |
Weight Decay (\({\lambda }_{3}\)) | 0.0005 | 0.0003 | 0.0004 |
Momentum (\(\beta\)) | 0.9 | 0.92 | 0.94 |
Dropout Rate (\(p\)) | 0.2 | 0.25 | 0.22 |
Data Augmentation | Random cropping, horizontal flipping | Random rotation, color jitter | Random cropping, vertical flipping |
Initialization Scheme | He initialization | Xavier initialization | He initialization |
Cost Function Value (\({L}_{total}\)) | 0.607 | 0.511 | 0.617 |