Table 2 Hyperparameter configurations.
Parameter | Value | Parameter | Value |
---|---|---|---|
Discount factor (γ) | 0.99 | Batch size | 32 |
Learning rate (α) | 0.001 | Training rounds | 150 |
ε (Initial) | 1.0 | State space dimension | 100 |
εmin | 0.01 | Action space dimension | 4 |
εdecay | 0.995 | Target network update frequency | 50 |