Table 2 Hyperparameter configurations.

Parameter	Value	Parameter	Value
Discount factor (γ)	0.99	Batch size	32
Learning rate (α)	0.001	Training rounds	150
ε (Initial)	1.0	State space dimension	100
ε_min	0.01	Action space dimension	4
ε_decay	0.995	Target network update frequency	50

Quick links

Search