Table 1 Parameters of experiment.
From: A scalable approach to optimize traffic signal control with federated reinforcement learning
Parameters | Value |
|---|---|
Learning rate | 0.0001 |
\(\varepsilon\)-greedy action selection | 0.9 |
Reward discount of RL \(\gamma\) | 0.9 |
Target network update frequency | 30 episodes |
Memory capacity | 1000 |
Discount weight of waiting time in reward \(\sigma\) | 0.1 |