Table 1 Parameters of experiment.

From: A scalable approach to optimize traffic signal control with federated reinforcement learning

Parameters

Value

Learning rate

0.0001

\(\varepsilon\)-greedy action selection

0.9

Reward discount of RL \(\gamma\)

0.9

Target network update frequency

30 episodes

Memory capacity

1000

Discount weight of waiting time in reward \(\sigma\)

0.1