Table 3 Hyperparameters of the delay shift agent and baseline methods after validation.

From: Decentralized queue control with delay shifting in edge-IoT using reinforcement learning

Agent

Learning rate

Discount factor (γ)

\(\epsilon\)-decay

L2 regularisation

Action selection strategy

Delay Shift

0.001

0.95

0.995

0.01

\(\epsilon\)-greedy policy based on Q-function

Rule-Based

Δ = +2 s (offset applied to the estimated task delay)

Entropy-Based

0.01

0.9

0.001

Softmax over Q-estimates (T = 0.5)

Heuristic QoS

p = 0.7 (probabilistic admission for tasks with delay ≤ 100 ms)