Table 3 Hyperparameters of the delay shift agent and baseline methods after validation.
From: Decentralized queue control with delay shifting in edge-IoT using reinforcement learning
Agent | Learning rate | Discount factor (γ) | \(\epsilon\)-decay | L2 regularisation | Action selection strategy |
|---|---|---|---|---|---|
Delay Shift | 0.001 | 0.95 | 0.995 | 0.01 | \(\epsilon\)-greedy policy based on Q-function |
Rule-Based | – | – | – | – | Δ = +2 s (offset applied to the estimated task delay) |
Entropy-Based | 0.01 | 0.9 | – | 0.001 | Softmax over Q-estimates (T = 0.5) |
Heuristic QoS | – | – | – | – | p = 0.7 (probabilistic admission for tasks with delay ≤ 100 ms) |