Table 4 Statistical Results: Learned Policy over 20 Trials (Fully Unknown Dynamics).
From: Reinforcement learning-based optimal control for stochastic opinion dynamics
Metric | Gain \(K_1\) | Gain \(K_2\) | Mean Discounted Cost \(J_{\text {mean}}\) |
|---|---|---|---|
Mean | 0.549062 | 0.723203 | 0.948447 |
Standard Deviation (SD) | 0.008509 | 0.008733 | 0.004300 |
Coefficient of Variation (CV = SD/Mean) | 1.55% | 1.21% | 0.45% |
95% Confidence Interval (95% CI) | [0.5457, 0.5524] | [0.7199, 0.7265] | [0.9466, 0.9503] |