Table 4 Statistical Results: Learned Policy over 20 Trials (Fully Unknown Dynamics).

From: Reinforcement learning-based optimal control for stochastic opinion dynamics

Metric

Gain \(K_1\)

Gain \(K_2\)

Mean Discounted Cost \(J_{\text {mean}}\)

Mean

0.549062

0.723203

0.948447

Standard Deviation (SD)

0.008509

0.008733

0.004300

Coefficient of Variation (CV = SD/Mean)

1.55%

1.21%

0.45%

95% Confidence Interval (95% CI)

[0.5457, 0.5524]

[0.7199, 0.7265]

[0.9466, 0.9503]