Fig. 9

Mean and standard deviation of total reward across three independent DRL trainings. The solid line represents the mean reward over episodes, while the shaded region denotes the standard deviation.

Mean and standard deviation of total reward across three independent DRL trainings. The solid line represents the mean reward over episodes, while the shaded region denotes the standard deviation.