Figure 4 | Scientific Reports

Figure 4

From: Exploring optimal control of epidemic spread using reinforcement learning

Figure 4

A heatmap representation of the reward function. The horizontal axis represents the percentage of active cases. The vertical axis represents the cumulative death percentage. From left to right, the three heatmaps illustrate the reward distribution in level-0 movement restriction, level-1 movement restriction, and level-2 movement restriction, respectively. In the three restriction levels 0, 1, and 2, the value of \(E_t\) is expected to be approximately 1, 0.75, and 0.25, respectively.

Back to article page