Figure 22
From: Exploring optimal control of epidemic spread using reinforcement learning

The graph presents a comparison of the agent’s policy with the traditional n-work-m-lockdown policy. The comparison is formed on a 10,000 population with a density of 0.02. By only maintaining a 7-work-7-lockdown policy, a rapid spread of the virus can not be halted, and therefore, a total of 34.5% of the population gets infected. Furthermore, if the 7-work-7-lockdown policy is applied after a full lockdown of 40 days, the overall infection is decreased to 11.5%. However, the agent generated policy mostly flattens the curve.