Table 2 The table accumulates the overall comparisons of the memory-based agents.

From: Exploring optimal control of epidemic spread using reinforcement learning

 

M7

M15

M30

M45

M60

Points (per day)

5

2

1

3

4

Economy (per day)

5

4

2

3

1

Infected

3

2

1

4

5

  1. For each comparison, we initialize rankings for each agent. Lower orders are better. Agent M30 comparatively performs better than the rest of the agents.
  2. Minimum orders are marked as bold.