Table 4 Cross-validation of batch size and update frequency.
Batch size | Update frequency | Average reward | Training time (hours) |
|---|---|---|---|
16 | 25 | 79.3 | 7.2 |
50 | 80.1 | 7.8 | |
100 | 77.6 | 6.9 | |
32 | 25 | 81.2 | 8.1 |
50 | 82.8 | 8.5 | |
100 | 80.9 | 8.3 | |
64 | 25 | 75.4 | 9.2 |
50 | 76.8 | 9.7 | |
100 | 74.1 | 9.0 |
Batch size | Update frequency | Average reward | Training time (hours) |
|---|---|---|---|
16 | 25 | 79.3 | 7.2 |
50 | 80.1 | 7.8 | |
100 | 77.6 | 6.9 | |
32 | 25 | 81.2 | 8.1 |
50 | 82.8 | 8.5 | |
100 | 80.9 | 8.3 | |
64 | 25 | 75.4 | 9.2 |
50 | 76.8 | 9.7 | |
100 | 74.1 | 9.0 |