Figure 6 | Scientific Reports

Figure 6

From: Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy

Figure 6

Reward function for reinforcement learning. Contour plot of reward function as a function of the probability of local control (PLC) and radiation induced pneumonitis of grade 2 or higher (PRP2). Area enclosed by the blue line corresponds to the clinically desirable outcome, i.e., \(P_{LC} > 70{\%}\) and \({P_{RP2}} <17.2{\%}\) . Similarly, the area enclosed by the green lines corresponds to the computationally desirable outcome, i.e., \(P_{LC} > 50{\%}\)  and \({P_{RP2}} <50{\%}\). Along with \(P_{LC} \times (1-P_{RP2})\) the AI agent receives + 10 reward for achieving clinically desirable outcome, + 5 for achieving computationally desirable outcome, and -1 when unable to achieve a desirable outcome.

Back to article page