Figure 4

Training and validation results. (a) similarity scores and (b) self-evaluation results for the training and validation data sets. A total of 5 numerical experiments were carried out for each of the models: Deep reinforcement learning (DRL), quantum DRL (qDRL) trained in simulator, and qDRL trained in IBMQ quantum processor. In particular, we used Double DQN algorithm for DRL. All the results correspond to the average model. The results of the Self-Evaluation I scheme are converted into the percentage of the total patients. The similarity score is the root mean square error (RMSE) value between the retrospective clinical dose decision and the AI recommendation while the self-evaluation value is calculated based on the clinical outcome and the clinically established relationship between radiation dose and RT outcome.