Figure 2 | Scientific Reports

Figure 2

From: Computational medication regimen for Parkinson’s disease using reinforcement learning

Figure 2

Comparison of estimated penalty distributions (i.e., the estimated cumulative sum of future total UPDRS III scores). The penalty scores of an individual patient is \(V={\sum }_{t=1}^{T}{\gamma }^{t-1}{r}_{t}\) where \({r}_{t}\) is total UPDRS III scores at each visit with a discount factor \(\gamma =0.3\). The estimate of \(V\) was computed by importance sampling. We computed the penalty scores distribution across 500 independent bootstrapping with resampled training (80%) and test (20%) set. The final policy was chosen by the majority vote from 500 bootstraps. (a) Comparison of the penalty scores from four different strategies: clinicians AI’s policy, Zero drug (i.e., no drugs are given for all states), and random drug (i.e., any random drugs are given). Each box extends from the lower to upper quartile. A horizontal line in the box is a median. (b) Pairwise comparison of the penalty scores between AI and clinician.

Back to article page