Figure 5 | Scientific Reports

Figure 5

From: Robust diagnostic classification via Q-learning

Figure 5

An example of how a policy updates with all possible responses from an inquiry. The top row captures the initial “empty” state of the policy, while the branch represent all of the possible state update that could occur depending on the observation made following the action taken. The column vector represents the state of the policy, or the items that the policy has information about so far. The horizontal bar chart captures the relative Q-value of each action (actions are equivalent to querying an item or making a prediction). As ADI_45 has the highest Q-value, it is the first item that is queried by the policy. The arrows capture possible responses, or observations, that the policy can have, which in turn are used to update the state. The verticle bar chart captures the current state’s predicted probabilities of ADHD and ASD respectively (Belief).

Back to article page