Fig. 8: The POMDP policy can be implemented by a drift diffusion model (DDM) with collapsing bounds. | Nature Communications

Fig. 8: The POMDP policy can be implemented by a drift diffusion model (DDM) with collapsing bounds.

From: Bayesian inference with incomplete knowledge explains perceptual confidence and its deviations from accuracy

Fig. 8

a (Left panel) In the standard DDM, the decision variable (DV) is the sum of observations over time. The process stops when the DV reaches one of the static decision bounds (+B or −B). (Right panel) Graphical model for a POMDP. For each time step (indicated by the subscripts 0, 1, ..., t − 1, t), r is the reward gained due to action a in hidden state s. z is the observation in hidden state s. The POMDP model infers a posterior probability distribution over hidden states at each time step based on past observations and actions. In the motion discrimination task, the actions are committing to a choice or making another observation. The model commits to a choice when the expected increase in the probability of a correct response is not worth the cost of an extra observation. b The time-varying bounds on μt in the POMDP policy map (e.g., solid white lines in Fig. 6b) have equivalent time-varying bounds on the DV in the DDM (Eq. (10); white lines in this panel). Similarly, the low and high confidence regions (blue and red regions respectively) of the POMDP policy map in Fig. 6b have equivalents in the DDM as shown here.

Back to article page