Fig. 4: Results replicate in a probabilistic learning task. | Nature Human Behaviour

Fig. 4: Results replicate in a probabilistic learning task.

From: A habit and working memory model as an alternative account of human reward-based learning

Fig. 4

a, Model comparison showing the results from a family of models manipulating the subjective outcome value of outcome 0, r0, for RL, WM or both—with r0 a free parameter unless labelled to its fixed value. r0 = 0 corresponds to standard RL or WM computations; r0 = 1 corresponds to an H agent that handles both outcomes similarly. Highlighted in pink are agents that can be interpreted as WMH and in brown those that correspond to RL mixtures. The winning model RLr0 = 1; WMr0 = 0 assumes RLr0 = 1 and WMr0 = 0 and is thus a WMH agent, replicating the findings in the deterministic version of the task. I further verified that the winning model was better than the best single-process model, WMf (Methods). The data are plotted as individual (dots) and group mean AIC (± standard error), baselined to the group mean best model; the right plot shows the proportion of participants best fit by each model. b, A set-size effect was also observed in a probabilistic version of the task; the winning model (third from the left) captures the learning curve pattern better than the competing models. The error bars indicate the standard error of the mean across n = 34 individual participants (dots in a).

Back to article page