Figure 1: Restless bandit task and re-analysis.
From: Reminders of past choices bias decisions for reward in humans

(a) Four-armed bandit from Daw et al. (2006). Participants chose between four slot machines to receive points. (b) Payoffs. The mean amount of points paid out by each machine varied slowly over the course of the experiment. (c) Model comparison. Log Bayes factors favouring sampling over the TD model.