Fig. 3: Quantifying behavior models of a value-based choice task.

a Schematic of trial structure showing mice initiating trials via sustained (500 ms) center port entry, followed by left/right choice within 3 s. Subsequent reward is delivered from center port. b Five behavior models using choice and reward history to predict current animal choice. c AIC comparison (mean ± SEM) from five behavior models of choice behavior (WS-LSh, WinStay-LoseShift; LogReg, Logistic Regression; Q-learning, standard q-learning model; rLogReg, recursive Logistic Regression; Q+forget, q-learning model with forgetting for unchosen choice). d Relationship between the probability of right choice and ∆Q value obtained from Q+forget model. Mean (thick gray line) and individual animal replicates (thin gray line). e Trial-by-trial choices (blue bar, right; green bar, left), outcomes (long bar, positive; short bar, negative) and predicted ∆Q-values derived from Q + forget model.