Fig. 3: Latent state bias and computational learning models.
From: Unconscious reinforcement learning of hidden brain states supported by confidence

a Example state time-courses, for each session, from two subjects. R and L denote the two possible states (decoder outputs). Each line represents trial-by-trial decoded outputs smoothed with a moving average filter (span = 5 trials). b Individual traces of the extent of latent RL state bias throughout each session. Latent state bias was defined as the unsigned difference between the number of occurrences of each state, normalized by the number of trials. Time-courses were computed with a moving window (span = 30 trials), and then smoothed with a moving average filter (span = 5 trials). The black dotted lines indicate the group mean, the shaded areas the 95% confidence intervals, coloured lines individual subjects. c Individual and group average of the degree of absolute latent state bias as outputted by the decoder, for each session. d Mean latent state bias plotted vs. mean p(opt action), averaged over sessions 1–2. Pearson correlation (n = 18), two-sided p value. e Example time-courses of actions selected by two subjects (blue lines) vs. actions selected by the two RL algorithms that were fitted to the data. Top, black lines: state-free RL model. Bottom, grey lines: noisy state-dependent RL model. f Akaike Information Criteria (AIC)31 was computed for each subject, session and model, respectively. Lower AIC indicates a better fit. Left: bars show the difference in AIC between the two models: AICsd−AICsf. ΔAIC < 0 on all sessions, indicating lower AIC for the noisy state-dependent RL. Right: subject level and median ΔAIC for each session. Within-session statistical test against 0: full linear model, with the intercept as difference from 0 (two-sided p values, FDR corrected). Between-session statistical test of difference (sessions 1 vs. 2): sign test (one-sided p value, uncorrected). In c, d, f, coloured dots represent individual subjects; in c, f, thick horizontal lines represent the mean and the median, respectively, error bars the SEM. The experiment was conducted once (n = 18 biologically independent samples). f n.s.P = 0.42, **P = 0.0085, ***P = 0.0014.