Fig. 2: Learning to choose optimal actions. | Nature Communications

Fig. 2: Learning to choose optimal actions.

From: Unconscious reinforcement learning of hidden brain states supported by confidence

Fig. 2

a Subjects’ probabilities of selecting the optimal action (the one that was more likely to be rewarded on a particular trial) in each session. The shaded areas in the violin plots represent the population spread and variance, the white dot the median, the thicker line the interquartile range, and coloured dots individual subjects. Within-session statistical test against chance: full linear model, with the intercept as difference from chance (two-sided p values, FDR corrected). Between-session statistical test of difference (sessions 1 vs. 2): sign test (one-sided p value, uncorrected). b For each subject, a control p(opt action) was computed according to a win-stay lose-switch heuristic. Under this strategy, an agent repeats the same action if it was rewarded in the previous trial, and switches otherwise. Grey histograms represent the probability density function (PDF) of p(opt action) from the win-stay lose-switch strategy, while coloured histograms represent the PDF of actual subjects’ p(opt action) rates. Within-session statistical test of difference: sign test (two-sided p values, FDR corrected). The experiment was conducted once (n = 18 biologically independent samples). a n.s.P = 0.15, **P = 0.003, ***P = 2.9 × 10−8; b n.s.P = 0.48, *P = 0.019, ***P = 0.0004.

Back to article page