Fig. 2: Individual differences in mesolimbic dopamine signals correlate with learned behavioural policy.
From: Mesolimbic dopamine adapts the rate of learning from action

a, Schematic for fibre photometry with optional simultaneous optogenetic stimulation of midbrain dopamine neurons. b, Fibre paths and virus expression from an example experiment. A–P, anterior–posterior. c, Left and middle: NAc–DA, licking, body movements, whisking probability and pupil diameter measurements for the mean of animals with lowest (blue, n = 4) and highest (pink, n = 4) NAc–DA reward signals over the initial 100 trials, shown for trials 1–100 (left) compared to trials 600–800 (middle). Right: means of responses (resp.) in the analysis windows indicated at left (dashed grey boxes) across training. d, Illustration of fibre locations for each mouse (n = 9), colour-coded according to the size of their initial NAc–DA reward signals. DLS, dorsolateral striatum; NAsh, nucleus accumbens shell. e, Initial (Init.) NAc–DA reward responses (trials 1–100) were correlated with final latency to collect reward (bottom, Pearson’s r = 0.81, P = 0.008), but not with final cued NAc–DA response (top, Pearson’s P = 0.47) (n = 9 mice). f, ACTR simulations with low (small pink dots, n = 6) or high (large blue dots, n = 6) initial reward-related sensory input exhibited a significant correlation between initial (trials 1–100) predicted mDA reward response and final reward collection latency. a.u., arbitrary units. All error bars denote ±s.e.m.