Fig. 4: Reinforcement learning model and emergent strategies. | Nature

Fig. 4: Reinforcement learning model and emergent strategies.

From: Dopaminergic mechanisms of dynamical social specialization

Fig. 4: Reinforcement learning model and emergent strategies.The alternative text for this image may have been generated using AI.

a, Reinforcement learning model with one e-mouse, four states and two actions: lever press (L) and eat at food dispenser (D). Q-values represent agent learning (e-mice). Example with β = 2. b, Simulated #LP versus %CS for 200 e-mice with β uniformly distributed across [0.75,1.25]. c, Experimental #LP versus %CS for 62 mice, including both males and females. d, Pie charts show mean proportions of Achievers and Storers from archetypal decomposition in the (#LP, %CS) space (4 sessions each) in simulated male (n = 1,000, β [1, 1.25]) and female (n = 1,000, β [0.75, 1]) mice. Bottom, behavioural values at archetype vertices, shown as percentiles across the dataset. e, Same reinforcement learning model as in a, with three interacting e-mice. Right, simulated %LP versus %CS, colour-coded by archetype. f, Experimental data colour-coded by behavioural profile. g, Archetypal composition of simulated male triads (n = 1,000, β [2.25, 2.5]) and female triads (n = 1,000, β [0, 0.25]), showing distributions of Workers, Scroungers and Storers in the 4-dimensional feature space (%LP, %CS, gain, loss). Bottom left, behavioural values at archetype vertices, shown as percentiles across the dataset. h, Reduced two-agent model (e-mice 1 and 2) with two states (L, lever; D, food dispenser). Bifurcation diagram of the Q-value at D as a function of β. A pitchfork bifurcation emerges at βbifurc, with a lower limit βno food (Methods). i, A pair of low-β e-mice results in a single stable fixed point (uniform profiles) in the velocity landscape. Black curve shows an example of simulated dynamics. j, A pair of high-β e-mice (for example, two male e-mice) yields two stable fixed points corresponding to a Worker and a Scrounger. k, A pair comprising one high-β male e-mouse and one low-β female e-mouse yields a stable fixed point with the male as a Scrounger.

Back to article page