Main

Social animals interact and coordinate their behaviours to form social organizations—structured patterns of relationships and interactions that stabilize into collective forms (for example, division of labour and norms) and specialized roles3,4,5,6,7. A key challenge in behavioural science is deciphering how individual differences in behaviour emerge, how they contribute to collective actions and how they are linked to neural activity. While the broad effects of social organization on individual behaviour and specialization are well documented8,9,10,11,12, the underlying cognitive and neurophysiological mechanisms remain largely unexplored, mainly because of the difficulty of studying such mechanisms in controlled, artificial social settings that also allow neurobiological investigation.

In social groups, the production of and access to shared resources drive strategies such as competition or cooperation2,8,13. The producer–scrounger game illustrates this dynamic: some individuals produce resources, others exploit them1,2,14,15. Thus, foraging strategies span from independent food acquisition to exploiting others’ efforts, balancing effort, risk and reward under ecological and social constraints13. Evolutionary game theory16 predicts stable equilibria between such strategies, but typically assumes that behaviours are predetermined and stable17, overlooking the internal mechanisms that guide an individual’s behaviour, including the neural and cognitive processes that grant adaptative, learning-based flexible strategies17,18,19. Central to this adaptability is dopaminergic signalling, which reinforces rewarded actions by signalling when reward is larger than expected20,21,22,23, including social reward24,25. A second key process is the exploration–exploitation trade-off26,27,28,29, whereby individuals must choose between exploiting known options and exploring alternatives, a process in which dopamine has also been implicated30,31,32,33,34,35.

Here we hypothesize that social foraging strategies, including producer–scrounger dynamics, emerge flexibly through socially constrained reinforcement learning. In particular, we suggest that dopaminergic signalling shapes variability in learning and decision policies, thereby contributing to behavioural specialization. As social organizations forms, individuals adjust actions on the basis of expected payoffs, which in turn modify neural activity and learning rules, ultimately stabilizing social roles. Recent advances in continuous animal tracking and behaviour quantification in semi-naturalistic settings36,37,38,39,40 enable detailed measurements of how individuals interact and adapt their strategies over time. Using these approaches, we examined how foraging specialization emerges in small groups (n = 3) of isogenic mice housed in a controlled semi-natural environment.

Two foraging strategies in lone mice

We first assessed mice foraging behaviour and underlying neural mechanisms in a lone context. Female or male mice (n = 62) were placed alone in a 50 cm × 50 cm multi-compartment environment and tracked continuously for 5 days and 4 nights using the Live Mouse Tracker system36. This setup includes a lever on one side and a food dispenser with a beam on the opposite side, enabling mice to obtain pellets by pressing the lever and consume them at any time (Fig. 1a). Mice displayed principally nocturnal foraging, with lever presses and nose pokes in the food dispenser occurring primarily during the dark cycle, as expected (Fig. 1b). Over time, the number of lever presses (#LP) increased and the number of nose pokes decreased, indicating that mice had learned the association between lever press and food retrieval (Fig. 1c, left, and Extended Data Fig. 1a). Accordingly, trajectory durations between lever press and food dispenser showed a skewed distribution, with 54% having a duration of less than 6 s, which we defined as a complete sequence (CS; Fig. 1c, right). The percentage of complete sequences (%CS = 100 × number of complete sequences divided by #LP) increased over sessions, stabilizing at 35%, suggesting an instrumental use of lever presses (Fig. 1c and Extended Data Fig. 1a).

Fig. 1: Different reward-seeking strategies emerge in a non-social semi-natural environment.
Fig. 1: Different reward-seeking strategies emerge in a non-social semi-natural environment.The alternative text for this image may have been generated using AI.
Full size image

a, Mice were housed individually for 5 days in habitats in which food was delivered via a lever press opposite the food dispenser. b, Polar histogram of #LP over 24 h. c, Left, time course of #LP and number of beam breaks (#BB) over 4 nights. One-way repeated-measures ANOVA. Right, distribution of trajectory durations between lever press and dispenser; durations of less than 6 s define complete sequences. Right, inset, %CS over time. One-way repeated-measures ANOVA. d, Top, viral injection and fibre photometry implantation. Bottom, VTA dopaminergic (DA) activity before and after learning (after day 3), separated by sequence completion, with activation at food delivery. Responses were quantified as post-event versus pre-event windows and compared using paired t-tests or Wilcoxon signed-rank tests. e, Archetypal analysis identifies Achievers and Storers, defined by distinct behavioural values (#LP and %CS from days 1 to 4) at archetype vertices, shown as percentiles across the dataset. f, Top, distribution of individuals in the two-dimensional behavioural space. Bottom, example of post-lever press trajectories (6-s windows) from one Storer and one Achiever. g, %CS over time by archetype. Two-way repeated-measures ANOVA. h, Distance to the Storer archetype by sex. Wilcoxon rank-sum test. i, Strategy distribution by sex. Pearson’s χ² test with Yates correction. j, Body mass at day 4 by sex and archetype. Two-way ANOVA. k, Left, in vivo electrophysiology setup. Right, VTA dopaminergic activity in anaesthetized male mice at the end of experiment. Mean values of the firing rate are indicated for Achievers and for Storers. Wilcoxon rank-sum test. WT, wild type. Data are presented as mean ± s.e.m. Details (test statistics, degrees of freedom and exact P values) are reported in Supplementary Table 1. All tests were two-sided. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. No adjustment for multiple comparisons unless stated.

To explore the neural bases of this association, we used in vivo fibre photometry in DAT-Cre mice expressing GCaMP7c in dopaminergic neurons of the ventral tegmental area (VTA) (Fig. 1d). A phasic increase in VTA dopaminergic neuron activity was observed at the time of food delivery, regardless of the sequence completeness or of learning, consistent with reward signalling (Fig. 1d). By contrast, we found a specific phasic increase in VTA dopaminergic neuron activity at the time of lever press, but only after learning and for complete sequences, consistent with the prediction of food reward by lever press20. No such increase at the lever press time was observed for incomplete sequences, regardless of whether the mouse moved afterwards. This indicates that the increase characterizing complete sequences was not related to the movement of the mouse (Extended Data Fig. 1b,c). Additionally, we found a correlation between VTA dopaminergic activity at lever press and the time taken to reach the food magazine (Extended Data Fig. 1c, bottom). These findings further suggest that lever presses associated with complete sequences acquired an instrumental status during learning, reflected in reward prediction-like VTA dopaminergic activity.

We next analysed individual decision-making strategies using archetypal analysis32,38,40—that is, expressing individuals as a weighted combination of archetypal strategies—a dimensionality reduction method rooted in the concept of optimality with respect to multiple evolutionary constraints41 (Methods). Archetypes based on #LP and %CS from day 1 to day 4 identified two distinct profiles: Achievers and Storers (n = 31 each; Fig. 1e,f and Extended Data Fig. 1d,e). The two groups pressed the lever equally often, but differed in how they collected rewards. Storers frequently left pellets unretrieved, whereas Achievers completed a higher proportion of sequences (Fig. 1e–g and Extended Data Fig. 1f). Thus, both engaged the task, but differed in the immediacy of reward retrieval. Female mice were about twice as likely as male mice to be classified as Storers and clustered closer to this archetype in the behavioural space (Fig. 1h,i). Post-experiment body mass differed between sexes but not between archetypes (Fig. 1j). Finally, spontaneous VTA dopaminergic neuron activity, recorded post-experiment in anaesthetized males, showed higher firing frequency in Achievers than in Storers (Fig. 1k and Extended Data Fig. 1g), suggesting a link between VTA dopaminergic activity and foraging strategies32,37,38.

Sex-dependent differences between triads

We next explored the foraging strategies of mice (n = 195) in a social context, by housing triads of either all-male or all-female task-naive mice for 8 days and 7 nights (Fig. 2a,b). Lever pressing was predominantly nocturnal (Fig. 2b). Male and female mice performed a similar number of lever presses, but females achieved significantly fewer complete sequences (Fig. 2c and Extended Data Fig. 2a), reflecting higher latencies between lever press and food retrieval, and more consecutive lever presses before consumption (Extended Data Fig. 2b). Archetypal analysis of eight behavioural features (Fig. 2d)—%LP intra-cage (#LP per mouse divided by the total #LP in the cage on that day) and %CS over the past 3 days, plus the number of food pellet gains and losses due to conspecifics (Methods)—revealed a triangular behavioural space with Storer, Worker and Scrounger archetypes (Fig. 2d and Extended Data Fig. 2c,d). Storers pressed frequently but completed few sequences, leaving pellets temporarily in the food dispenser (thus delaying reward retrieval), a pattern consistent with the Storer archetype in the lone context. Despite leaving pellets available to others, Storers rarely engaged in competition, resulting in low numbers of food losses and gains (Fig. 2e). By contrast, Workers performed frequent complete sequences and produced food that was often consumed by others, whereas Scroungers rarely pressed the lever and primarily consumed food earned by cage mates (Fig. 2e,f). Workers and Scroungers showed clear signs of competition, with opposite food gain and loss patterns (Fig. 2e) and lever press strategies (Fig. 2f and Extended Data Fig. 2e). Spatial analyses revealed that the lever press is a salient event for Workers and especially Scroungers, who oriented towards the feeder zone following conspecific lever presses, whereas Storers showed no such modulation (Extended Data Fig. 3a,b). These roles emerged rapidly, with clear distinctions already on day 1 that remained stable thereafter (Extended Data Fig. 3c–e). Indeed, within hours on day 1, Workers showed frequent complete sequences, Scroungers pressed little, and Storers pressed often but rarely completed sequences (Extended Data Fig. 3c).

Fig. 2: Distinct sex-dependent strategies emerge within groups.
Fig. 2: Distinct sex-dependent strategies emerge within groups.The alternative text for this image may have been generated using AI.
Full size image

a, Triads of same-sex mice housed as in Fig. 1, forming microsocieties. b, Left, task timeline. Right, polar histogram of lever presses over 24 h. c, Mean #LP (Welch’s two-sample t-test) and %CS by sex. Wilcoxon rank-sum test with continuity correction. d, Archetypal analysis (n = 195 mice) based on intra-triad %LP and %CS (days 5 to 7), and food pellet gain (G) and loss (L) (Methods). Top left, plot of individual data. Right, ternary plot with proximity to Worker, Scrounger and Storer archetypes. Bottom left, behavioural values at archetype vertices, shown as percentiles across the dataset. e, %LP, %CS and pellet gain and loss by archetype. One-way ANOVA, followed by Tukey’s multiple comparisons post hoc test. W, worker; S, scrounger; St, storer. f, Representative 6-s trajectories post-lever press by archetype (trajectories shown for Scroungers after Worker lever presses). g, Distance to Storer and Scrounger vertices by sex. Wilcoxon rank-sum tests with continuity correction. h, Archetypal repartition by sex. Pearson’s χ² test). i, Triad composition for female (left) and male (right) mice, compared to random expectations based on experimental archetypal repartition by sex. Simulation-based random sampling (Methods). j, Final body mass by archetype. One-way ANOVA followed by pairwise Wilcoxon rank-sum tests with continuity correction and Holm adjustment for multiple comparisons. Data are presented as mean ± s.e.m. Details (test statistics, degrees of freedom and exact P values) are reported in Supplementary Table 1. All tests were two-sided.

Distance to archetypes differed between sexes, with females being closer to the Storer archetype and males being closer to the Scrounger (Fig. 2g) and Worker archetypes. This sex bias was evident both at the individual and group levels: 86% of females were classified as Storers, whereas 87% of males belonged to the Worker or Scrounger archetypes (Fig. 2h). Correspondingly, triad composition deviated from chance, with 75% of female triads composed solely of Storers and 76% of male triads combining Workers and Scroungers (Methods; Fig. 2i), suggesting that intrinsic mechanism and social interactions constrain admissible group configurations: for instance, triads of three Scroungers were very rare. Workers and Scroungers did not differ in final body mass, and the lower final mass in Storers (Fig. 2j) simply reflects the overrepresentation of females, which are lighter overall, in this group. Tube tests, performed before and after the microsociety experiment (Extended Data Fig. 4a), revealed that dominance hierarchies and foraging roles evolved independently, with no consistent relationship between dominance rank and archetypal distance in either sex (Extended Data Fig. 4b). Thus, although hierarchies and foraging roles evolve in groups, they appear to reflect independent dimensions of social organization.

Dopaminergic correlates of social roles

To investigate the neural underpinnings of social roles, we recorded VTA dopaminergic activity during the task with fibre photometry in DAT-Cre mice (n = 19; Fig. 3a,b). All mice showed dopamine transients at food delivery, but the responses at lever presses differed with sex and profile (Fig. 3c,d). Analysis of the results on day 1 of the experiment showed that dopamine transients were already detectable at food delivery in both sexes (Extended Data Fig. 5a), whereas responses related to lever press were significant only in males (Extended Data Fig. 5b).

Fig. 3: Dopaminergic correlates of social roles.
Fig. 3: Dopaminergic correlates of social roles.The alternative text for this image may have been generated using AI.
Full size image

a, Viral injection and fibre photometry implantation targeting VTA dopaminergic neurons in DAT-Cre mice. Bottom, example of GCaMP7c expression. b, Example of ΔF/F traces. Bars indicate food consumption, own lever press (LP) and conspecific lever press (CLP). c, Mean z-scored VTA dopaminergic activity aligned to food, own lever presses and conspecific lever presses, by sex. Two-way repeated-measures ANOVA, sex analysed separately. d, Same as c, by male archetype (Worker or Scrounger). Two-way repeated-measures ANOVA, archetype analysed separately. e, Mean z-scored VTA dopaminergic activity in Scroungers aligned to conspecific lever press, split into focused versus unfocused states. Paired t-test (Methods). f,g, Peak activity at own lever presses (f) and conspecific lever presses (g) versus distance to archetypes with linear regression fits. Estimated peak activity at zero distance from archetype (mean ± 95% confidence interval) from linear models with archetype × distance interaction, with post hoc contrasts performed using emmeans with Tukey adjustment. h, Top, electrophysiology setup and timeline of in vivo anaesthetized recording of VTA DA neurons (Rec). Bottom, VTA dopamine firing rates in naive and post-task mice, by sex, per neuron (left) and per mouse (right). Numbers in columns represent the number of neurons. Wilcoxon rank-sum tests; Holm-adjusted P values. i, Mean firing rate per mouse versus distance to archetype, with linear regression fits. Estimated firing rates at zero distance (mean ± 95% confidence interval), from linear models with archetype × distance interaction and post hoc contrasts using emmeans with Tukey adjustment. Data are mean ± s.e.m; model-based estimates are mean ± 95% confidence interval; shaded areas indicate 95% confidence interval. Event-evoked responses were quantified as the mean z-scored signal in a post-event window (0 to1.5 s) relative to a pre-event window ( −10 to −5 s) (Methods). Details (test statistics, degrees of freedom and exact P values) are reported in Supplementary Table 1. All tests were two-sided.

In females (Fig. 3c), dopaminergic activity increased at food delivery and at their own lever presses. In males (Fig. 3c–d and Extended Data Fig. 5c), Workers responded to their own lever presses, consistent with reward prediction, whereas Scroungers responded to conspecific lever presses, suggesting that for Scroungers, the actions of others became predictive of reward. Notably, the responses of Scroungers were state-dependent: dopamine transients occurred only when mice were oriented towards the lever or dispenser (that is, focused), but not when engaged elsewhere (unfocused) (Fig. 3e and Extended Data Fig. 3a,b). To relate dopaminergic responses to the behavioural space, we analysed dopaminergic peak activity in each mouse as a function of its distance to the three archetypes. Dopaminergic responses to own lever presses were stronger near the Worker and Storer vertices than near the Scrounger one (Fig. 3f and Extended Data Fig. 5d–f), whereas responses to conspecific lever presses peaked near the Scrounger vertex (Fig. 3g and Extended Data Fig. 5e–g). These role-dependent activity patterns cross-validate behavioural archetypes and show that VTA dopaminergic dynamics reflect individual roles within the group. Post-task electrophysiological recordings of VTA dopaminergic neurons in anaesthetized mice (Extended Data Fig. 5h) revealed further sex- and role-dependent effects in firing rate (Fig. 3h,i) and bursting activity (Extended Data Fig. 5i,j). Although baseline dopaminergic activity (that is, before the microsociety task) was comparable between sexes, group housing increased firing in males but not in females, leading to a post-task difference (Fig. 3h and Extended Data Fig. 5h). At the individual level, dopaminergic tone was associated with behavioural specialization: mice closer to the Storer archetype showed lower firing rates than those aligned with Worker or Scrounger profiles (Fig. 3i and Extended Data Fig. 5j). These findings suggest that social experience induces sex-dependent plastic changes in dopaminergic activity, and that dopaminergic tone is associated with the roles adopted by individuals within the microsociety.

Explore–exploit trade-off shapes roles

To unravel the mechanisms of specialization, we developed a Q-learning model (Methods) in which ‘e-mice’ learned to find food in an environment (Fig. 4a) with six spatial states, including a lever press and a food dispenser. Over time, state transition Q-values formed a spatial gradient towards the food dispenser. Specialization could arise from the interaction of intra- or inter-sex individual parameter variability, social constraints (lever press accessibility or food availability) or contingency (individual random choices). We first assessed how behavioural parameters accounted for specialization in the lone (one e-mouse) and social (three e-mice) conditions. Learning rate (α) and temporal discount factor (γ) parameters had moderate effects, whereas the inverse temperature (β), controlling the exploitation–exploration trade-off (high-valued options versus random decisions), was critical (Extended Data Fig. 6a–e). In the lone context, varying β values (Fig. 4a,b) reproduced behavioural patterns observed in mice (Fig. 4c). A lower β favoured exploration (with shallower gradients (Extended Data Fig. 6a) and erratic trajectories, resulting in a lower %CS) (Extended Data Fig. 6b,c), resembling Storers, whereas higher β promoted exploitation (Extended Data Fig. 6b,c), as in Achievers. Thus, introducing sex-dependent β variability, with males exhibiting higher β values, replicated sex differences observed in isolated mice (Fig. 4d). When moving from individuals to groups, even with identical β values within a group, e-triads with high β values (β ≥ 1) favouring exploitation displayed distinct gradients and specialization into Workers and Scroungers, with individual roles determined by contingent dynamics (Fig. 4e and Extended Data Fig. 7a,b), whereas e-triads with low β values (β ≤ 1) favouring exploration displayed no specialization (Extended Data Fig. 6d). Consequently, sex-dependent β distributions closely matched both the archetypal distributions and triad compositions observed experimentally (Fig. 4f,g). In mixed groups (two low β and one high β), all profiles emerged: high-β individuals mainly became Scroungers, whereas low-β individuals became Workers or Storers (Extended Data Fig. 7c–f). This shows that role adoption depends not only on individual β values but also on group composition. For example, low-β females alone do not produce Workers (Extended Data Fig. 7a), yet the same β values in females produce Workers when two low-β females are combined with a high-β male (Extended Data Fig. 7e).

Fig. 4: Reinforcement learning model and emergent strategies.
Fig. 4: Reinforcement learning model and emergent strategies.The alternative text for this image may have been generated using AI.
Full size image

a, Reinforcement learning model with one e-mouse, four states and two actions: lever press (L) and eat at food dispenser (D). Q-values represent agent learning (e-mice). Example with β = 2. b, Simulated #LP versus %CS for 200 e-mice with β uniformly distributed across [0.75,1.25]. c, Experimental #LP versus %CS for 62 mice, including both males and females. d, Pie charts show mean proportions of Achievers and Storers from archetypal decomposition in the (#LP, %CS) space (4 sessions each) in simulated male (n = 1,000, β [1, 1.25]) and female (n = 1,000, β [0.75, 1]) mice. Bottom, behavioural values at archetype vertices, shown as percentiles across the dataset. e, Same reinforcement learning model as in a, with three interacting e-mice. Right, simulated %LP versus %CS, colour-coded by archetype. f, Experimental data colour-coded by behavioural profile. g, Archetypal composition of simulated male triads (n = 1,000, β [2.25, 2.5]) and female triads (n = 1,000, β [0, 0.25]), showing distributions of Workers, Scroungers and Storers in the 4-dimensional feature space (%LP, %CS, gain, loss). Bottom left, behavioural values at archetype vertices, shown as percentiles across the dataset. h, Reduced two-agent model (e-mice 1 and 2) with two states (L, lever; D, food dispenser). Bifurcation diagram of the Q-value at D as a function of β. A pitchfork bifurcation emerges at βbifurc, with a lower limit βno food (Methods). i, A pair of low-β e-mice results in a single stable fixed point (uniform profiles) in the velocity landscape. Black curve shows an example of simulated dynamics. j, A pair of high-β e-mice (for example, two male e-mice) yields two stable fixed points corresponding to a Worker and a Scrounger. k, A pair comprising one high-β male e-mouse and one low-β female e-mouse yields a stable fixed point with the male as a Scrounger.

Competition and contingency shape roles

We assessed the necessary conditions, in terms of decision parameters, for social behavioural specialization to occur in a mathematically tractable reduced model version (Methods) with two e-mice and two locations (lever and food dispenser). Differentiation arose through symmetry breaking at a supercritical pitchfork bifurcation at β = βbifurc (Fig. 4h). Below βbifurc (female e-mice), the single fixed point for Q-values towards D (qD) (and other probabilities; Extended Data Fig. 8a-c) corresponded to Storers. Above βbifurc (male e-mice), two stable fixed points branches emerged, corresponding to Workers and Scroungers (bistability; Fig. 4h and Extended Data Fig. 8a–c).

Qualitative analysis42 revealed the causal mechanism underlying symmetry breaking and behavioural bistability in male e-mice (Fig. 4i–k and Extended Data Fig. 8d,e). Initially, contingency (individual random choices) determined which e-mouse first accessed the dispenser. Positive feedback between reinforcement learning and action selection (softmax) processes led one e-mouse (Scrounger) to predominantly occupy the dispenser area, reducing food access for the other e-mouse and increasing its lever press probability (Worker). Competition for shared resources thus drives specialization in groups of male e-mice, with higher β values intensifying this effect by promoting greater resource exploitation. By contrast, female e-mice maintained uniform behaviours, owing to lower levels of exploitation and competition. Thus, β emerged as a critical sex-dependent control parameter accounting for female uniformity versus male specialization, where specialization arose mainly from task contingency.

These theoretical results allowed us to make three predictions. First, introducing a high-β mouse (male) into a group of low-β mice (females) should disrupt female uniformity, leading some to adopt distinct roles, and increase the propensity of the single male to become a Scrounger (Extended Data Fig. 7d–f). Second, placing a naive individual into a group with a pre-established Scrounger should lead to specialization and adoption of complementary roles through competitive dynamics. Third, manipulating β value biologically via dopaminergic activity should shift behavioural strategies30,31,32,34: decreasing β should reduce competition and promote Storers, whereas increasing β should increase competition and promote specialization.

Triad composition and history shape roles

To test whether interacting with high β individual drives differentiation in low-β mice, we formed mixed-sex triads composed of one male (high β) and two females (low β) (Fig. 5a). The model predicted that this would eliminate uniform Storer strategies and promote specialization, with the male adopting a Scrounger role (Fig. 4k and Extended Data Fig. 7c–f). In such triads, the male performed fewer lever presses than females, but unlike in sex-matched triads, there was no sex difference in %CS (Extended Data Fig. 9b). Although both sexes were equally distant from the Storer archetype, males were significantly closer to the Scrounger archetype (Extended Data Fig. 9c). The overall distribution was characterized by a marked reduction of Storers (Fig. 5a) and showed Worker–Scrounger combinations in most triads (Extended Data Fig. 9d), supporting the hypothesis that behavioural specialization emerges from the set of β values within the group, beyond individual values.

Fig. 5: Experimental manipulation.
Fig. 5: Experimental manipulation.The alternative text for this image may have been generated using AI.
Full size image

a,b, Experimental testing of model predictions: mixed-sex triads and behavioural flexibility. a, Mixed-sex triads composed of one male and two females. Archetypal analysis shows distributions across the three archetypes and archetypal repartition by sex. Pearson’s χ² test. b, Reconfiguration experiment: a male previously identified as Scrounger (week 1) was rehoused with two naive males for a second week (week 2). Right, distance to the Scrounger archetype for retained individuals, compared between week 1 and week 2. Paired t-test, mean ± s.e.m. cf, Bidirectional manipulation of VTA dopaminergic activity modulates behavioural specialization. c, Top, viral injection and fibre implantation for optogenetic activation of VTA dopaminergic neurons (ChR2) in male DAT-Cre mice. Bottom, experimental timeline: photostimulation (stim) was delivered 24 h and 30 min before the microsociety task. Experimental male mouse triads: (1) three YFP mice (YFP, control); (2) one ChR2-stimulated mouse with two YFP mice (1ChR2); and (3) three ChR2-stimulated mice (3ChR2). d, Archetypal profile distributions by condition (YFP, 1ChR2 and 3ChR2) (Pearson’s χ² test), with planned comparisons of Storer proportions using one-sided two-sample proportion tests with Holm adjustment for multiple comparisons. e, Viral injection for chemogenetic inhibition of VTA dopaminergic neurons (hM4Di). Experimental timeline: CNO was administered 24 h before and on day 1 of the microsociety task. Experimental female mouse triads: (1) three mCherry mice (mCh, control); (2) one hM4Di-inhibited mouse with two mCh mice (1hM4Di); and (3) three hM4Di-inhibited mice (3hM4Di). f, Archetypal profile distributions by condition (mCh, 1hM4Di and 3hM4Di) (Pearson’s χ² test), with planned comparisons of Scrounger proportions using one-sided two-sample proportion tests with Holm adjustment. Data are presented as mean ± s.e.m. Details (test statistics, degrees of freedom and exact P values) are reported in Supplementary Table 1. All tests were two-sided unless otherwise stated.

We next assessed how stable a Scrounger role is in dynamic social contexts by retaining one male Scrounger from a male triad and housing it with two naive males during a second week of group living (Fig. 5b). By the end of the second week, the previously specialized Scrounger had significantly increased its #LP (Extended Data Fig. 9e) and its position in archetypal space had shifted away from the Scrounger pole in 85% of cases (11 out of 13; Extended Data Fig. 9g), adopting either a Worker (7 out of 13) or a Storer (4 out of 13) profile. This suggests that behavioural specialization is neither fixed nor intrinsic, but emerges dynamically with a β-like decision parameter setting predispositions that are resolved through social contingency and interaction history.

Pre-task dopaminergic modulation shifts roles

VTA dopaminergic activity has been linked to exploration–exploitation parameters30,31,32,33,34, suggesting that dopaminergic tone could bias behavioural specialization. At the end of the microsociety experiment, males showed higher spontaneous VTA dopaminergic activity than females (Fig. 3h,i); however, this difference is likely to reflect social experience during group living rather than a pre-existing signature. We therefore reasoned that altering dopaminergic tone could reshape group strategies by controlling the tendency towards social uniformity or specialization. We tested this by increasing baseline VTA dopaminergic activity in males or lowering it in females.

We used a pre-task stimulation protocol43,44 to test whether the dopaminergic tone at the onset of social interactions biases subsequent role adoption through its effect on the exploration–exploitation balance. In male DAT-Cre mice43, we selectively increased basal VTA dopaminergic activity by using excitatory channelrhodopsin (ChR2) (Fig. 5c and Extended Data Fig. 10h,i) before the start of the microsociety experiment, rather than during the experiment, to avoid confounding effects of affecting dopaminergic activity during the lever presses. Electrophysiological recordings confirmed that this stimulation reliably increased VTA dopaminergic neuron excitability for at least 6 h post-stimulation (Extended Data Fig. 10a–d). Three conditions were tested: (1) YFP-only mice (control); (2) triads with one ChR2-stimulated mice (1ChR2); and (3) triads with three ChR2-stimulated mice (3ChR2) (Fig. 5c). Archetypal analysis revealed a redistribution of behavioural profiles, with the emergence of Storers specifically in 3ChR2 triads (Fig. 5d). The distance to the Storer archetype was significantly reduced in the 3ChR2 groups compared to controls (Extended Data Fig. 9j), and triad composition shifted to more Storers (Extended Data Fig. 9k). In 1ChR2 triads, this redistribution of archetypal profiles was less pronounced, but it extended to non-stimulated members, suggesting that altering dopaminergic tone in a single mouse can also reshape group dynamics.

The complementary prediction in females was tested by reducing VTA dopaminergic activity with inhibitory DREADDs (hM4Di) activated by clozapine N-oxide (CNO) 24 h before the microsociety experiment and during the first day of the experiment (Fig. 5e and Extended Data Fig. 9l,m). We compared female triads composed of: (1) three mCherry mice (control); (2) one hM4Di-expressing mouse (1hM4Di); or (3) three hM4Di-expressing mice (3hM4Di). Compared with control triads, 3hM4Di females showed a marked shift in behavioural space, with male-typical Scroungers strategies emerging (Fig. 5f). Distance to the Scrounger archetype decreased (Extended Data Fig. 9n), and several triads developed Worker–Scrounger–Scrounger configurations dominated by Scroungers (Extended Data Fig. 9o). By contrast, the 1hM4Di condition had no detectable behavioural effect.

Together, these results confirm that manipulating dopaminergic tone biases the emergence of specific behavioural strategies in a sex-dependent manner. This highlights the complex interplay between VTA dopaminergic neurophysiology and the social environment in shaping individual strategies, providing a neural basis for the adaptive specialization in mouse microsocieties.

Discussion

This study highlights the importance of exploring social organization through the lens of division of labour to gain deeper insights into behavioural specialization in animal microsocieties. Evolutionary approaches conceptualize behavioural specialization as the outcome of stable strategies16,45, leading to fixed, genetically determined social roles. By contrast, our findings reveal a dynamic, context-dependent process, in which roles emerge through learning and repeated social interactions rather than reflecting intrinsic traits. As a result, behavioural roles can be reshaped when group composition changes17,19. Moreover, social norms, conceptualized here as implicit rules governing group behaviour, emerge rapidly from individual history and the underlying structure of dynamical interaction, guiding both individual roles and collective dynamics.

A key contribution of this mouse microsociety paradigm is the demonstration that an individual cognitive parameter (β) (the exploration–exploitation trade-off), operating near to a bifurcation, is sufficient to constrain and shape emergent social organization. Our models and experiments suggest that β reflects dopamine-dependent control of decision variability, thereby linking VTA activity, learning dynamics and social organization. Social competition increases tonic dopaminergic activity of VTA neurons in males and promotes Worker–Scrounger specialization, whereas optogenetic stimulation of these dopaminergic neurons leads to the opposite outcome, favouring Storer-like behaviours. These findings indicate that the β value is not simply determined by the overall firing rate of these dopaminergic neurons, but by interactions between distinct dopaminergic components: tonic, non-specific activity (as during optogenetic activation) and phasic, event-locked responses amplified by competition. The β value may also depend on the salience or signal-to-noise ratio46 of phasic dopaminergic signals, explaining why competition may enhance dopamine salience and behavioural differentiation, whereas optogenetic activation may blur this distinction. Consistent with this possibility, early (day 1) photometry recordings revealed robust food-related dopamine transients in both sexes, but a clearer lever press-related phasic response in males than in females (Extended Data Fig. 5a,b), suggesting that differences in dopamine salience at the time of lever press may already exist before role differentiation unfolds. Increasing VTA dopaminergic tonic baseline activity with optogenetic stimulation may reduce the salience of phasic, task-related responses, lowering β and favouring less differentiated behaviour, and consequently shifts males towards female-like Storer profiles, whereas decreasing activity restores Worker–Scrounger specialization. Although this mechanism explains our findings, it is likely to be only one among several processes through which dopaminergic tone can influence role adoption. Indeed, changes in tonic activity of VTA dopaminergic neurons could also affect arousal, stress reactivity, motivation or social drive—additional routes through which β, as an effective behavioural parameter, may be shifted.

Consistent with this, neural recordings showed that Workers display VTA responses to their own lever presses, whereas Scroungers respond to conspecific presses, indicating that phasic dopaminergic activity is event-specific and role-dependent. These responses scaled with the position in the archetypal space, supporting their relevance to behavioural specialization. In addition, by manipulating tonic dopaminergic activity before social engagement, we further demonstrated that baseline neuromodulatory state biases group composition. Although our manipulations targeted VTA neurons globally and dopamine can also influence arousal or motivation, the effects were reproducible and consistent between groups. Fibre photometry and electrophysiology together show that both phasic signals and long-lasting plasticity contribute to the emergence and stabilization of specialization. Together, our results indicate that behavioural specialization arises not only as a response to social constraint, but also through a feedback loop in which social context modulates dopaminergic activity, which in turn consolidates and stabilizes social roles.

Many studies have documented sex effects on social behaviours47,48,49,50,51. A notable aspect of our findings is the pronounced sex-dependent divergence in social organization between all-male and all-female mouse triads, despite only minor sex differences when mice were tested alone. Male triads exhibited distinct and stable divisions of labour, with roles such as Workers and Scroungers, whereas female groups displayed remarkable uniformity, predominantly adopting Storer strategies. These differences highlight how sex-dependent cognitive52 and neurophysiological differences interact with social environments to shape collective dynamics. In males, high β values promote competition and role specialization, whereas lower β values in females favour uniformity. Nonetheless, β differences alone cannot explain why entire groups of female mice adopt the same strategy. This uniformity points to group-level mechanisms—reduced competition and reinforcement of homogeneity—that collectively stabilize Storer profiles. Consistent with recent findings49, social competition amplifies behavioural divergence in males but has weaker effects in females, highlighting sex-dependent adaptations to social dynamics. Although dominance is associated with greater reward motivation and risk taking53, and dopamine signalling contributes to such social hierarchy54,55,56,57, we show that specialized roles in resource acquisition can emerge independently of social rank, pointing to partly orthogonal dopaminergic mechanisms underlying social hierarchy and behavioural specialization.

Early life contingency has a crucial role in shaping behavioural and social outcomes. Small initial differences, whether due to chance or experience, can be amplified through feedback loops, producing divergent trajectories12,49,58,59,60. In competitive social environments, such as those often encountered by male mice, these feedback loops amplify disparities and promote the emergence of different roles. By contrast, female mice may experience reduced competitive pressure, allowing more-equitable outcomes within their groups. In our model, the distributions of β values required to reproduce experimental behaviours were compatible with the hypothesis that the presence of other males or other females in the unisex triads increases or decreases exploitation, respectively, possibly owing to social effects on the internal state of the individual4,37,51,56. This contingency-based framework emphasizes the importance of considering both contingent aspects (such as random group composition of initial parameters and/or initial random reward discovery) and structural factors (for example, one animal becoming a Scrounger favours others to become Workers) in shaping sex-dependent behaviours and social structures.

In conclusion, this work shows how individual and social factors interact to shape behavioural specialization in animal microsocieties. By bridging neurophysiology, behavioural science and computational modelling, our findings offer a nuanced understanding of the dynamical interplay between individual cognition and collective behaviour, with broad implications for the study of social systems across taxa.

Methods

Animals

Male and female C57BL/6J mice (8 to 12 weeks old; Janvier Labs, France) or DAT-iCre mice on a C57BL/6J background were group-housed under standard conditions (12 h:12 h light:dark, ~22 °C, ~50% humidity) and maintained in triads prior to testing (see Supplementary Methods for details). All experiments and procedures were performed in accordance with European Commission directives 219/1990, 220/1990 and 2010/63, and approved by the ESPCI and the ethical committee no. 059 under APAFIS #34335-2021121318085835.

Lone condition and microsociety experiment

Mice were placed either individually or in triads in a 50 × 50 cm large environment and continuously tracked using the Live Mouse Tracker system36, allowing monitoring of identity and behaviour of individual mice over extended periods. Before the experiment, all mice were implanted with a radio frequency identification (RFID) microchip (Biomark APT-12 PIT Tag, Biomark) under the shoulder skin.

Triads (3 males, 3 females, or 2 females and 1 male) stayed for 8 days and 7 nights in the environment, whereas lone mice were housed for 5 days and 4 nights.

The cage was composed of different zones that were freely accessible by all the mice at any moment. The arena contained a lever on one side and a food dispenser/magazine on the opposite side. Each lever press delivered a 20 mg pellet (TestDiet 5TUL/1811142 purified, Bio-Concept) and the lever becomes inactive for 5 s. A nose poke in the magazine leads to a beam break that is a proxy of the consumption of the pellet by the mouse. Outside the 5 s delay after each lever press, all the mice can press at any moment and consume the pellet at any time following its release. Mice were weighed and health-checked daily. No mouse was removed from the task either for weight loss or for aggression.

Social replacement experiment

To test role stability, we performed a reconfiguration experiment in which a male previously identified as a Scrounger during the first week of group housing was rehoused with two task-naive males (Scrounger–naive–naive condition). Mice were kept in the same semi-natural environment as described above, and behaviour was recorded continuously for an additional 7 days. We retained only triads in which a Scrounger was reliably identified at the end of the first week (13 out of 15 triads). To track individual trajectories across time, the retained Scrounger from week 1 was linked to its corresponding behavioural position in week 2, allowing comparison of lever press counts, distances to archetypes, and profile classification before and after reconfiguration.

Live Mouse Tracker system

Behaviour was monitored continuously (24 h daily, 7 days a week) using a dual acquisition pipeline combining time-stamped, animal-identified operant events (lever, magazine/dispenser TTLs (Transistor-Transistor Logic); MedAssociates) with continuous video tracking to quantify locomotion and social/spatial organization (Live Mouse Tracker)36. Individual lever presses, nose pokes and complete sequences were extracted from the database by matching TTLs to mouse identity within predefined lever/magazine zones, and a complete sequence was defined as a lever press followed by the same mouse reaching the magazine within 6 s, otherwise the sequence was not counted (typically if another mouse was already at the magazine waiting for the food) (see Supplementary Methods for details). Gain was defined as retrieving a pellet within 6 s after a conspecific pressed the lever, whereas loss was defined as a pellet retrieved by a conspecific within 6 s after the subject’s own press.

Tube test

We assessed social hierarchy in a subset of mice using a tube test conducted before and after the microsociety task. Three days before testing, mice were habituated to traverse a 30-cm tube (2.5-cm diameter) connecting two cages; habituation was reached after ≥10 successful crossings. On the following day, tube-test trials were performed by placing two mice at opposite ends of the tube; in triads, each mouse faced the two others once per trial. A win/loss was scored when one mouse retreated such that its hind paws touched the cage floor. Five trials were run before the microsociety task, and five additional trials after 7 days of group housing; the apparatus was cleaned between sessions. For Extended Data Fig. 4a, we quantified each mouse’s rank across the five pre-task trials and compared the final pre-task rank (trial 5) to the post-task rank to visualize rank stability and shifts.

Behavioural analyses

Behavioural data were extracted from the database (PyCharm/MySQL) and analysed in MATLAB and R. Lever press counts (#LP) were computed per day for each mouse. To account for inter-cage variability in overall activity, #LP was normalized within each cage by dividing the #LP per mouse by the total #LP in the cage on that day (%LP). This normalization was applied prior to all archetypal analyses involving social triads. The percentage of complete sequences (%CS) was calculated individually for each mouse, as the number of complete sequences divided by the total #LP on the same day. For Extended Data Fig. 3a,b, polar plots were generated from xy coordinates of representative male (n = 42) and female (n = 42) mice, using a coordinate system centred and aligned so that 0° corresponded to the food dispenser. This standardization allowed us to quantify the time each mouse spent oriented towards the food dispenser during particular events—specifically, in response to a conspecific mouse lever press. The arena was partitioned into four zones (water, food, lever-left area and lever-right area), and mouse orientation was quantified at and 1 s after conspecific lever presses.

Archetypal analysis

Archetypal analyses and visualizations were performed in R using the archetypes package (version 2.2-0.1). Archetypal analysis identifies k idealized behavioural profiles (archetypes) spanning the boundaries of variability in a multivariate dataset and yielding individual α-coefficients and distances to each archetype. The number of archetypes (k) was selected using residual sum of squares as a function of k (elbow criterion) and inspecting solution interpretability in lone and social datasets (see Extended Data Figs. 1d and 2c). For Fig. 1, archetypes were computed in a 2D space (#LP, %CS), yielding two archetypes (Achievers and Storers). In Fig. 2, the three-archetype reference space (Workers, Scroungers and Storers) was built from the full dataset (n = 195) using eight features: %LP and %CS (days 5–7) and food pellets gained from/lost to conspecifics. New cohorts (for example, mixed-sex or dopamine-manipulated triads; Fig. 5) were projected into this fixed space to derive α-coefficients, assign behavioural roles, compute distances to archetypes and compare group compositions. The same archetypal framework was applied to simulated triads (e-triads): archetypes were defined from a reference simulation with separated β values (high-β ‘males’ and low-β ‘females’), enabling robust identification of emergent strategies (Fig. 4g). All other simulated conditions (including Extended Data Fig. 7) were projected into this space. Distances to archetypes derived from α-coefficients were related to dopaminergic signals and firing rates using linear models; expected values at each archetype were estimated from the intercept at zero distance with 95% confidence intervals from prediction errors (Fig. 3). Cage-level compositions were compared to sex-specific null distributions generated by random sampling (10,000 iterations) preserving empirical archetype proportions (Fig. 2i). See also Supplementary Methods for details.

In vivo electrophysiology

Mice were deeply anaesthetized with isoflurane (3% induction, 1–2% maintenance) and extracellular single-unit recordings were performed in the VTA using glass micropipette electrodes (6–9 MΩ, 0.5% NaCl). Signals were amplified and digitized at 25 kHz (spike 2) while sampling the central VTA (anterior–posterior (AP) −3.1 to −4.0 mm, medial–lateral (ML) 0.3–0.7 mm, dorsal–ventral (DV) 4.0–4.8 mm), with electrode tracks spaced by ≥0.1 mm. Spontaneously active dopamine neurons were identified using established electrophysiological criteria (Supplementary Methods). Activity and bursting (% spikes within bursts, %SWB) were quantified in 60-s windows shifted every 15 s (Supplementary Methods).

Stereotaxic surgeries

Stereotaxic surgeries were performed in 6- to 8-week-old DAT-Cre mice under isoflurane anaesthesia. For fibre photometry, AAV1-Syn-FLEX-GCaMP7c was injected unilaterally into the VTA (300 nl, 100 nl min−1; AP −3.20 mm, ML ±0.5 mm, DV −4.20 mm), followed 2 to 3 weeks later by unilateral optic fibre implantation above VTA and fixation with dental acrylic; buprenorphine was given post-operatively. For optogenetics (DAT-Cre males), AAV5-DIO-ChR2-EYFP (or EYFP control) was injected bilaterally in VTA (300 nl per side) and an optic fibre was implanted unilaterally above VTA at a 10° angle (AP −3.20 mm, ML ±0.9 mm, DV −3.95 mm). For chemogenetics (DAT-Cre females), AAV5-DIO-hM4Di-mCherry (or mCherry control) was injected bilaterally in VTA (300 nl per side). Mice recovered in a heated cage and were monitored daily; behavioural testing began at least one week after surgery, and injection or implant sites were systematically verified post hoc by immunohistochemistry (Supplementary Methods).

Immunohistochemistry

After euthanasia, brains were extracted and fixed in 4% paraformaldehyde for at least 3 days at 4 °C, and 60-µm-thick sections were taken through the midbrain on a vibratome. Free-floating sections were blocked (PBS, 3% BSA, 0.2% Triton X-100) and incubated overnight at 4 °C with a mouse anti-tyrosine hydroxylase primary antibody (Sigma T1299, 1:500). Sections were then rinsed and incubated for 3 h at room temperature with a Cy3-conjugated goat anti-mouse secondary antibody (Jackson, 1:500), mounted with ProLong Gold plus DAPI, and imaged on a Zeiss epifluorescence microscope (ZEN); grayscale images were acquired and false-coloured in ImageJ for visualization (Supplementary Methods).

Fibre photometry

DAT-Cre mice injected with AAV1-Syn-FLEX-GCaMP7c and implanted in the VTA underwent fibre photometry experiments in the lone or social context. A Doric Lenses fibre photometry system was employed to record fluorescence signals reflecting dopaminergic neuron activity in the VTA. Fluorescence was excited with a 465-nm LED driven in lock-in mode (220.537 Hz) and routed through a Mini Cube (FMC4_AE(405)_E(460–490)_F(500–550)_S) to the implanted fibre via a patch cord and zirconia sleeve; emitted light was detected with a photoreceiver (AC low setting). The received light signal was converted to electrical signals by a photoreceiver using the AC low setting, before being transmitted through another optic patch cord to the Mini Cube via a dedicated fibre optic adaptor. Signals were acquired in Doric Neuroscience Studio at 12 kHz and low-pass filtered at 12 Hz. For the lone condition, the mice were recorded between one and two hours at the beginning of the dark cycle, when they were active, during the first day and the last two days of the experiment. For the social condition, the mice were also recorded at the beginning of the dark cycle, one after the other, between one and two hours each.

Analysis of fibre photometry recordings

Fluorescence signals were first detrended (biexponential fit) to correct for photobleaching, then re-centred by adding back the pre-detrend mean; ΔF/F was computed relative to a baseline fluorescence signal. Dopamine-related activity was quantified using peri-event time histograms time-locked to behaviourally defined TTL events (lever press, nose poke) with 100-ms bins, converted to event-wise z-scored ΔF/F using the 5-s pre-event baseline, and then smoothed by Gaussian convolution (MATLAB, gausswin; 100-bin window). For each event type, peristimulus time histograms were computed over a −10 s to +10 s window, and response magnitude was quantified as the mean z-scored ΔF/F in a post-event window (0 to +1.5 s) relative to a pre-event window (−10 to −5 s). Statistical significance of event-evoked responses was assessed by paired comparisons of pre-event versus post-event window values across mice, using paired t-tests or Wilcoxon signed-rank tests as appropriate (described in figure legends, Supplementary Table 1 and Supplementary Methods). Focused versus unfocused conspecific lever press events in Scroungers (Fig. 3) were scored visually on the basis of orientation to the lever/dispenser and proximity to the dispenser (within 5 cm; Supplementary Methods).

Optogenetic experiments

In male DAT-Cre mice injected with AAV5-Ef1α-DIO-ChR2(H134R)-EYFP or AAV5-Ef1α-DIO-EYFP, optical stimulations were performed with an ultra-high-power LED (470 nm, Prizmatix) coupled to a patch cord (500 μm core, NA = 0.5, Prizmatix) with an output intensity of 5–10 mW. We applied a 20 Hz optogenetic stimulation protocol (5 ms light pulse) for 15 min, delivered twice: 24 h and 1 h before the start of the microsociety task (in the social environment). No significant changes were observed in the behaviours of the mice after the stimulation.

Chemogenetic experiments

In female DAT-Cre mice injected with AAV5-hSyn-DIO-hM4Di-mCherry or AAV5-hSyn-DIO-mCherry, CNO (water soluble, Hellobio) was administered through a water bottle. The CNO solution was introduced 24 h before the microsociety experiment and remained available during the first day (day 1), and was then replaced by normal water. The concentration of CNO was determined on the basis of a dosage of 5 mg kg−1 assuming a daily water consumption of 5 ml per mouse. A 200 µl solution of CNO at a concentration of 10 mg ml−1 was prepared and administered in 100 ml of water in each bottle.

Modelling

Building a behavioural model of e-mouse behaviours in lone and social conditions

The environment of experiments was modelled as six states (Fig. 4a,e, rooms 1–4, and lever and dispenser positions). The number and sex of agents (e-mice) present in the environment was varied, with; (1) one (male or female) e-mouse in the lone experiment; (2) three male or female e-mice in social experiments; or (3) 1 male and 2 females in the mixed-box experiment. State transition occurred at each time step, with probabilities of transitions determined by a softmax based on Q-values of all accessible states from the current state. In the softmax, the inverse temperature parameter β controlled the exploitation–exploration trade-off, with lower β values producing more stochastic exploration and higher β values promoting exploitation of higher-valued transitions. Q learning occurred after each transition from a departure state to an accessible arriving state. After each transition, the value of the selected move was updated from a prediction error combining the obtained reward and the best expected future value from the arrival state. Updates were scaled by a learning rate (α) and a discount factor (γ) controlled the weight of future outcomes. Furthermore, e-mice encountered satiety, which scaled action probabilities, the learning rate and fatigue, which affected both pressing and eating. Satiety and fatigue were used to scale an action pace in the simulation that was consistent with the experimental measures. The lever could not be pressed for 3 time steps after each press (that is, to mimic the 5 s lever unavailability in experiments). Complete modelling information regarding the full model and observables are given in Supplementary Methods.

Reduced model of social interactions

We built a reduced theoretical model (the ‘reduced model’) to assess, within a mathematically tractable framework, the causal mechanisms whereby specialized behaviours emerge under social interactions. To do so, we derived the reduced model from the reinforcement learning one, based on a continuous time version of Q dynamics—that is, ordinary differential equations (ODEs). In the ODE system, learning and behavioural dynamics operated at a slower time scale than that of individual choices in the full model, so that actions (that is, state transitions, lever pressings and eating) were described probabilistically. In this framework, we performed qualitative analysis to determine the number and stability of fixed points of learning and behavioural (state) variables, as a function of parameters (with a focus on β, essential in setting social interactions in the full model). Moreover, to reduce dimensionality for better tractability, we considered a simpler setup in which the environment contains only two positions for e-mice (the lever L and the food dispenser D) and only two e-mice (which allowed us to assess social interactions), and we did not consider fatigue or satiety. The reduced model ODEs could be expressed under a tractable form (Supplementary Methods). The main results of the qualitative analysis of the system are recapitulated in Supplementary Table 2 (Supplementary Information). Full information on the reduced model derivation and analysis are given in Supplementary Methods.

Statistics

A priori power analyses were not used to predetermine sample sizes. Animals were randomized to groups at the time of viral infection or behavioural testing. Statistical analyses were performed using MATLAB and R. Normality was assessed with the Shapiro–Wilk test; normally distributed data were analysed with independent or paired t-tests, and non-normal data with Mann–Whitney or Wilcoxon signed-rank tests. Repeated-measures ANOVA (one-way or two-way, as appropriate) was used for designs with multiple factors, with Bonferroni–Holm post hoc correction. Chi-square (χ²) tests were used to compare proportions or categorical distributions between groups.

For the archetypal analyses, linear regression models (lm function in R) were used to assess the relationship between behavioural or dopaminergic responses and distance to each archetype. Models included both main effects and archetype × distance interactions, allowing estimation of archetype-specific slopes and intercepts. To compare predicted responses at zero distance (intercepts), marginal means were extracted with the emmeans package and pairwise contrasts performed with Tukey correction. Confidence intervals for predicted values (95% confidence interval, shown in figures) were obtained via standard error propagation (predict(…, se.fit = TRUE)), and model significance was assessed using type II ANOVA (car::Anova) and adjusted R². These procedures were applied identically to behavioural, photometry and neurophysiological datasets.

Unless otherwise specified, all statistical tests were two-tailed. In cases in which specific hypotheses were tested regarding the directionality of the effect—such as expected increases or decreases in dopaminergic firing following ChR2 stimulation or hM4Di inhibition—one-tailed tests were used and explicitly reported.

Data presentation (mean ± s.e.m. or mean ± 95% confidence interval), significance thresholds and the statistical tests used are specified in each figure legend; full statistical details (details test statistics, degrees of freedom, exact P values and multiple-comparison procedures) are provided in Supplementary Table 1.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.