Natural behaviour is learned through dopamine-mediated reinforcement

Kasdin, Jonathan; Duffy, Alison; Nadler, Nathan; Raha, Arnav; Fairhall, Adrienne L.; Stachenfeld, Kimberly L.; Gadagkar, Vikram

doi:10.1038/s41586-025-08729-1

Article
Published: 12 March 2025

Natural behaviour is learned through dopamine-mediated reinforcement

Nature volume 641, pages 699–706 (2025) Cite this article

20k Accesses
19 Citations
186 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 20 February 2026

This article has been updated

Abstract

Many natural motor skills, such as speaking or locomotion, are acquired through a process of trial-and-error learning over the course of development. It has long been hypothesized, motivated by observations in artificial learning experiments, that dopamine has a crucial role in this process. Dopamine in the basal ganglia is thought to guide reward-based trial-and-error learning by encoding reward prediction errors¹, decreasing after worse-than-predicted reward outcomes and increasing after better-than-predicted ones. Our previous work in adult zebra finches—in which we changed the perceived song quality with distorted auditory feedback—showed that dopamine in Area X, the singing-related basal ganglia, encodes performance prediction error: dopamine is suppressed after worse-than-predicted (distorted syllables) and activated after better-than-predicted (undistorted syllables) performance². However, it remains unknown whether the learning of natural behaviours, such as developmental vocal learning, occurs through dopamine-based reinforcement. Here we tracked song learning trajectories in juvenile zebra finches and used fibre photometry³ to monitor concurrent dopamine activity in Area X. We found that dopamine was activated after syllable renditions that were closer to the eventual adult version of the song, compared with recent renditions, and suppressed after renditions that were further away. Furthermore, the relationship between dopamine and song fluctuations revealed that dopamine predicted the future evolution of song, suggesting that dopamine drives behaviour. Finally, dopamine activity was explained by the contrast between the quality of the current rendition and the recent history of renditions—consistent with dopamine’s hypothesized role in encoding prediction errors in an actor–critic reinforcement-learning model^4,5. Reinforcement-learning algorithms⁶ have emerged as a powerful class of model to explain learning in reward-based laboratory tasks, as well as for driving autonomous learning in artificial intelligence⁷. Our results suggest that complex natural behaviours in biological systems can also be acquired through dopamine-mediated reinforcement learning.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Synchronous recording of developmental song learning and dopamine.**

**Fig. 2: Dopamine encodes relative syllable quality during song learning.**

**Fig. 3: Dopamine predicts future song evolution during development.**

**Fig. 4: Dopamine tracks performance history to generate prediction errors.**

Dual neuromodulatory dynamics underlie birdsong learning

Article 12 March 2025

Dopaminergic error signals retune to social feedback during courtship

Article 27 September 2023

Dopaminergic action prediction errors serve as a value-free teaching signal

Article Open access 14 May 2025

Data availability

Further data are available upon request. Source data are provided with this paper.

Code availability

The code written for this study is available upon request.

Change history

20 February 2026
A Correction to this paper has been published: https://doi.org/10.1038/s41586-026-10199-y

References

Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS PubMed Google Scholar
Gadagkar, V. et al. Dopamine neurons encode performance error in singing birds. Science 354, 1278–1282 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhuo, Y. et al. Improved green and red GRAB sensors for monitoring dopaminergic activity in vivo. Nat. Methods 21, 680–691 (2024).
Article CAS PubMed Google Scholar
Chen, R. & Goldberg, J. H. Actor–critic reinforcement learning in the songbird. Curr. Opin. Neurobiol. 65, 1–9 (2020).
Article CAS PubMed PubMed Central Google Scholar
Joel, D., Niv, Y. & Ruppin, E. Actor–critic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw. 15, 535–547 (2002).
Article PubMed Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 1998).
Botvinick, M. et al. Reinforcement learning, fast and slow. Trends Cogn. Sci. 23, 408–422 (2019).
Article PubMed Google Scholar
Wickens, J. R., Reynolds, J. N. & Hyland, B. I. Neural mechanisms of reward-related motor learning. Curr. Opin. Neurobiol. 13, 685–690 (2003).
Article CAS PubMed Google Scholar
Costa, R. M. Plastic corticostriatal circuits for action learning: what’s dopamine got to do with it? Ann. N. Y. Acad. Sci. 1104, 172–191 (2007).
Article ADS CAS PubMed Google Scholar
Jarvis, E. Vocal learning and spoken language. Science 366, 50–54 (2019).
Article ADS CAS PubMed Google Scholar
Davenport, M. H. & Jarvis, E. D. Birdsong neuroscience and the evolutionary substrates of learned vocalization. Trends Neurosci. 46, 97–99 (2023).
Article CAS PubMed Google Scholar
Konopka, G. & Roberts, T. F. Insights into the neural and genetic basis of vocal communication. Cell 164, 1269–1276 (2016).
Article CAS PubMed PubMed Central Google Scholar
Prather, J., Okanoya, K. & Bolhuis, J. J. Brains for birds and babies: neural parallels between birdsong and speech acquisition. Neurosci. Biobehav. Rev. 81, 225–237 (2017).
Article PubMed Google Scholar
Doupe, A. J. & Kuhl, P. K. Birdsong and human speech: common themes and mechanisms. Annu. Rev. Neurosci. 22, 567–631 (1999).
Article CAS PubMed Google Scholar
Brainard, M. S. & Doupe, A. J. Translating birdsong: songbirds as a model for basic and applied medical research. Annu. Rev. Neurosci. 36, 489–517 (2013).
Article CAS PubMed PubMed Central Google Scholar
Burke, J. E. & Schmidt, M. F. Neural control of birdsong. eLS 1, 345–355 (2020).
Google Scholar
Person, A. L., Gale, S. D., Farries, M. A. & Perkel, D. J. Organization of the songbird basal ganglia, including area X. J. Comp. Neurol. 508, 840–866 (2008).
Article PubMed Google Scholar
Lovell, P. V. et al. ZEBrA: Zebra finch Expression Brain Atlas—a resource for comparative molecular neuroanatomy and brain evolution studies. J. Comp. Neurol. 528, 2099–2131 (2020).
Article PubMed PubMed Central Google Scholar
Tumer, E. C. & Brainard, M. S. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244 (2007).
Article ADS CAS PubMed Google Scholar
Andalman, A. S. & Fee, M. S. A basal ganglia–forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc. Natl Acad. Sci. USA 106, 12518–12523 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Duffy, A., Latimer, K. W., Goldberg, J. H., Fairhall, A. L. & Gadagkar, V. Dopamine neurons evaluate natural fluctuations in performance quality. Cell Rep. 38, 110574 (2022).
Article CAS PubMed PubMed Central Google Scholar
Roeser, A. et al. Dopaminergic error signals retune to social feedback during courtship. Nature 623, 375–380 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Hisey, E., Kearney, M. G. & Mooney, R. A common neural circuit mechanism for internally guided and externally reinforced forms of motor learning. Nat. Neurosci. 21, 589–597 (2018).
Article CAS PubMed PubMed Central Google Scholar
Xiao, L. et al. A basal ganglia circuit sufficient to guide birdsong learning. Neuron 98, 208–221 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hoffmann, L. A., Saravanan, V., Wood, A. N., He, L. & Sober, S. J. Dopaminergic contributions to vocal learning. J. Neurosci. 36, 2176–2189 (2016).
Article CAS PubMed PubMed Central Google Scholar
Fee, M. S. & Goldberg, J. H. A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience 198, 152–170 (2011).
Article CAS PubMed PubMed Central Google Scholar
Mackevicius, E. L. & Fee, M. S. Building a state space for song learning. Curr. Opin. Neurobiol. 49, 59–68 (2018).
Article CAS PubMed Google Scholar
Doya, K. & Sejnowski, T. A novel reinforcement model of birdsong vocalization learning. In Adv. Neural Information Processing Systems 7 (NIPS 7) (eds Tesauro, G. et al.) 101–108 (MIT Press, 1995).
Tchernichovski, O., Mitra, P. P., Lints, T. & Nottebohm, F. Dynamics of the vocal imitation process: how a zebra finch learns its song. Science 291, 2564–2569 (2001).
Article ADS CAS PubMed Google Scholar
Kollmorgen, S., Hahnloser, R. H. R. & Mante, V. Nearest neighbours reveal fast and slow components of motor learning. Nature 577, 526–530 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Funabiki, Y. & Konishi, M. Long memory in song learning by zebra finches. J. Neurosci. 23, 6928–6935 (2003).
Article CAS PubMed PubMed Central Google Scholar
Steinfath, E., Palacios-Munoz, A., Rottschafer, J. R., Yuezak, D. & Clemens, J. Fast and accurate annotation of acoustic signals with deep neural networks. eLife 10, e68837 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lerner, T. N., Holloway, A. L. & Seiler, J. L. Dopamine, updated: reward prediction error and beyond. Curr. Opin. Neurobiol. 67, 123–130 (2021).
Article CAS PubMed Google Scholar
Toutounji, H., Zai, A. T., Tchernichovski, O., Hahnloser, R. H. R. & Lipkind, D. Learning the sound inventory of a complex vocal skill via an intrinsic reward. Sci. Adv. 10, eadj3824 (2024).
Article PubMed PubMed Central Google Scholar
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005).
Article CAS PubMed PubMed Central Google Scholar
Adam, I. et al. Daily vocal exercise is necessary for peak performance singing in a songbird. Nat. Commun. 14, 7787 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Fiete, I. R., Fee, M. S. & Seung, H. S. Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. J. Neurophysiol. 98, 2038–2057 (2007).
Article PubMed Google Scholar
Ikeda, M. Z., Trusel, M. & Roberts, T. F. Memory circuits for vocal imitation. Curr. Opin. Neurobiol. 60, 37–46 (2019).
Article PubMed PubMed Central Google Scholar
Louder, M. I. M. et al. Transient sensorimotor projections in the developmental song learning period. Cell Rep. 43, 114196 (2024).
Article CAS PubMed Google Scholar
Tian, J. et al. Distributed and mixed information in monosynaptic inputs to dopamine neurons. Neuron 91, 1374–1389 (2016).
Article CAS PubMed PubMed Central Google Scholar
Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural circuitry of reward prediction error. Annu. Rev. Neurosci. 40, 373–394 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chen, R. et al. Songbird ventral pallidum sends diverse performance error signals to dopaminergic midbrain. Neuron 103, 266–276 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kearney, M. G., Warren, T. L., Hisey, E., Qi, J. & Mooney, R. Discrete evaluative and premotor circuits enable vocal learning in songbirds. Neuron 104, 559–575 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bottjer, S. W., Brady, J. D. & Cribbs, B. Connections of a motor cortical region in zebra finches: relation to pathways for vocal learning. J. Comp. Neurol. 420, 244–260 (2000).
Article CAS PubMed Google Scholar
Mandelblat-Cerf, Y., Las, L., Denisenko, N. & Fee, M. S. A role for descending auditory cortical projections in songbird vocal learning. eLife 3, e02152 (2014).
Article PubMed PubMed Central Google Scholar
Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).
Article ADS CAS PubMed Google Scholar
Fawzi, A. et al. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Markowitz, J. E. et al. Spontaneous behaviour is structured by reinforcement without explicit reward. Nature 614, 108–117 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Colquitt, B. M., Merullo, D. P., Konopka, G., Roberts, T. F. & Brainard, M. S. Cellular transcriptomics reveals evolutionary identities of songbird vocal circuits. Science 371, eabd9704 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pfenning, A. R. et al. Convergent transcriptional specializations in the brains of humans and song-learning birds. Science 346, 1256846 (2014).
Article ADS PubMed PubMed Central Google Scholar
Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. A procedure for an automated measurement of song similarity. Anim. Behav. 59, 1167–1176 (2000).
Article CAS PubMed Google Scholar
Immelman, K. in Bird Vocalizations (ed. Hinde, R. A.) 64–74 (Cambridge Univ. Press, 1969).
Krzanowski, W. J. Principles of Multivariate Analysis: A User’s Perspective (Oxford Univ. Press, 1988).
Seber, G. A. F. Multivariate Observations (Wiley, 1984).
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974).
Article ADS MathSciNet Google Scholar
Ljung, L. System Identification: Theory for the User (Prentice Hall, 1999).
Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).
MathSciNet Google Scholar

Download references

Acknowledgements

We thank L. Abbott, D. Aronov, M. Churchland, A. Litwin-Kumar, N. Sawtell, S. Siegelbaum and members of the V.G. laboratory for comments and suggestions; M. Eswaran and A. Sahilu for technical assistance; and K. J. Miller for analysis suggestions and feedback on the manuscript. Imaging was performed with support from the Zuckerman Institute’s Cellular Imaging platform for instrument use and technical advice. We thank the Zebra Finch Expression Brain Atlas website (http://www.zebrafinchatlas.org) for histological reference images. Funding support was provided to A.L.F. and A.D. by the Simons Collaboration on the Global Brain, and to V.G. by the NIH (R00NS102520 and DP2AT012347) and the Searle, Klingenstein–Simons and McKnight scholars programs.

Author information

These authors contributed equally: Jonathan Kasdin, Alison Duffy

Authors and Affiliations

Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
Jonathan Kasdin, Nathan Nadler, Arnav Raha, Kimberly L. Stachenfeld & Vikram Gadagkar
Department of Neurobiology and Biophysics and Computational Neuroscience Center, University of Washington, Seattle, WA, USA
Alison Duffy & Adrienne L. Fairhall
Google DeepMind, New York, NY, USA
Kimberly L. Stachenfeld

Authors

Jonathan Kasdin
View author publications
Search author on:PubMed Google Scholar
Alison Duffy
View author publications
Search author on:PubMed Google Scholar
Nathan Nadler
View author publications
Search author on:PubMed Google Scholar
Arnav Raha
View author publications
Search author on:PubMed Google Scholar
Adrienne L. Fairhall
View author publications
Search author on:PubMed Google Scholar
Kimberly L. Stachenfeld
View author publications
Search author on:PubMed Google Scholar
Vikram Gadagkar
View author publications
Search author on:PubMed Google Scholar

Contributions

J.K. and V.G. conceived the study, and designed and performed the experiments. J.K., A.D., N.N., A.R., A.L.F., K.L.S. and V.G. performed data analysis. A.D., A.R., A.L.F. and K.L.S. performed data modelling. J.K., A.D., N.N., A.R. and V.G. wrote the original draft of the manuscript. J.K., A.D., N.N., A.R., A.L.F., K.L.S. and V.G. edited and reviewed the final manuscript. A.L.F. and V.G. acquired funding. A.L.F., K.L.S. and V.G. supervised the project.

Corresponding author

Correspondence to Vikram Gadagkar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Ofer Tchernichovski and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Histological verification of fibre placement in Area X.

a, Schematic from the zebra finch atlas. Adapted from ref. ¹⁸ (Wiley). Orange dashed lines indicate the intended placement of the cannula in Area X. b, Example brain slice showing fibre placement (purple dashed circle, Area X; orange dashed lines, fibre placement).

Extended Data Fig. 2 Singing-related dopamine activity.

a, Dopamine responses aligned to singing bout boundaries (see Methods) for an example bird across days 61–100 of development. Example spectrograms and dopamine responses aligned to bout onsets (n = 1912, top) and bout offsets (n = 7745, bottom), plotted above average ΔF/F signals (purple, bout onsets; orange, bout offsets). b, ΔF/F signals plotted similarly to a but each row represents the average across one of n = 6 birds (shading, ±s.e.m). c,d, Mean subtraction of singing-related dopamine activity to isolate the error response (see Methods). Top, ΔF/F signals for an example syllable (c) and averaged across all (n = 25) syllables (d) for the closest 10% relative distance (blue), furthest 10% relative distance (red) and mean (black) across all renditions (shading, ±s.e.m). Bottom, ΔF/F signals for the furthest and closest relative distance plotted as on top but after mean subtraction. e,f, ΔF/F signals plotted as in c,d with the dopamine response to the middle 10% of trials with 0 average relative distance (purple, equidistant - neither closer nor further). g, Scatter plot of averaged ΔF/F signals for all syllables (Methods) for closer (blue), further (red), and equidistant (purple) renditions (*P < 0.05, n.s. P > 0.05, not significantly different from zero, one sample t-test; black bars, mean ± s.d.).

Extended Data Fig. 3 Rendition-to-rendition variability in dopamine responses.

a, Spectrogram of an example eight-second section of song plotted above the corresponding ΔF/F signal. b, Single rendition dopamine responses for an example syllable on a single day (day 67) with the closest 10% (top) and furthest 10% (bottom) relative distance plotted above average ΔF/F signals (blue, closer renditions; red, further renditions; shading, ±s.d.). c, Dopamine responses (see Methods) for the same data shown in b (bar and error bar, mean ± s.d.; *P < 0.05, two-sided t-test). d, Dopamine responses plotted similarly to c but averaged across development (day 61−100) for all n = 25 syllables, ordered by the difference between closer and further response (*P < 0.05, n.s. P > 0.05, not significant, 2-sided t-test with Holm–Bonferroni correction⁵⁷).

Extended Data Fig. 4 Variability in dopamine responses across syllables.

a, Syllable-averaged dopamine responses (reproduced from Fig. 2e, see Methods) across development (day 61−100) with the closest 10% relative distance (top) and furthest 10% relative distance (bottom) plotted above average ΔF/F signals (blue, closer renditions; red, further renditions; shading, ±s.e.m) sorted from weakest to strongest dopamine response. b, Example spectrograms for the three syllables with the strongest dopamine response (top) from early in development (day ≤ 76) and during adulthood (day > 90) (top) and the three syllables with the weakest dopamine response (bottom). Number indicates ranking (1, weakest; 25, strongest). c, Average over all syllables (n = 25) of the mean distance from all renditions on a given day to the adult (age > 90 days) median (grey shading, ±s.e.m.). d, Correlation between magnitude of dopamine response and syllable duration (black dots, 25 syllables; black line, best fit; n.s. P > 0.05, no significant correlation, Pearson linear correlation coefficient). e–g, Plotted as in d but for average s.d. per day of distance to adult version across development (e), per hour change in distance averaged across each day (f) and mean change in distance per day across development (g).

Extended Data Fig. 5 Circadian oscillations have a subtle influence on the magnitude of the dopamine response.

a, Distance to adult median averaged across all syllables (n = 25) and over development (day 61 − 100) decreases over the course of a day (line and shading, mean across syllables ± s.e.m.). b, Average dopamine response across syllable renditions (see Methods), plotted similarly to a. c, The percentage of renditions with closest 10% (blue) and furthest 10% (red) relative distance during a day that were sung in each hour of the day (line and shading, mean across syllables ± s.e.m.). d, Dopamine responses to the closest 10% (blue) and furthest 10% (red) relative distance renditions sung in the morning (first 3 hours of the day, solid lines, n = 51560 total renditions across birds) and afternoon (latter 9 hours of the day, dotted lines, n = 51465 total renditions across birds) averaged across n = 25 syllables across development (Day 61−100). e, Scatter plot of averaged ΔF/F signals for the closest 10% relative distance sung in the morning (left) and afternoon (right) for all syllables (n.s. P > 0.05, not significant, paired t-test; black bars, mean ± s.d.). f, Plotted similarly to e but for furthest 10% relative distance (*P < 0.05, significant, paired t-test).

Extended Data Fig. 6 Variation in syllable amplitude does not account for the dopamine response.

a, Mean amplitude across all renditions on each day of development (day 61−100) for an example syllable, normalized to the average adult (day > 90) amplitude of that syllable. b, Plotted similarly to a but averaged across all (n = 25) syllables (shading, ±s.e.m.). c, Correlation of the ΔF/F signal with syllable amplitude (see Methods) across development (day 61−100) for an example syllable (black line, best fit line; n.s., p > 0.05, not significant, Pearson linear correlation coefficient). d, Pearson linear correlation coefficient between syllable amplitude and the ΔF/F signal for all syllables (open circles, p > 0.05, not significant; filled circles, p < 0.05, Pearson linear correlation coefficient with Holm–Bonferroni correction⁵⁷). e,f, Syllable-averaged (n = 25) ΔF/F signals for 10% closest (blue) and 10% furthest (red) relative distance before (e, reproduced from Fig. 2e) and after (f) normalization (see Methods) by syllable amplitude (shading, ± s.e.m). g,h, Scatter plot of averaged ΔF/F signals for all n = 25 syllables for closer (g) and further (h) relative distance comparing not normalized (left) and amplitude-normalized (right) renditions (n.s. P > 0.05, not significant, paired t-test; black bars, mean ± s.d.).

Extended Data Fig. 7 Dopamine predicts future song evolution.

a, Movement along the dopamine vector is not sensitive to song-block size (see Methods) and the effect is consistently significant at a block size of 10 or more renditions. Replication of analysis shown in Fig. 3f over a range of song-block sizes. Movement along the dopamine vector averaged across all n = 25 syllables (blue trace and shading, mean ± s.e.m.; black line and grey shading, mean ± 1.96 s.d. of shuffled data). b, Six example syllables plotted as in Fig. 3d. Average across all dopamine vectors for the example syllable across all focal groups of 150 syllable renditions. c, Extension of analysis shown in Fig. 3f (see Methods) for both forward (blue) and backward (red) in rendition steps relative to the focal song block, plotted similarly to a (black bar and star, P < 0.05, paired t-test, region in which the difference in absolute magnitudes between the movement along the dopamine vector is significantly greater in the future than in the past).

Extended Data Fig. 8 Dopamine correlates more with the future than with the past at points of abrupt change in song.

a, Schematic of analysis. Three consecutive, non-overlapping song blocks are selected: past, current, and future. Past and future vectors are computed from the difference in median song locations in each song block. Song renditions in the current song block are projected onto the past and future vectors and the correlation with dopamine in the current song block are computed along these dimensions. This is repeated at steps of 25 renditions across all song development (see Methods). Local song-block triplets are divided into two categories depending on the local geometry of the song trajectory: ‘hinge’ (θ < π/2) trajectories and ‘arrow’ (θ > π/2) trajectories (θ is defined between 0 and π). b, Distribution of angles between the future vector and the negative of the past vector across all sampled triplets of song blocks along song development across n = 25 syllables (block size, 150 renditions). c, Average correlation coefficient between current song renditions projected onto the future and past song vectors and current dopamine transients (points and error bars, mean ± s.e.m over syllables; blue/pink stars, p < 0.05, two-sided t-test, correlations between dopamine and past/future song vectors are significantly different from zero; black circle, p < 0.05, paired t-test, correlations between dopamine and past/future song vectors are significantly different). d,e, Plotted as in c, but across multiple song-block sizes for ‘hinge’ (d) and ‘arrow’ (e) trajectories. f, Example syllable spectrograms showing the emergence of an element present in the tutor song but not in early renditions during development. Left, syllable renditions produced on a single day during development (day = 61) were divided into ‘Immature’ versions (without the tutor element, purple box) and ‘Mature’ versions (with the tutor element). Right, example spectrograms of the adult version (day = 100) and corresponding tutor syllable. g, Dopamine responses for the example syllable shown in a on a single day during development (day = 61) for the ‘Mature’ (n = 423, top) and ‘Immature’ (n = 402, bottom) versions plotted above average ΔF/F signals (all plots aligned to syllable onset; blue, ‘Mature’ renditions; red, ‘Immature’ renditions; P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ and ‘Mature’ renditions). h,i, A second example syllable plotted as in f,g using day = 63 during development (P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ (n = 201) and ‘Mature’ (n = 137) renditions). j,k, A third example syllable plotted as in f,g using day = 61 during development (P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ (n = 240) and ‘Mature’ (n = 310) renditions).

Source data

Extended Data Fig. 9 Dopamine tracks performance history in individual syllables.

a, Magnified view of the regression coefficients over all development for the current rendition (0) and the previous two renditions (−1, −2) for all n = 25 syllables plotted as in Fig. 4d (error bars, mean ± s.e.m; *P < 0.001, two-sided t-test; see Methods). b, Fraction of performance history coefficients in linear regression fit to dopamine in Fig. 4b that are less than 0 compared to the shuffled distribution (see Methods). Coefficients are consistently negative above chance (P < 0.05, computed from shuffled distribution, see Methods). c,d, Linear regression fit of performance to dopamine for early (c) and late (d) development, plotted as in Fig. 4d. e, Plotted as in Fig. 4c. Coefficients of a linear regression fit of the song history to the dopamine signal for example syllables fit across development (blue trace, average over days; error bars, ±s.e.m.).

Extended Data Fig. 10 Dopamine transients represent prediction errors better than mismatch errors.

a, Top, comparison of the PPE and ME models with different numbers of performance history terms (see Methods). Each trace is a single syllable model comparison (blue dot, the minimum, negative Akaike information criterion, ∆AIC, the PPE model’s best fit relative to the ME model). The magnitude and sign of ∆AIC indicate the relative superiority of one model over another. 25/25 syllables were included in the ME/PPE comparison because all syllables had at least one ME or PPE linear regression onto dopamine with a significant R² value (computed using the fitlm function in MATLAB). 22/25 syllable traces have a negative minimum (blue dot excluded from the three syllable traces which are always above 0; that is, the ME model always outperforms the PPE model). Blue line indicates the average best number of history terms across syllables (including the three syllables in which no. history terms = 0). Bottom, summary plot of number of history terms in best selected model from the top traces. 3/25 models had no PPE models which improved fit to dopamine over the ME model (shown with open circle). b, Top, as in a, but applying ∆MSE as a secondary model comparison metric (see Methods). As in a, the sign and magnitude of ∆MSE indicates the relative superiority of the ME versus PPE model. Negative values indicate that the PPE model outperforms the ME model. The ∆AIC and ∆MSE metrics found the average best number of history terms across syllables to be n = 11 and n = 12, respectively. Bottom, plotted similarly to a for ∆MSE.

Supplementary information

Reporting Summary (download PDF )

Peer Review file (download PDF )

Source data

Source Data Fig. 1 (download XLSX )

Source Data Fig. 2 (download XLSX )

Source Data Fig. 3 (download XLSX )

Source Data Fig. 4 (download XLSX )

Source Data Extended Data Fig. 8 (download XLSX )

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kasdin, J., Duffy, A., Nadler, N. et al. Natural behaviour is learned through dopamine-mediated reinforcement. Nature 641, 699–706 (2025). https://doi.org/10.1038/s41586-025-08729-1

Download citation

Received: 17 June 2024
Accepted: 03 February 2025
Published: 12 March 2025
Version of record: 12 March 2025
Issue date: 15 May 2025
DOI: https://doi.org/10.1038/s41586-025-08729-1

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

Change history

20 February 2026

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links