Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Natural behaviour is learned through dopamine-mediated reinforcement

An Author Correction to this article was published on 20 February 2026

This article has been updated

Abstract

Many natural motor skills, such as speaking or locomotion, are acquired through a process of trial-and-error learning over the course of development. It has long been hypothesized, motivated by observations in artificial learning experiments, that dopamine has a crucial role in this process. Dopamine in the basal ganglia is thought to guide reward-based trial-and-error learning by encoding reward prediction errors1, decreasing after worse-than-predicted reward outcomes and increasing after better-than-predicted ones. Our previous work in adult zebra finches—in which we changed the perceived song quality with distorted auditory feedback—showed that dopamine in Area X, the singing-related basal ganglia, encodes performance prediction error: dopamine is suppressed after worse-than-predicted (distorted syllables) and activated after better-than-predicted (undistorted syllables) performance2. However, it remains unknown whether the learning of natural behaviours, such as developmental vocal learning, occurs through dopamine-based reinforcement. Here we tracked song learning trajectories in juvenile zebra finches and used fibre photometry3 to monitor concurrent dopamine activity in Area X. We found that dopamine was activated after syllable renditions that were closer to the eventual adult version of the song, compared with recent renditions, and suppressed after renditions that were further away. Furthermore, the relationship between dopamine and song fluctuations revealed that dopamine predicted the future evolution of song, suggesting that dopamine drives behaviour. Finally, dopamine activity was explained by the contrast between the quality of the current rendition and the recent history of renditions—consistent with dopamine’s hypothesized role in encoding prediction errors in an actor–critic reinforcement-learning model4,5. Reinforcement-learning algorithms6 have emerged as a powerful class of model to explain learning in reward-based laboratory tasks, as well as for driving autonomous learning in artificial intelligence7. Our results suggest that complex natural behaviours in biological systems can also be acquired through dopamine-mediated reinforcement learning.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Synchronous recording of developmental song learning and dopamine.
The alternative text for this image may have been generated using AI.
Fig. 2: Dopamine encodes relative syllable quality during song learning.
The alternative text for this image may have been generated using AI.
Fig. 3: Dopamine predicts future song evolution during development.
The alternative text for this image may have been generated using AI.
Fig. 4: Dopamine tracks performance history to generate prediction errors.
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

Further data are available upon request. Source data are provided with this paper.

Code availability

The code written for this study is available upon request.

Change history

References

  1. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

    Article  CAS  PubMed  Google Scholar 

  2. Gadagkar, V. et al. Dopamine neurons encode performance error in singing birds. Science 354, 1278–1282 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. Zhuo, Y. et al. Improved green and red GRAB sensors for monitoring dopaminergic activity in vivo. Nat. Methods 21, 680–691 (2024).

    Article  CAS  PubMed  Google Scholar 

  4. Chen, R. & Goldberg, J. H. Actor–critic reinforcement learning in the songbird. Curr. Opin. Neurobiol. 65, 1–9 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Joel, D., Niv, Y. & Ruppin, E. Actor–critic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw. 15, 535–547 (2002).

    Article  PubMed  Google Scholar 

  6. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 1998).

  7. Botvinick, M. et al. Reinforcement learning, fast and slow. Trends Cogn. Sci. 23, 408–422 (2019).

    Article  PubMed  Google Scholar 

  8. Wickens, J. R., Reynolds, J. N. & Hyland, B. I. Neural mechanisms of reward-related motor learning. Curr. Opin. Neurobiol. 13, 685–690 (2003).

    Article  CAS  PubMed  Google Scholar 

  9. Costa, R. M. Plastic corticostriatal circuits for action learning: what’s dopamine got to do with it? Ann. N. Y. Acad. Sci. 1104, 172–191 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Jarvis, E. Vocal learning and spoken language. Science 366, 50–54 (2019).

    Article  ADS  CAS  PubMed  Google Scholar 

  11. Davenport, M. H. & Jarvis, E. D. Birdsong neuroscience and the evolutionary substrates of learned vocalization. Trends Neurosci. 46, 97–99 (2023).

    Article  CAS  PubMed  Google Scholar 

  12. Konopka, G. & Roberts, T. F. Insights into the neural and genetic basis of vocal communication. Cell 164, 1269–1276 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Prather, J., Okanoya, K. & Bolhuis, J. J. Brains for birds and babies: neural parallels between birdsong and speech acquisition. Neurosci. Biobehav. Rev. 81, 225–237 (2017).

    Article  PubMed  Google Scholar 

  14. Doupe, A. J. & Kuhl, P. K. Birdsong and human speech: common themes and mechanisms. Annu. Rev. Neurosci. 22, 567–631 (1999).

    Article  CAS  PubMed  Google Scholar 

  15. Brainard, M. S. & Doupe, A. J. Translating birdsong: songbirds as a model for basic and applied medical research. Annu. Rev. Neurosci. 36, 489–517 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Burke, J. E. & Schmidt, M. F. Neural control of birdsong. eLS 1, 345–355 (2020).

    Google Scholar 

  17. Person, A. L., Gale, S. D., Farries, M. A. & Perkel, D. J. Organization of the songbird basal ganglia, including area X. J. Comp. Neurol. 508, 840–866 (2008).

    Article  PubMed  Google Scholar 

  18. Lovell, P. V. et al. ZEBrA: Zebra finch Expression Brain Atlas—a resource for comparative molecular neuroanatomy and brain evolution studies. J. Comp. Neurol. 528, 2099–2131 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Tumer, E. C. & Brainard, M. S. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  20. Andalman, A. S. & Fee, M. S. A basal ganglia–forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc. Natl Acad. Sci. USA 106, 12518–12523 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. Duffy, A., Latimer, K. W., Goldberg, J. H., Fairhall, A. L. & Gadagkar, V. Dopamine neurons evaluate natural fluctuations in performance quality. Cell Rep. 38, 110574 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Roeser, A. et al. Dopaminergic error signals retune to social feedback during courtship. Nature 623, 375–380 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. Hisey, E., Kearney, M. G. & Mooney, R. A common neural circuit mechanism for internally guided and externally reinforced forms of motor learning. Nat. Neurosci. 21, 589–597 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Xiao, L. et al. A basal ganglia circuit sufficient to guide birdsong learning. Neuron 98, 208–221 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Hoffmann, L. A., Saravanan, V., Wood, A. N., He, L. & Sober, S. J. Dopaminergic contributions to vocal learning. J. Neurosci. 36, 2176–2189 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Fee, M. S. & Goldberg, J. H. A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience 198, 152–170 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Mackevicius, E. L. & Fee, M. S. Building a state space for song learning. Curr. Opin. Neurobiol. 49, 59–68 (2018).

    Article  CAS  PubMed  Google Scholar 

  28. Doya, K. & Sejnowski, T. A novel reinforcement model of birdsong vocalization learning. In Adv. Neural Information Processing Systems 7 (NIPS 7) (eds Tesauro, G. et al.) 101–108 (MIT Press, 1995).

  29. Tchernichovski, O., Mitra, P. P., Lints, T. & Nottebohm, F. Dynamics of the vocal imitation process: how a zebra finch learns its song. Science 291, 2564–2569 (2001).

    Article  ADS  CAS  PubMed  Google Scholar 

  30. Kollmorgen, S., Hahnloser, R. H. R. & Mante, V. Nearest neighbours reveal fast and slow components of motor learning. Nature 577, 526–530 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  31. Funabiki, Y. & Konishi, M. Long memory in song learning by zebra finches. J. Neurosci. 23, 6928–6935 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Steinfath, E., Palacios-Munoz, A., Rottschafer, J. R., Yuezak, D. & Clemens, J. Fast and accurate annotation of acoustic signals with deep neural networks. eLife 10, e68837 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Lerner, T. N., Holloway, A. L. & Seiler, J. L. Dopamine, updated: reward prediction error and beyond. Curr. Opin. Neurobiol. 67, 123–130 (2021).

    Article  CAS  PubMed  Google Scholar 

  34. Toutounji, H., Zai, A. T., Tchernichovski, O., Hahnloser, R. H. R. & Lipkind, D. Learning the sound inventory of a complex vocal skill via an intrinsic reward. Sci. Adv. 10, eadj3824 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Adam, I. et al. Daily vocal exercise is necessary for peak performance singing in a songbird. Nat. Commun. 14, 7787 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Fiete, I. R., Fee, M. S. & Seung, H. S. Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. J. Neurophysiol. 98, 2038–2057 (2007).

    Article  PubMed  Google Scholar 

  38. Ikeda, M. Z., Trusel, M. & Roberts, T. F. Memory circuits for vocal imitation. Curr. Opin. Neurobiol. 60, 37–46 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Louder, M. I. M. et al. Transient sensorimotor projections in the developmental song learning period. Cell Rep. 43, 114196 (2024).

    Article  CAS  PubMed  Google Scholar 

  40. Tian, J. et al. Distributed and mixed information in monosynaptic inputs to dopamine neurons. Neuron 91, 1374–1389 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural circuitry of reward prediction error. Annu. Rev. Neurosci. 40, 373–394 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Chen, R. et al. Songbird ventral pallidum sends diverse performance error signals to dopaminergic midbrain. Neuron 103, 266–276 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Kearney, M. G., Warren, T. L., Hisey, E., Qi, J. & Mooney, R. Discrete evaluative and premotor circuits enable vocal learning in songbirds. Neuron 104, 559–575 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Bottjer, S. W., Brady, J. D. & Cribbs, B. Connections of a motor cortical region in zebra finches: relation to pathways for vocal learning. J. Comp. Neurol. 420, 244–260 (2000).

    Article  CAS  PubMed  Google Scholar 

  45. Mandelblat-Cerf, Y., Las, L., Denisenko, N. & Fee, M. S. A role for descending auditory cortical projections in songbird vocal learning. eLife 3, e02152 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  47. Fawzi, A. et al. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  48. Markowitz, J. E. et al. Spontaneous behaviour is structured by reinforcement without explicit reward. Nature 614, 108–117 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  49. Colquitt, B. M., Merullo, D. P., Konopka, G., Roberts, T. F. & Brainard, M. S. Cellular transcriptomics reveals evolutionary identities of songbird vocal circuits. Science 371, eabd9704 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Pfenning, A. R. et al. Convergent transcriptional specializations in the brains of humans and song-learning birds. Science 346, 1256846 (2014).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  51. Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. A procedure for an automated measurement of song similarity. Anim. Behav. 59, 1167–1176 (2000).

    Article  CAS  PubMed  Google Scholar 

  52. Immelman, K. in Bird Vocalizations (ed. Hinde, R. A.) 64–74 (Cambridge Univ. Press, 1969).

  53. Krzanowski, W. J. Principles of Multivariate Analysis: A User’s Perspective (Oxford Univ. Press, 1988).

  54. Seber, G. A. F. Multivariate Observations (Wiley, 1984).

  55. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974).

    Article  ADS  MathSciNet  Google Scholar 

  56. Ljung, L. System Identification: Theory for the User (Prentice Hall, 1999).

  57. Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).

    MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank L. Abbott, D. Aronov, M. Churchland, A. Litwin-Kumar, N. Sawtell, S. Siegelbaum and members of the V.G. laboratory for comments and suggestions; M. Eswaran and A. Sahilu for technical assistance; and K. J. Miller for analysis suggestions and feedback on the manuscript. Imaging was performed with support from the Zuckerman Institute’s Cellular Imaging platform for instrument use and technical advice. We thank the Zebra Finch Expression Brain Atlas website (http://www.zebrafinchatlas.org) for histological reference images. Funding support was provided to A.L.F. and A.D. by the Simons Collaboration on the Global Brain, and to V.G. by the NIH (R00NS102520 and DP2AT012347) and the Searle, Klingenstein–Simons and McKnight scholars programs.

Author information

Authors and Affiliations

Authors

Contributions

J.K. and V.G. conceived the study, and designed and performed the experiments. J.K., A.D., N.N., A.R., A.L.F., K.L.S. and V.G. performed data analysis. A.D., A.R., A.L.F. and K.L.S. performed data modelling. J.K., A.D., N.N., A.R. and V.G. wrote the original draft of the manuscript. J.K., A.D., N.N., A.R., A.L.F., K.L.S. and V.G. edited and reviewed the final manuscript. A.L.F. and V.G. acquired funding. A.L.F., K.L.S. and V.G. supervised the project.

Corresponding author

Correspondence to Vikram Gadagkar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Ofer Tchernichovski and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Histological verification of fibre placement in Area X.

a, Schematic from the zebra finch atlas. Adapted from ref. 18 (Wiley). Orange dashed lines indicate the intended placement of the cannula in Area X. b, Example brain slice showing fibre placement (purple dashed circle, Area X; orange dashed lines, fibre placement).

Extended Data Fig. 2 Singing-related dopamine activity.

a, Dopamine responses aligned to singing bout boundaries (see Methods) for an example bird across days 61–100 of development. Example spectrograms and dopamine responses aligned to bout onsets (n = 1912, top) and bout offsets (n = 7745, bottom), plotted above average ΔF/F signals (purple, bout onsets; orange, bout offsets). b, ΔF/F signals plotted similarly to a but each row represents the average across one of n = 6 birds (shading, ±s.e.m). c,d, Mean subtraction of singing-related dopamine activity to isolate the error response (see Methods). Top, ΔF/F signals for an example syllable (c) and averaged across all (n = 25) syllables (d) for the closest 10% relative distance (blue), furthest 10% relative distance (red) and mean (black) across all renditions (shading, ±s.e.m). Bottom, ΔF/F signals for the furthest and closest relative distance plotted as on top but after mean subtraction. e,f, ΔF/F signals plotted as in c,d with the dopamine response to the middle 10% of trials with 0 average relative distance (purple, equidistant - neither closer nor further). g, Scatter plot of averaged ΔF/F signals for all syllables (Methods) for closer (blue), further (red), and equidistant (purple) renditions (*P < 0.05, n.s. P > 0.05, not significantly different from zero, one sample t-test; black bars, mean ± s.d.).

Extended Data Fig. 3 Rendition-to-rendition variability in dopamine responses.

a, Spectrogram of an example eight-second section of song plotted above the corresponding ΔF/F signal. b, Single rendition dopamine responses for an example syllable on a single day (day 67) with the closest 10% (top) and furthest 10% (bottom) relative distance plotted above average ΔF/F signals (blue, closer renditions; red, further renditions; shading, ±s.d.). c, Dopamine responses (see Methods) for the same data shown in b (bar and error bar, mean ± s.d.; *P < 0.05, two-sided t-test). d, Dopamine responses plotted similarly to c but averaged across development (day 61−100) for all n = 25 syllables, ordered by the difference between closer and further response (*P < 0.05, n.s. P > 0.05, not significant, 2-sided t-test with Holm–Bonferroni correction57).

Extended Data Fig. 4 Variability in dopamine responses across syllables.

a, Syllable-averaged dopamine responses (reproduced from Fig. 2e, see Methods) across development (day 61−100) with the closest 10% relative distance (top) and furthest 10% relative distance (bottom) plotted above average ΔF/F signals (blue, closer renditions; red, further renditions; shading, ±s.e.m) sorted from weakest to strongest dopamine response. b, Example spectrograms for the three syllables with the strongest dopamine response (top) from early in development (day ≤ 76) and during adulthood (day > 90) (top) and the three syllables with the weakest dopamine response (bottom). Number indicates ranking (1, weakest; 25, strongest). c, Average over all syllables (n = 25) of the mean distance from all renditions on a given day to the adult (age > 90 days) median (grey shading, ±s.e.m.). d, Correlation between magnitude of dopamine response and syllable duration (black dots, 25 syllables; black line, best fit; n.s. P > 0.05, no significant correlation, Pearson linear correlation coefficient). eg, Plotted as in d but for average s.d. per day of distance to adult version across development (e), per hour change in distance averaged across each day (f) and mean change in distance per day across development (g).

Extended Data Fig. 5 Circadian oscillations have a subtle influence on the magnitude of the dopamine response.

a, Distance to adult median averaged across all syllables (n = 25) and over development (day 61 − 100) decreases over the course of a day (line and shading, mean across syllables ± s.e.m.). b, Average dopamine response across syllable renditions (see Methods), plotted similarly to a. c, The percentage of renditions with closest 10% (blue) and furthest 10% (red) relative distance during a day that were sung in each hour of the day (line and shading, mean across syllables ± s.e.m.). d, Dopamine responses to the closest 10% (blue) and furthest 10% (red) relative distance renditions sung in the morning (first 3 hours of the day, solid lines, n = 51560 total renditions across birds) and afternoon (latter 9 hours of the day, dotted lines, n = 51465 total renditions across birds) averaged across n = 25 syllables across development (Day 61−100). e, Scatter plot of averaged ΔF/F signals for the closest 10% relative distance sung in the morning (left) and afternoon (right) for all syllables (n.s. P > 0.05, not significant, paired t-test; black bars, mean ± s.d.). f, Plotted similarly to e but for furthest 10% relative distance (*P < 0.05, significant, paired t-test).

Extended Data Fig. 6 Variation in syllable amplitude does not account for the dopamine response.

a, Mean amplitude across all renditions on each day of development (day 61−100) for an example syllable, normalized to the average adult (day > 90) amplitude of that syllable. b, Plotted similarly to a but averaged across all (n = 25) syllables (shading, ±s.e.m.). c, Correlation of the ΔF/F signal with syllable amplitude (see Methods) across development (day 61−100) for an example syllable (black line, best fit line; n.s., p > 0.05, not significant, Pearson linear correlation coefficient). d, Pearson linear correlation coefficient between syllable amplitude and the ΔF/F signal for all syllables (open circles, p > 0.05, not significant; filled circles, p < 0.05, Pearson linear correlation coefficient with Holm–Bonferroni correction57). e,f, Syllable-averaged (n = 25) ΔF/F signals for 10% closest (blue) and 10% furthest (red) relative distance before (e, reproduced from Fig. 2e) and after (f) normalization (see Methods) by syllable amplitude (shading, ± s.e.m). g,h, Scatter plot of averaged ΔF/F signals for all n = 25 syllables for closer (g) and further (h) relative distance comparing not normalized (left) and amplitude-normalized (right) renditions (n.s. P > 0.05, not significant, paired t-test; black bars, mean ± s.d.).

Extended Data Fig. 7 Dopamine predicts future song evolution.

a, Movement along the dopamine vector is not sensitive to song-block size (see Methods) and the effect is consistently significant at a block size of 10 or more renditions. Replication of analysis shown in Fig. 3f over a range of song-block sizes. Movement along the dopamine vector averaged across all n = 25 syllables (blue trace and shading, mean ± s.e.m.; black line and grey shading, mean ± 1.96 s.d. of shuffled data). b, Six example syllables plotted as in Fig. 3d. Average across all dopamine vectors for the example syllable across all focal groups of 150 syllable renditions. c, Extension of analysis shown in Fig. 3f (see Methods) for both forward (blue) and backward (red) in rendition steps relative to the focal song block, plotted similarly to a (black bar and star, P < 0.05, paired t-test, region in which the difference in absolute magnitudes between the movement along the dopamine vector is significantly greater in the future than in the past).

Extended Data Fig. 8 Dopamine correlates more with the future than with the past at points of abrupt change in song.

a, Schematic of analysis. Three consecutive, non-overlapping song blocks are selected: past, current, and future. Past and future vectors are computed from the difference in median song locations in each song block. Song renditions in the current song block are projected onto the past and future vectors and the correlation with dopamine in the current song block are computed along these dimensions. This is repeated at steps of 25 renditions across all song development (see Methods). Local song-block triplets are divided into two categories depending on the local geometry of the song trajectory: ‘hinge’ (θ < π/2) trajectories and ‘arrow’ (θ > π/2) trajectories (θ is defined between 0 and π). b, Distribution of angles between the future vector and the negative of the past vector across all sampled triplets of song blocks along song development across n = 25 syllables (block size, 150 renditions). c, Average correlation coefficient between current song renditions projected onto the future and past song vectors and current dopamine transients (points and error bars, mean ± s.e.m over syllables; blue/pink stars, p < 0.05, two-sided t-test, correlations between dopamine and past/future song vectors are significantly different from zero; black circle, p < 0.05, paired t-test, correlations between dopamine and past/future song vectors are significantly different). d,e, Plotted as in c, but across multiple song-block sizes for ‘hinge’ (d) and ‘arrow’ (e) trajectories. f, Example syllable spectrograms showing the emergence of an element present in the tutor song but not in early renditions during development. Left, syllable renditions produced on a single day during development (day = 61) were divided into ‘Immature’ versions (without the tutor element, purple box) and ‘Mature’ versions (with the tutor element). Right, example spectrograms of the adult version (day = 100) and corresponding tutor syllable. g, Dopamine responses for the example syllable shown in a on a single day during development (day = 61) for the ‘Mature’ (n = 423, top) and ‘Immature’ (n = 402, bottom) versions plotted above average ΔF/F signals (all plots aligned to syllable onset; blue, ‘Mature’ renditions; red, ‘Immature’ renditions; P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ and ‘Mature’ renditions). h,i, A second example syllable plotted as in f,g using day = 63 during development (P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ (n = 201) and ‘Mature’ (n = 137) renditions). j,k, A third example syllable plotted as in f,g using day = 61 during development (P < 0.05, two-sided t-test, significant difference between dopamine responses for ‘Immature’ (n = 240) and ‘Mature’ (n = 310) renditions).

Source data

Extended Data Fig. 9 Dopamine tracks performance history in individual syllables.

a, Magnified view of the regression coefficients over all development for the current rendition (0) and the previous two renditions (−1, −2) for all n = 25 syllables plotted as in Fig. 4d (error bars, mean ± s.e.m; *P < 0.001, two-sided t-test; see Methods). b, Fraction of performance history coefficients in linear regression fit to dopamine in Fig. 4b that are less than 0 compared to the shuffled distribution (see Methods). Coefficients are consistently negative above chance (P < 0.05, computed from shuffled distribution, see Methods). c,d, Linear regression fit of performance to dopamine for early (c) and late (d) development, plotted as in Fig. 4d. e, Plotted as in Fig. 4c. Coefficients of a linear regression fit of the song history to the dopamine signal for example syllables fit across development (blue trace, average over days; error bars, ±s.e.m.).

Extended Data Fig. 10 Dopamine transients represent prediction errors better than mismatch errors.

a, Top, comparison of the PPE and ME models with different numbers of performance history terms (see Methods). Each trace is a single syllable model comparison (blue dot, the minimum, negative Akaike information criterion, ∆AIC, the PPE model’s best fit relative to the ME model). The magnitude and sign of ∆AIC indicate the relative superiority of one model over another. 25/25 syllables were included in the ME/PPE comparison because all syllables had at least one ME or PPE linear regression onto dopamine with a significant R2 value (computed using the fitlm function in MATLAB). 22/25 syllable traces have a negative minimum (blue dot excluded from the three syllable traces which are always above 0; that is, the ME model always outperforms the PPE model). Blue line indicates the average best number of history terms across syllables (including the three syllables in which no. history terms = 0). Bottom, summary plot of number of history terms in best selected model from the top traces. 3/25 models had no PPE models which improved fit to dopamine over the ME model (shown with open circle). b, Top, as in a, but applying ∆MSE as a secondary model comparison metric (see Methods). As in a, the sign and magnitude of ∆MSE indicates the relative superiority of the ME versus PPE model. Negative values indicate that the PPE model outperforms the ME model. The ∆AIC and ∆MSE metrics found the average best number of history terms across syllables to be n = 11 and n = 12, respectively. Bottom, plotted similarly to a for ∆MSE.

Supplementary information

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kasdin, J., Duffy, A., Nadler, N. et al. Natural behaviour is learned through dopamine-mediated reinforcement. Nature 641, 699–706 (2025). https://doi.org/10.1038/s41586-025-08729-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41586-025-08729-1

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing