Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks

A preprint version of the article is available at bioRxiv.

Abstract

Recurrent neural networks (RNNs) are ubiquitously used in neuroscience to capture both neural dynamics and behaviours of living systems. However, when it comes to complex cognitive tasks, training RNNs with traditional methods can prove difficult and fall short of capturing crucial aspects of animal behaviour. Here we propose a principled approach for identifying and incorporating compositional tasks as part of RNN training. Taking as the target a temporal wagering task previously studied in rats, we design a pretraining curriculum of simpler cognitive tasks that reflect relevant subcomputations, which we term ‘kindergarten curriculum learning’. We show that this pretraining substantially improves learning efficacy and is critical for RNNs to adopt similar strategies as rats, including long-timescale inference of latent states, which conventional pretraining approaches fail to capture. Mechanistically, our pretraining supports the development of slow dynamical systems features needed for implementing both inference and value-based decision making. Overall, our approach helps endow RNNs with relevant inductive biases, which is important when modelling complex behaviours that rely on multiple cognitive functions.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Modelling the animal’s learning experience.
Fig. 2: Rodent behaviour in temporal wagering task.
Fig. 3: Deep meta-RL agents can learn the temporal wagering task.
Fig. 4: Performance of pretraining methods on the temporal wagering task.
Fig. 5: Variations of kCL.
Fig. 6: Dynamical systems features of kCL-trained networks.

Similar content being viewed by others

Data availability

The rat behavioural data and the statistical model of behaviour are detailed in ref. 19 and are available via Zenodo at https://doi.org/10.5281/zenodo.10031483 (ref. 45). The data used to generate figures in this manuscript, as well as a repository of pretrained RNN model files for different curriculum learning strategies, are available via Zenodo at https://doi.org/10.5281/zenodo.14907819 (ref. 46).

Code availability

The data were analysed with code written in Python (Python v3.9.5, Pytorch v1.8.0), as well as Matlab (v2023b). The code used to train RNNs, analyse data and generate figures is available at GitHub via https://github.com/Savin-Lab-Code/kind_cl (ref. 47) and as CodeOcean capsule48.

References

  1. Yang, G. R. & Wang, X.-J. Artificial neural networks for neuroscientists: a primer. Neuron 107, 1048–1070 (2020).

    Article  Google Scholar 

  2. Krueger, K. A. & Dayan, P. Flexible shaping: How learning in small steps helps. Cognition 110, 380–394 (2009).

    Article  Google Scholar 

  3. Narvekar, S. et al. Curriculum learning for reinforcement learning domains: a framework and survey. J. Mach. Learn. Res. 21, 181 (2020).

    MathSciNet  Google Scholar 

  4. Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In Proc. 26th Annual Int. Conference on Machine Learning 41–48 (Association for Computing Machinery, 2009).

  5. Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Int. Conference on Machine Learning, 1126–1135 (PMLR, 2017).

  6. Thrun, S. & Pratt, L. in Learning to Learn (eds Thrun, S. & Pratt, L.) 3–17 (Springer, 1998).

  7. Harlow, H. F. The formation of learning sets. Psychol. Rev. 56, 51–65 (1949).

    Article  Google Scholar 

  8. Skinner, B. F. How to teach animals. Sci. Am. 185, 26–29 (1951).

    Article  Google Scholar 

  9. Savin, C. & Triesch, J. Emergence of task-dependent representations in working memory circuits. Front. Comput. Neurosci. 8, 57 (2014).

    Article  Google Scholar 

  10. McAndrew, R. & Helms Tillery, S. I. Laboratory primates: their lives in and after research. Temperature https://doi.org/10.1080/23328940.2016.1229161 (2016).

  11. Chowdhury, S. A. & DeAngelis, G. C. Fine discrimination training alters the causal contribution of macaque area mt to depth perception. Neuron 60, 367–377 (2008).

    Article  Google Scholar 

  12. Arlt, C. et al. Cognitive experience alters cortical involvement in goal-directed navigation. eLife 11, e76051 (2022).

    Article  Google Scholar 

  13. Wang, J. X. Meta-learning in natural and artificial intelligence. Curr. Opin. Behav. Sci. 38, 90–95 (2021).

    Article  Google Scholar 

  14. Vanderschuren, L. J., Achterberg, E. M. & Trezza, V. The neurobiology of social play and its rewarding value in rats. Neurosci. Biobehav. Rev. 70, 86–105 (2016).

    Article  Google Scholar 

  15. Vanderschuren, L. J. & Trezza, V. What the laboratory rat has taught us about social play behavior: role in behavioral development and neural mechanisms. In The Neurobiology of Childhood (eds Andersen, S. L. & Pine, D. S.) 189–212 (Springer, 2014).

  16. Baarendse, P. J., Limpens, J. H. & Vanderschuren, L. J. Disrupted social development enhances the motivation for cocaine in rats. Psychopharmacology 231, 1695–1704 (2014).

    Article  Google Scholar 

  17. Einon, D. F. & Morgan, M. A critical period for social isolation in the rat. Dev. Psychobiol. J. Int. Soc. Dev. Psychobiol. 10, 123–132 (1977).

    Article  Google Scholar 

  18. Zador, A. M. A critique of pure learning and what artificial neural networks can learn from animal brains. Nat. Commun. 10, 3770 (2019).

    Article  Google Scholar 

  19. Mah, A., Schiereck, S. S., Bossio, V. & Constantinople, C. M. Distinct value computations support rapid sequential decisions. Nat. Commun. 14, 7573 (2023).

    Article  Google Scholar 

  20. Schiereck, S. S. et al. Neural dynamics in the orbitofrontal cortex reveal cognitive strategies. Preprint at bioRxiv https://doi.org/10.1101/2024.10.29.620879 (2024).

  21. Constantino, S. M. & Daw, N. D. Learning the opportunity cost of time in a patch-foraging task. Cog. Affect. Behav. Neurosci. 15, 837–853 (2015).

    Article  Google Scholar 

  22. Charnov, E. L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 9, 129–136 (1976).

    Article  Google Scholar 

  23. McNamara, J. M. & Houston, A. I. Optimal foraging and learning. J. Theor. Biol. 117, 231–249 (1985).

    Article  MathSciNet  Google Scholar 

  24. Wang, J. X. et al. Prefrontal cortex as a meta-reinforcement learning system. Nat. Neurosci. 21, 860–868 (2018).

    Article  Google Scholar 

  25. Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).

    Article  Google Scholar 

  26. Schuck, N. W., Cai, M. B., Wilson, R. C. & Niv, Y. Human orbitofrontal cortex represents a cognitive map of state space. Neuron 91, 1402–1412 (2016).

    Article  Google Scholar 

  27. Bartolo, R. & Averbeck, B. B. Inference as a fundamental process in behavior. Curr. Opin. Behav. Sci. 38, 8–13 (2021).

    Article  Google Scholar 

  28. Averbeck, B. & O’Doherty, J. P. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology 47, 147–162 (2022).

    Article  Google Scholar 

  29. Miyashita, Y. & Chang, H. S. Neuronal correlate of pictorial short-term memory in the primate temporal cortexyasushi miyashita. Nature 331, 68–70 (1988).

    Article  Google Scholar 

  30. Dubreuil, A., Valente, A., Beiran, M., Mastrogiuseppe, F. & Ostojic, S. The role of population structure in computations through neural dynamics. Nat. Neurosci. 25, 783–794 (2022).

    Article  Google Scholar 

  31. Khona, M. & Fiete, I. R. Attractor and integrator networks in the brain. Nat. Rev. Neurosci. 23, 744–766 (2022).

    Article  Google Scholar 

  32. Marschall, O. & Savin, C. Probing learning through the lens of changes in circuit dynamics. Preprint at bioRxiv https://doi.org/10.1101/2023.09.13.557585 (2023).

  33. Sussillo, D. & Barak, O. Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Comput. 25, 626–649 (2013).

    Article  MathSciNet  Google Scholar 

  34. Elman, J. L. Learning and development in neural networks: the importance of starting small. Cognition 48, 71–99 (1993).

    Article  Google Scholar 

  35. Kepple, D., Engelken, R. & Rajan, K. Curriculum learning as a tool to uncover learning principles in the brain. In Int. Conference on Learning Representations (2022).

  36. Silver, D. et al. Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  Google Scholar 

  37. Jensen, K. T., Hennequin, G. & Mattar, M. G. A recurrent network model of planning explains hippocampal replay and human behavior. Nat. Neurosci. 27, 1340–1348 (2024).

  38. Braun, D. A., Mehring, C. & Wolpert, D. M. Structure learning in action. Behav. Brain Res. 206, 157–165 (2010).

    Article  Google Scholar 

  39. Makino, H. Arithmetic value representation for hierarchical behavior composition. Nat. Neurosci. 26, 140–149 (2023).

    Article  Google Scholar 

  40. Driscoll, L. N., Shenoy, K. & Sussillo, D. Flexible multitask computation in recurrent networks utilizes shared dynamical motifs. Nat. Neurosci. 27, 1349–1363 (2024).

    Article  Google Scholar 

  41. Gupta, D., DePasquale, B., Kopec, C. D. & Brody, C. D. Trial-history biases in evidence accumulation can give rise to apparent lapses in decision-making. Nat. Commun. 15, 662 (2024).

    Article  Google Scholar 

  42. Richards, B. A. et al. A deep learning framework for neuroscience. Nat. Neurosci. 22, 1761–1770 (2019).

    Article  Google Scholar 

  43. Ma, W. J. & Peters, B. A neural network walks into a lab: towards using deep nets as models for human behavior. Preprint at https://arxiv.org/abs/2005.02181 (2020).

  44. Goldman, M. S. Memory without feedback in a neural network. Neuron 61, 621–634 (2009).

    Article  Google Scholar 

  45. Mah, A., Schiereck, S., Bossio, V. & Constantinople, C. Distinct value computations support rapid sequential decisions (version v1). Zenodo https://doi.org/10.5281/zenodo.10031483 (2023).

  46. Hocker, D., Constantinople, C. M. & Savin, C. Composition of simple computational tasks captures the inductive biases of animals in network models (version v1). Zenodo https://doi.org/10.5281/zenodo.14907819 (2025).

  47. Hocker, D., Constantinople, C. M. & Savin, C. Savin-Lab-Code/kind_cl: Nature Machine Intelligence code (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.14907734 (2025).

  48. Hocker, D., Constantinople, C. M. & Savin, C. Compositional pretraining improves computational efficiency and matches animal behavior on complex tasks. Code Ocean Capsule https://doi.org/10.24433/CO.3440797.v1 (2025).

  49. Arnold, T. B. & Emerson, J. W. Nonparametric goodness-of-fit tests for discrete null distributions. R J. 3, 34–39 (2011).

    Article  Google Scholar 

Download references

Acknowledgements

We thank L. Driscoll, P. Glimcher, V. Goudar, O. Marschall, K. Miller and J. Wang for helpful discussions and comments on the manuscript. We also thank M. Chittireddy for fixed-point characterization of shaping-trained RNNs. This work was supported by the NIMH (grant nos. 1K01MH132043-01A1 to D.H. and 1R01MH125571-01 to C.M.C. and C.S.). This work was supported in part through the NYU IT High Performance Computing resources, services and staff expertise. We gratefully acknowledge use of the research computing resources of the Empire AI Consortium, Inc., with support from the State of New York, the Simons Foundation and the Secunda Family Foundation.

Author information

Authors and Affiliations

Authors

Contributions

C.M.C. designed the behavioural task. D.H., C.M.C. and C.S. designed the curriculum training protocol. D.H. created the RNN model. D.H., C.M.C. and C.S analysed the data. D.H. prepared the figures. D.H., C.M.C. and C.S. wrote the manuscript. C.M.C. and C.S. supervised the project.

Corresponding author

Correspondence to Cristina Savin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Maria Eckstein, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7 and sections: hyperparameters, LSTM derivatives for linearized dynamics and additional MDP analyses.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hocker, D., Constantinople, C.M. & Savin, C. Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks. Nat Mach Intell 7, 689–702 (2025). https://doi.org/10.1038/s42256-025-01029-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01029-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing