Abstract
In reward foraging tasks, prefrontal neurons track reward history, yet animals also show persistent choice-history biases. How these histories are represented in prefrontal circuits and guide animals’ decisions remains unknown. We asked whether past rewards and choices are incorporated by leaky integration or carried as discrete, history-specific codes, and how these codes are recruited under different task demands. We recorded medial prefrontal cortex (mPFC) activity while mice performed probabilistic reward foraging task and fit a reinforcement-learning model whose decision variable, combining reward and choice histories, captured behavior. Neurons represented history-specific rewards and choices while integrating them consistent with their behavioral impact. We then altered reward contingencies and inter-choice intervals and transiently inactivated mPFC. Neural representations adapted to changing task demands, yet the behavioral impact of inactivation was sensitive to inter-choice interval and reward contingencies. We conclude that mPFC hosts redundant computations whose influence is gated by timing and task structure.
Similar content being viewed by others
Data availability
All behavioral and spike sorted data are available from public repository Code Ocean https://codeocean.com/capsule/6312901/tree, https://doi.org/10.24433/CO.6312901.v1. Source data are provided in this paper.
Code availability
All code is available from public repository Code Ocean https://codeocean.com/capsule/6312901/tree, https://doi.org/10.24433/CO.6312901.v1.
References
Sutton, R. S. & Barto, A. G. Introduction to reinforcement learning. (MIT press, Cambridge, 1998).
Corrado, G. & Doya, K. Understanding neural coding through the model-based analysis of decision making. J. Neurosci. 27, 8178–8180 (2007).
Hattori, R. & Komiyama, T. Context-dependent persistency as a coding mechanism for robust and widely distributed value coding. Neuron 110, 502–515 (2022).
Barraclough, D. J., Conroy, M. L. & Lee, D. Prefrontal cortex and decision making in a mixed-strategy game. Nat. Neurosci. 7, 404–410 (2004).
Lau, B. & Glimcher, P. W. Dynamic response-by-response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84, 555–579 (2005).
Kim, H., Sul, J. H., Huh, N., Lee, D. & Jung, M. W. Role of striatum in updating values of chosen actions. J. Neurosci. 29, 14701–14712 (2009).
Bari, B. A. et al. Stable representations of decision variables for flexible behavior. Neuron https://doi.org/10.1016/J.NEURON.2019.06.001 (2019).
Beron, C. C., Neufeld, S. Q., Linderman, S. W. & Sabatini, B. L. Mice exhibit stochastic and efficient action switching during probabilistic decision making. Proc. Natl. Acad. Sci. USA 119, e2113961119 (2022).
Belinsky, R., González, F. & Stahl, J. Optimal behavior and concurrent variable interval schedules. J. Math. Psychol. 48, 247–262 (2004).
Hwang, E. J., Dahlen, J. E., Mukundan, M. & Komiyama, T. History-based action selection bias in posterior parietal cortex. Nat. Commun. 8, 1242 (2017).
Hattori, R., Danskin, B., Babic, Z., Mlynaryk, N. & Komiyama, T. Area-specificity and plasticity of history-dependent value coding during learning. Cell 177, 1858–1872 (2019).
Guo, Z. V. et al. Flow of cortical activity underlying a tactile decision in mice. Neuron 81, 179–194 (2014).
Allen, W. E. et al. Global representations of goal-directed behavior in distinct cell types of mouse neocortex. Neuron 94, 891–907 (2017).
Phillips, C. G., Zeki, S. & Barlow, H. B. Localization of function in the cerebral cortex. Past, present and future. Brain 107, 327–361 (1984).
Pinto, L. et al. Task-dependent changes in the large-scale dynamics and necessity of cortical regions. Neuron 104, 810–824 (2019).
Harris, J. A. et al. Hierarchical organization of cortical and thalamic connectivity. Nature 575, 195–202 (2019).
Murakami, M., Shteingart, H., Loewenstein, Y. & Mainen, Z. F. Distinct sources of deterministic and stochastic components of action timing decisions in rodent frontal cortex. Neuron 94, 908–919 (2017).
Machens, C. K., Romo, R. & Brody, C. D. Functional, but not anatomical, separation of ‘what’ and ‘when’ in prefrontal cortex.J. Neurosci. 30 350–360 (2010).
Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–590 (2013).
Hirokawa, J., Vaughan, A., Masset, P., Ott, T. & Kepecs, A. Frontal cortex neuron types categorically encode single decision variables. Nature 576, 446–451 (2019).
Fusi, S., Miller, E. K. & Rigotti, M. Why neurons mix: high dimensionality for higher cognition. Curr. Opin. Neurobiol. 37, 66–74 (2016).
Wang, J. X. et al. Prefrontal cortex as a meta-reinforcement learning system. Nat. Neurosci. 21, 860–868 (2018).
Hattori, R. et al. Meta-reinforcement learning via orbitofrontal cortex. Nat. Neurosci. 26, 2182–2191 (2023).
Liu, D. et al. Medial prefrontal activity during delay period contributes to learning of a working memory task. Science 346, 458–463 (2014).
Sul, J. H., Kim, H., Huh, N., Lee, D. & Jung, M. W. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making.Neuron 66, 449–460 (2010).
Akam, T. et al. The anterior cingulate cortex predicts future states to mediate model-based action selection. Neuron 109, 149–163 (2021).
López-Yépez, J. S., Martin, J., Hulme, O. & Kvitsiani, D. Choice history effects in mice and humans improve reward harvesting efficiency. PLoS Comput. Biol. 17, e1009452 (2021).
Bari, B. A. & Gershman, S. J. Undermatching Is a Consequence of Policy Compression. J. Neurosci. 43, 447–457 (2023).
Samejima, K., Ueda, Y., Doya, K. & Kimura, M. Neuroscience: Representation of action-specific reward values in the striatum. Science 310, 1337–1340 (2005).
Tsutsui, K. I., Grabenhorst, F., Kobayashi, S. & Schultz, W. A dynamic code for economic object valuation in prefrontal cortex neurons. Nat. Commun. 7, 1–16 (2016).
Paxinos, G. & Franklin, K. B. JThe Mouse Brain in Stereotaxic Coordinates (2008).
Schmitzer-Torbert, N., Jackson, J., Henze, D., Harris, K. & Redish, A. D. Quantitative measures of cluster quality for use in extracellular recordings. Neuroscience 131, 1–11 (2005).
Kvitsiani, D. et al. Distinct behavioural and network correlates of two interneuron types in prefrontal cortex. Nature 498, 363–366 (2013).
Seo, H. & Lee, D. Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J. Neurosci. 27, 8366–8377 (2007).
Sugrue, L. P., Corrado, G. S. & Newsome, W. T. Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004).
Libby, A. & Buschman, T. J. Rotational dynamics reduce interference between sensory and memory representations. Nat. Neurosci. 24, 715–726 (2021).
Pachitariu, M., Steinmetz, N., Kadir, S., Carandini, M. & Harris, K. D. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels. Preprint at https://doi.org/10.1101/061481 (2016).
Killcross, S. & Coutureau, E. Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13, 400–408 (2003).
Jog, M. S., Kubota, Y., Connolly, C. I. & Graybiel, A. M. Building neural representations of habits. Science 286, 1745–1749 (1999).
Ozdemir, A. T. et al. Unexpected rule-changes in a working memory task shape the firing of histologically identified delay-tuned neurons in the prefrontal cortex. Cell Rep. 30, 1613–1626 (2020).
Roth, B. L. DREADDs for neuroscientists. Neuron 89, 683 (2016).
Bolkan, S. S. et al. Thalamic projections sustain prefrontal activity during working memory maintenance. Nat. Neurosci. 20, 987–996 (2017).
Baeg, E. H. et al. Dynamics of population code for working memory in the prefrontal cortex. Neuron 40, 177–188 (2003).
Kim, D. et al. Distinct roles of parvalbumin- and somatostatin-expressing interneurons in working memory. Neuron 92, 902–915 (2016).
Bae, J. W. et al. Parallel processing of working memory and temporal information by distinct types of cortical projection neurons. Nat. Commun. 12, 4352 (2021).
Goldman-Rakic, P. S. Regional and cellular fractionation of working memory. Proc. Natl. Acad. Sci. USA 93, 13473 (1996).
Amarasekare, P. Spatial dynamics of foodwebs.Annu. Rev. Ecol. Evol. Syst. 39, 479–500 (2008).
Fleshler, M. & Hoffman, H. S. A progression for generating variable-interval schedules. J. Exp. Anal. Behav. 5, 529–530 (1962).
De Wit, S., Kosaki, Y., Balleine, B. W. & Dickinson, A. Dorsomedial prefrontal cortex resolves response conflict in rats. J. Neurosci. 26, 5224–5229 (2006).
Tervo, D. G. R. et al. Behavioral variability through stochastic choice and its gating by anterior cingulate cortex. Cell 159, 21–32 (2014).
Walton, M. E., Bannerman, D. M. & Rushworth, M. F. S. The role of rat medial frontal cortex in effort-based decision making. J. Neurosci. 22, 10996–11003 (2002).
Ostlund, S. B. & Balleine, B. W. Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. J. Neurosci. 25, 7763–7770 (2005).
Ito, M. & Doya, K. Validation of decision-making models and analysis of decision variables in the rat basal ganglia. J. Neurosci. 29, 9861–9874 (2009).
Waelti, P., Dickinson2, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).
Inagaki, H. K., Fontolan, L., Romani, S. & Svoboda, K. Discrete attractor dynamics underlies persistent activity in the frontal cortex. Nature 566, 212–217 (2019).
Fiorillo, C. D., Newsome, W. T. & Schultz, W. The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci. 11, 966–973 (2008).
Kreitzer, A. C. & Malenka, R. C. Striatal plasticity and basal ganglia circuit function. Neuron 60, 543–554 (2008).
Beier, K. T. et al. Circuit architecture of VTA dopamine neurons revealed by systematic input-output mapping. Cell 162, 622–634 (2015).
Yoshizawa, T., Ito, M. & Doya, K. Neuronal representation of a working memory-based decision strategy in the motor and prefrontal cortico-basal ganglia loops. eNeuro 10, https://doi.org/10.1523/eneuro.0413-22.2023 (2023).
Grabot, L., Kayser, C. & van Wassenhove, V. Postdiction: when temporal regularity drives space perception through prestimulus alpha oscillations. eNeuro 8, https://doi.org/10.1523/ENEURO.0030-21 (2021).
Gershman, S. J. Origin of perseveration in the trade-off between reward and complexity. Cognition 204, 104394 (2020).
Heidbreder, C. A. & Groenewegen, H. J. The medial prefrontal cortex in the rat: evidence for a dorso-ventral distinction based upon functional and anatomical characteristics. Neurosci. Biobehav. Rev. 27, 555–579 (2003).
James, G., Witten, D., Hastie, T. & Tibshirani, R.An Introduction to Statistical Learning.(2026).
Dayan, P. & Abbott, L. F.Theoretical Neuroscience. (Peter Dayan and L. F. Abbott, 1991).
Schwartz, A. A reinforcement learning method for maximizing undiscounted rewards.https://doi.org/10.1016/B978-1-55860-307-3.50045-9 (1993).
Corrado, G. S., Sugrue, L. P., Sebastian Seung, H. & Newsome, W. T. Linear-nonlinear-poisson models of primate choice dynamics. J. Exp. Anal. Behav. 84, 581–617 (2005).
Katahira, K. The relation between reinforcement learning parameters and the influence of reinforcement history on choice behavior. J. Math. Psychol. 66, 59–69 (2015).
Acknowledgements
We express our gratitude to the members of the Kvitsiani lab, including Sophie Seidenbecher, Madeny Belkhiri, and Jesper Hagelskaer, for their valuable feedback on both the analysis and the writing of the manuscript. We thank Joseph Cheatwood for technical support with epifluorescence microscopy and assistance in imaging brain slices. We thank Ashok Litwin Kumar and Larry Abbott for their assistance with the neural data analysis. We also appreciate Naoshige Uchida for providing critical feedback on the manuscript. This study was supported by the Lundbeck Foundation grant: DANDRITE-R248-2016-2518 https://lundbeckfonden.com and startup funds from Southern Illinois University at Carbondale.
Author information
Authors and Affiliations
Contributions
D.K. conceived and designed the project. A.B., J.M., M.M., T.-F.W., and E.D. performed the experiments. J.S.L.-Y., A.B., J.M., and D.K. analyzed the data. J.S.L.-Y. developed the computational modeling of behavior. O.H. contributed to modeling and interpretation. J.S.L.-Y., A.B., O.H., and D.K. wrote the manuscript with input from all authors. E.D. and D.K. revised the manuscript and addressed reviewers’ comments.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Michael Halassa and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Lopez-Yepez, J.S., Barta, A., Martin, J. et al. Time-dependent deployment of medial prefrontal cortical representations in male mice. Nat Commun (2026). https://doi.org/10.1038/s41467-025-68215-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-68215-0


