Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity

A preprint version of the article is available at bioRxiv.

Abstract

Prediction models that generate neuronal spikes from upstream neural activities offer a promising way to re-establish neural functional connectivity. Traditional methods train these models by supervised learning, which requires downstream recordings as ground truth. However, functional downstream activity cannot be recorded when neurological disorders exist. Here we introduce a reinforcement learning (RL)-based point process framework to generate spike trains that directly maximize behavior-level rewards, thus bypassing downstream recordings. This yields a generative spike model that directly transforms upstream activity into spike patterns modulated to desired behavior. We show that these RL-based generative models produce movement-modulated spike patterns akin to downstream recordings from healthy subjects, providing a biomimetic spike encoding framework. This RL framework outperforms existing methods and demonstrates a strong adaptation capability across different decoder settings, highlighting its potential for neural prostheses in restoring transregional communication with biomimetic cortical stimulation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The general structure and information flow of RL-based spike generation.
The alternative text for this image may have been generated using AI.
Fig. 2: Neural modulations of model predictions with respect to movements.
The alternative text for this image may have been generated using AI.
Fig. 3: Temporal patterns and statistical performance of generated neural activity.
The alternative text for this image may have been generated using AI.
Fig. 4: Adaptation and transfer learning among different decoder settings under the RLPP framework.
The alternative text for this image may have been generated using AI.
Fig. 5: Information analysis for spike prediction models.
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

The datasets that support the findings of this study include both publicly available and proprietary data. Public datasets used in this work are available via refs. 19,20,21. The proprietary dataset was collected from Sprague-Dawley rats for a previously published study60. The data are available for research purposes from the corresponding author. A minimum segment of the proprietary dataset to interpret and verify the research is available via GitHub at https://github.com/WuShenghui97/RLPP and via Zenodo at https://doi.org/10.5281/zenodo.17221566 (ref. 64). Source data are provided with this manuscript.

Code availability

The RLPP framework code is available via GitHub at https://github.com/WuShenghui97/RLPP and via Zenodo at https://doi.org/10.5281/zenodo.17221566 (ref. 64).

References

  1. Rao, R. P. Towards neural co-processors for the brain: combining decoding and encoding in brain–computer interfaces. Curr. Opin. Neurobiol. 55, 142–151 (2019).

    Article  Google Scholar 

  2. Belkacem, A. N., Jamil, N., Khalid, S. & Alnajjar, F. On closed-loop brain stimulation systems for improving the quality of life of patients with neurological disorders. Front. Hum. Neurosci. 17, 1085173 (2023).

    Article  Google Scholar 

  3. Bouton, C. E. et al. Restoring cortical control of functional movement in a human with quadriplegia. Nature 533, 247–250 (2016).

    Article  Google Scholar 

  4. Ajiboye, A. B. et al. Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. Lancet 389, 1821–1830 (2017).

    Article  Google Scholar 

  5. Capogrosso, M. et al. A brain–spine interface alleviating gait deficits after spinal cord injury in primates. Nature 539, 284–288 (2016).

    Article  Google Scholar 

  6. Bryan, M. J., Jiang, L. P. & Rao, R. P. N. Neural co-processors for restoring brain function: results from a cortical model of grasping. J. Neural Eng. 20, 036004 (2023).

    Article  Google Scholar 

  7. Deadwyler, S. A. et al. A cognitive prosthesis for memory facilitation by closed-loop functional ensemble stimulation of hippocampal neurons in primate brain. Exp. Neurol. 287, 452–460 (2017).

    Article  Google Scholar 

  8. Hampson, R. E. et al. Developing a hippocampal neural prosthetic to facilitate human memory encoding and recall. J. Neural Eng. 15, 036014 (2018).

    Article  Google Scholar 

  9. Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P. & Brown, E. N. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. J. Neurophysiol. 93, 1074–1089 (2005).

    Article  Google Scholar 

  10. Song, D. et al. Nonlinear dynamical modeling of human hippocampal CA3-CA1 functional connectivity for memory prostheses. In Proc. 2015 7th International IEEE/EMBS Conference on Neural Engineering (NER) 316–319 (IEEE, 2015).

  11. Qian, C. et al. Binless kernel machine: modeling spike train transformation for cognitive neural prostheses. Neural Comput. 32, 1863–1900 (2020).

    Article  MathSciNet  Google Scholar 

  12. Choi, J. S. et al. Eliciting naturalistic cortical responses with a sensory prosthesis via optimized microstimulation. J. Neural Eng. 13, 056007 (2016).

    Article  Google Scholar 

  13. Upadhyay, U., De, A. & Gomez-Rodrizuez, M. Deep reinforcement learning of marked temporal point processes. In Advances in Neural Information Processing Systems Vol. 31 (eds Bengio, S. et al.) 3172–3182 (Curran Associates Inc., 2018).

  14. Li, S. et al. Learning temporal point processes via reinforcement learning. In Advances in Neural Information Processing Systems Vol. 31 (eds Bengio, S. et al.) 10804–10814 (Curran Associates Inc., 2018).

  15. Zhu, S., Li, S., Peng, Z. & Xie, Y. Imitation learning of neural spatio-temporal point processes. IEEE Trans. Knowl. Data Eng. 34, 5391–5402 (2022).

    Article  Google Scholar 

  16. DiGiovanna, J., Mahmoudi, B., Fortes, J., Principe, J. C. & Sanchez, J. C. Coadaptive brain–machine interface via reinforcement learning. IEEE Trans. Biomed. Eng. 56, 54–64 (2009).

    Article  Google Scholar 

  17. Marsh, B. T., Tarigoppula, V. S. A., Chen, C. & Francis, J. T. Toward an autonomous brain machine interface: integrating sensorimotor reward modulation and reinforcement learning. J. Neurosci. 35, 7374–7387 (2015).

    Article  Google Scholar 

  18. Shen, X., Zhang, X., Huang, Y., Chen, S. & Wang, Y. Task learning over multi-day recording via internally rewarded reinforcement learning based brain machine interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 28, 3089–3099 (2020).

    Article  Google Scholar 

  19. International Brain Laboratory et al. A brain-wide map of neural activity during complex behaviour. Nature 645, 177–191 (2025).

  20. Steinmetz, N., Zatka-Haas, P., Carandini, M. & Harris, K. Main dataset from steinmetz et al. 2019. figshare https://doi.org/10.6084/M9.FIGSHARE.9598406.V2 (2019).

  21. Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273 (2019).

    Article  Google Scholar 

  22. Narayanan, N. S. & Laubach, M. Top-down control of motor cortex ensembles by dorsomedial prefrontal cortex. Neuron 52, 921–931 (2006).

    Article  Google Scholar 

  23. Li, W. et al. The neural mechanism exploration of adaptive motor control: dynamical economic cell allocation in the primary motor cortex. IEEE Trans. Neural Syst. Rehabil. Eng. 25, 492–501 (2016).

    Article  Google Scholar 

  24. Haan, R. D. et al. Neural representation of motor output, context and behavioral adaptation in rat medial prefrontal cortex during learned behavior. Front. Neural Circuits 12, 75 (2018).

    Article  Google Scholar 

  25. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    Google Scholar 

  26. Seidler, R. D., Kwak, Y., Fling, B. W. & Bernard, J. A. in Progress in Motor Control (eds Richardson, M. J. et al.) Vol. 782, 39–60 (Springer, 2013).

  27. Cross, L., Cockburn, J., Yue, Y. & O’Doherty, J. P. Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments. Neuron 109, 724–738.e7 (2020).

    Article  Google Scholar 

  28. Domenech, P., Rheims, S. & Koechlin, E. Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex. Science 369, eabb0184 (2020).

    Article  Google Scholar 

  29. Sugawara, M. & Katahira, K. Dissociation between asymmetric value updating and perseverance in human reinforcement learning. Sci. Rep. 11, 3574 (2021).

    Article  Google Scholar 

  30. Bermudez-Contreras, E. Deep reinforcement learning to study spatial navigation, learning and memory in artificial and biological agents. Biol. Cybern. 115, 131–134 (2021).

    Article  Google Scholar 

  31. Lubianiker, N., Paret, C., Dayan, P. & Hendler, T. Neurofeedback through the lens of reinforcement learning. Trends Neurosci. 45, 579–593 (2022).

    Article  Google Scholar 

  32. Carmena, J. M., Ganguly, K., Dimitrov, D. F. & Wallis, J. D. Reversible large-scale modification of cortical networks during neuroprosthetic control. Nat. Neurosci. 14, 662–667 (2011).

    Article  Google Scholar 

  33. Orsborn, A. L. et al. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. Neuron 82, 1380–1393 (2014).

    Article  Google Scholar 

  34. Zhao, Y., Hessburg, J. P., Kumar, J. N. A. & Francis, J. T. Paradigm shift in sensorimotor control research and brain machine interface control: the influence of context on sensorimotor representations. Front. Neurosci. 12, 579 (2018).

    Article  Google Scholar 

  35. Sakellaridi, S. et al. Intrinsic variable learning for brain–machine interface control by human anterior intraparietal cortex. Neuron 102, 694–705.e3 (2019).

    Article  Google Scholar 

  36. Rowald, A. et al. Activity-dependent spinal cord neuromodulation rapidly restores trunk and leg motor functions after complete paralysis. Nat. Med. 28, 260–271 (2022).

    Article  Google Scholar 

  37. Bonizzato, M. et al. Autonomous optimization of neuroprosthetic stimulation parameters that drive the motor cortex and spinal cord outputs in rats and monkeys. Cell Rep. Med. 4, 101008 (2023).

    Article  Google Scholar 

  38. Nieves-Vazquez, H. A., Kim, E. & Ueda, J. Closed-loop estimation of individualized inter-stimulus interval window for transient neuromodulation via paired mechanical and brain stimulation. IEEE. Trans. Med. Robot. Bionics 5, 110–119 (2023).

    Article  Google Scholar 

  39. Golub, M. D. et al. Learning by neural reassociation. Nat. Neurosci. 21, 607–616 (2018).

    Article  Google Scholar 

  40. Mahmoudi, B. & Sanchez, J. C. A symbiotic brain–machine interface through value-based decision making. PLoS ONE 6, e14760 (2011).

    Article  Google Scholar 

  41. Fidêncio, A. X., Klaes, C. & Iossifidis, I. Error-related potentials in reinforcement learning-based brain-machine interfaces. Front. Hum. Neurosci. 16, 806517 (2022).

    Article  Google Scholar 

  42. Tan, J., Zhang, X., Wu, S., Song, Z. & Wang, Y. Hidden brain state-based internal evaluation using kernel inverse reinforcement learning in brain-machine interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 4219–4229 (2024).

    Article  Google Scholar 

  43. Valle, G. et al. Biomimetic computer-to-brain communication enhancing naturalistic touch sensations via peripheral nerve stimulation. Nat. Commun. 15, 1151 (2024).

    Article  Google Scholar 

  44. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).

  45. Schulman, J., Moritz, P., Levine, S., Jordan, M. & Abbeel, P. High-dimensional continuous control using generalized advantage estimation. Preprint at https://arxiv.org/abs/1506.02438 (2018).

  46. Mei, H. & Eisner, J. M. The neural Hawkes process: a neurally self-modulating multivariate point process. In Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) (Curran Associates, Inc., 2017).

  47. Cunningham, J. P. & Yu, B. M. Dimensionality reduction for large-scale neural recordings. Nat. Neurosci. 17, 1500–1509 (2014).

    Article  Google Scholar 

  48. Sadtler, P. T. et al. Neural constraints on learning. Nature 512, 423–426 (2014).

    Article  Google Scholar 

  49. Gallego, J. A., Perich, M. G., Miller, L. E. & Solla, S. A. Neural manifolds for the control of movement. Neuron 94, 978–984 (2017).

    Article  Google Scholar 

  50. Wärnberg, E. & Kumar, A. Perturbing low dimensional activity manifolds in spiking neuronal networks. PLoS Comput. Biol. 15, e1007074 (2019).

    Article  Google Scholar 

  51. Gallego, J. A., Perich, M. G., Chowdhury, R. H., Solla, S. A. & Miller, L. E. Long-term stability of cortical population dynamics underlying consistent behavior. Nat. Neurosci. 23, 260–270 (2020).

    Article  Google Scholar 

  52. Perich, M. G., Narain, D. & Gallego, J. A. A neural manifold view of the brain. Nat. Neurosci. 28, 1582–1597 (2025).

    Article  Google Scholar 

  53. Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).

    Article  Google Scholar 

  54. Pandarinath, C. et al. Inferring single-trial neural population dynamics using sequential auto-encoders. Nat. Methods 15, 805–815 (2018).

    Article  Google Scholar 

  55. Abbaspourazad, H., Choudhury, M., Wong, Y. T., Pesaran, B. & Shanechi, M. M. Multiscale low-dimensional motor cortical state dynamics predict naturalistic reach-and-grasp behavior. Nat. Commun. 12, 607 (2021).

    Article  Google Scholar 

  56. Safaie, M. et al. Preserved neural dynamics across animals performing similar behaviour. Nature 623, 765–771 (2023).

    Article  Google Scholar 

  57. Wu, S., Zhang, X. & Wang, Y. Neural manifold constraint for spike prediction models under behavioral reinforcement. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 2772–2781 (2024).

    Article  Google Scholar 

  58. Fetz, E. E., Jackson, A. & Mavoori, J. Long-term motor cortex plasticity induced by an electronic neural implant. Nature 444, 56–60 (2006).

    Article  Google Scholar 

  59. Guggenmos, D. J. et al. Restoration of function after brain damage using a neural prosthesis. Proc. Natl Acad. Sci. USA 110, 21177–21182 (2013).

    Article  Google Scholar 

  60. Wu, S. et al. Spike prediction on primary motor cortex from medial prefrontal cortex during task learning. J. Neural Eng. 19, 046025 (2022).

    Article  Google Scholar 

  61. Hawkes, A. G. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90 (1971).

    Article  MathSciNet  Google Scholar 

  62. Møller, M. F. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533 (1993).

    Article  Google Scholar 

  63. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).

  64. Wu, S. et al. A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity. Zenodo https://doi.org/10.5281/ZENODO.17221566 (2025).

Download references

Acknowledgements

This work was supported by STI 2030-Major Projects under grant no. 2021ZD0200403, the Research Grants Council of the Hong Kong Special Administrative Region, China (project no. HKUST C6049-24G), Special Research Support from the Chau Hoi Shuen Foundation under grant no. R9051, the Seed Fund of the Big Data for Bio-Intelligence Laboratory from HKUST under grant no. Z0428 and the Innovation and Technology Commission under grant no. ITCPD/17-9.

Author information

Authors and Affiliations

Authors

Contributions

S.W. and Y.W. conceived the study, developed the methodology and performed data analyses. S.W., Z.S. and Z.W. developed the code. S.W. and Z.S. performed the analyses for the video demonstration. J.T. and M.L. conducted implementations and analyzed results on open datasets. X.Z., Y.H., S.C. and X.S. performed the rat experiments and collected the dataset. Y.C. and K.L. contributed to histology imaging and electrophysiology studies. S.W. wrote the paper. D.F., J.C.P. and Y.W. reviewed and revised the paper. Y.W. supervised the work. All authors helped to prepare or edit the manuscript.

Corresponding author

Correspondence to Yiwen Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Daniel N. Zdeblick and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figs. 1–6 and Note 1. (download PDF )

Reporting Summary (download PDF )

Supplementary Video 1 | Evaluation of RLPP-generated spike trains in online decoding (download MP4 )

. The video demonstrates the behavioral experimental setup and online decoding results. The rat was performing the two-lever discrimination task. The recorded spike trains of M1 neurons were real-time decoded into trajectories in a 2D space by a Kalman filter (KF) as an online prosthetic control. The spike trains generated from the RLPP model were classified by the movement decoder and subsequently decoded into the 2D trajectories by a KF. A constraint was applied to limit excessive firing in the generated spike trains during the model training. The video presents three consecutive example trials of behavioral decoding, showcasing the rat’s behavior alongside recorded spiking activities. The position of the screen cursor decoded from the experimental recordings accurately matched the rat’s actual movement. In contrast, randomly sampled spike trains failed to control the cursor, reflecting conditions where M1 recordings were unavailable due to neural pathway damage. Notably, our RL-trained model successfully generated spike trains from online mPFC recordings, where these generated spike trains showed similar modulation patterns as the experimental data and produced accurate decoding trajectories that aligned with the rat’s behavior.

Source data

Source Data Fig. 2 (download ZIP )

Statistical source data.

Source Data Fig. 3 (download CSV )

Statistical source data.

Source Data Fig. 4 (download CSV )

Statistical source data.

Source Data Fig. 5 (download CSV )

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, S., Song, Z., Zhang, X. et al. A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity. Nat Comput Sci 6, 179–192 (2026). https://doi.org/10.1038/s43588-025-00915-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s43588-025-00915-5

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing