A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity

Wu, Shenghui; Song, Zhiwei; Zhang, Xiang; Huang, Yifan; Chen, Shuhang; Shen, Xiang; Tan, Jieyuan; Li, Mingdong; Wang, Ziyi; Chen, Yujun; Liu, Kai; Farina, Dario; Principe, Jose C.; Wang, Yiwen

doi:10.1038/s43588-025-00915-5

Article
Published: 02 January 2026

A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity

Nature Computational Science volume 6, pages 179–192 (2026) Cite this article

2798 Accesses
1 Citations
24 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Prediction models that generate neuronal spikes from upstream neural activities offer a promising way to re-establish neural functional connectivity. Traditional methods train these models by supervised learning, which requires downstream recordings as ground truth. However, functional downstream activity cannot be recorded when neurological disorders exist. Here we introduce a reinforcement learning (RL)-based point process framework to generate spike trains that directly maximize behavior-level rewards, thus bypassing downstream recordings. This yields a generative spike model that directly transforms upstream activity into spike patterns modulated to desired behavior. We show that these RL-based generative models produce movement-modulated spike patterns akin to downstream recordings from healthy subjects, providing a biomimetic spike encoding framework. This RL framework outperforms existing methods and demonstrates a strong adaptation capability across different decoder settings, highlighting its potential for neural prostheses in restoring transregional communication with biomimetic cortical stimulation.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: The general structure and information flow of RL-based spike generation.**

**Fig. 2: Neural modulations of model predictions with respect to movements.**

**Fig. 3: Temporal patterns and statistical performance of generated neural activity.**

**Fig. 4: Adaptation and transfer learning among different decoder settings under the RLPP framework.**

**Fig. 5: Information analysis for spike prediction models.**

Biomimetic model of corticostriatal micro-assemblies discovers a neural code

Article Open access 29 December 2025

Model-agnostic linear-memory online learning in spiking neural networks

Article Open access 19 January 2026

Sequence anticipation and spike-timing-dependent plasticity emerge from a predictive learning rule

Article Open access 21 August 2023

Data availability

The datasets that support the findings of this study include both publicly available and proprietary data. Public datasets used in this work are available via refs. ^19,20,21. The proprietary dataset was collected from Sprague-Dawley rats for a previously published study⁶⁰. The data are available for research purposes from the corresponding author. A minimum segment of the proprietary dataset to interpret and verify the research is available via GitHub at https://github.com/WuShenghui97/RLPP and via Zenodo at https://doi.org/10.5281/zenodo.17221566 (ref. ⁶⁴). Source data are provided with this manuscript.

Code availability

The RLPP framework code is available via GitHub at https://github.com/WuShenghui97/RLPP and via Zenodo at https://doi.org/10.5281/zenodo.17221566 (ref. ⁶⁴).

References

Rao, R. P. Towards neural co-processors for the brain: combining decoding and encoding in brain–computer interfaces. Curr. Opin. Neurobiol. 55, 142–151 (2019).
Article Google Scholar
Belkacem, A. N., Jamil, N., Khalid, S. & Alnajjar, F. On closed-loop brain stimulation systems for improving the quality of life of patients with neurological disorders. Front. Hum. Neurosci. 17, 1085173 (2023).
Article Google Scholar
Bouton, C. E. et al. Restoring cortical control of functional movement in a human with quadriplegia. Nature 533, 247–250 (2016).
Article Google Scholar
Ajiboye, A. B. et al. Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. Lancet 389, 1821–1830 (2017).
Article Google Scholar
Capogrosso, M. et al. A brain–spine interface alleviating gait deficits after spinal cord injury in primates. Nature 539, 284–288 (2016).
Article Google Scholar
Bryan, M. J., Jiang, L. P. & Rao, R. P. N. Neural co-processors for restoring brain function: results from a cortical model of grasping. J. Neural Eng. 20, 036004 (2023).
Article Google Scholar
Deadwyler, S. A. et al. A cognitive prosthesis for memory facilitation by closed-loop functional ensemble stimulation of hippocampal neurons in primate brain. Exp. Neurol. 287, 452–460 (2017).
Article Google Scholar
Hampson, R. E. et al. Developing a hippocampal neural prosthetic to facilitate human memory encoding and recall. J. Neural Eng. 15, 036014 (2018).
Article Google Scholar
Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P. & Brown, E. N. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. J. Neurophysiol. 93, 1074–1089 (2005).
Article Google Scholar
Song, D. et al. Nonlinear dynamical modeling of human hippocampal CA3-CA1 functional connectivity for memory prostheses. In Proc. 2015 7th International IEEE/EMBS Conference on Neural Engineering (NER) 316–319 (IEEE, 2015).
Qian, C. et al. Binless kernel machine: modeling spike train transformation for cognitive neural prostheses. Neural Comput. 32, 1863–1900 (2020).
Article MathSciNet Google Scholar
Choi, J. S. et al. Eliciting naturalistic cortical responses with a sensory prosthesis via optimized microstimulation. J. Neural Eng. 13, 056007 (2016).
Article Google Scholar
Upadhyay, U., De, A. & Gomez-Rodrizuez, M. Deep reinforcement learning of marked temporal point processes. In Advances in Neural Information Processing Systems Vol. 31 (eds Bengio, S. et al.) 3172–3182 (Curran Associates Inc., 2018).
Li, S. et al. Learning temporal point processes via reinforcement learning. In Advances in Neural Information Processing Systems Vol. 31 (eds Bengio, S. et al.) 10804–10814 (Curran Associates Inc., 2018).
Zhu, S., Li, S., Peng, Z. & Xie, Y. Imitation learning of neural spatio-temporal point processes. IEEE Trans. Knowl. Data Eng. 34, 5391–5402 (2022).
Article Google Scholar
DiGiovanna, J., Mahmoudi, B., Fortes, J., Principe, J. C. & Sanchez, J. C. Coadaptive brain–machine interface via reinforcement learning. IEEE Trans. Biomed. Eng. 56, 54–64 (2009).
Article Google Scholar
Marsh, B. T., Tarigoppula, V. S. A., Chen, C. & Francis, J. T. Toward an autonomous brain machine interface: integrating sensorimotor reward modulation and reinforcement learning. J. Neurosci. 35, 7374–7387 (2015).
Article Google Scholar
Shen, X., Zhang, X., Huang, Y., Chen, S. & Wang, Y. Task learning over multi-day recording via internally rewarded reinforcement learning based brain machine interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 28, 3089–3099 (2020).
Article Google Scholar
International Brain Laboratory et al. A brain-wide map of neural activity during complex behaviour. Nature 645, 177–191 (2025).
Steinmetz, N., Zatka-Haas, P., Carandini, M. & Harris, K. Main dataset from steinmetz et al. 2019. figshare https://doi.org/10.6084/M9.FIGSHARE.9598406.V2 (2019).
Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273 (2019).
Article Google Scholar
Narayanan, N. S. & Laubach, M. Top-down control of motor cortex ensembles by dorsomedial prefrontal cortex. Neuron 52, 921–931 (2006).
Article Google Scholar
Li, W. et al. The neural mechanism exploration of adaptive motor control: dynamical economic cell allocation in the primary motor cortex. IEEE Trans. Neural Syst. Rehabil. Eng. 25, 492–501 (2016).
Article Google Scholar
Haan, R. D. et al. Neural representation of motor output, context and behavioral adaptation in rat medial prefrontal cortex during learned behavior. Front. Neural Circuits 12, 75 (2018).
Article Google Scholar
van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
Seidler, R. D., Kwak, Y., Fling, B. W. & Bernard, J. A. in Progress in Motor Control (eds Richardson, M. J. et al.) Vol. 782, 39–60 (Springer, 2013).
Cross, L., Cockburn, J., Yue, Y. & O’Doherty, J. P. Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments. Neuron 109, 724–738.e7 (2020).
Article Google Scholar
Domenech, P., Rheims, S. & Koechlin, E. Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex. Science 369, eabb0184 (2020).
Article Google Scholar
Sugawara, M. & Katahira, K. Dissociation between asymmetric value updating and perseverance in human reinforcement learning. Sci. Rep. 11, 3574 (2021).
Article Google Scholar
Bermudez-Contreras, E. Deep reinforcement learning to study spatial navigation, learning and memory in artificial and biological agents. Biol. Cybern. 115, 131–134 (2021).
Article Google Scholar
Lubianiker, N., Paret, C., Dayan, P. & Hendler, T. Neurofeedback through the lens of reinforcement learning. Trends Neurosci. 45, 579–593 (2022).
Article Google Scholar
Carmena, J. M., Ganguly, K., Dimitrov, D. F. & Wallis, J. D. Reversible large-scale modification of cortical networks during neuroprosthetic control. Nat. Neurosci. 14, 662–667 (2011).
Article Google Scholar
Orsborn, A. L. et al. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. Neuron 82, 1380–1393 (2014).
Article Google Scholar
Zhao, Y., Hessburg, J. P., Kumar, J. N. A. & Francis, J. T. Paradigm shift in sensorimotor control research and brain machine interface control: the influence of context on sensorimotor representations. Front. Neurosci. 12, 579 (2018).
Article Google Scholar
Sakellaridi, S. et al. Intrinsic variable learning for brain–machine interface control by human anterior intraparietal cortex. Neuron 102, 694–705.e3 (2019).
Article Google Scholar
Rowald, A. et al. Activity-dependent spinal cord neuromodulation rapidly restores trunk and leg motor functions after complete paralysis. Nat. Med. 28, 260–271 (2022).
Article Google Scholar
Bonizzato, M. et al. Autonomous optimization of neuroprosthetic stimulation parameters that drive the motor cortex and spinal cord outputs in rats and monkeys. Cell Rep. Med. 4, 101008 (2023).
Article Google Scholar
Nieves-Vazquez, H. A., Kim, E. & Ueda, J. Closed-loop estimation of individualized inter-stimulus interval window for transient neuromodulation via paired mechanical and brain stimulation. IEEE. Trans. Med. Robot. Bionics 5, 110–119 (2023).
Article Google Scholar
Golub, M. D. et al. Learning by neural reassociation. Nat. Neurosci. 21, 607–616 (2018).
Article Google Scholar
Mahmoudi, B. & Sanchez, J. C. A symbiotic brain–machine interface through value-based decision making. PLoS ONE 6, e14760 (2011).
Article Google Scholar
Fidêncio, A. X., Klaes, C. & Iossifidis, I. Error-related potentials in reinforcement learning-based brain-machine interfaces. Front. Hum. Neurosci. 16, 806517 (2022).
Article Google Scholar
Tan, J., Zhang, X., Wu, S., Song, Z. & Wang, Y. Hidden brain state-based internal evaluation using kernel inverse reinforcement learning in brain-machine interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 4219–4229 (2024).
Article Google Scholar
Valle, G. et al. Biomimetic computer-to-brain communication enhancing naturalistic touch sensations via peripheral nerve stimulation. Nat. Commun. 15, 1151 (2024).
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Schulman, J., Moritz, P., Levine, S., Jordan, M. & Abbeel, P. High-dimensional continuous control using generalized advantage estimation. Preprint at https://arxiv.org/abs/1506.02438 (2018).
Mei, H. & Eisner, J. M. The neural Hawkes process: a neurally self-modulating multivariate point process. In Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) (Curran Associates, Inc., 2017).
Cunningham, J. P. & Yu, B. M. Dimensionality reduction for large-scale neural recordings. Nat. Neurosci. 17, 1500–1509 (2014).
Article Google Scholar
Sadtler, P. T. et al. Neural constraints on learning. Nature 512, 423–426 (2014).
Article Google Scholar
Gallego, J. A., Perich, M. G., Miller, L. E. & Solla, S. A. Neural manifolds for the control of movement. Neuron 94, 978–984 (2017).
Article Google Scholar
Wärnberg, E. & Kumar, A. Perturbing low dimensional activity manifolds in spiking neuronal networks. PLoS Comput. Biol. 15, e1007074 (2019).
Article Google Scholar
Gallego, J. A., Perich, M. G., Chowdhury, R. H., Solla, S. A. & Miller, L. E. Long-term stability of cortical population dynamics underlying consistent behavior. Nat. Neurosci. 23, 260–270 (2020).
Article Google Scholar
Perich, M. G., Narain, D. & Gallego, J. A. A neural manifold view of the brain. Nat. Neurosci. 28, 1582–1597 (2025).
Article Google Scholar
Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).
Article Google Scholar
Pandarinath, C. et al. Inferring single-trial neural population dynamics using sequential auto-encoders. Nat. Methods 15, 805–815 (2018).
Article Google Scholar
Abbaspourazad, H., Choudhury, M., Wong, Y. T., Pesaran, B. & Shanechi, M. M. Multiscale low-dimensional motor cortical state dynamics predict naturalistic reach-and-grasp behavior. Nat. Commun. 12, 607 (2021).
Article Google Scholar
Safaie, M. et al. Preserved neural dynamics across animals performing similar behaviour. Nature 623, 765–771 (2023).
Article Google Scholar
Wu, S., Zhang, X. & Wang, Y. Neural manifold constraint for spike prediction models under behavioral reinforcement. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 2772–2781 (2024).
Article Google Scholar
Fetz, E. E., Jackson, A. & Mavoori, J. Long-term motor cortex plasticity induced by an electronic neural implant. Nature 444, 56–60 (2006).
Article Google Scholar
Guggenmos, D. J. et al. Restoration of function after brain damage using a neural prosthesis. Proc. Natl Acad. Sci. USA 110, 21177–21182 (2013).
Article Google Scholar
Wu, S. et al. Spike prediction on primary motor cortex from medial prefrontal cortex during task learning. J. Neural Eng. 19, 046025 (2022).
Article Google Scholar
Hawkes, A. G. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90 (1971).
Article MathSciNet Google Scholar
Møller, M. F. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533 (1993).
Article Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Wu, S. et al. A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity. Zenodo https://doi.org/10.5281/ZENODO.17221566 (2025).

Download references

Acknowledgements

This work was supported by STI 2030-Major Projects under grant no. 2021ZD0200403, the Research Grants Council of the Hong Kong Special Administrative Region, China (project no. HKUST C6049-24G), Special Research Support from the Chau Hoi Shuen Foundation under grant no. R9051, the Seed Fund of the Big Data for Bio-Intelligence Laboratory from HKUST under grant no. Z0428 and the Innovation and Technology Commission under grant no. ITCPD/17-9.

Author information

Authors and Affiliations

Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Shenghui Wu, Zhiwei Song, Xiang Zhang, Yifan Huang, Shuhang Chen, Xiang Shen, Jieyuan Tan, Mingdong Li, Ziyi Wang & Yiwen Wang
Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA
Xiang Zhang
Department of Orthopaedics and Traumatology, The University of Hong Kong, Hong Kong SAR, China
Yifan Huang
Hong Kong Center for Neurodegenerative Diseases, Hong Kong SAR, China
Shuhang Chen & Kai Liu
Guangdong Institute of Intelligence Science and Technology, Zhuhai, China
Xiang Shen
Robotics and Autonomous Systems Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Mingdong Li
Division of Life Science, State Key Laboratory of Nervous System Disorders, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Yujun Chen & Kai Liu
SIAT-HKUST Joint Laboratory for Brain Science, Hong Kong SAR, China
Kai Liu
Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Kai Liu & Yiwen Wang
Department of Bioengineering, Imperial College London, London, UK
Dario Farina
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
Jose C. Principe

Authors

Shenghui Wu
View author publications
Search author on:PubMed Google Scholar
Zhiwei Song
View author publications
Search author on:PubMed Google Scholar
Xiang Zhang
View author publications
Search author on:PubMed Google Scholar
Yifan Huang
View author publications
Search author on:PubMed Google Scholar
Shuhang Chen
View author publications
Search author on:PubMed Google Scholar
Xiang Shen
View author publications
Search author on:PubMed Google Scholar
Jieyuan Tan
View author publications
Search author on:PubMed Google Scholar
Mingdong Li
View author publications
Search author on:PubMed Google Scholar
Ziyi Wang
View author publications
Search author on:PubMed Google Scholar
Yujun Chen
View author publications
Search author on:PubMed Google Scholar
Kai Liu
View author publications
Search author on:PubMed Google Scholar
Dario Farina
View author publications
Search author on:PubMed Google Scholar
Jose C. Principe
View author publications
Search author on:PubMed Google Scholar
Yiwen Wang
View author publications
Search author on:PubMed Google Scholar

Contributions

S.W. and Y.W. conceived the study, developed the methodology and performed data analyses. S.W., Z.S. and Z.W. developed the code. S.W. and Z.S. performed the analyses for the video demonstration. J.T. and M.L. conducted implementations and analyzed results on open datasets. X.Z., Y.H., S.C. and X.S. performed the rat experiments and collected the dataset. Y.C. and K.L. contributed to histology imaging and electrophysiology studies. S.W. wrote the paper. D.F., J.C.P. and Y.W. reviewed and revised the paper. Y.W. supervised the work. All authors helped to prepare or edit the manuscript.

Corresponding author

Correspondence to Yiwen Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Daniel N. Zdeblick and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figs. 1–6 and Note 1. (download PDF )

Reporting Summary (download PDF )

Supplementary Video 1 | Evaluation of RLPP-generated spike trains in online decoding (download MP4 )

. The video demonstrates the behavioral experimental setup and online decoding results. The rat was performing the two-lever discrimination task. The recorded spike trains of M1 neurons were real-time decoded into trajectories in a 2D space by a Kalman filter (KF) as an online prosthetic control. The spike trains generated from the RLPP model were classified by the movement decoder and subsequently decoded into the 2D trajectories by a KF. A constraint was applied to limit excessive firing in the generated spike trains during the model training. The video presents three consecutive example trials of behavioral decoding, showcasing the rat’s behavior alongside recorded spiking activities. The position of the screen cursor decoded from the experimental recordings accurately matched the rat’s actual movement. In contrast, randomly sampled spike trains failed to control the cursor, reflecting conditions where M1 recordings were unavailable due to neural pathway damage. Notably, our RL-trained model successfully generated spike trains from online mPFC recordings, where these generated spike trains showed similar modulation patterns as the experimental data and produced accurate decoding trajectories that aligned with the rat’s behavior.

Source data

Source Data Fig. 2 (download ZIP )

Statistical source data.

Source Data Fig. 3 (download CSV )

Statistical source data.

Source Data Fig. 4 (download CSV )

Statistical source data.

Source Data Fig. 5 (download CSV )

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, S., Song, Z., Zhang, X. et al. A generative spike prediction model using behavioral reinforcement for re-establishing neural functional connectivity. Nat Comput Sci 6, 179–192 (2026). https://doi.org/10.1038/s43588-025-00915-5

Download citation

Received: 27 November 2024
Accepted: 29 October 2025
Published: 02 January 2026
Version of record: 02 January 2026
Issue date: February 2026
DOI: https://doi.org/10.1038/s43588-025-00915-5