Abstract
Recent research demonstrates that large language models can predict neural activity recorded via electrocorticography during natural language processing. To predict word-by-word neural activity, most prior work evaluates encoding models within individual electrodes and participants, limiting generalizability. Here we analyze electrocorticography data from eight participants listening to the same 30-min podcast. Using a shared response model, we estimate a common information space across participants. This shared space substantially enhances large language model-based encoding performance and enables denoising of individual brain responses by projecting back into participant-specific electrode spaces—yielding a 37% average improvement in encoding accuracy (from r = 0.188 to r = 0.257). The greatest gains occur in brain areas specialized for language comprehension, particularly the superior temporal gyrus and inferior frontal gyrus. Our findings highlight that estimating a shared space allows us to construct encoding models that better generalize across individuals.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
Sample data are available via Zenodo at https://zenodo.org/records/15220273 (ref. 53), and the full raw dataset is publicly available at https://openneuro.org/datasets/ds005574/versions/1.0.0 (ref. 54). Source data are provided with this paper.
Code availability
Code used to analyze the data is publicly available via GitHub at https://github.com/pritamarnab/SRM-Encoding (ref. 55).
References
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).
Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).
Manning, C. D., Clark, K., Hewitt, J., Khandelwal, U. & Levy, O. Emergent linguistic structure in artificial neural networks trained by self-supervision. Proc. Natl Acad. Sci. USA 117, 30046–30054 (2020).
Schrimpf, M. et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).
Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5, 134 (2022).
Goldstein, A. et al. Shared computational principles for language processing in humans and deep language models. Nat. Neurosci. 25, 369–380 (2022).
Toneva, M., Mitchell, T. M. & Wehbe, L. Combining computational controls with natural text reveals aspects of meaning composition. Nat. Comput. Sci. 2, 745–757 (2022).
Kumar, S. et al. Shared functional specialization in transformer-based language models and the human brain. Nat. Commun. 15, 5523 (2024).
Cai, J., Hadjinicolaou, A. E., Paulk, A. C., Williams, Z. M. & Cash, S. S. Natural language processing models reveal neural dynamics of human conversation. Nat. Commun. 16, 3376 (2025).
Goldstein, A. et al. A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations. Nat. Hum. Behav. 9, 1041–1055 (2025).
Mischler, G., Li, Y. A., Bickel, S., Mehta, A. D. & Mesgarani, N. Contextual feature extraction hierarchies converge in large language models and the brain. Nat. Mach. Intell. 6, 1467–1477 (2024).
Goldstein, A. et al. Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns. Nat. Commun. 15, 2768 (2024).
Hong, Z. et al. Scale matters: large language models with billions (rather than millions) of parameters better match neural representations of natural language. eLife 13, RP101204 (2024).
Zada, Z. et al. A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations. Neuron 112, 3211–3222 (2024).
Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011).
Honey, C. J. et al. Slow cortical dynamics and the accumulation of information over long timescales. Neuron 76, 423–434 (2012).
Hasson, U., Chen, J. & Honey, C. J. Hierarchical process memory: memory as an integral component of information processing. Trends Cogn. Sci. 19, 304–313 (2015).
Nastase, S. A., Gazzola, V., Hasson, U. & Keysers, C. Measuring shared responses across subjects using intersubject correlation. Soc. Cogn. Affect. Neurosci. 14, 667–685 (2019).
Nastase, S. A. et al. The ‘Narratives’ fMRI dataset for evaluating models of naturalistic language comprehension. Sci. Data 8, 250 (2021).
Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 1177–1194 (2010).
Nieto-Castañón, A. & Fedorenko, E. Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. NeuroImage 63, 1646–1669 (2012).
Braga, R. M., DiNicola, L. M., Becker, H. C. & Buckner, R. L. Situating the left-lateralized language network in the broader organization of multiple specialized large-scale distributed networks. J. Neurophysiol. 124, 1415–1448 (2020).
Lipkin, B. et al. Probabilistic atlas for the language network based on precision fMRI data from >800 individuals. Sci. Data 9, 529 (2022).
Haxby, J. V. et al. A common, high-dimensional model of the representational space in human ventral temporal cortex. Neuron 72, 404–416 (2011).
Chen, P.-H. et al. A reduced-dimension fMRI shared response model. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C. et al.) (Curran Associates, 2015).
Guntupalli, J. S. et al. A model of representational spaces in human cortex. Cerebral Cortex 26, 2919–2934 (2016).
Haxby, J. V., Guntupalli, J. S., Nastase, S. A. & Feilong, M. Hyperalignment: modeling shared information encoded in idiosyncratic cortical topographies. eLife 9, e56601 (2020).
Feilong, M. et al. The Individualized Neural Tuning Model: precise and generalizable cartography of functional architecture in individual brains. Imag. Neurosci. 1, 1–34 (2023).
Owen, L. L. W. et al. A Gaussian process model of human electrocorticographic data. Cereb. Cortex 30, 5333–5345 (2020).
Van Uden, C. E. et al. Modeling semantic encoding in a common neural representational space. Front. Neurosci. 12, 378029 (2018).
Nastase, S. A., Liu, Y.-F., Hillman, H., Norman, K. A. & Hasson, U. Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. NeuroImage 217, 116865 (2020).
Radford, A. et al. Language Models Are Unsupervised Multitask Learners (OpenAI Blog, 2019).
Naselaris, T., Kay, K. N., Nishimoto, S. & Gallant, J. L. Encoding and decoding in fMRI. NeuroImage 56, 400–410 (2011).
Huth, A. G., De Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).
Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 31, 968–980 (2006).
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).
de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).
Honnibal, M. et al. spaCy: industrial-strength natural language processing in Python. Zenodo https://doi.org/10.5281/zenodo.1212303 (2020).
Pennington, J., Socher, R. & Manning, C. GloVe: global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (eds Moschitti, A., Pang, B. & Daelemans, W.) 1532–1543 (Association for Computational Linguistics, 2014).
Antonello, R., Vaidya, A. & Huth, A. Scaling laws for language encoding models in fMRI. In Advances in Neural Information Processing Systems Vol. 36 (eds Oh, A. et al.) 21895–21907 (Curran Associates, 2023).
Conwell, C., Prince, J. S., Kay, K. N., Alvarez, G. A. & Konkle, T. A large-scale examination of inductive biases shaping high-level visual representation in brains and machines. Nat. Commun. 15, 9383 (2024).
Wang, A. Y., Kay, K., Naselaris, T., Tarr, M. J. & Wehbe, L. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset. Nat. Mach. Intell. 5, 1415–1426 (2023).
Feilong, M., Nastase, S. A., Guntupalli, J. S. & Haxby, J. V. Reliable individual differences in fine-grained cortical functional architecture. NeuroImage 183, 375–386 (2018).
Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature 620, 1037–1046 (2023).
Willett, F. R. et al. A high-performance speech neuroprosthesis. Nature 620, 1031–1036 (2023).
Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (eds Liu, Q. & Schlangen, D.) 38–45 (Association for Computational Linguistics, Online, 2020).
Kauf, C., Tuckute, G., Levy, R., Andreas, J. & Fedorenko, E. Lexical-semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network. Neurobiol. Lang. 5, 7–42 (2024).
Tuckute, G. et al. Driving and suppressing the human language network using large language models. Nat. Hum. Behav. 8, 544–561 (2024).
Manning, J. R., Jacobs, J., Fried, I. & Kahana, M. J. Broadband shifts in local field potential power spectra are correlated with single-neuron spiking in humans. J. Neurosci. 29, 13613–13620 (2009).
Jia, X., Tanabe, S. & Kohn, A. Gamma and the coordination of spiking activity in early visual cortex. Neuron 77, 762–774 (2013).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
Cohen, J. D. et al. Computational approaches to fMRI analysis. Nat. Neurosci. 20, 304–313 (2017).
Bhattacharjee, A. ECoG data of 8 subjects listening to a podcast. Zenodo https://doi.org/10.5281/zenodo.15220273 (2025).
Zada, Z. et al. The ‘podcast’ ECoG dataset for modeling neural activity during natural language comprehension. Sci. Data 12, 1135 (2025).
Bhattacharjee, A. Software for the paper titled aligning brains into a shared space improves their alignment to large language models. Zenodo https://doi.org/10.5281/zenodo.15644439 (2025).
Acknowledgements
We thank our funders: NIH grant DP1HD091948 (U.H.), NIH grant R01NS109367 (A.F.), NIH CRCNS R01DC022534 (U.H.) and J. Insley Blair Pyne Fund (P.J.R. and U.H.)
Author information
Authors and Affiliations
Contributions
Conceptualization, A.B., S.A.N., A.G. and U.H.; data curation, B.A., W.D., D.F., P.D., A.F. and O.D.; formal analysis, A.B.; funding acquisition, P.J.R. and U.H.; methodology, A.B., S.A.N. and U.H.; project administration, A.B., S.A.N., P.J.R. and U.H.; software, A.B.; supervision, S.A.N. and U.H.; visualization, A.B., S.A.N. and U.H.; writing—original draft, A.B. and S.A.N.; writing—review and editing, A.B., S.A.N., Z.Z., H.W. and U.H.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Mark Lescroart, Jeremy R. Manning and Alex Murphy for their contribution to the peer review of this work. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Source Data Fig. 1
Contains the .mat files for statistical source data for Fig. 1. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.
Source Data Fig. 2
Contains the .mat files for statistical source data for Fig. 2. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.
Source Data Fig. 3
Contains the .mat files for statistical source data for Fig. 3. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.
Source Data Fig. 4
Contains the .mat files for statistical source data for Fig. 4. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bhattacharjee, A., Zada, Z., Wang, H. et al. Aligning brains into a shared space improves their alignment with large language models. Nat Comput Sci (2025). https://doi.org/10.1038/s43588-025-00900-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s43588-025-00900-y
This article is cited by
-
Viability of using LLMs as models of human language processing
Nature Computational Science (2025)


