Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Aligning brains into a shared space improves their alignment with large language models

Abstract

Recent research demonstrates that large language models can predict neural activity recorded via electrocorticography during natural language processing. To predict word-by-word neural activity, most prior work evaluates encoding models within individual electrodes and participants, limiting generalizability. Here we analyze electrocorticography data from eight participants listening to the same 30-min podcast. Using a shared response model, we estimate a common information space across participants. This shared space substantially enhances large language model-based encoding performance and enables denoising of individual brain responses by projecting back into participant-specific electrode spaces—yielding a 37% average improvement in encoding accuracy (from r = 0.188 to r = 0.257). The greatest gains occur in brain areas specialized for language comprehension, particularly the superior temporal gyrus and inferior frontal gyrus. Our findings highlight that estimating a shared space allows us to construct encoding models that better generalize across individuals.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Improving model-based encoding performance with SRM.
Fig. 2: Reconstructing electrode activity via the shared space.
Fig. 3: Comparison of encoding performance for SRM-reconstructed data and original electrode data for different regions of the language network.
Fig. 4: Cross-participant encoding performance via the shared space.

Similar content being viewed by others

Data availability

Sample data are available via Zenodo at https://zenodo.org/records/15220273 (ref. 53), and the full raw dataset is publicly available at https://openneuro.org/datasets/ds005574/versions/1.0.0 (ref. 54). Source data are provided with this paper.

Code availability

Code used to analyze the data is publicly available via GitHub at https://github.com/pritamarnab/SRM-Encoding (ref. 55).

References

  1. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).

  2. Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).

  3. Manning, C. D., Clark, K., Hewitt, J., Khandelwal, U. & Levy, O. Emergent linguistic structure in artificial neural networks trained by self-supervision. Proc. Natl Acad. Sci. USA 117, 30046–30054 (2020).

    Article  Google Scholar 

  4. Schrimpf, M. et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

    Article  Google Scholar 

  5. Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5, 134 (2022).

    Article  Google Scholar 

  6. Goldstein, A. et al. Shared computational principles for language processing in humans and deep language models. Nat. Neurosci. 25, 369–380 (2022).

    Article  Google Scholar 

  7. Toneva, M., Mitchell, T. M. & Wehbe, L. Combining computational controls with natural text reveals aspects of meaning composition. Nat. Comput. Sci. 2, 745–757 (2022).

    Article  Google Scholar 

  8. Kumar, S. et al. Shared functional specialization in transformer-based language models and the human brain. Nat. Commun. 15, 5523 (2024).

    Article  Google Scholar 

  9. Cai, J., Hadjinicolaou, A. E., Paulk, A. C., Williams, Z. M. & Cash, S. S. Natural language processing models reveal neural dynamics of human conversation. Nat. Commun. 16, 3376 (2025).

    Article  Google Scholar 

  10. Goldstein, A. et al. A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations. Nat. Hum. Behav. 9, 1041–1055 (2025).

    Article  Google Scholar 

  11. Mischler, G., Li, Y. A., Bickel, S., Mehta, A. D. & Mesgarani, N. Contextual feature extraction hierarchies converge in large language models and the brain. Nat. Mach. Intell. 6, 1467–1477 (2024).

    Article  Google Scholar 

  12. Goldstein, A. et al. Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns. Nat. Commun. 15, 2768 (2024).

    Article  Google Scholar 

  13. Hong, Z. et al. Scale matters: large language models with billions (rather than millions) of parameters better match neural representations of natural language. eLife 13, RP101204 (2024).

    Google Scholar 

  14. Zada, Z. et al. A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations. Neuron 112, 3211–3222 (2024).

    Article  Google Scholar 

  15. Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011).

    Article  Google Scholar 

  16. Honey, C. J. et al. Slow cortical dynamics and the accumulation of information over long timescales. Neuron 76, 423–434 (2012).

    Article  Google Scholar 

  17. Hasson, U., Chen, J. & Honey, C. J. Hierarchical process memory: memory as an integral component of information processing. Trends Cogn. Sci. 19, 304–313 (2015).

    Article  Google Scholar 

  18. Nastase, S. A., Gazzola, V., Hasson, U. & Keysers, C. Measuring shared responses across subjects using intersubject correlation. Soc. Cogn. Affect. Neurosci. 14, 667–685 (2019).

    Google Scholar 

  19. Nastase, S. A. et al. The ‘Narratives’ fMRI dataset for evaluating models of naturalistic language comprehension. Sci. Data 8, 250 (2021).

    Article  Google Scholar 

  20. Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 1177–1194 (2010).

    Article  Google Scholar 

  21. Nieto-Castañón, A. & Fedorenko, E. Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. NeuroImage 63, 1646–1669 (2012).

    Article  Google Scholar 

  22. Braga, R. M., DiNicola, L. M., Becker, H. C. & Buckner, R. L. Situating the left-lateralized language network in the broader organization of multiple specialized large-scale distributed networks. J. Neurophysiol. 124, 1415–1448 (2020).

    Article  Google Scholar 

  23. Lipkin, B. et al. Probabilistic atlas for the language network based on precision fMRI data from >800 individuals. Sci. Data 9, 529 (2022).

    Article  Google Scholar 

  24. Haxby, J. V. et al. A common, high-dimensional model of the representational space in human ventral temporal cortex. Neuron 72, 404–416 (2011).

    Article  Google Scholar 

  25. Chen, P.-H. et al. A reduced-dimension fMRI shared response model. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C. et al.) (Curran Associates, 2015).

  26. Guntupalli, J. S. et al. A model of representational spaces in human cortex. Cerebral Cortex 26, 2919–2934 (2016).

    Article  Google Scholar 

  27. Haxby, J. V., Guntupalli, J. S., Nastase, S. A. & Feilong, M. Hyperalignment: modeling shared information encoded in idiosyncratic cortical topographies. eLife 9, e56601 (2020).

    Article  Google Scholar 

  28. Feilong, M. et al. The Individualized Neural Tuning Model: precise and generalizable cartography of functional architecture in individual brains. Imag. Neurosci. 1, 1–34 (2023).

    Google Scholar 

  29. Owen, L. L. W. et al. A Gaussian process model of human electrocorticographic data. Cereb. Cortex 30, 5333–5345 (2020).

    Article  Google Scholar 

  30. Van Uden, C. E. et al. Modeling semantic encoding in a common neural representational space. Front. Neurosci. 12, 378029 (2018).

    Google Scholar 

  31. Nastase, S. A., Liu, Y.-F., Hillman, H., Norman, K. A. & Hasson, U. Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. NeuroImage 217, 116865 (2020).

    Article  Google Scholar 

  32. Radford, A. et al. Language Models Are Unsupervised Multitask Learners (OpenAI Blog, 2019).

  33. Naselaris, T., Kay, K. N., Nishimoto, S. & Gallant, J. L. Encoding and decoding in fMRI. NeuroImage 56, 400–410 (2011).

    Article  Google Scholar 

  34. Huth, A. G., De Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

    Article  Google Scholar 

  35. Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 31, 968–980 (2006).

    Article  Google Scholar 

  36. Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).

    Article  Google Scholar 

  37. de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).

    Article  Google Scholar 

  38. Honnibal, M. et al. spaCy: industrial-strength natural language processing in Python. Zenodo https://doi.org/10.5281/zenodo.1212303 (2020).

  39. Pennington, J., Socher, R. & Manning, C. GloVe: global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (eds Moschitti, A., Pang, B. & Daelemans, W.) 1532–1543 (Association for Computational Linguistics, 2014).

  40. Antonello, R., Vaidya, A. & Huth, A. Scaling laws for language encoding models in fMRI. In Advances in Neural Information Processing Systems Vol. 36 (eds Oh, A. et al.) 21895–21907 (Curran Associates, 2023).

  41. Conwell, C., Prince, J. S., Kay, K. N., Alvarez, G. A. & Konkle, T. A large-scale examination of inductive biases shaping high-level visual representation in brains and machines. Nat. Commun. 15, 9383 (2024).

    Article  Google Scholar 

  42. Wang, A. Y., Kay, K., Naselaris, T., Tarr, M. J. & Wehbe, L. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset. Nat. Mach. Intell. 5, 1415–1426 (2023).

    Article  Google Scholar 

  43. Feilong, M., Nastase, S. A., Guntupalli, J. S. & Haxby, J. V. Reliable individual differences in fine-grained cortical functional architecture. NeuroImage 183, 375–386 (2018).

    Article  Google Scholar 

  44. Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature 620, 1037–1046 (2023).

    Article  Google Scholar 

  45. Willett, F. R. et al. A high-performance speech neuroprosthesis. Nature 620, 1031–1036 (2023).

    Article  Google Scholar 

  46. Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (eds Liu, Q. & Schlangen, D.) 38–45 (Association for Computational Linguistics, Online, 2020).

  47. Kauf, C., Tuckute, G., Levy, R., Andreas, J. & Fedorenko, E. Lexical-semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network. Neurobiol. Lang. 5, 7–42 (2024).

    Article  Google Scholar 

  48. Tuckute, G. et al. Driving and suppressing the human language network using large language models. Nat. Hum. Behav. 8, 544–561 (2024).

    Article  Google Scholar 

  49. Manning, J. R., Jacobs, J., Fried, I. & Kahana, M. J. Broadband shifts in local field potential power spectra are correlated with single-neuron spiking in humans. J. Neurosci. 29, 13613–13620 (2009).

    Article  Google Scholar 

  50. Jia, X., Tanabe, S. & Kohn, A. Gamma and the coordination of spiking activity in early visual cortex. Neuron 77, 762–774 (2013).

    Article  Google Scholar 

  51. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).

    Article  MathSciNet  Google Scholar 

  52. Cohen, J. D. et al. Computational approaches to fMRI analysis. Nat. Neurosci. 20, 304–313 (2017).

    Article  Google Scholar 

  53. Bhattacharjee, A. ECoG data of 8 subjects listening to a podcast. Zenodo https://doi.org/10.5281/zenodo.15220273 (2025).

  54. Zada, Z. et al. The ‘podcast’ ECoG dataset for modeling neural activity during natural language comprehension. Sci. Data 12, 1135 (2025).

    Article  Google Scholar 

  55. Bhattacharjee, A. Software for the paper titled aligning brains into a shared space improves their alignment to large language models. Zenodo https://doi.org/10.5281/zenodo.15644439 (2025).

Download references

Acknowledgements

We thank our funders: NIH grant DP1HD091948 (U.H.), NIH grant R01NS109367 (A.F.), NIH CRCNS R01DC022534 (U.H.) and J. Insley Blair Pyne Fund (P.J.R. and U.H.)

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, A.B., S.A.N., A.G. and U.H.; data curation, B.A., W.D., D.F., P.D., A.F. and O.D.; formal analysis, A.B.; funding acquisition, P.J.R. and U.H.; methodology, A.B., S.A.N. and U.H.; project administration, A.B., S.A.N., P.J.R. and U.H.; software, A.B.; supervision, S.A.N. and U.H.; visualization, A.B., S.A.N. and U.H.; writing—original draft, A.B. and S.A.N.; writing—review and editing, A.B., S.A.N., Z.Z., H.W. and U.H.

Corresponding author

Correspondence to Arnab Bhattacharjee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Mark Lescroart, Jeremy R. Manning and Alex Murphy for their contribution to the peer review of this work. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–8 and Tables 1–5.

Reporting Summary

Peer Review file

Source data

Source Data Fig. 1

Contains the .mat files for statistical source data for Fig. 1. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.

Source Data Fig. 2

Contains the .mat files for statistical source data for Fig. 2. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.

Source Data Fig. 3

Contains the .mat files for statistical source data for Fig. 3. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.

Source Data Fig. 4

Contains the .mat files for statistical source data for Fig. 4. It also contains a README.txt file that points to the GitHub notebook code on how to use these source data files to generate the figure.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharjee, A., Zada, Z., Wang, H. et al. Aligning brains into a shared space improves their alignment with large language models. Nat Comput Sci (2025). https://doi.org/10.1038/s43588-025-00900-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s43588-025-00900-y

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing