Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Active use of latent tree-structured sentence representation in humans and large language models

Abstract

Understanding how sentences are represented in the human brain, as well as in large language models (LLMs), poses a substantial challenge for cognitive science. Here we develop a one-shot learning task to investigate whether humans and LLMs encode tree-structured constituents within sentences. Participants (total N = 372, native Chinese or English speakers, and bilingual in Chinese and English) and LLMs (for example, ChatGPT) were asked to infer which words should be deleted from a sentence. Both groups tend to delete constituents, instead of non-constituent word strings, following rules specific to Chinese and English, respectively. The results cannot be explained by models that rely only on word properties and word positions. Crucially, based on word strings deleted by either humans or LLMs, the underlying constituency tree structure can be successfully reconstructed. Altogether, these results demonstrate that latent tree-structured sentence representations emerge in both humans and LLMs.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Task and constituent rate in Dataset 1.
Fig. 2: Sensitivity to syntactic category in Dataset 1.
Fig. 3: Sensitivity to syntactic category in Dataset 2 and influence of input language.
Fig. 4: Results on Dataset 3, which deletes an arbitrary constituent, instead of an NP embedded in a VP.
Fig. 5: Procedures to reconstruct a constituency tree based on the word-deletion task.
Fig. 6: Properties of constituency tree constructed based on word deletion behaviour.

Similar content being viewed by others

Data availability

All sentences and experiment results are available via GitHub at https://github.com/y1ny/WordDeletion, except for the treebank sentences, which are available at https://catalog.ldc.upenn.edu/LDC99T42 (PTB) and https://catalog.ldc.upenn.edu/LDC2013T21 (CTB). Source data are provided with this paper.

Code availability

All scripts are available via GitHub at https://github.com/y1ny/WordDeletion.

References

  1. Bloomfield, L. Language (Univ. Chicago Press, 1933).

  2. Chomsky, N. Syntactic Structures (De Gruyter Mouton, 1957).

  3. Tesnière, L. Eléments de Syntaxe Structurale (Librairie C. Klincksieck, 1959).

  4. Mel’cuk, I. A. Dependency Syntax: Theory and Practice (SUNY Press, 1988).

  5. Miller, P. H. Strong Generative Capacity (CSLI Publications, 1999).

  6. Klein, D. & Manning, C. Corpus-based induction of syntactic structure: models of dependency and constituency. In Proc. 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04) 478–485 (Association for Computational Linguistics, 2004); https://doi.org/10.3115/1218955.1219016

  7. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) 5998–6008 (Curran Associates, 2017).

  8. Fodor, J. A. & Bever, T. G. The psychological reality of linguistic segments. J. Verbal Learn. Verbal Behav. 4, 414–420 (1965).

    Article  Google Scholar 

  9. Jarvella, R. J. Syntactic processing of connected speech. J. Verbal Learn. Verbal Behav. 10, 409–416 (1971).

    Article  Google Scholar 

  10. Pallier, C., Devauchelle, A.-D. & Dehaene, S. Cortical representation of the constituent structure of sentences. Proc. Natl Acad. Sci. USA 108, 2522–2527 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Brennan, J. R., Stabler, E. P., Wagenen, S. E. V., Luh, W.-M. & Hale, J. T. Abstract linguistic structure correlates with temporal activity during naturalistic comprehension. Brain Lang. 157–158, 81–94 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Ding, N., Melloni, L., Zhang, H., Tian, X. & Poeppel, D. Cortical tracking of hierarchical linguistic structures in connected speech. Nat. Neurosci. 19, 158–164 (2016).

    Article  CAS  PubMed  Google Scholar 

  13. Fedorenko, E. et al. Neural correlate of the construction of sentence meaning. Proc. Natl Acad. Sci. USA 113, E6256–E6262 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Nelson, M. J. et al. Neurophysiological dynamics of phrase-structure building during sentence processing. Proc. Natl Acad. Sci. USA 114, E3669–E3678 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Gibson, E. et al. How efficiency shapes human language. Trends Cogn. Sci. 23, 389–407 (2019).

    Article  PubMed  Google Scholar 

  16. Liu, H. Dependency distance as a metric of language comprehension difficulty. J. Cogn. Sci. 9, 159–191 (2008).

    Article  Google Scholar 

  17. Sanford, A. J. & Sturt, P. Depth of processing in language comprehension: not noticing the evidence. Trends Cogn. Sci. 6, 382–386 (2002).

    Article  PubMed  Google Scholar 

  18. Ferreira, F. & Patson, N. D. The ‘good enough’ approach to language comprehension. Lang. Linguist. Compass 1, 71–83 (2007).

    Article  Google Scholar 

  19. Fedorenko, E., Ivanova, A. A. & Regev, T. I. The language network as a natural kind within the broader landscape of the human brain. Nat. Rev. Neurosci. 25, 289–312 (2024).

    Article  CAS  PubMed  Google Scholar 

  20. Zeng, A. et al. GLM-130B: an open bilingual pre-trained model. In The Eleventh International Conference on Learning Representations (ICLR, 2023); https://openreview.net/forum?id=-Aw0rrrPUF

  21. Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

  22. OpenAI et al. GPT-4 Technical Report. Preprint at https://arxiv.org/abs/2303.08774 (2024).

  23. Marcus, G. & Davis, E. Rebooting AI: Building Artificial Intelligence We Can Trust (Pantheon Books, 2019).

  24. Barredo Arrieta, A. et al. Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020).

    Article  Google Scholar 

  25. Hagendorff, T. Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods. Preprint at https://arxiv.org/abs/2303.13988 (2023).

  26. Hewitt, J. & Manning, C. D. A. Structural probe for finding syntax in word representations. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J., Doran, C. & Solorio, T.) 4129–4138 (Association for Computational Linguistics, 2019); https://doi.org/10.18653/v1/N19-1419

  27. Kim, Y. et al. Unsupervised recurrent neural network grammars. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J., Doran, C. & Solorio, T.) 1105–1117 (Association for Computational Linguistics, 2019); https://doi.org/10.18653/v1/N19-1114

  28. Arps, D., Samih, Y., Kallmeyer, L. & Sajjad, H. Probing for constituency structure in neural language models. In Findings of the Association for Computational Linguistics: EMNLP 2022 (eds Goldberg, Y., Kozareva, Z. & Zhang, Y.) 6738–6757 (Association for Computational Linguistics, 2022); https://doi.org/10.18653/v1/2022.findings-emnlp.502

  29. Belinkov, Y. Probing classifiers: promises, shortcomings, and advances. Comput. Linguist. 48, 207–219 (2022).

    Article  Google Scholar 

  30. van Dijk, B., Kouwenhoven, T., Spruit, M. & van Duijn, M. J. Large language models: the need for nuance in current debates and a pragmatic perspective on understanding. In Proc. 2023 Conference on Empirical Methods in Natural Language Processing (eds Bouamor, H., Pino, J. & Bali, K.) 12641–12654 (Association for Computational Linguistics, 2023); https://doi.org/10.18653/v1/2023.emnlp-main.779

  31. Tian, Y., Song, Y., Xia, F. & Zhang, T. in Improving Constituency Parsing with Span Attention 1691–1703 (Association for Computational Linguistics, 2020); https://doi.org/10.18653/v1/2020.findings-emnlp.153

  32. Zhang, Y., Zhou, H. & Li, Z. Fast and accurate neural CRF constituency parsing. In Proc. Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20 (ed. Bessiere, C.) 4046–4053 (International Joint Conferences on Artificial Intelligence Organization, 2020); https://doi.org/10.24963/ijcai.2020/560

  33. Bender, E. M. & Koller, A. Climbing towards NLU: on meaning, form, and understanding in the age of data. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J.) 5185–5198 (Association for Computational Linguistics, 2020); https://doi.org/10.18653/v1/2020.acl-main.463

  34. Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In Proc. 2021 ACM Conference on Fairness, Accountability, and Transparency 610–623 (Association for Computing Machinery, 2021); https://doi.org/10.1145/3442188.3445922

  35. Binz, M. & Schulz, E. Using cognitive psychology to understand GPT-3. Proc. Natl Acad. Sci. USA 120, e2218523120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wilson, M., Petty, J. & Frank, R. How abstract is linguistic generalization in large language models? Experiments with argument structure. Trans. Assoc. Comput. Linguist. 11, 1377–1395 (2023).

    Article  Google Scholar 

  37. Introducing ChatGPT (OpenAI, 2022).

  38. Introducing the Next Generation of Claude (Anthropic, 2024).

  39. Team, G. et al. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805 (2023).

  40. Grattafiori, A. et al. The Llama 3 herd of models. Preprint at https://arxiv.org/abs/2407.21783 (2024).

  41. Marcus, M. P., Santorini, B. & Marcinkiewicz, M. A. Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19, 313–330 (1993).

    Google Scholar 

  42. Xue, N., Xia, F. E. I., Chiou, F.-D. & Palmer, M. The Penn Chinese TreeBank: phrase structure annotation of a large corpus. Nat. Lang. Eng. 11, 207–238 (2005).

    Article  Google Scholar 

  43. Tomasello, M. Do young children have adult syntactic competence? Cognition 74, 209–253 (2000).

    Article  CAS  PubMed  Google Scholar 

  44. Kidd, E., Donnelly, S. & Christiansen, M. H. Individual differences in language acquisition and processing. Trends Cogn. Sci. 22, 154–169 (2018).

    Article  PubMed  Google Scholar 

  45. Schrimpf, M. et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Goldstein, A. et al. Shared computational principles for language processing in humans and deep language models. Nat. Neurosci. 25, 369–380 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Stanojević, M., Brennan, J. R., Dunagan, D., Steedman, M. & Hale, J. T. Modeling structure-building in the brain with CCG parsing and large language models. Cogn. Sci. 47, e13312 (2023).

    Article  PubMed  Google Scholar 

  48. Lyu, B., Marslen-Wilson, W. D., Fang, Y. & Tyler, L. K. Finding structure during incremental speech comprehension. eLife 12, RP89311 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Yu, S., Gu, C., Huang, K. & Li, P. Predicting the next sentence (not word) in large language models: what model–brain alignment tells us about discourse comprehension. Sci. Adv. 10, eadn7744 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Carnie, A. Syntax: A Generative Introduction (Blackwell, 2002).

  51. Adger, D. Core Syntax: A Minimalist Approach (Oxford Univ. Press, 2003).

  52. Radford, A. Syntactic Theory and the Structure of English: A Minimalist Approach (Cambridge Univ. Press, 1997).

  53. Tallerman, M. Understanding Syntax (Arnold, 2005).

  54. Cao, S., Kitaev, N. & Klein, D. Unsupervised parsing via constituency tests. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (eds Webber, B., Cohn, T., He, Y. & Liu, Y.) 4798–4808 (Association for Computational Linguistics, 2020); https://doi.org/10.18653/v1/2020.emnlp-main.389

  55. Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems (eds Larochelle, H. et al.) vol. 33, 1877–1901 (Curran Associates, 2020).

  56. Marvin, R. & Linzen, T. Targeted syntactic evaluation of language models. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing (eds Riloff, E. et al.) 1192–1202 (Association for Computational Linguistics, 2018); https://doi.org/10.18653/v1/D18-1151

  57. Hu, J., Gauthier, J., Qian, P., Wilcox, E. & Levy, R. A systematic assessment of syntactic generalization in neural language models. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds Jurafsky, D. et al.) 1725–1744 (Association for Computational Linguistics, 2020); https://doi.org/10.18653/v1/2020.acl-main.158

  58. Finlayson, M. et al. Causal analysis of syntactic agreement mechanisms in neural language models. In Proc. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (eds Zong, C. et al.) 1828–1843 (Association for Computational Linguistics, 2021); https://doi.org/10.18653/v1/2021.acl-long.144

  59. Choe, D. K. & Charniak, E. Parsing as language modeling. In Proc. 2016 Conference on Empirical Methods in Natural Language Processing (eds Su, J., Duh, K. & Carreras, X.) 2331–2336 (Association for Computational Linguistics, 2016); https://doi.org/10.18653/v1/D16-1257

  60. Cross, J. & Huang, L. Incremental parsing with minimal features using bi-directional LSTM. In Proc. 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (eds Erk, K. & Smith, N. A.) 32–37 (Association for Computational Linguistics, 2016); https://doi.org/10.18653/v1/P16-2006

  61. Linzen, T., Dupoux, E. & Goldberg, Y. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Trans. Assoc. Comput. Linguist. 4, 521–535 (2016).

    Article  Google Scholar 

  62. Linzen, T. & Leonard, B. Distinct patterns of syntactic agreement errors in recurrent networks and humans. In Proc. 40th Annual Meeting of the Cognitive Science Society, CogSci 2018 690–695 (The Cognitive Science Society, 2018).

  63. Kitaev, N. & Klein, D. Constituency parsing with a self-attentive encoder. In Proc. 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Gurevych, I. & Miyao, Y.) 2676–2686 (Association for Computational Linguistics, 2018); https://doi.org/10.18653/v1/P18-1249

  64. Kuncoro, A. et al. LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better. In Proc. 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Gurevych, I. & Miyao, Y.) 1426–1436 (Association for Computational Linguistics, 2018); https://doi.org/10.18653/v1/P18-1132

  65. Michaelov, J., Arnett, C., Chang, T. & Bergen, B. Structural priming demonstrates abstract grammatical representations in multilingual language models. In Proc. 2023 Conference on Empirical Methods in Natural Language Processing (eds Bouamor, H., Pino, J. & Bali, K.) 3703–3720 (Association for Computational Linguistics, 2023); https://doi.org/10.18653/v1/2023.emnlp-main.227

  66. He, L., Chen, P., Nie, E., Li, Y. & Brennan, J. R. Decoding probing: revealing internal linguistic structures in neural language models using minimal pairs. In Proc. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (eds Calzolari, N. et al.) 4488–4497 (ELRA and ICCL, 2024).

  67. Elman, J. L. An alternative view of the mental lexicon. Trends Cogn. Sci. 8, 301–306 (2004).

    Article  PubMed  Google Scholar 

  68. Golan, T., Siegelman, M., Kriegeskorte, N. & Baldassano, C. Testing the limits of natural language models for predicting human language judgements. Nat. Mach. Intell. 5, 952–964 (2023).

    Article  Google Scholar 

  69. Hu, J., Mahowald, K., Lupyan, G., Ivanova, A. A. & Levy, R. Language models align with human judgments on key grammatical constructions. Proc. Natl Acad. Sci. USA 121, e2400917121 (2024).

  70. Tian, Y., Xia, F. & Song, Y. Large language models are no longer shallow parsers. In Proc. 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Ku, L.-W., Martins, A. & Srikumar, V.) 7131–7142 (Association for Computational Linguistics, 2024); https://doi.org/10.18653/v1/2024.acl-long.384

  71. Mahowald, K. et al. Dissociating language and thought in large language models: a cognitive perspective. Trends Cogn. Sci. 28, 517–540 (2023).

  72. Jawahar, G., Sagot, B. & Seddah, D. What does BERT learn about the structure of language? In Proc. 57th Annual Meeting of the Association for Computational Linguistics (eds Korhonen, A., Traum, D. & Màrquez, L.) 3651–3657 (Association for Computational Linguistics, 2019); https://doi.org/10.18653/v1/P19-1356

  73. Hagoort, P. The neurobiology of language beyond single-word processing. Science 366, 55–58 (2019).

    Article  CAS  PubMed  Google Scholar 

  74. Pylkkänen, L. The neural basis of combinatory syntax and semantics. Science 366, 62–66 (2019).

    Article  PubMed  Google Scholar 

  75. Fedorenko, E., Blank, I. A., Siegelman, M. & Mineroff, Z. Lack of selectivity for syntax relative to word meanings throughout the language network. Cognition 203, 104348 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Liu, W., Xiang, M. & Ding, N. Adjective scale probe: can language models encode formal semantics information? Proc. AAAI Conf. Artif. Intell. 37, 13282–13290 (2023).

    Google Scholar 

  77. Webb, T., Holyoak, K. J. & Lu, H. Emergent analogical reasoning in large language models. Nat. Hum. Behav. 7, 1526–1541 (2023).

    Article  PubMed  Google Scholar 

  78. McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conference Vol. 445 (eds van der Walt, S. & Millman, J.) 51–56 (SciPy, 2010).

  79. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems Vol. 32 (eds Wallach, H. et al.) 8024–8035 (Curran Associates, 2019).

  82. He, H. & Choi, J. D. The stem cell hypothesis: dilemma behind multi-task learning with transformer encoders. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing (eds Moens, M.-F. et al.) 5555–5577 (Association for Computational Linguistics, 2021); https://doi.org/10.18653/v1/2021.emnlp-main.451

  83. Honnibal, M., Montani, I., Van Landeghem, S. & Boyd, A. spaCy: Industrial-strength natural language processing in Python. Zenodo https://doi.org/10.5281/zenodo.1212303 (2020).

  84. Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conference (eds van der Walt, S. & Millman, J.) 92–96 (SciPy, 2010); https://doi.org/10.25080/Majora-92bf1922-011

  85. Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw. 6, 3021 (2021).

    Article  Google Scholar 

  86. Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).

    Article  Google Scholar 

  87. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  88. de Marneffe, M.-C., MacCartney, B. & Manning, C. D. Generating typed dependency parses from phrase structure parses. In Proc. Fifth International Conference on Language Resources and Evaluation (LREC’06) (eds Calzolari, N. et al.) 449–454 (European Language Resources Association, 2006).

  89. Schuster, S. & Manning, C. D. Enhanced English universal dependencies: an improved representation for natural language understanding tasks. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) (eds Calzolari, N. et al.) 2371–2378 (European Language Resources Association, 2016).

  90. Matsuzaki, T., Miyao, Y. & Tsujii, J. Probabilistic CFG with latent annotations. In Proc. 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05) (eds Knight, K., Ng, H. T. & Oflazer, K.) 75–82 (Association for Computational Linguistics, 2005); https://doi.org/10.3115/1219840.1219850

  91. Spivey-Knowlton, M. & Sedivy, J. C. Resolving attachment ambiguities with multiple constraints. Cognition 55, 227–267 (1995).

    Article  CAS  PubMed  Google Scholar 

  92. Traxler, M. J. A hierarchical linear modeling analysis of working memory and implicit prosody in the resolution of adjunct attachment ambiguity. J. Psycholinguist. Res. 38, 491–509 (2009).

    Article  PubMed  Google Scholar 

  93. Collins, M. & Brooks, J. Prepositional phrase attachment through a backed-off model. In Third Workshop on Very Large Corpora 27–38 (Association for Computational Linguistics, 1995).

  94. Schütze, C. T. PP attachment and argumenthood. MIT Working Pap. Linguist. 26, 151 (1995).

    Google Scholar 

  95. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR, 2019); https://openreview.net/forum?id=Bkg6RiCqY7

  96. Radford, A. et al. Language models are unsupervised multitask learners. OpenAI blog 1, 9 (2019).

    Google Scholar 

  97. Gao, T., Yao, X. & Chen, D. SimCSE: simple contrastive learning of sentence embeddings. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing (eds Moens, M.-F. et al.) 6894–6910 (Association for Computational Linguistics, 2021); https://doi.org/10.18653/v1/2021.emnlp-main.552

  98. Zhang, Z. et al. MELA: multilingual evaluation of linguistic acceptability. In Proc. 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (eds Ku, L.-W., Martins, A. & Srikumar, V.) 2658–2674 (Association for Computational Linguistics, 2024); https://doi.org/10.18653/v1/2024.acl-long.146

  99. Kasami, T. An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages, Coordinated Science Laboratory Report No. R-257 (Univ. Illinois–Urbana, 1966).

  100. Younger, D. H. Recognition and parsing of context-free languages in time n3. Inf. Control 10, 189–208 (1967).

    Article  Google Scholar 

  101. Hope, A. C. A. A simplified Monte Carlo significance test procedure. J. R. Stat. Soc. Ser. B 30, 582–598 (1968).

    Article  Google Scholar 

  102. Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (Chapman and Hall/CRC, 1994).The color of these bars seems incorrect. In the version we submitted, the bars are grey (see Fig. 3b for examples). The black bars are overlapping the data points.

Download references

Acknowledgements

We thank S. Wang for discussions, L. Jin and X. Pan for helping to construct ambiguous sentences and M. Wolpert and F. H. Wang for comments on earlier versions of the manuscript. This work was supported by the National Science and Technology Innovation 2030 Major Project 2021ZD0204100 (2021ZD0204105 to W.L. and N.D.) and the Fundamental Research Funds for the Central Universities (226-2025-00035 to N.D.).

Author information

Authors and Affiliations

Authors

Contributions

N.D. conceived the study. W.L., M.X. and N.D. designed the experiment. W.L. implemented and conducted the experiments. W.L. analysed the data. W.L., M.X. and N.D. wrote the manuscript.

Corresponding author

Correspondence to Nai Ding.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Jonathan Brennan, Taro Watanabe and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15, Supplementary Tables 1–3, Supplementary Results, Supplementary Discussion, lists of parallel sentences, meaningless sentences, syntactically ambiguous sentences, and demonstration sentences for syntactically ambiguous sentences, as well as the instructions and prompts for human participants and LLMs.

Reporting Summary

Peer Review File

Supplementary Data 1

Statistical source data for the Supplementary Information.

Source data

Source Data Figs. 1–4 and 6

Statistical source data for the main text and figures.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Xiang, M. & Ding, N. Active use of latent tree-structured sentence representation in humans and large language models. Nat Hum Behav (2025). https://doi.org/10.1038/s41562-025-02297-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41562-025-02297-0

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing