Abstract
Altruism underlies cooperative behaviours that facilitate social complexity. In late 2022 and early 2023, we tested whether particular large language models—then in widespread use—generated completions that simulated altruism when prompted with text inputs similar to those used in ‘dictator game’ experiments measuring human altruism. Here we report that one model in our initial study set—OpenAI’s text-davinci-003—consistently generated completions that simulated payoff maximization in a non-social decision task yet simulated altruism in dictator games. Comparable completions appeared when we replicated our experiments, altered prompt phrasing, varied model parameters, altered currencies described in the prompt and studied a subsequent model, GPT-4. Furthermore, application of explainable artificial intelligence techniques showed that results changed little when instructing the system to ignore past research on the dictator or ultimatum games but changed noticeably when instructing the system to focus on the needs of particular participants in a simulated social encounter.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
To enable readers to replicate the study and explore the implications of the authors’ methods, we have submitted all datasets with this Article. These data are available via OSF at https://osf.io/c83m4.
Code availability
To facilitate replication and explore our methods, we have submitted—as with the study data—all code used in our Article. This code is available via OSF at https://osf.io/c83m4.
References
Hamilton, W. D. The genetical evolution of social behaviour. I. J. Theor. Biol. 7, 1–16 (1964).
Hamilton, W. D. The genetical evolution of social behaviour. II. J. Theor. Biol. 7, 17–52 (1964).
Hamilton, W. D. The evolution of altruistic behavior. Am. Nat. 97, 354–356 (1963).
West, S. A. & Gardner, A. Altruism, spite, and greenbeards. Science 327, 1341–1344 (2010).
Fischbacher, U. & Fehr, E. The nature of human altruism. Nature 425, 785–791 (2003).
Dugatkin, L. A. The Altruism Equation (Princeton University Press, 2006).
Hamilton, W. D. Altruism and related phenomena, mainly in the social insects. Annu. Rev. Ecol. Syst. 3, 192–232 (1972).
Nowak, M. A. & Highfield, R. Supercooperators. (Free Press, 2011).
West, S. A., Fisher, R. M., Gardner, A. & Kiers, E. T. Major evolutionary transitions in individuality. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1421402112 (2015).
Nowak, M. A. Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006).
Fehr, E. & Gachter, S. Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90, 980–994 (2000).
Fehr, E. & Gächter, S. Altruistic punishment in humans. Nature 415, 137–140 (2002).
Henrich, J. et al. Costly punishment across human societies. Science 312, 1767–1770 (2006).
Silk, J. B. & House, B. R. Evolutionary foundations of human prosocial sentiments. Proc. Natl Acad. Sci. USA 108, 10910–10917 (2011).
Silk, J. B. et al. Chimpanzees are indifferent to the welfare of unrelated group members. Nature 437, 1357–1359 (2005).
Dudley, S. A. Plant cooperation. AoB PLANTS https://doi.org/10.1093/aobpla/plv113 (2015).
Dudley, S. A. & File, A. L. Kin recognition in an annual plant. Biol. Lett. 3, 435–438 (2007).
West, S. A., Diggle, S. P., Buckling, A., Gardner, A. & Griffin, A. S. The social lives of microbes. Annu. Rev. Ecol. Evol. Syst. 38, 53–77 (2007).
Johnson, T. & Obradovich, N. Evidence of behavior consistent with self-interest and altruism in an artificially intelligent agent. Preprint at https://arxiv.org/abs/2301.02330 (2023).
Güth, W., Schmittberger, R. & Schwarze, B. An experimental analysis of ultimatum bargaining. J. Economic Behav. Organ. 3, 367–388 (1982).
Kahneman, D., Knetsch, J. L. & Thaler, R. H. Fairness and the assumptions of economics. J. Bus. 59, S285–S300 (1986).
Forsythe, R., Horowitz, J. L., Savin, N. E. & Sefton, M. Fairness in simple bargaining experiments. Games Econ. Behav. 6, 347–369 (1994).
Engel, C. Dictator games: a meta study. Exp. Econ. 14, 583–610 (2011).
Mei, Q., Xie, Y., Yuan, W. & Jackson, M. O. A Turing test of whether AI chatbots are behaviorally similar to humans. Proc. Natl Acad. Sci. USA 121, e2313925121 (2024).
Capraro, V., Di Paolo, R., Perc, M. & Pizziol, V. Language-based game theory in the age of artificial intelligence. J. R. Soc. Interface 21, 20230720 (2024).
Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).
Gebru, T. in The Oxford Handbook of Ethics of AI (eds Dubber, M. D. et al.) Ch. 13, 253–270 (Oxford Univ. Press, 2020).
Bandy, J. Problematic machine behavior: a systematic literature review of algorithm audits. Proc. ACM Hum. Comput. Interact. 5, 74 (2021).
Firestone, C. Performance vs. competence in human–machine comparisons. Proc. Natl Acad. Sci. USA 117, 26562–26571 (2020).
Binz, M. & Schulz, E. Using cognitive psychology to understand GPT-3. Proc. Natl Acad. Sci. USA 120, e2218523120 (2023).
Crandall, J. W. et al. Cooperating with machines. Nat. Commun. 9, 233 (2018).
Ammanabrolu, P., Jiang, L., Sap, M., Hajishirzi, H. & Choi, Y. Aligning to social norms and values in interactive narratives. In Proc. 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Association for Computational Linguistics 5994–6017 (2022).s
Aher, G., Arriaga, R. I. & Tauman Kalai, A. Using large language models to simulate multiple humans and replicate human subjects. In Proc. 40th International Conference on Machine Learning https://proceedings.mlr.press/v202/aher23a/aher23a.pdf (2022).
Argyle, L. P. et al. Out of one, many: using language models to simulate human samples. Polit. Anal. https://doi.org/10.1017/pan.2023.2 (2023).
Dillion, D., Tandon, N., Gu, Y. & Gray, K. Can AI language models replace human participants? Trends Cogn. Sci. https://doi.org/10.1016/j.tics.2023.04.008 (2023).
Shanahan, M., McDonell, K. & Reynolds, L. Role play with large language models. Nature 623, 493–498 (2023).
Proudfoot, D. Anthropomorphism and AI: Turingʼs much misunderstood imitation game. Artif. Intell. 175, 950–957 (2011).
OpenAI. Models: GPT-3 https://beta.openai.com/docs/models/gpt-3 (2023).
OpenAI. Model Index for Researchers https://beta.openai.com/docs/model-index-for-researchers (2023).
Kaufman, S., Rosset, S., Perlich, C. & Stitelman, O. Leakage in data mining: formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data. https://doi.org/10.1145/2382577.2382579 (2012).
Deb, M., Deiseroth, B., Weinbach, S., Schramowski, P. & Kersting, K. AtMan: understanding transformer predictions through memory efficient attention manipulation. In 37th Conference on Neural Information Processing Systems (NeurIPS 2023) https://proceedings.neurips.cc/paper_files/paper/2023/file/c83bc020a020cdeb966ed10804619664-Paper-Conference.pdf (2023).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In 31st Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (NIPS, 2017).
Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1145/2939672.2939778 (2016).
Phelps, S. & Russell, Y. I. Investigating emergent goal-like behaviour in large language models using experimental economics. Preprint at https://ar5iv.labs.arxiv.org/html/2305.07970 (2023).
Capraro, V., Halpern, J. Y. & Perc, M. From outcome-based to language-based preferences. J. Econ. Lit. 62, 115–154 (2024).
Goli, A. & Singh, A. A cross-linguistic analysis of intertemporal preferences in GPT-3.5. Preprint at https://arxiv.org/abs/2305.02531v4 (2023).
Bi, B. et al. PALM: pre-training an autoencoding & autoregressive language model for context-conditioned generation. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 8681–8691 (2020).
Phelps, S. & Ranson, R. Of models and tin men: a behavioural economics study of principal-agent problems in AI alignment using large-language models. Preprint at https://arxiv.org/abs/2307.11137 (2023).
Bostrom, N. Superintelligence: Paths, Dangers, Strategies (Oxford Univ. Press, 2014).
Tegmark, M. Life 3.0: Being Human in the Age of Artificial Intelligence (Alfred A. Knopf, 2017).
Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control (Viking, 2019).
Acknowledgements
The wisdom and insight that M. Cebrian shared during our research provided both practical ways to better our work and the inspiration to do so. T.J. thanks M. Selby, G. Johnson, E. Johnson, J. Chen and O. Smirnov for helpful conversations that improved the present research. T.J. also thanks the late J. M. Orbell for insights concerning the study of altruism and cooperation that continue to carry influence. Finally, we thank OpenAI’s Researcher Access Program for generous resources that supported this work.
Author information
Authors and Affiliations
Contributions
T.J. devised the research idea and experimental design; N.O. edited the experimental design; T.J. implemented the experiment; T.J. and N.O. designed the data analysis and figures; T.J. conducted data analysis and produced figures; N.O. contributed to data analysis and reproduced figures; T.J. and N.O. interpreted the results; T.J. drafted the manuscript; T.J. and N.O. revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors are participants in OpenAI’s Research Access Program.
Peer review
Peer review information
Nature Human Behaviour thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Sections 1–16, Figs. 1–34 and Tables 1–17.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Johnson, T., Obradovich, N. Testing for completions that simulate altruism in early language models. Nat Hum Behav 9, 1861–1870 (2025). https://doi.org/10.1038/s41562-025-02258-7
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41562-025-02258-7