Testing for completions that simulate altruism in early language models

Johnson, Tim; Obradovich, Nick

doi:10.1038/s41562-025-02258-7

Article
Published: 28 July 2025

Testing for completions that simulate altruism in early language models

Nature Human Behaviour volume 9, pages 1861–1870 (2025)Cite this article

790 Accesses
1 Citations
26 Altmetric
Metrics details

Subjects

Abstract

Altruism underlies cooperative behaviours that facilitate social complexity. In late 2022 and early 2023, we tested whether particular large language models—then in widespread use—generated completions that simulated altruism when prompted with text inputs similar to those used in ‘dictator game’ experiments measuring human altruism. Here we report that one model in our initial study set—OpenAI’s text-davinci-003—consistently generated completions that simulated payoff maximization in a non-social decision task yet simulated altruism in dictator games. Comparable completions appeared when we replicated our experiments, altered prompt phrasing, varied model parameters, altered currencies described in the prompt and studied a subsequent model, GPT-4. Furthermore, application of explainable artificial intelligence techniques showed that results changed little when instructing the system to ignore past research on the dictator or ultimatum games but changed noticeably when instructing the system to focus on the needs of particular participants in a simulated social encounter.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Simulated proportion shared in completions responding to social-decision task prompts involvingtext-davinci-003-to-human and text-davinci-003-to-AI pairings.**

**Fig. 2: Distribution of simulated altruism when prompts provide text-davinci-003 further instructions in the social decision task.**

**Fig. 3: Simulated sharing behaviour depicted in GPT-4’s completions from the text-insertion variant of the dictator game.**

**Fig. 4: Simulated decisions in GPT-4’s completions when prompts described battery life and system usage as resources available for consumption.**

GPT-3.5 altruistic advice is sensitive to reciprocal concerns but not to strategic risk

Article Open access 27 September 2024

Large language models for autism: evaluating theory of mind tasks in a gamified environment

Article Open access 06 October 2025

A reinforcement learning approach to explore the role of social expectations in altruistic behavior

Article Open access 31 January 2023

Data availability

To enable readers to replicate the study and explore the implications of the authors’ methods, we have submitted all datasets with this Article. These data are available via OSF at https://osf.io/c83m4.

Code availability

To facilitate replication and explore our methods, we have submitted—as with the study data—all code used in our Article. This code is available via OSF at https://osf.io/c83m4.

References

Hamilton, W. D. The genetical evolution of social behaviour. I. J. Theor. Biol. 7, 1–16 (1964).
Article PubMed CAS Google Scholar
Hamilton, W. D. The genetical evolution of social behaviour. II. J. Theor. Biol. 7, 17–52 (1964).
Article PubMed CAS Google Scholar
Hamilton, W. D. The evolution of altruistic behavior. Am. Nat. 97, 354–356 (1963).
Article Google Scholar
West, S. A. & Gardner, A. Altruism, spite, and greenbeards. Science 327, 1341–1344 (2010).
Article PubMed CAS Google Scholar
Fischbacher, U. & Fehr, E. The nature of human altruism. Nature 425, 785–791 (2003).
Article PubMed Google Scholar
Dugatkin, L. A. The Altruism Equation (Princeton University Press, 2006).
Hamilton, W. D. Altruism and related phenomena, mainly in the social insects. Annu. Rev. Ecol. Syst. 3, 192–232 (1972).
Article Google Scholar
Nowak, M. A. & Highfield, R. Supercooperators. (Free Press, 2011).
West, S. A., Fisher, R. M., Gardner, A. & Kiers, E. T. Major evolutionary transitions in individuality. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1421402112 (2015).
Nowak, M. A. Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006).
Article PubMed PubMed Central CAS Google Scholar
Fehr, E. & Gachter, S. Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90, 980–994 (2000).
Article Google Scholar
Fehr, E. & Gächter, S. Altruistic punishment in humans. Nature 415, 137–140 (2002).
Article PubMed CAS Google Scholar
Henrich, J. et al. Costly punishment across human societies. Science 312, 1767–1770 (2006).
Article PubMed CAS Google Scholar
Silk, J. B. & House, B. R. Evolutionary foundations of human prosocial sentiments. Proc. Natl Acad. Sci. USA 108, 10910–10917 (2011).
Article PubMed PubMed Central CAS Google Scholar
Silk, J. B. et al. Chimpanzees are indifferent to the welfare of unrelated group members. Nature 437, 1357–1359 (2005).
Article PubMed CAS Google Scholar
Dudley, S. A. Plant cooperation. AoB PLANTS https://doi.org/10.1093/aobpla/plv113 (2015).
Article PubMed PubMed Central Google Scholar
Dudley, S. A. & File, A. L. Kin recognition in an annual plant. Biol. Lett. 3, 435–438 (2007).
Article PubMed PubMed Central Google Scholar
West, S. A., Diggle, S. P., Buckling, A., Gardner, A. & Griffin, A. S. The social lives of microbes. Annu. Rev. Ecol. Evol. Syst. 38, 53–77 (2007).
Article Google Scholar
Johnson, T. & Obradovich, N. Evidence of behavior consistent with self-interest and altruism in an artificially intelligent agent. Preprint at https://arxiv.org/abs/2301.02330 (2023).
Güth, W., Schmittberger, R. & Schwarze, B. An experimental analysis of ultimatum bargaining. J. Economic Behav. Organ. 3, 367–388 (1982).
Article Google Scholar
Kahneman, D., Knetsch, J. L. & Thaler, R. H. Fairness and the assumptions of economics. J. Bus. 59, S285–S300 (1986).
Article Google Scholar
Forsythe, R., Horowitz, J. L., Savin, N. E. & Sefton, M. Fairness in simple bargaining experiments. Games Econ. Behav. 6, 347–369 (1994).
Article Google Scholar
Engel, C. Dictator games: a meta study. Exp. Econ. 14, 583–610 (2011).
Article Google Scholar
Mei, Q., Xie, Y., Yuan, W. & Jackson, M. O. A Turing test of whether AI chatbots are behaviorally similar to humans. Proc. Natl Acad. Sci. USA 121, e2313925121 (2024).
Article PubMed PubMed Central CAS Google Scholar
Capraro, V., Di Paolo, R., Perc, M. & Pizziol, V. Language-based game theory in the age of artificial intelligence. J. R. Soc. Interface 21, 20230720 (2024).
Article PubMed PubMed Central Google Scholar
Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).
Article PubMed CAS Google Scholar
Gebru, T. in The Oxford Handbook of Ethics of AI (eds Dubber, M. D. et al.) Ch. 13, 253–270 (Oxford Univ. Press, 2020).
Bandy, J. Problematic machine behavior: a systematic literature review of algorithm audits. Proc. ACM Hum. Comput. Interact. 5, 74 (2021).
Article Google Scholar
Firestone, C. Performance vs. competence in human–machine comparisons. Proc. Natl Acad. Sci. USA 117, 26562–26571 (2020).
Article PubMed PubMed Central CAS Google Scholar
Binz, M. & Schulz, E. Using cognitive psychology to understand GPT-3. Proc. Natl Acad. Sci. USA 120, e2218523120 (2023).
Article PubMed PubMed Central CAS Google Scholar
Crandall, J. W. et al. Cooperating with machines. Nat. Commun. 9, 233 (2018).
Article PubMed PubMed Central Google Scholar
Ammanabrolu, P., Jiang, L., Sap, M., Hajishirzi, H. & Choi, Y. Aligning to social norms and values in interactive narratives. In Proc. 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Association for Computational Linguistics 5994–6017 (2022).s
Aher, G., Arriaga, R. I. & Tauman Kalai, A. Using large language models to simulate multiple humans and replicate human subjects. In Proc. 40th International Conference on Machine Learning https://proceedings.mlr.press/v202/aher23a/aher23a.pdf (2022).
Argyle, L. P. et al. Out of one, many: using language models to simulate human samples. Polit. Anal. https://doi.org/10.1017/pan.2023.2 (2023).
Article Google Scholar
Dillion, D., Tandon, N., Gu, Y. & Gray, K. Can AI language models replace human participants? Trends Cogn. Sci. https://doi.org/10.1016/j.tics.2023.04.008 (2023).
Article PubMed Google Scholar
Shanahan, M., McDonell, K. & Reynolds, L. Role play with large language models. Nature 623, 493–498 (2023).
Article PubMed CAS Google Scholar
Proudfoot, D. Anthropomorphism and AI: Turingʼs much misunderstood imitation game. Artif. Intell. 175, 950–957 (2011).
Article Google Scholar
OpenAI. Models: GPT-3 https://beta.openai.com/docs/models/gpt-3 (2023).
OpenAI. Model Index for Researchers https://beta.openai.com/docs/model-index-for-researchers (2023).
Kaufman, S., Rosset, S., Perlich, C. & Stitelman, O. Leakage in data mining: formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data. https://doi.org/10.1145/2382577.2382579 (2012).
Article Google Scholar
Deb, M., Deiseroth, B., Weinbach, S., Schramowski, P. & Kersting, K. AtMan: understanding transformer predictions through memory efficient attention manipulation. In 37th Conference on Neural Information Processing Systems (NeurIPS 2023) https://proceedings.neurips.cc/paper_files/paper/2023/file/c83bc020a020cdeb966ed10804619664-Paper-Conference.pdf (2023).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In 31st Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (NIPS, 2017).
Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1145/2939672.2939778 (2016).
Phelps, S. & Russell, Y. I. Investigating emergent goal-like behaviour in large language models using experimental economics. Preprint at https://ar5iv.labs.arxiv.org/html/2305.07970 (2023).
Capraro, V., Halpern, J. Y. & Perc, M. From outcome-based to language-based preferences. J. Econ. Lit. 62, 115–154 (2024).
Article Google Scholar
Goli, A. & Singh, A. A cross-linguistic analysis of intertemporal preferences in GPT-3.5. Preprint at https://arxiv.org/abs/2305.02531v4 (2023).
Bi, B. et al. PALM: pre-training an autoencoding & autoregressive language model for context-conditioned generation. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 8681–8691 (2020).
Phelps, S. & Ranson, R. Of models and tin men: a behavioural economics study of principal-agent problems in AI alignment using large-language models. Preprint at https://arxiv.org/abs/2307.11137 (2023).
Bostrom, N. Superintelligence: Paths, Dangers, Strategies (Oxford Univ. Press, 2014).
Tegmark, M. Life 3.0: Being Human in the Age of Artificial Intelligence (Alfred A. Knopf, 2017).
Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control (Viking, 2019).

Download references

Acknowledgements

The wisdom and insight that M. Cebrian shared during our research provided both practical ways to better our work and the inspiration to do so. T.J. thanks M. Selby, G. Johnson, E. Johnson, J. Chen and O. Smirnov for helpful conversations that improved the present research. T.J. also thanks the late J. M. Orbell for insights concerning the study of altruism and cooperation that continue to carry influence. Finally, we thank OpenAI’s Researcher Access Program for generous resources that supported this work.

Author information

Authors and Affiliations

Atkinson School of Management, Willamette University, Salem, OR, USA
Tim Johnson
Laureate Institute for Brain Research, Tulsa, OK, USA
Nick Obradovich

Authors

Tim Johnson
View author publications
Search author on:PubMed Google Scholar
Nick Obradovich
View author publications
Search author on:PubMed Google Scholar

Contributions

T.J. devised the research idea and experimental design; N.O. edited the experimental design; T.J. implemented the experiment; T.J. and N.O. designed the data analysis and figures; T.J. conducted data analysis and produced figures; N.O. contributed to data analysis and reproduced figures; T.J. and N.O. interpreted the results; T.J. drafted the manuscript; T.J. and N.O. revised the manuscript.

Corresponding author

Correspondence to Tim Johnson.

Ethics declarations

Competing interests

The authors are participants in OpenAI’s Research Access Program.

Peer review

Peer review information

Nature Human Behaviour thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–16, Figs. 1–34 and Tables 1–17.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Johnson, T., Obradovich, N. Testing for completions that simulate altruism in early language models. Nat Hum Behav 9, 1861–1870 (2025). https://doi.org/10.1038/s41562-025-02258-7

Download citation

Received: 09 October 2023
Accepted: 04 June 2025
Published: 28 July 2025
Issue date: September 2025
DOI: https://doi.org/10.1038/s41562-025-02258-7