Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Testing for completions that simulate altruism in early language models

Abstract

Altruism underlies cooperative behaviours that facilitate social complexity. In late 2022 and early 2023, we tested whether particular large language models—then in widespread use—generated completions that simulated altruism when prompted with text inputs similar to those used in ‘dictator game’ experiments measuring human altruism. Here we report that one model in our initial study set—OpenAI’s text-davinci-003—consistently generated completions that simulated payoff maximization in a non-social decision task yet simulated altruism in dictator games. Comparable completions appeared when we replicated our experiments, altered prompt phrasing, varied model parameters, altered currencies described in the prompt and studied a subsequent model, GPT-4. Furthermore, application of explainable artificial intelligence techniques showed that results changed little when instructing the system to ignore past research on the dictator or ultimatum games but changed noticeably when instructing the system to focus on the needs of particular participants in a simulated social encounter.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Simulated proportion shared in completions responding to social-decision task prompts involvingtext-davinci-003-to-human and text-davinci-003-to-AI pairings.
Fig. 2: Distribution of simulated altruism when prompts provide text-davinci-003 further instructions in the social decision task.
Fig. 3: Simulated sharing behaviour depicted in GPT-4’s completions from the text-insertion variant of the dictator game.
Fig. 4: Simulated decisions in GPT-4’s completions when prompts described battery life and system usage as resources available for consumption.

Similar content being viewed by others

Data availability

To enable readers to replicate the study and explore the implications of the authors’ methods, we have submitted all datasets with this Article. These data are available via OSF at https://osf.io/c83m4.

Code availability

To facilitate replication and explore our methods, we have submitted—as with the study data—all code used in our Article. This code is available via OSF at https://osf.io/c83m4.

References

  1. Hamilton, W. D. The genetical evolution of social behaviour. I. J. Theor. Biol. 7, 1–16 (1964).

    Article  PubMed  CAS  Google Scholar 

  2. Hamilton, W. D. The genetical evolution of social behaviour. II. J. Theor. Biol. 7, 17–52 (1964).

    Article  PubMed  CAS  Google Scholar 

  3. Hamilton, W. D. The evolution of altruistic behavior. Am. Nat. 97, 354–356 (1963).

    Article  Google Scholar 

  4. West, S. A. & Gardner, A. Altruism, spite, and greenbeards. Science 327, 1341–1344 (2010).

    Article  PubMed  CAS  Google Scholar 

  5. Fischbacher, U. & Fehr, E. The nature of human altruism. Nature 425, 785–791 (2003).

    Article  PubMed  Google Scholar 

  6. Dugatkin, L. A. The Altruism Equation (Princeton University Press, 2006).

  7. Hamilton, W. D. Altruism and related phenomena, mainly in the social insects. Annu. Rev. Ecol. Syst. 3, 192–232 (1972).

    Article  Google Scholar 

  8. Nowak, M. A. & Highfield, R. Supercooperators. (Free Press, 2011).

  9. West, S. A., Fisher, R. M., Gardner, A. & Kiers, E. T. Major evolutionary transitions in individuality. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1421402112 (2015).

  10. Nowak, M. A. Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Fehr, E. & Gachter, S. Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90, 980–994 (2000).

    Article  Google Scholar 

  12. Fehr, E. & Gächter, S. Altruistic punishment in humans. Nature 415, 137–140 (2002).

    Article  PubMed  CAS  Google Scholar 

  13. Henrich, J. et al. Costly punishment across human societies. Science 312, 1767–1770 (2006).

    Article  PubMed  CAS  Google Scholar 

  14. Silk, J. B. & House, B. R. Evolutionary foundations of human prosocial sentiments. Proc. Natl Acad. Sci. USA 108, 10910–10917 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Silk, J. B. et al. Chimpanzees are indifferent to the welfare of unrelated group members. Nature 437, 1357–1359 (2005).

    Article  PubMed  CAS  Google Scholar 

  16. Dudley, S. A. Plant cooperation. AoB PLANTS https://doi.org/10.1093/aobpla/plv113 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Dudley, S. A. & File, A. L. Kin recognition in an annual plant. Biol. Lett. 3, 435–438 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  18. West, S. A., Diggle, S. P., Buckling, A., Gardner, A. & Griffin, A. S. The social lives of microbes. Annu. Rev. Ecol. Evol. Syst. 38, 53–77 (2007).

    Article  Google Scholar 

  19. Johnson, T. & Obradovich, N. Evidence of behavior consistent with self-interest and altruism in an artificially intelligent agent. Preprint at https://arxiv.org/abs/2301.02330 (2023).

  20. Güth, W., Schmittberger, R. & Schwarze, B. An experimental analysis of ultimatum bargaining. J. Economic Behav. Organ. 3, 367–388 (1982).

    Article  Google Scholar 

  21. Kahneman, D., Knetsch, J. L. & Thaler, R. H. Fairness and the assumptions of economics. J. Bus. 59, S285–S300 (1986).

    Article  Google Scholar 

  22. Forsythe, R., Horowitz, J. L., Savin, N. E. & Sefton, M. Fairness in simple bargaining experiments. Games Econ. Behav. 6, 347–369 (1994).

    Article  Google Scholar 

  23. Engel, C. Dictator games: a meta study. Exp. Econ. 14, 583–610 (2011).

    Article  Google Scholar 

  24. Mei, Q., Xie, Y., Yuan, W. & Jackson, M. O. A Turing test of whether AI chatbots are behaviorally similar to humans. Proc. Natl Acad. Sci. USA 121, e2313925121 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Capraro, V., Di Paolo, R., Perc, M. & Pizziol, V. Language-based game theory in the age of artificial intelligence. J. R. Soc. Interface 21, 20230720 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).

    Article  PubMed  CAS  Google Scholar 

  27. Gebru, T. in The Oxford Handbook of Ethics of AI (eds Dubber, M. D. et al.) Ch. 13, 253–270 (Oxford Univ. Press, 2020).

  28. Bandy, J. Problematic machine behavior: a systematic literature review of algorithm audits. Proc. ACM Hum. Comput. Interact. 5, 74 (2021).

    Article  Google Scholar 

  29. Firestone, C. Performance vs. competence in human–machine comparisons. Proc. Natl Acad. Sci. USA 117, 26562–26571 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Binz, M. & Schulz, E. Using cognitive psychology to understand GPT-3. Proc. Natl Acad. Sci. USA 120, e2218523120 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  31. Crandall, J. W. et al. Cooperating with machines. Nat. Commun. 9, 233 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Ammanabrolu, P., Jiang, L., Sap, M., Hajishirzi, H. & Choi, Y. Aligning to social norms and values in interactive narratives. In Proc. 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Association for Computational Linguistics 5994–6017 (2022).s

  33. Aher, G., Arriaga, R. I. & Tauman Kalai, A. Using large language models to simulate multiple humans and replicate human subjects. In Proc. 40th International Conference on Machine Learning https://proceedings.mlr.press/v202/aher23a/aher23a.pdf (2022).

  34. Argyle, L. P. et al. Out of one, many: using language models to simulate human samples. Polit. Anal. https://doi.org/10.1017/pan.2023.2 (2023).

    Article  Google Scholar 

  35. Dillion, D., Tandon, N., Gu, Y. & Gray, K. Can AI language models replace human participants? Trends Cogn. Sci. https://doi.org/10.1016/j.tics.2023.04.008 (2023).

    Article  PubMed  Google Scholar 

  36. Shanahan, M., McDonell, K. & Reynolds, L. Role play with large language models. Nature 623, 493–498 (2023).

    Article  PubMed  CAS  Google Scholar 

  37. Proudfoot, D. Anthropomorphism and AI: Turingʼs much misunderstood imitation game. Artif. Intell. 175, 950–957 (2011).

    Article  Google Scholar 

  38. OpenAI. Models: GPT-3 https://beta.openai.com/docs/models/gpt-3 (2023).

  39. OpenAI. Model Index for Researchers https://beta.openai.com/docs/model-index-for-researchers (2023).

  40. Kaufman, S., Rosset, S., Perlich, C. & Stitelman, O. Leakage in data mining: formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data. https://doi.org/10.1145/2382577.2382579 (2012).

    Article  Google Scholar 

  41. Deb, M., Deiseroth, B., Weinbach, S., Schramowski, P. & Kersting, K. AtMan: understanding transformer predictions through memory efficient attention manipulation. In 37th Conference on Neural Information Processing Systems (NeurIPS 2023) https://proceedings.neurips.cc/paper_files/paper/2023/file/c83bc020a020cdeb966ed10804619664-Paper-Conference.pdf (2023).

  42. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In 31st Conference on Neural Information Processing Systems https://papers.nips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (NIPS, 2017).

  43. Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1145/2939672.2939778 (2016).

  44. Phelps, S. & Russell, Y. I. Investigating emergent goal-like behaviour in large language models using experimental economics. Preprint at https://ar5iv.labs.arxiv.org/html/2305.07970 (2023).

  45. Capraro, V., Halpern, J. Y. & Perc, M. From outcome-based to language-based preferences. J. Econ. Lit. 62, 115–154 (2024).

    Article  Google Scholar 

  46. Goli, A. & Singh, A. A cross-linguistic analysis of intertemporal preferences in GPT-3.5. Preprint at https://arxiv.org/abs/2305.02531v4 (2023).

  47. Bi, B. et al. PALM: pre-training an autoencoding & autoregressive language model for context-conditioned generation. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 8681–8691 (2020).

  48. Phelps, S. & Ranson, R. Of models and tin men: a behavioural economics study of principal-agent problems in AI alignment using large-language models. Preprint at https://arxiv.org/abs/2307.11137 (2023).

  49. Bostrom, N. Superintelligence: Paths, Dangers, Strategies (Oxford Univ. Press, 2014).

  50. Tegmark, M. Life 3.0: Being Human in the Age of Artificial Intelligence (Alfred A. Knopf, 2017).

  51. Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control (Viking, 2019).

Download references

Acknowledgements

The wisdom and insight that M. Cebrian shared during our research provided both practical ways to better our work and the inspiration to do so. T.J. thanks M. Selby, G. Johnson, E. Johnson, J. Chen and O. Smirnov for helpful conversations that improved the present research. T.J. also thanks the late J. M. Orbell for insights concerning the study of altruism and cooperation that continue to carry influence. Finally, we thank OpenAI’s Researcher Access Program for generous resources that supported this work.

Author information

Authors and Affiliations

Authors

Contributions

T.J. devised the research idea and experimental design; N.O. edited the experimental design; T.J. implemented the experiment; T.J. and N.O. designed the data analysis and figures; T.J. conducted data analysis and produced figures; N.O. contributed to data analysis and reproduced figures; T.J. and N.O. interpreted the results; T.J. drafted the manuscript; T.J. and N.O. revised the manuscript.

Corresponding author

Correspondence to Tim Johnson.

Ethics declarations

Competing interests

The authors are participants in OpenAI’s Research Access Program.

Peer review

Peer review information

Nature Human Behaviour thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–16, Figs. 1–34 and Tables 1–17.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Johnson, T., Obradovich, N. Testing for completions that simulate altruism in early language models. Nat Hum Behav 9, 1861–1870 (2025). https://doi.org/10.1038/s41562-025-02258-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41562-025-02258-7

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics