Abstract
Large language models (LLMs) offer potential benefits in clinical care. However, concerns remain regarding socio-demographic biases embedded in their outputs. Opioid prescribing is one domain in which these biases can have serious implications, especially given the ongoing opioid epidemic and the need to balance effective pain management with addiction risk. We tested ten LLMs—both open access and closed source—on 1,000 acute-pain vignettes. Half of the vignettes were labelled as non-cancer and half as cancer. Each vignette was presented in 34 socio-demographic variations, including a control group without demographic identifiers. We analysed the models’ recommendations on opioids, anxiety treatment, perceived psychological stress, risk scores and monitoring recommendations, yielding 3.4 million model-generated responses overall. Using logistic and linear mixed-effects models, we measured how these outputs varied by demographic group and whether a cancer diagnosis intensified or reduced observed disparities. Across both cancer and non-cancer cases, historically marginalized groups—especially cases labelled as individuals who were unhoused or Black or who identified as LGBTQIA+—often received more or stronger opioid recommendations, sometimes exceeding 90% in cancer settings, despite being labelled as high risk by the same models. Meanwhile, low-income or unemployed groups were assigned elevated risk scores yet fewer opioid recommendations, hinting at inconsistent rationales. Disparities in anxiety treatment and perceived psychological stress similarly clustered within marginalized populations, even when clinical details were identical. These patterns diverged from standard guidelines and point to model-driven bias rather than acceptable clinical variation. Our findings underscore the need for rigorous bias evaluation and the integration of guideline-based checks in LLMs to ensure equitable and evidence-based pain care.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The data and clinical vignettes used in this study can be fully accessed by qualified researchers through the corresponding author for research purposes related to the implementation or evaluation of artificial intelligence in pain management for cancer or non-cancer conditions. Requests should be submitted by email to the corresponding author and will be reviewed and responded to within one month.
Code availability
The code used for data processing and analysis is provided in Supplementary Section 1. Additional scripts can be made available by the corresponding author for academic or research use related to artificial intelligence applications in pain management. Requests will be reviewed and answered within one month.
References
Raja, S. N. et al. The revised International Association for the Study of Pain definition of pain: concepts, challenges, and compromises. Pain 161, 1976–1982 (2020).
Benyamin, R. et al. Opioid complications and side effects. Pain Physician 11, S105–S120 (2008).
Anekar, A. A., Hendrix, J. M. & Cascella, M. WHO Analgesic Ladder (StatPearls, 2024); http://www.ncbi.nlm.nih.gov/books/NBK554435/
Mosadeghrad, A. M. Factors influencing healthcare service quality. Int. J. Health Policy Manag. 3, 77–89 (2014).
Omar, M. et al. Sociodemographic biases in medical decision making by large language models. Nat. Med. 31, 1873–1881 (2025).
Njoku, A., Evans, M., Nimo-Sefah, L. & Bailey, J. Listen to the whispers before they become screams: addressing Black maternal morbidity and mortality in the United States. Healthcare (Basel) 11, 438 (2023).
Keteepe-Arachi, T. & Sharma, S. Cardiovascular disease in women: understanding symptoms and risk factors. Eur. Cardiol. 12, 10–13 (2017).
Richardson-Parry, A. et al. Interventions to reduce cancer screening inequities: the perspective and role of patients, advocacy groups, and empowerment organizations. Int. J. Equity Health 22, 19 (2023).
Liu, M., Sandhu, S., Reisner, S. L., Gonzales, G. & Keuroghlian, A. S. Health status and health care access among lesbian, gay, and bisexual adults in the US, 2013 to 2018. JAMA Int. Med. 183, 380–383 (2023).
Grol-Prokopczyk, H. Sociodemographic disparities in chronic pain, based on 12-year longitudinal data. Pain 158, 313–322 (2017).
Liu, Z., Chuang, T. Y. & Wang, S. Race and gender biases in assessing pain intensity and medication needs among Chinese observers. Pain Rep. 10, e1231 (2025).
Cascella, M. et al. The breakthrough of large language models release for medical applications: 1-year timeline and perspectives. J. Med. Syst. 48, 22 (2024).
Glicksman, M. et al. Artificial intelligence and pain medicine education: benefits and pitfalls for the medical trainee. Pain Pract. 25, e13428 (2025).
Omar, M. et al. Evaluating and addressing demographic disparities in medical large language models: a systematic review. Int. J. Equity Health 24, 57 (2025).
Zack, T. et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. Lancet Digit. Health 6, e12–e22 (2024).
Omar, M. et al. Evaluating and addressing demographic disparities in medical large language models: a systematic review. Preprint at medRxiv https://doi.org/10.1101/2024.09.09.24313295 (2024).
Cau, R., Pisu, F., Suri, J. S. & Saba, L. Addressing hidden risks: systematic review of artificial intelligence biases across racial and ethnic groups in cardiovascular diseases. Eur. J. Radiol. 183, 111867 (2024).
Pfohl, S. R. et al. A toolbox for surfacing health equity harms and biases in large language models. Nat. Med. 30, 3590–3600 (2024).
Resnik, P. Large language models are biased because they are large language models. Preprint at https://arxiv.org/abs/2406.13138 (2024).
Poulain, R., Fayyaz, H. & Beheshti, R. Bias patterns in the application of LLMs for clinical decision support: a comprehensive study. Preprint at https://arxiv.org/abs/2404.15149 (2024).
Mudumbai, S. C. et al. Opioid use in cancer patients compared with noncancer pain patients in a veteran population. JNCI Cancer Spectr. 8, pkae012 (2024).
Marco, C. A., Kanitz, W. & Jolly, M. Pain scores among emergency department (ED) patients: comparison by ED diagnosis. J. Emerg. Med. 44, 46–52 (2013).
Bruera, E. & Kim, H. N. Cancer pain. J. Am. Med. Assoc. 290, 2476–2479 (2003).
Paice, J. A. et al. Use of opioids for adults with pain from cancer or cancer treatment: ASCO guideline. J. Clin. Oncol. 41, 914–930 (2023).
Baum, L. V. M. et al. Trends in new and persistent opioid use in older adults with and without cancer. J. Natl Cancer Inst. 116, 316–323 (2024).
Le, T. T., Fleming, S. P. & Simoni-Wastila, L. Patterns of opioid use in commercially insured patients with cancer. Am. J. Manag. Care 28, 207–211 (2022).
McLaughlin, M. F., Li, R., Carrero, N. D., Bain, P. A. & Chatterjee, A. Opioid use disorder treatment for people experiencing homelessness: a scoping review. Drug Alcohol Depend. 224, 108717 (2021).
Yamamoto, A. et al. Association between homelessness and opioid overdose and opioid-related hospital admissions/emergency department visits. Soc. Sci. Med. 242, 112585 (2019).
Cochran, S. D., Sullivan, J. G. & Mays, V. M. Prevalence of mental disorders, psychological distress, and mental health services use among lesbian, gay, and bisexual adults in the United States. J. Consult. Clin. Psychol. 71, 53–61 (2003).
Gmelin, J. O. H. et al. Increased risks for mental disorders among LGB individuals: cross-national evidence from the World Mental Health Surveys. Soc. Psychiatry Psychiatr. Epidemiol. 57, 2319–2332 (2022).
Meyer, I. H. Prejudice, social stress, and mental health in lesbian, gay, and bisexual populations: conceptual issues and research evidence. Psychol. Bull. 129, 674–697 (2003).
Hoy-Ellis, C. P. Minority stress and mental health: a review of the literature. J. Homosex. 70, 806–830 (2023).
Operario, D. et al. Sexual minority health disparities in adult men and women in the United States: National Health and Nutrition Examination Survey, 2001–2010. Am. J. Public Health 105, e27–e34 (2015).
Amirizaniani, M., Martin, E., Sivachenko, M., Mashhadi, A. & Shah, C. Do LLMs exhibit human-like reasoning? Evaluating theory of mind in LLMs for open-ended responses. Preprint at https://arxiv.org/abs/2406.05659 (2024).
Bandara, S., Bicket, M. C. & McGinty, E. E. Trends in opioid and non-opioid treatment for chronic non-cancer pain and cancer pain among privately insured adults in the United States, 2012–2019. PLoS ONE 17, e0272142 (2022).
Wiffen, P. J., Wee, B., Derry, S., Bell, R. F. & Moore, R. A. Opioids for cancer pain—an overview of Cochrane reviews. Cochrane Database Syst. Rev. 7, CD012592 (2017).
Meghani, S. H., Byun, E. & Gallagher, R. M. Time to take stock: a meta-analysis and systematic review of analgesic treatment disparities for pain in the United States. Pain Med. 13, 150–174 (2012).
Ly, D. P. Association of patient race and ethnicity with differences in opioid prescribing by primary care physicians for older adults with new low back pain. JAMA Health Forum 2, e212333 (2021).
Hwang, S. W. et al. Chronic pain among homeless persons: characteristics, treatment, and barriers to management. BMC Fam. Pract. 12, 73 (2011).
National Academies of Sciences, Engineering, and Medicine. in Medications for Opioid Use Disorder Save Lives 63–90 (National Academies Press, 2019).
Wilder, M. E. et al. The impact of social determinants of health on medication adherence: a systematic review and meta-analysis. J. Gen. Intern. Med. 36, 1359–1370 (2021).
Xie, Z., St Clair, P., Goldman, D. P. & Joyce, G. Racial and ethnic disparities in medication adherence among privately insured patients in the United States. PLoS ONE 14, e0212117 (2019).
Bostwick, W. B., Boyd, C. J., Hughes, T. L. & McCabe, S. E. Dimensions of sexual orientation and the prevalence of mood and anxiety disorders in the United States. Am. J. Public Health 100, 468–475 (2010).
Omar, M., Glicksberg, B. S., Nadkarni, G. N. & Klang, E. Refining LLMs Outputs with iterative consensus ensemble (ICE). Preprint at medRxiv https://doi.org/10.1101/2024.12.25.24319629 (2024).
Agbareia, R. et al. The role of prompt engineering for multimodal LLM glaucoma diagnosis. Preprint at medRxiv https://doi.org/10.1101/2024.10.30.24316434 (2024).
Stacey, D. et al. A systematic process for creating and appraising clinical vignettes to illustrate interprofessional shared decision making. J. Interprof. Care 28, 453–459 (2014).
Hackmann, S., Mahmoudian, H., Steadman, M. & Schmidt, M. Word importance explains how prompts affect language model outputs. Preprint at https://arxiv.org/abs/2403.03028 (2024).
Xiong, G. et al. Improving retrieval-augmented generation in medicine with iterative follow-up questions. Pac. Symp. Biocomput. 30, 199–214 (2025).
Todd, K. H. et al. Pain in the emergency department: results of the pain and emergency medicine initiative (PEMI) multicenter study. J. Pain 8, 460–466 (2007).
Swarm, R. A. et al. Adult Cancer Pain, Version 3.2019, NCCN Clinical Practice Guidelines in Oncology. J. Natl Compr. Canc. Netw. 17, 977–1007 (2019).
WHO Guidelines for the Pharmacological and Radiotherapeutic Management of Cancer Pain in Adults and Adolescents (World Health Organization, 2018).
Braveman, P. & Gottlieb, L. The social determinants of health: it’s time to consider the causes of the causes. Public Health Rep. 129, 19–31 (2014).
Reisner, S. L. et al. Global health burden and needs of transgender populations: a review. Lancet 388, 412–436 (2016).
Acknowledgements
This work was supported in part through the computational and data resources and staff expertise provided by Scientific Computing and Data at the Icahn School of Medicine at Mount Sinai, as well as Clinical and Translational Science Awards grant UL1TR004419 from the National Center for Advancing Translational Sciences (to G.N.N.). Research reported in this publication was also supported by the Office of Research Infrastructure of the National Institutes of Health under awards S10OD026880 and S10OD030463 (to G.N.N.). The content of this Article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The funders played no role in study design, data collection, analysis and interpretation of data, or the writing of this paper.
Author information
Authors and Affiliations
Contributions
M.O. led the study design, case validation, data analysis, visualizations and paper drafting. S.S. helped with study design construction, case validation, draft writing and editing. R.A., Y.L.H., D.U.A., A.W.C., N.L.B., D.L.R., B.S.G., G.N.N. and E.K. all contributed substantially to the project, as well as editing and revising the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Health thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lorenzo Righetto, in collaboration with the Nature Health editorial team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information (download PDF )
Supplementary Sections 1–3, Tables 1–4, Figs. 1 and 2, methods and raw results.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Omar, M., Soffer, S., Agbareia, R. et al. Socio-demographic gaps in pain management guided by large language models. Nat. Health 1, 216–225 (2026). https://doi.org/10.1038/s44360-025-00017-6
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s44360-025-00017-6


