Abstract
Background and aim
Managing obesity requires a comprehensive approach that involves therapeutic lifestyle changes, medications, or metabolic surgery. Many patients seek health information from online sources and artificial intelligence models like ChatGPT, Google Gemini, and Microsoft Copilot before consulting health professionals. This study aims to evaluate the appropriateness of the responses of Google Gemini and Microsoft Copilot to questions on pharmacologic and surgical management of obesity and assess for bias in their responses to either the ADA or AACE guidelines.
Methods
Ten questions were compiled into a set and posed separately to the free editions of Google Gemini and Microsoft Copilot. Recommendations for the questions were extracted from the ADA and the AACE websites, and the responses were graded by reviewers for appropriateness, completeness, and bias to any of the guidelines.
Results
All responses from Microsoft Copilot and 8/10 (80%) responses from Google Gemini were appropriate. There were no inappropriate responses. Google Gemini refused to respond to two questions and insisted on consulting a physician. Microsoft Copilot (10/10; 100%) provided a higher proportion of complete responses than Google Gemini (5/10; 50%). Of the eight responses from Google Gemini, none were biased towards any of the guidelines, while two of the responses from Microsoft Copilot were biased.
Conclusion
The study highlights the role of Microsoft Copilot and Google Gemini in weight loss management. The differences in their responses may be attributed to the variation in the quality and scope of their training data and design.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data generated or analyzed during this study are included in this published article.
References
World Health Organization: WHO. Obesity and overweight. 2024. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight.
Flegal KM, Kit BK, Orpana H, Graubard BI. Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis. JAMA. 2013;309:71–82.
Chew HSJ, Ang WHD, Lau Y. The potential of artificial intelligence in enhancing adult weight loss: a scoping review. Public Health Nutr. 2021;24:1993–2020.
Kushner RF. Weight loss strategies for treatment of obesity. Prog Cardiovasc Dis. 2014;56:465–72.
Chu YT, Huang RY, Chen TTW, Lin WH, Tang JT, Lin CW, et al. Effect of health literacy and shared decision-making on choice of weight-loss plan among overweight or obese participants receiving a prototype artificial intelligence robot intervention facilitating weight-loss management decisions. Digit Health. 2022;8:20552076221136372.
Kivimäki M, Kuosma E, Ferrie JE, Luukkonen R, Nyberg ST, Alfredsson L, et al. Overweight, obesity, and risk of cardiometabolic multimorbidity: pooled analysis of individual-level data for 120 813 adults from 16 cohort studies from the USA and Europe. Lancet Public Health. 2017;2:e277–85.
Masood A, Alsheddi L, Alfayadh L, Bukhari B, Elawad R, Alfadda AA. Dietary and lifestyle factors serve as predictors of successful weight loss maintenance postbariatric surgery. J Obes. 2019;2019:7295978.
Saperstein SL, Atkinson NL, Gold RS. The impact of Internet use for weight loss. Obes Rev. 2007;8:459–65.
Tan SSL, Goonawardene N. Internet health information seeking and the patient-physician relationship: a systematic review. J Med Internet Res. 2017;19:e9.
Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2:230–43.
Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;7:e7702.
Nadarzynski T, Miles O, Cowie A, Ridge D. Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digit Health. 2019;5:2055207619871808.
Chew HSJ. The use of artificial intelligence-based conversational agents (Chatbots) for weight loss: scoping review and practical recommendations. JMIR Med Inform. 2022;10:e32578.
Pavlik EJ, Ramaiah DD, Swiecki-Sikora AL, Land JM. Replies to queries in gynecologic oncology by Bard, Bing and Google Assistant. BioMedInformatics. 2024;4:1773–82.
Shukla R, Mishra AK, Banerjee N, Verma A. The comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for diagnosing cases of neuroophthalmology. Cureus. 2024;16:e58232.
Yang R, Tan TF, Lu W, Thirunavukarasu AJ, Ting DSW, Liu N. Large language models in health care: development, applications, and challenges. Health Care Sci. 2023;2:255–63.
Barlas T, Altinova AE, Akturk M, Toruner FB. Credibility of ChatGPT in the assessment of obesity in type 2 diabetes according to the guidelines. Int J Obes. 2024;48:271–5.
Alhur A. Redefining healthcare with artificial intelligence (AI): the contributions of ChatGPT, Gemini, and Co-pilot. Cureus. 2024;16:e57795.
Atarere J, Naqvi H, Haas C, Adewunmi C, Bandaru S, Allamneni R, et al. Applicability of online chat-based artificial intelligence models to colorectal cancer screening. Dig Dis Sci. 2024;69:791–7.
Lee Y, Shin T, Tessier L, Javidan A, Jung J, Hong D. et al. ASMBS Artificial Intelligence and Digital Surgery Task Force. Harnessing artificial intelligence in bariatric surgery:comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations. Surg Obes Relat Dis. 2024;20:603–608. https://doi.org/10.1016/j.soard.2024.03.011.
Kozaily E, Geagea M, Akdogan ER, Atkins J, Elshazly MB, Guglin M, et al. Accuracy and consistency of online large language model-based artificial intelligence chat platforms in answering patients’ questions about heart failure. Int J Cardiol. 2024;408:132115.
Nazir T, Ahmad U, Mal M, Rehman MU, Saeed R, Kalia JS. Microsoft Bing vs. Google Bard in Neurology: A comparative study of AI-generated patient education material. 2023. https://doi.org/10.1101/2023.08.25.23294641.
Mudrik A, Nadkarni GN, Efros O, Glicksberg BS, Klang E, Soffer S. Exploring the role of large language models (LLMs) in hematology: A systematic review of applications, benefits, and limitations. 2024. https://doi.org/10.1101/2024.04.26.24306358.
Cornell S. Comparison of the diabetes guidelines from the ADA/EASD and the AACE/ACE. J Am Pharm Assoc. 2017;57:261–5.
Sblendorio E, Dentamaro V, Cascio AL, Germini F, Piredda M, Cicolini G. Integrating human expertise & automated methods for a dynamic and multiparametric evaluation of large language models’ feasibility in clinical decisionmaking. Int J Med Inform. 2024;188:105501.
Haider SA, Pressman SM, Borna S, Gomez-Cabello CA, Sehgal A, Leibovich BC, et al. Evaluating large language model (LLM) performance on established breast classification systems. Diagnostics. 2024;14:1491.
Author information
Authors and Affiliations
Contributions
Eugene Annor: conceptualization, methodology, drafting, and revision of manuscript. Joseph Atarere: conceptualization, methodology, drafting of the manuscript. Nneoma Ubah: drafting of the manuscript. Bryce Kunkle: drafting of the manuscript. Olachi Egbo: drafting of the manuscript. Oladoyin Jolaoye: drafting of the manuscript. Daniel K. Martin: drafting of the manuscript and reviewing for scientific content.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This study does not contain identifying information about the patients. The study was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its subsequent amendments.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Annor, E., Atarere, J., Ubah, N. et al. Assessing online chat-based artificial intelligence models for weight loss recommendation appropriateness and bias in the presence of guideline incongruence. Int J Obes 49, 896–901 (2025). https://doi.org/10.1038/s41366-025-01717-5
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41366-025-01717-5