Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Communications Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. communications medicine
  3. articles
  4. article
Optimizing prompting strategies improves large language model classification of pain- and fatigue-related functional impact in childhood cancer survivors
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 25 March 2026

Optimizing prompting strategies improves large language model classification of pain- and fatigue-related functional impact in childhood cancer survivors

  • Jin-ah Sim1,
  • Madeline R. Horan2,
  • Xiaolei Huang3,
  • Minsu Kim4,
  • Deo Kumar Srivastava  ORCID: orcid.org/0000-0001-6693-81205,
  • Kirsten K. Ness  ORCID: orcid.org/0000-0002-2084-15076,
  • Melissa M. Hudson6,7,
  • Justin N. Baker8 &
  • …
  • I-Chan Huang  ORCID: orcid.org/0000-0002-1194-39236 

Communications Medicine , Article number:  (2026) Cite this article

  • 1175 Accesses

  • 60 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Paediatric cancer
  • Signs and symptoms

Abstract

Background

Understanding how symptoms affect daily functioning is central to improving care for childhood cancer survivors. As narrative symptom reporting becomes increasingly common in survivorship care, scalable automated tools are needed to interpret these descriptions and identify their functional impact. This study evaluates how two large language models (ChatGPT-4o, Llama-3.1) perform this task across different prompt engineering strategies.

Methods

We analyzed semi-structured interviews from 30 childhood cancer survivors and their caregivers, yielding 819 pain- and fatigue-related symptom narratives. Each narrative was expert-annotated for physical, social, or cognitive functional impact, serving as the reference standard. ChatGPT-4o and Llama-3.1 were evaluated using four prompting strategies: zero-shot, few-shot, step-by-step reasoning (Chain-of-Thought), and generated knowledge. Model outputs were compared with expert annotations, and performance was quantified using standard classification and discrimination metrics with resampling-based confidence intervals.

Results

Here, we show that prompting strategies based on generated knowledge and step-by-step reasoning consistently outperform zero-shot and few-shot across both models. Overall, these strategies produce the most accurate and stable classification of physical, social, and cognitive functional impact. Specifically, ChatGPT-4o achieves more balanced precision and discrimination across physical, social, and cognitive functioning, whereas Llama-3.1 demonstrates higher sensitivity but substantially lower precision, particularly for physical and social functioning.

Conclusions

Prompt engineering improves how large language models interpret survivor-reported pain and fatigue. These findings support the use of carefully designed prompts to enable automated, context-aware analysis of symptom narratives, providing a scalable approach to support symptom monitoring and survivor-centered care.

Plain language summary

Many children who survive cancer continue to have pain and tiredness that affect how they move, think, and take part in daily life. These experiences are often recorded in interviews or written notes, which are hard for doctors to review quickly. This study tested whether two computer programs could aid in interpreting these symptom narratives. We provided the programs with different types of instructions and asked them to determine how pain and fatigue affected physical, social, and thinking activities. We found that instructions that included background information or step-by-step thinking led to more accurate results. These tools could help doctors better track symptoms, identify problems earlier, and provide more personalized care for childhood cancer survivors.

Similar content being viewed by others

Comprehensive testing of large language models for extraction of structured data in pathology

Article Open access 31 March 2025

Cancer type, stage and prognosis assessment from pathology reports using LLMs

Article Open access 26 July 2025

Effects of high-intensity training on the quality of life of cancer patients and survivors: a systematic review with meta-analysis

Article Open access 23 July 2021

Data availability

Source data underlying Table 1 is in Supplementary Table 1; source data underlying Tables 2–3 are in Supplementary Data parts 3 and 4. These source data are available on Zenodo.org (https://zenodo.org/records/18526848)53. These public datasets enable reproduction of the reported tables but do not include unstructured symptom narratives. The full analytic datasets used for model training and statistical analyses, including unstructured symptom narratives, may contain protected health information (PHI) and therefore cannot be made publicly available. These data are stored within secure St. Jude data systems and are available to qualified researchers upon reasonable request, subject to institutional review and execution of a data use agreement (DUA). Requests for access to restricted analytic data should be directed to the corresponding author, who will coordinate access in accordance with St. Jude data-governance policies.

References

  1. Ohlsen, T., Martos, M. & Hawkins, D. Recent advances in the treatment of childhood cancers. Curr. Opin. Pediatr. 36, 57–63 (2023).

    Google Scholar 

  2. Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 73, 17–48 (2023).

    Google Scholar 

  3. Miller, K. D. et al. Cancer treatment and survivorship statistics, 2022. CA Cancer J. Clin. 72, 409–436 (2022).

    Google Scholar 

  4. Armstrong, G. T. et al. Reduction in late mortality among 5-year survivors of childhood cancer. NEJM 374, 833–842 (2016).

    Google Scholar 

  5. Hudson, M. M. et al. Clinical ascertainment of health outcomes among adults treated for childhood cancer. JAMA 309, 2371–2381 (2013).

    Google Scholar 

  6. Bhakta, N. et al. The cumulative burden of surviving childhood cancer: an initial report from the St Jude Lifetime Cohort Study (SJLIFE). Lancet 390, 2569–2582 (2017).

    Google Scholar 

  7. Shin, H. et al. Associations of symptom clusters and health outcomes in adult survivors of childhood cancer: a report from the St Jude Lifetime Cohort Study. J. Clin. Oncol. 41, 497–507 (2023).

    Google Scholar 

  8. Horan, M. R. et al. Multilevel characteristics of cumulative symptom burden in young survivors of childhood cancer. JAMA Netw. Open 7, e2410145–e2410145 (2024).

    Google Scholar 

  9. Molcho, M., D’Eath, M., Alforque Thomas, A. & Sharp, L. Educational attainment of childhood cancer survivors: a systematic review. Cancer Med. 8, 3182–3195 (2019).

    Google Scholar 

  10. Leahy, A. B. & Steineck, A. Patient-reported outcomes in pediatric oncology: the patient voice as a gold standard. JAMA Pediatr. 174, e202868–e202868 (2020).

    Google Scholar 

  11. Basch, E., Leahy, A. B. & Reeve, B. B. Symptom monitoring with patient-reported outcomes during pediatric cancer care. JAMA 332, 1979–1980 (2024).

    Google Scholar 

  12. Aiyegbusi, O. L. et al. Recommendations to address respondent burden associated with patient-reported outcome assessment. Nat. Med. 30, 650–659 (2024).

    Google Scholar 

  13. Minvielle, E., di Palma, M., Mir, O. & Scotté, F. The use of patient-reported outcomes (PROs) in cancer care: a realistic strategy. Ann. Oncol. 33, 357–359 (2022).

    Google Scholar 

  14. Lu, Z. et al. Natural language processing and machine learning methods to characterize unstructured patient-reported outcomes: validation study. J. Med. Internet Res. 23, e26777 (2021).

    Google Scholar 

  15. Sezgin, E., Hussain, S. A., Rust, S. & Huang, Y. Extracting medical information from free-text and unstructured patient-generated health data using natural language processing methods: feasibility study with real-world data. JMIR Form. Res. 7, e43014 (2023).

    Google Scholar 

  16. Castro, A., Pinto, J., Reino, L., Pipek, P. & Capinha, C. Large language models overcome the challenges of unstructured text data in ecology. Ecol. Inf. 82, 102742 (2024).

    Google Scholar 

  17. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 4171–4186 (Association for Computational Linguistics, 2019).

  18. Yang, X. et al. Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms. NPJ Digit. Med. 8, 85 (2025).

    Google Scholar 

  19. Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

    Google Scholar 

  20. Hadi, A., Tran, E., Nagarajan, B. & Kirpalani, A. Evaluation of ChatGPT as a diagnostic tool for medical learners and clinicians. PLoS ONE 19, e0307383 (2024).

    Google Scholar 

  21. Liévin, V., Hother, C. E., Motzfeldt, A. G. & Winther, O. Can large language models reason about medical questions?. Patterns 5, 100943 (2024).

    Google Scholar 

  22. Wei, W. I. et al. Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong. Clin. Microbiol. Infect. 30, 142.e141–142.e143 (2024).

    Google Scholar 

  23. Nazi, Z. A., Hossain, M. R. & Mamun, F. A. Evaluation of open and closed-source LLMs for low-resource language with zero-shot, few-shot, and chain-of-thought prompting. Nat. Lang. Proc. J. 10, 100124 (2025).

    Google Scholar 

  24. Sim, J. A. et al. Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: a systematic review. Artif. Intell. Med. 146, 102701 (2023).

    Google Scholar 

  25. Sim, J. A., Huang, X., Horan, M. R., Baker, J. N. & Huang, I. C. Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review. Expert Rev. Pharmacoecon. Outcomes Res. 24, 467–475 (2024).

    Google Scholar 

  26. Cho, S. et al. Leveraging large language models for improved understanding of communications with patients with cancer in a call center setting: proof-of-concept study. J. Med. Internet Res. 26, e63892 (2024).

    Google Scholar 

  27. Gupta, G. K., Singh, A., Manikandan, S. V. & Ehtesham, A. Digital diagnostics: the potential of large language models in recognizing symptoms of common illnesses. AI 6, 13 (2025).

    Google Scholar 

  28. Huang, X. et al. Evaluating the performance of ChatGPT in clinical pharmacy: a comparative study of ChatGPT and clinical pharmacists. Br. J. Clin. Pharm. 90, 232–238 (2024).

    Google Scholar 

  29. Han, C. et al. Evaluation of GPT-4 for 10-year cardiovascular risk prediction: insights from the UK Biobank and KoGES data. iScience 27, 109022 (2024).

    Google Scholar 

  30. Goh, E. et al. Large language model influence on diagnostic reasoning: a randomized clinical trial. JAMA Netw. Open 7, e2440969–e2440969 (2024).

    Google Scholar 

  31. Rydzewski, N. R. et al. Comparative evaluation of LLMs in clinical oncology. NEJM AI 1, 1–14 (2024).

    Google Scholar 

  32. Hirosawa, T. et al. ChatGPT-generated differential diagnosis lists for complex case-derived clinical vignettes: diagnostic accuracy evaluation. JMIR Med. Inf. 11, e48808 (2023).

    Google Scholar 

  33. Chen, A., Chen, D. O. & Tian, L. Benchmarking the symptom-checking capabilities of ChatGPT for a broad range of diseases. J. Am. Med. Inf. Assoc. 31, 2084–2088 (2024).

    Google Scholar 

  34. Forrest, C. B. et al. Self-reported health outcomes of children and youth with 10 chronic diseases. J. Pediatr. 246, 207–212 (2022).

    Google Scholar 

  35. Pierzynski, J. A. et al. Patient-reported outcomes in paediatric cancer survivorship: a qualitative study to elicit the content from cancer survivors and caregivers. BMJ Open 10, e032414 (2020).

    Google Scholar 

  36. Forrest, C. B. et al. Establishing the content validity of PROMIS Pediatric pain interference, fatigue, sleep disturbance, and sleep-related impairment measures in children with chronic kidney disease and Crohn’s disease. J. Patient Rep. Outcomes 4, 11 (2020).

    Google Scholar 

  37. Varni, J. W. et al. PROMIS Pediatric Pain Interference Scale: an item response theory analysis of the pediatric pain item bank. J. Pain. 11, 1109–1119 (2010).

    Google Scholar 

  38. Lai, J. S. et al. Development and psychometric properties of the PROMIS(®) pediatric fatigue item banks. Qual. Life Res. 22, 2417–2427 (2013).

    Google Scholar 

  39. Hahn, E. A. et al. Precision of health-related quality-of-life data compared with other clinical measures. Mayo Clin. Proc. 82, 1244–1254 (2007).

    Google Scholar 

  40. Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. Large language models are zero-shot reasoners. In Proceedings of the 36th International Conference on Neural Information Processing Systems 22199–22213 (Association for Computational Linguistics, 2022).

  41. Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. In Proceedings of the 36th International Conference on Neural Information Processing Systems 24824–24837 (NeurIPS, 2022).

  42. Liu, J. et al. Generated knowledge prompting for commonsense reasoning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 3154–3169 (Association for Computational Linguistics, 2022).

  43. Zelikman, E., Wu, Y., Mu, J. & Goodman, N. Shape of thought: When distribution matters more than correctness in reasoning tasks. In Proceedings of the 36th International Conference on Neural Information Processing Systems 15476–15488 (NeurIPS, 2022).

  44. Kakaday, R., Herrera, E. Z., Coskey, O., Hertel, A. W. & Kaiser, P. The STREAMLINE Pilot—study on time reduction and efficiency in AI-mediated logging for improved note-taking experience. Appl. Clin. Inf. 16, 614–621 (2025).

    Google Scholar 

  45. Wornow, M. et al. The shaky foundations of large language models and foundation models for electronic health records. NPJ Digit. Med. 6, 135 (2023).

    Google Scholar 

  46. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55, 1–38 (2023).

    Google Scholar 

  47. Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing System 9459–9474 (NeurIPS, 2020).

  48. Li, M., Kilicoglu, H., Xu, H. & Zhang, R. BiomedRAG: a retrieval augmented large language model for biomedicine. J. Biomed. Inf. 162, 104769 (2025).

    Google Scholar 

  49. Liu, P. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 1–35 (2023).

    Google Scholar 

  50. Xu, D. et al. Editing factual knowledge and explanatory ability of medical large language models. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2660–2670 (Association for Computing Machinery, 2024).

  51. Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).

    Google Scholar 

  52. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. 54, 1–35 (2021).

    Google Scholar 

  53. Source data from: Optimizing prompting strategies improves large language model classification of pain- and fatigue-related functional impact in childhood cancer survivors. Zenodo https://doi.org/10.5281/zenodo.18526848 (2026).

Download references

Acknowledgements

The research reported in this manuscript was supported by the U.S. National Cancer Institute (NCI) under award numbers NCI U01CA195547, NCI R21CA202210, NCI R01CA238368, and NCI R01CA258193. This research was also supported by the grant from the National Research Foundation of Korea (NRF), funded by the Korean government (no.: 2022R1C1C1009902). The content is solely the responsibility of the authors and does not represent the official views of the funding agencies. Support for St. Jude Children’s Research Hospital was also provided by the Cancer Center Support (CORE) grant (P30 CA21765, C. Roberts, Principal Investigator) and the American Lebanese-Syrian Associated Charities (ALSAC). In addition, the authors thank Rachel M. Keesey and Ruth J. Eliason for conducting in-depth interviews with study participants; Jennifer L. Clegg and Conor M. Jones, MD, for annotating symptom data from the interview data; and Christopher B. Forrest, MD, PhD, for adjudicating discrepancies in the annotation of symptom data.

Author information

Authors and Affiliations

  1. Graduate School of Public Health and Healthcare Management, The Catholic University of Korea, Seoul, Republic of Korea

    Jin-ah Sim

  2. Department of Pediatrics, Wake Forest University School of Medicine, Winston-Salem, NC, USA

    Madeline R. Horan

  3. Department of Computer Science, College of Arts & Sciences, University of Memphis, Memphis, TN, USA

    Xiaolei Huang

  4. Department of AI Convergence, Hallym University, Chuncheon, Republic of Korea

    Minsu Kim

  5. Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN, USA

    Deo Kumar Srivastava

  6. Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN, USA

    Kirsten K. Ness, Melissa M. Hudson & I-Chan Huang

  7. Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN, USA

    Melissa M. Hudson

  8. Division of Quality of Life and Pediatric Palliative Care, Department of Pediatrics, Stanford University School of Medicine and Stanford Medicine Children’s Health, Palo Alto, CA, USA

    Justin N. Baker

Authors
  1. Jin-ah Sim
    View author publications

    Search author on:PubMed Google Scholar

  2. Madeline R. Horan
    View author publications

    Search author on:PubMed Google Scholar

  3. Xiaolei Huang
    View author publications

    Search author on:PubMed Google Scholar

  4. Minsu Kim
    View author publications

    Search author on:PubMed Google Scholar

  5. Deo Kumar Srivastava
    View author publications

    Search author on:PubMed Google Scholar

  6. Kirsten K. Ness
    View author publications

    Search author on:PubMed Google Scholar

  7. Melissa M. Hudson
    View author publications

    Search author on:PubMed Google Scholar

  8. Justin N. Baker
    View author publications

    Search author on:PubMed Google Scholar

  9. I-Chan Huang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Conceptualization: J.A.S. and I.C.H.; Funding acquisition: J.A.S., K.K.N., M.M.H., and I.C.H.; Methodology: J.A.S., X.H., and I.C.H.; Data analysis: J.A.S. and M.K.; Interpretation of data: all co-authors; Project administration: I.C.H.; Resources: K.K.N., M.M.H., J.N.B., and I.C.H.; Supervision: I.C.H.; Writing: J.A.H., M.R.H., and I.C.H.; Review and editing: all co-authors; All authors have read and agreed to the submitted version of the manuscript.

Corresponding author

Correspondence to I-Chan Huang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file (download PDF )

Supplemental Information (download PDF )

Description of Additional Supplementary files (download DOCX )

Supplementary Data 1-4 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sim, Ja., Horan, M.R., Huang, X. et al. Optimizing prompting strategies improves large language model classification of pain- and fatigue-related functional impact in childhood cancer survivors. Commun Med (2026). https://doi.org/10.1038/s43856-026-01499-5

Download citation

  • Received: 13 July 2025

  • Accepted: 23 February 2026

  • Published: 25 March 2026

  • DOI: https://doi.org/10.1038/s43856-026-01499-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Journal Information
  • Open Access Fees and Funding
  • Journal Metrics
  • Editors
  • Editorial Board
  • Calls for Papers
  • Contact
  • Conferences
  • Editorial Values Statement
  • Posters
  • Editorial policies

Publish with us

  • For Authors
  • For Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Communications Medicine (Commun Med)

ISSN 2730-664X (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer