Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Accurate discharge summary generation using fine tuned large language models with self evaluation
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 17 January 2026

Accurate discharge summary generation using fine tuned large language models with self evaluation

  • Wenbin Li1,2,
  • Hui Feng3,
  • Chao Hu4,
  • Minpeng Xu1 &
  • …
  • Longlong Cheng1,5 

Scientific Reports , Article number:  (2026) Cite this article

  • 739 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Health care
  • Mathematics and computing

Abstract

Discharge summaries are critical for patient care continuity, clinical decision-making, and legal documentation, yet their creation is labor-intensive. Clinicians must manually integrate diverse data from multiple sources under time constraints, often leading to delays, inconsistencies, and potential omissions. This study introduces a novel framework to automate discharge summary generation using advanced natural language processing (NLP) techniques, aiming to reduce clinician workload while ensuring accurate, complete, and standardized documentation. We combine the Decomposed Low-Rank Adaptation (DoRA) fine-tuning method with a novel self-evaluation mechanism to enhance large language models (LLMs) for medical text generation. DoRA efficiently adapts pre-trained LLMs to the specialized medical domain, demonstrating superior performance over traditional methods such as LoRA and QLoRA, with a enhancement in BERTScore and a reduction in Perplexity across all evaluated models. The self-evaluation mechanism, inspired by cognitive psychology, iteratively re-feeds generated summaries together with segmented clinical data into the model, allowing it to systematically detect and correct omissions in each data segment, thereby ensuring the outputs accurately and comprehensively represent the original input. This approach was rigorously compared against few-shot prompting and Chain of Thought (CoT) methods. Extensive experiments show that self-evaluation improves BERTScore by 6.9% and 4.1% and increases ROUGE-L by 69.6% and 0.4% relative to few-shot and CoT baselines, respectively, while qualitative metrics also demonstrate consistent gains in accuracy and completeness. Our results demonstrate substantial enhancements in the quality and consistency of generated discharge summaries while reducing the time required for their creation. This research underscores the potential of AI-driven tools in healthcare documentation, reducing the time required for generating discharge summaries while improving their quality and consistency. The findings indicate promising prospects for automating medical documentation that adheres to high standards of accuracy and relevance.

Similar content being viewed by others

Automated generation of discharge summaries: leveraging large language models with clinical data

Article Open access 12 May 2025

Enhancing clinical documentation with voice processing and large language models: a study on the LAOS system

Article Open access 28 November 2025

Development and evaluation of a clinical note summarization system using large language models

Article Open access 28 August 2025

Data availability

Due to ethical restrictions, the raw data cannot be made publicly available. However, de-identified data may be obtained from the first author upon reasonable request.

References

  1. Wilson, S., Ruscoe, W., Chapman, M. & Miller, R. General practitioner–hospital communications: A review of discharge summaries. J. Qual. Clin. Pract. 21, 104–108 (2001).

    Google Scholar 

  2. Chen, W.-K. Linear Networks and Systems. pp. 123–135, (1993).

  3. Patel, S. B. & Lam, K. ChatGPT: The future of discharge summaries?. Lancet Digit. Health 5(3), e107–e108 (2023).

    Google Scholar 

  4. van Walraven, C. & Rokosh, E. What is necessary for high-quality discharge summaries?. Am. J. Med. Qual. 14, 160–169 (1999).

    Google Scholar 

  5. Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).

    Google Scholar 

  6. Brown, T. B. Language models are few-shot learners. Preprint at https://arxiv.org/abs/2005.14165 (2020).

  7. Floridi, L. & Chiriatti, M. GPT-3: Its nature, scope, limits, and consequences. Mind. Mach. 30, 681–694 (2020).

    Google Scholar 

  8. R. Zhou, L. Chen, and K. Yu, Is LLM a Reliable Reviewer? A Comprehensive Evaluation of LLM on Automatic Paper Reviewing Tasks, in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation 430 (LREC-COLING 2024), 9340–9351, (2024).

  9. He, Z. et al. Quality of answers of generative large language models versus peer users for interpreting laboratory test results for lay patients: Evaluation study. J. Med. Internet Res. 26, e56655 (2024).

    Google Scholar 

  10. I. C. Wiest, et al. Anonymizing medical documents with local, privacy preserving large language models: The LLM-Anonymizer. medRxiv, 2024–06 (2024).

  11. E. J. Hu, et al. Lora: Low-rank adaptation of large language models. Preprint at https://arxiv.org/abs/2106.09685 (2021).

  12. Liu, S.-Y., et al. Dora: Weight-decomposed low-rank adaptation. Preprint at https://arxiv.org/abs/2402.09353(2024).

  13. Jiang, A. Q. et al., Mistral 7B. Preprint at https://arxiv.org/abs/2310.06825 (2023).

  14. Jung, H. et al., Enhancing Clinical Efficiency through LLM: Discharge Note Generation for Cardiac Patients. Preprint at https://arxiv.org/abs/2404.05144 (2024).

  15. Dubey, A. et al., The llama 3 herd of models. Preprint at https://arxiv.org/abs/2407.21783 (2024).

  16. Bai, J. et al., Qwen technical report. Preprint at https://arxiv.org/abs/2309.16609 (2023).

  17. Yang, A. et al., Qwen2 technical report. Preprint at https://arxiv.org/abs/2407.10671 (2024).

  18. Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural. Inf. Process. Syst. 35, 24824–24837 (2022).

    Google Scholar 

  19. Tang, A. Q., Zhang, X., & Dinh, M. N. Ignition Innovators at “Discharge Me!”: Chain-of-Thought Instruction Finetuning Large Language Models for Discharge Summaries. Preprint at https://arxiv.org/abs/2407.17636 (2024).

  20. Brown, H. Lin, L. Kawaguchi, K. & Shieh, M. Self-Evaluation as a Defense Against Adversarial Attacks on LLMs. Preprint at https://arxiv.org/abs/2407.03234 (2024).

  21. McAleese, N., Pokorny, R. M., Uribe, J. F. C., Nitishinskaya, E. Trebacz, M. & Leike, J. LLM critics help catch LLM bugs, Preprint at https://arxiv.org/abs/2407.00215 (2024).

  22. Shinn, N. et al. Reflexion: Language agents with verbal reinforcement learning. Adv. Neural. Inf. Process. Syst. 36, 8634–8652 (2023).

    Google Scholar 

  23. Manakul, P. Liusie, A. & Gales, M. SelfCheckGPT: Zero-resource black-box hallucination detection for generative large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 9004–9017, (2023).

  24. Van Veen, D. et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat. Med. 30, 1134–1142 (2024).

    Google Scholar 

  25. Neira, R. A. Q. de Vries, G.-J. Caffarel, J. & Stretton, E. Extraction of data from a hospital information system to perform process mining. In MEDINFO 2017: Precision Healthcare through Informatics. 554–558, (IOS Press, 2017).

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

  1. Academy of Medical Engineering and Translational Medicine, MedicalCollege, Tianjin University, Tianjin, China

    Wenbin Li, Minpeng Xu & Longlong Cheng

  2. China Electronics Corporation, Shenzhen, China

    Wenbin Li

  3. Renmin Hospital of Wuhan University, Wuhan, China

    Hui Feng

  4. China Electronics Cloud Technology Co., Ltd., Wuhan, China

    Chao Hu

  5. China Electronics Cloud Brain (Tianjin) Technology Co., Ltd., Tianjin, China

    Longlong Cheng

Authors
  1. Wenbin Li
    View author publications

    Search author on:PubMed Google Scholar

  2. Hui Feng
    View author publications

    Search author on:PubMed Google Scholar

  3. Chao Hu
    View author publications

    Search author on:PubMed Google Scholar

  4. Minpeng Xu
    View author publications

    Search author on:PubMed Google Scholar

  5. Longlong Cheng
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Wenbin Li conceptualized the research idea and wrote the main manuscript text. Hui Feng and Minpeng Xu conducted the literature review and contributed to drafting and editing portions of the manuscript. Chao Hu and Longlong Cheng compiled and analyzed the latest developments in machine learning techniques, wrote the technical methods section, and assisted in final manuscript revisions. All authors reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Longlong Cheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Feng, H., Hu, C. et al. Accurate discharge summary generation using fine tuned large language models with self evaluation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35552-z

Download citation

  • Received: 09 October 2025

  • Accepted: 06 January 2026

  • Published: 17 January 2026

  • DOI: https://doi.org/10.1038/s41598-026-35552-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Natural Language Processing
  • Large Language Models
  • Discharge Summaries
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics