Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
Multidimensional evaluation of large language models in radiology report readability
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 01 April 2026

Multidimensional evaluation of large language models in radiology report readability

  • Yunhai Mao  ORCID: orcid.org/0009-0000-4940-14791 na1,
  • Chunyan Wang  ORCID: orcid.org/0009-0007-3490-83581 na1,
  • Yuxin Li  ORCID: orcid.org/0009-0000-7072-060X1,
  • Wei Wang  ORCID: orcid.org/0009-0004-6081-97601 &
  • …
  • Mengchao Zhang  ORCID: orcid.org/0000-0002-3733-15971 

npj Digital Medicine , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Health care
  • Mathematics and computing
  • Medical research

Abstract

This study systematically investigated the influence of demographic characteristics on the readability of patient-centric radiology reports and compared the performance of different large language models (LLMs) in generating patient-centered reports. Adopting a sequential two-stage design, the research first conducted a retrospective evaluation involving 320 radiology reports followed by a clinical setting validation with 800 patients. Results suggested that all three LLMs significantly improved the readability of radiology reports (P < 0.05), with DeepSeek-R1 showing potentially superior performance within this specific cohort. Demographic analysis revealed significant interactive effects: higher education and older age (within consistent educational levels) were associated with better comprehension. Clinical setting validation further indicated that reading simplified reports suggesting the potential to significantly improved patients’ subjective and objective comprehension while significantly alleviating medical anxiety (P < 0.05). However, limitations persist, including inconsistent model outputs, missing anatomical details, and comprehension variances driven by demographic factors. Consequently, LLMs should be integrated as auxiliary communication tools for radiologists rather than standalone solutions, necessitating personalized interventions tailored to specific demographic profiles.

Similar content being viewed by others

Multi-step retrieval and reasoning improves radiology question answering with large language models

Article Open access 22 December 2025

Evaluation of large language models for diagnostic impression generation from brain MRI report findings: a multicenter benchmark and reader study

Article Open access 22 January 2026

Diagnostic and interpretive gains from reasoning over conclusions with a large reasoning model in radiology

Article Open access 31 December 2025

Data availability

The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.

References

  1. Vijan, A., Bhagwanani, A., Calle, F. & Brun-Vergara, M. L. Optimizing patient communication in radiology. Radiographics 43, e230002 (2023).

    Google Scholar 

  2. Rockall, A. G., Justich, C., Helbich, T. & Vilgrain, V. Patient communication in radiology: moving up the agenda. Eur. J. Radiol. 155, 110464 (2022).

    Google Scholar 

  3. Cabarrus, M., Naeger, D. M., Rybkin, A. & Qayyum, A. Patients prefer results from the ordering provider and access to their radiology reports. J. Am. Coll. Radi. ol. 12, 556–562 (2015).

    Google Scholar 

  4. Gunn, A. J. et al. JOURNAL CLUB: structured feedback from patients on actual radiology reports: a novel approach to improve reporting practices. AJR Am. J. Roentgenol. 208, 1262–1270 (2017).

    Google Scholar 

  5. Martin-Carreras, T., Cook, T. S. & Kahn, C. E. Jr Readability of radiology reports: implications for patient-centered care. Clin. Imaging 54, 116–120 (2019).

    Google Scholar 

  6. Burns, J., Agarwal, V., Catanzano, T. M., Schaefer, P. W. & Jordan, S. G. Talking points: enhancing communication between radiologists and patients. Acad. Radiol. 29, 888–896 (2022).

    Google Scholar 

  7. Yin, S. et al. A survey on multimodal large language models. Natl. Sci. Rev. 11, nwae403 (2024).

    Google Scholar 

  8. Gulati, V. et al. Transcending language barriers: can ChatGPT Be the key to enhancing multilingual accessibility in health care? J. Am. Coll. Radiol. 21, 1888–1895 (2024).

    Google Scholar 

  9. Herwald, S. E. et al. RadGPT: a system based on a large language model that generates sets of patient-centered materials to explain radiology report information. J. Am. Coll. Radiol. 22, 1050–1059 (2025).

    Google Scholar 

  10. Leutz-Schmidt, P. et al. Performance of large language models ChatGPT and Gemini on workplace management questions in radiology. Diagnostics 15, 497 (2025).

    Google Scholar 

  11. Elhakim, T. et al. Enhanced PROcedural information READability for Patient-Centered Care in Interventional Radiology With Large Language Models (PRO-READ IR). J. Am. Coll. Radiol. 22, 84–97 (2025).

    Google Scholar 

  12. Kim, H. et al. Conversion of mixed-language free-text CT reports of pancreatic cancer to national comprehensive cancer network structured reporting templates by using GPT-4. Korean J. Radiol. 26, 557–568 (2025).

    Google Scholar 

  13. Çamur, E., Cesur, T. & Güneş, Y. C. A comparative study: performance of large language models in simplifying Turkish computed tomography reports. J. Infect. Public Health 87, 321–326 (2024).

    Google Scholar 

  14. Berzolla, E. et al. Artificial intelligence large language models improve patient comprehension of radiologist magnetic resonance imaging reports. Arthroscopy 41, 4607–4614.e4604 (2025).

    Google Scholar 

  15. Chen, A. H., Rudin, R. S., Levine, D. M. & Mehrotra, A. Improving patient understanding of radiology reports using generative artificial intelligence: a vignette study of 2000 US adults. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocaf187 (2025).

  16. Doyle, C., Lennox, L. & Bell, D. A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open 3, e001570 (2013).

    Google Scholar 

  17. Jeblick, K. et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur. Radiol. 34, 2817–2825 (2024).

    Google Scholar 

  18. Doshi, R. et al. Quantitative evaluation of large language models to streamline radiology report impressions: a multimodal retrospective analysis. Radiology 310, e231593 (2024).

    Google Scholar 

  19. Rahsepar, A. A. Large language models for enhancing radiology report impressions: improve readability while decreasing burnout. Radiology 310, e240498 (2024).

    Google Scholar 

  20. Nakaura, T. et al. The impact of large language models on radiology: a guide for radiologists on the latest innovations in AI. Jpn. J. Radiol. 42, 685–696 (2024).

    Google Scholar 

  21. Prucker, P. et al. A prospective controlled trial of large language model-based simplification of oncologic CT reports for patients with cancer. Radiology 317, e251844 (2025).

    Google Scholar 

  22. Jebb, A. T., Ng, V. & Tay, L. A review of key likert scale development advances: 1995-2019. Front. Psychol. 12, 637547 (2021).

    Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the Radiology Department of the Third Hospital of Jilin University for their support of this research, and Professor Mengchao Zhang on the research team.

Author information

Author notes
  1. These authors contributed equally: Yunhai Mao, Chunyan Wang.

Authors and Affiliations

  1. Department of Radiology, the Third Hospital of Jilin University, Changchun, China

    Yunhai Mao, Chunyan Wang, Yuxin Li, Wei Wang & Mengchao Zhang

Authors
  1. Yunhai Mao
    View author publications

    Search author on:PubMed Google Scholar

  2. Chunyan Wang
    View author publications

    Search author on:PubMed Google Scholar

  3. Yuxin Li
    View author publications

    Search author on:PubMed Google Scholar

  4. Wei Wang
    View author publications

    Search author on:PubMed Google Scholar

  5. Mengchao Zhang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

M.Z. conceptualized the study, performed formal analysis and investigation, and was responsible for project administration and supervision. Y.M. and C.W. (equal contributors) contributed to data curation, formal analysis, methodology, validation, visualization, and wrote the original draft and revised the manuscript. Y.L. contributed to data curation, methodology, and visualization. W.W. contributed to methodology and writing the original draft. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Mengchao Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mao, Y., Wang, C., Li, Y. et al. Multidimensional evaluation of large language models in radiology report readability. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02589-3

Download citation

  • Received: 15 December 2025

  • Accepted: 17 March 2026

  • Published: 01 April 2026

  • DOI: https://doi.org/10.1038/s41746-026-02589-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Evaluating the Real-World Clinical Performance of AI

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics