Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
Domain specific multimodal large language model for automated endoscopy reporting with multicenter prospective validation
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 28 March 2026

Domain specific multimodal large language model for automated endoscopy reporting with multicenter prospective validation

  • Ruiqing Jiang1,2,3,4 na1,
  • Boru Chen1,2,3,4 na1,
  • Zehua Dong1,2,3,4 na1,
  • Xiaoquan Zeng1,2,3,4,
  • Hang You1,2,3,4,
  • Yanxia Li1,2,3,4,
  • Yunchao Deng1,2,3,4,
  • Ganggang Mu1,2,3,4,
  • Jing Wang1,2,3,4,
  • Li Huang1,2,3,4,
  • Jia Li1,2,3,4,
  • Du Cheng1,2,3,4,
  • Wei Zhou1,2,3,4 &
  • …
  • Honggang Yu1,2,3,4,5 

npj Digital Medicine , Article number:  (2026) Cite this article

  • 1308 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Diseases
  • Gastroenterology
  • Health care
  • Medical research

Abstract

Accurate endoscopy reports are crucial for the diagnosis and management of patients with upper gastrointestinal (UGI) diseases, yet errors and omissions are common. The preparation of routine reports for common diseases is labor-intensive and time-consuming. To address this, here we developed Report-Angel, an integrated AI system based on a multi-modal large language model (MLLM) and conventional deep learning models and trained it on 20,617 image-text pairs to automatically generate detailed draft reports for UGI endoscopy. Report-Angel achieved a clinically acceptable report rate of 79.3% (95% CI: 74.4–83.5%) in the prospective internal and 83.3% (95% CI: 78.7–87.3%) in the prospective external cohort. At the case level, Report-Angel achieved a report completeness of 88.51% (95% CI: 84.64−92.38%) and a report accuracy of 78.93% (95% CI: 73.98–83.88%), with average processing time of 1.5 s per lesion in the internal prospective video dataset. Lesion-level reporting accuracies were 91.92% (95% CI: 90.58−93.25%), 89.07% (95% CI: 87.57–90.57%), and 83.94% (95% CI: 81.58–86.31%) on retrospective image and prospective single- and multi-center video datasets, respectively. Report-Angel generates expert-level draft endoscopy reports and demonstrates robust generalizability. By providing reliable foundation draft reports, this system has the potential to effectively standardize reporting and reduce endoscopists’ workloads.

Similar content being viewed by others

Edge artificial intelligence wireless video capsule endoscopy

Article Open access 12 August 2022

Explainable artificial intelligence incorporated with domain knowledge diagnosing early gastric neoplasms under white light endoscopy

Article Open access 12 April 2023

How machine learning on real world clinical data improves adverse event recording for endoscopy

Article Open access 10 July 2025

Data availability

Individual de-identified participant data that underlie the results reported in this article can be shared with investigators for research purposes. Access to the data can be requested from the first corresponding author, yuhonggang1968@163.com. Data access will be granted after signing a data access agreement. The pretraining model, software, source code used in the paper, and associated test data and parameters have been provided in https://github.com/endo-angel/MLLM-for-Automatically-Reporting-Lesions-of-Upper-GI-Endoscopy.

References

  1. Kaminski, M. F. et al. Performance measures for lower gastrointestinal endoscopy: a European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative. Endoscopy 49, 378–397 (2017).

    Google Scholar 

  2. Rutter, M. D. & Rees, C. J. Quality in gastrointestinal endoscopy. Endoscopy 46, 526–528 (2014).

    Google Scholar 

  3. Barbetta, A. et al. Quality of endoscopy reports for esophageal cancer patients: where do we stand? J. Gastrointest. Surg. 22, 778–784 (2018).

    Google Scholar 

  4. Bazerbachi, F., Chahal, P. & Shaukat, A. Improving upper gastrointestinal endoscopy quality. Clin. Gastroenterol. Hepatol. 21, 2457–2461 (2023).

    Google Scholar 

  5. Yokota, Y. et al. Effects of a novel endoscopic reporting system with voice recognition on the endoscopic procedure time and report preparation time: propensity score matching analysis. J. Gastroenterol. 57, 1–9 (2022).

    Google Scholar 

  6. Cid, Y. D. et al. Development and validation of open-source deep neural networks for comprehensive chest X-ray reading: a retrospective, multicentre study. Lancet Digit. Health 6, e44–e57 (2024).

    Google Scholar 

  7. Kim, C. et al. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat. Med 30, 1154–1165 (2024).

    Google Scholar 

  8. Ji, J., Hou, Y., Chen, X., Pan, Y. & Xiang, Y. Vision-language model for generating textual descriptions from clinical images: model development and validation study. JMIR Form. Res. 8, e32690 (2024).

    Google Scholar 

  9. Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

    Google Scholar 

  10. Kottlors, J. et al. Feasibility of differential diagnosis based on imaging patterns using a large language model. Radiology 308, e231167 (2023).

    Google Scholar 

  11. Fink, M. A. et al. Potential of ChatGPT and GPT-4 for data mining of free-text CT reports on lung cancer. Radiology 308, e231362 (2023).

    Google Scholar 

  12. Adams, L. C. et al. Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology 307, e230725 (2023).

    Google Scholar 

  13. Sun, Z. et al. Evaluating GPT4 on impressions generation in radiology reports. Radiology 307, e231259 (2023).

    Google Scholar 

  14. Huang, J. et al. Generative artificial intelligence for chest radiograph interpretation in the emergency department. JAMA Netw. Open 6, e2336100 (2023).

    Google Scholar 

  15. Tanno, R. et al. Collaboration between clinicians and vision-language models in radiology report generation. Nat. Med 31, 599–608 (2025).

    Google Scholar 

  16. Dong, Z. et al. A deep learning-based system for real-time image reporting during esophagogastroduodenoscopy: a multicenter study. Endoscopy 54, 771–777 (2022).

    Google Scholar 

  17. Zhang, L. et al. Effect of a deep learning-based automatic upper GI endoscopic reporting system: a randomized crossover study (with video). Gastrointest. Endosc. 98, 181–190.e10 (2023).

    Google Scholar 

  18. Lahat, A. et al. Evaluating the use of large language model in identifying top research questions in gastroenterology. Sci. Rep. 13, 4164 (2023).

    Google Scholar 

  19. Savage, T., Wang, J. & Shieh, L. A large language model screening tool to target patients for best practice alerts: development and validation. JMIR Med Inf. 11, e49886 (2023).

    Google Scholar 

  20. Kim, H. J., Gong, E. J. & Bang, C. S. Application of machine learning based on structured medical data in gastroenterology. Biomim. (Basel) 8, 512 (2023).

    Google Scholar 

  21. Wang S. et al. Leveraging large language and vision models for knowledge extraction from large-scale image-text colonoscopy records. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-025-01500-x (2025).

  22. Carlini L. et al. Large language models for detecting colorectal polyps in endoscopic images. Gut. https://doi.org/10.1136/gutjnl-2025-335091(2025).

  23. Boers, T. G. W. et al. Foundation models in gastrointestinal endoscopic AI: impact of architecture, pre-training approach and data efficiency. Med Image Anal. 98, 103298 (2024).

    Google Scholar 

  24. Aabakken, L. et al. Minimal standard terminology for gastrointestinal endoscopy - MST 3.0. Endoscopy 41, 727–728 (2009).

    Google Scholar 

  25. Nagula, S., Parasa, S., Laine, L. & Shah, S. C. AGA clinical practice update on high-quality upper endoscopy: expert review. Clin. Gastroenterol. Hepatol. 22, 933–943 (2024).

    Google Scholar 

  26. Beg, S. et al. Quality standards in upper gastrointestinal endoscopy: a position statement of the British Society of Gastroenterology (BSG) and Association of Upper Gastrointestinal Surgeons of Great Britain and Ireland (AUGIS). Gut 66, 1886–1899 (2017).

    Google Scholar 

  27. Cohen, J. & Pike, I. M. Defining and measuring quality in endoscopy. Am. J. Gastroenterol. 110, 46–47 (2015).

    Google Scholar 

  28. Roorda, A. K. & Triadafilopoulos, G. A fellow’s guide to generating the endoscopy procedure report. Gastrointest. Endosc. 72, 803–805 (2010).

    Google Scholar 

  29. Yao, K. et al. Guidelines for endoscopic diagnosis of early gastric cancer. Dig. Endosc. 32, 663–698 (2020).

    Google Scholar 

  30. Wu, L. et al. Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy. Gut 68, 2161–2169 (2019).

    Google Scholar 

  31. Wu, L. et al. A deep neural network improves endoscopic detection of early gastric cancer without blind spots. Endoscopy 51, 522–531 (2019).

    Google Scholar 

  32. Wang, H., Gao, C., Dantona, C., Hull, B. & Sun, J. DRG-LLaMA : tuning LLaMA model to predict diagnosis-related group for hospitalized patients. NPJ Digit Med 7, 16 (2024).

    Google Scholar 

  33. He, M. et al. Efficient multimodal learning from data-centric perspective. ArXiv Preprint https://doi.org/10.48550/arXiv.2402.11530 (2024).

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (grant no. 2022YFC2505105, to Lianlian Wu, W.Z.); the Natural Science Foundation of Wuhan, (grant no. 2025040601020197, to Z.H.D.); the Hubei Provincial Key Laboratory Open Project, (grant no. 2024KFZ005, to Z.H.D.); the Key Research and Development Program of Hubei Province (grant no. 2023BCB153, to H.G.Y.); and the National Natural Science Foundation of China-Youth Science Fund (grant no. 82202257, to Lianlian Wu). The funders had no role in the study design, data collection, data analysis, interpretation, or manuscript preparation.

Author information

Author notes
  1. These authors contributed equally: Ruiqing Jiang, Boru Chen, Zehua Dong.

Authors and Affiliations

  1. Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan, China

    Ruiqing Jiang, Boru Chen, Zehua Dong, Xiaoquan Zeng, Hang You, Yanxia Li, Yunchao Deng, Ganggang Mu, Jing Wang, Li Huang, Jia Li, Du Cheng, Wei Zhou & Honggang Yu

  2. Hubei Provincial Clinical Research Center for Digestive Disease Minimally Invasive Incision, Renmin Hospital of Wuhan University, Wuhan, China

    Ruiqing Jiang, Boru Chen, Zehua Dong, Xiaoquan Zeng, Hang You, Yanxia Li, Yunchao Deng, Ganggang Mu, Jing Wang, Li Huang, Jia Li, Du Cheng, Wei Zhou & Honggang Yu

  3. Key Laboratory of Hubei Province for Digestive System Disease, Renmin Hospital of Wuhan University, Wuhan, China

    Ruiqing Jiang, Boru Chen, Zehua Dong, Xiaoquan Zeng, Hang You, Yanxia Li, Yunchao Deng, Ganggang Mu, Jing Wang, Li Huang, Jia Li, Du Cheng, Wei Zhou & Honggang Yu

  4. Engineering Research Center for Artificial Intelligence Endoscopy Interventional Treatment of Hubei Province, Wuhan, China

    Ruiqing Jiang, Boru Chen, Zehua Dong, Xiaoquan Zeng, Hang You, Yanxia Li, Yunchao Deng, Ganggang Mu, Jing Wang, Li Huang, Jia Li, Du Cheng, Wei Zhou & Honggang Yu

  5. Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, China

    Honggang Yu

Authors
  1. Ruiqing Jiang
    View author publications

    Search author on:PubMed Google Scholar

  2. Boru Chen
    View author publications

    Search author on:PubMed Google Scholar

  3. Zehua Dong
    View author publications

    Search author on:PubMed Google Scholar

  4. Xiaoquan Zeng
    View author publications

    Search author on:PubMed Google Scholar

  5. Hang You
    View author publications

    Search author on:PubMed Google Scholar

  6. Yanxia Li
    View author publications

    Search author on:PubMed Google Scholar

  7. Yunchao Deng
    View author publications

    Search author on:PubMed Google Scholar

  8. Ganggang Mu
    View author publications

    Search author on:PubMed Google Scholar

  9. Jing Wang
    View author publications

    Search author on:PubMed Google Scholar

  10. Li Huang
    View author publications

    Search author on:PubMed Google Scholar

  11. Jia Li
    View author publications

    Search author on:PubMed Google Scholar

  12. Du Cheng
    View author publications

    Search author on:PubMed Google Scholar

  13. Wei Zhou
    View author publications

    Search author on:PubMed Google Scholar

  14. Honggang Yu
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Conceptualization: R.Q.J., B.R.C., Z.H.D. Methodology: R.Q.J., B.R.C., Z.H.D., X.Q.Z., H.Y. Investigation: R.Q.J., B.R.C., Z.H.D., X.Q.Z., H.Y., Y.X.L., Y.C.D., G.G.M,. J.W., L.H., J.L., D.C., W.Z. Visualization: R.Q.J., B.R.C., Z.H.D., X.Q.Z. Funding acquisition: H.G.Y., W.Z., Z.H.D.. Project administration: R.Q.J., B.R.C, Z.H.D., X.Q.Z. Supervision: H.G.Y., W.Z. Writing—original draft: R.Q.J., B.R.C., Z.H.D. Writing—review and editing: H.G.Y., W.Z., Z.H.D.

Corresponding authors

Correspondence to Wei Zhou or Honggang Yu.

Ethics declarations

Competing interests

Wuhan EndoAngel Co., Ltd. provided equipment for this study. The sponsor had no role in the design or conduct of the study; data collection, management, analysis, and interpretation; manuscript preparation; or the decision to submit the manuscript for publication. The other authors declare no competing financial or non-financial interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Revised Supplementary Materials for minor revision Clean (download PDF )

Video S1. Demonstration of a UGI endoscopic video (download MP4 )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, R., Chen, B., Dong, Z. et al. Domain specific multimodal large language model for automated endoscopy reporting with multicenter prospective validation. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02569-7

Download citation

  • Received: 11 December 2025

  • Accepted: 11 March 2026

  • Published: 28 March 2026

  • DOI: https://doi.org/10.1038/s41746-026-02569-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Multimodal AI for Digital Medicine

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing