Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
AI caption generation model for digital pathology of adenocarcinoma in endoscopic histopathology using multi-instance attention mechanisms
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 12 March 2026

AI caption generation model for digital pathology of adenocarcinoma in endoscopic histopathology using multi-instance attention mechanisms

  • Youngseop Lee  ORCID: orcid.org/0009-0002-6361-36161,
  • Kyungah Bai  ORCID: orcid.org/0009-0002-5265-46742,
  • Young Jae Kim  ORCID: orcid.org/0000-0003-0443-00511,3,
  • Jisup Kim  ORCID: orcid.org/0000-0002-0742-55174 &
  • …
  • Kwang Gi Kim  ORCID: orcid.org/0000-0001-9714-60381,5 

Scientific Reports , Article number:  (2026) Cite this article

  • 742 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Cancer
  • Computational biology and bioinformatics
  • Gastroenterology

Abstract

Gastric adenocarcinoma is a leading cause of cancer related mortality worldwide, and histopathologic examination of endoscopic biopsy samples remains essential for its diagnosis and grading. In this study, we propose a novel AI based caption generation model, termed MIAC (Multi-instance Attention Captioning), designed to produce descriptive diagnostic reports from digital pathology images. The model leverages a Multi-instance learning framework with permutation-invariant self attention to aggregate features from multiple histopathology image patches into a unified representation, effectively capturing whole slide characteristics. Using the publicly available PatchGastricADC22 dataset for training and validation, and an External Test dataset from Gil Hospital of Gachon University for clinical testing, the model demonstrated strong performance across standard natural language generation metrics (BLEU@4, ROUGE-L, METEOR, CIDEr). Notably, MIAC maintained high captioning accuracy even when evaluated on previously unseen data, particularly after color normalization using the Macenko method. These results underscore the model’s robustness, generalizability, and potential for integration into routine digital pathology workflows to assist pathologists in generating structured diagnostic reports.

Similar content being viewed by others

Machine intelligence in non-invasive endocrine cancer diagnostics

Article 09 November 2021

Enhanced gastric cancer classification and quantification interpretable framework using digital histopathology images

Article Open access 28 September 2024

Development and validation of an autonomous artificial intelligence agent for clinical decision-making in oncology

Article Open access 06 June 2025

Data availability

The publicly available PatchGastricADC22 dataset used in this study can be accessed at: https://www.kaggle.com/datasets/sanikapadegaonkar/patchgastricadc22 The code used for model training and evaluation is available at: https://github.com/Leeyoungsup/histopathology_captioning The clinical test dataset collected from Gachon University Gil Medical Center is not publicly available due to patient privacy and institutional data protection policies. However, data may be available from the corresponding author upon reasonable request and with appropriate institutional approvals.

References

  1. Sung, H. et al. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J. Clin.71, 209–249 (2021).

    Google Scholar 

  2. Pasechnikov, V., Chukov, S., Fedorov, E., Kikuste, I. & Leja, M. Gastric cancer: prevention, screening and early diagnosis. World J. Gastroenterol.20, 13842–13862 (2014).

    Google Scholar 

  3. Ajani, J. A. et al. Gastric cancer, version 2.2022, nccn clinical practice guidelines in oncology. J. Natl. Compr. Cancer Netw.20, 167–192 (2022).

    Google Scholar 

  4. Hirasawa, T. et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer21, 653–660 (2018).

    Google Scholar 

  5. Luo, H. et al. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study. Lancet Oncol.20, 1645–1654 (2019).

    Google Scholar 

  6. Zhu, Y. et al. Application of convolutional neural network in the diagnosis of the invasion depth of gastric cancer based on conventional endoscopy. Gastrointest. Endosc.89, 806–815 (2019).

    Google Scholar 

  7. Jiang, Y. et al. Radiomics signature on computed tomography imaging: association with lymph node metastasis in patients with gastric cancer. Front. Oncol.9, 340 (2019).

    Google Scholar 

  8. Wang, Y. et al. Prediction of the depth of tumor invasion in gastric cancer: Potential role of ct radiomics. Acad. Radiol.27, 1077–1084 (2020).

    Google Scholar 

  9. Giganti, F. et al. Pre-treatment mdct-based texture analysis for therapy response prediction in gastric cancer: Comparison with tumour regression grade at final histology. Eur. J. Radiol.90, 129–137 (2017).

    Google Scholar 

  10. Jiang, Y. et al. Noninvasive imaging evaluation of tumor immune microenvironment to predict outcomes in gastric cancer. Ann. Oncol.31, 760–768 (2020).

    Google Scholar 

  11. Zhang, W. et al. Development and validation of a ct-based radiomic nomogram for preoperative prediction of early recurrence in advanced gastric cancer. Radiother. Oncol.145, 13–20 (2020).

    Google Scholar 

  12. Niazi, M. K. K., Parwani, A. V. & Gurcan, M. N. Digital pathology and artificial intelligence. Lancet Oncol.20, e253–e261 (2019).

    Google Scholar 

  13. Ferreira, R. et al. The virtual microscope. Proc AMIA Annu Fall Symp, 449–453 (1997).

  14. Mukhopadhyay, S. et al. Whole slide imaging versus microscopy for primary diagnosis in surgical pathology: A multicenter blinded randomized noninferiority study of 1992 cases (pivotal study). Am. J. Surg. Pathol.42, 39–52 (2018).

    Google Scholar 

  15. Abels, E. et al. Computational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the digital pathology association. J. Pathol.249, 286–294 (2019).

    Google Scholar 

  16. Pantanowitz, L. et al. Validating whole slide imaging for diagnostic purposes in pathology: Guideline from the college of American pathologists pathology and laboratory quality center. Archives of Pathology. Lab. Med.137, 1710–1722 (2013).

    Google Scholar 

  17. Zarella, M. D. et al. A practical guide to whole slide imaging: a white paper from the digital pathology association. Arch. Pathol. Lab. Med.143, 222–234 (2019).

    Google Scholar 

  18. Chen, S. et al. Applications of artificial intelligence in digital pathology for gastric cancer. Front. Oncol.14, 1437252 (2024).

    Google Scholar 

  19. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems25 (2012).

  20. Srinidhi, C. L., Ciga, O. & Martel, A. L. Deep neural network models for computational histopathology: A survey. Med. Image Anal.67, 101813 (2021).

    Google Scholar 

  21. Liang, Q. et al. Weakly supervised biomedical image segmentation by reiterative learning. IEEE J. Biomed. Health Inform.23, 1205–1214 (2019).

    Google Scholar 

  22. Qu, J. et al. Gastric pathology image classification using stepwise fine-tuning for deep neural networks. J. Healthc. Eng.2018, 8961781 (2018).

    Google Scholar 

  23. Abe, H. et al. Development and multi-institutional validation of an artificial intelligence-based diagnostic system for gastric biopsy. Cancer Sci.113, 3608–3617 (2022).

    Google Scholar 

  24. Sharma, H. et al. Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology. Comput. Med. Imaging Graph.61, 2–13 (2017).

    Google Scholar 

  25. Iizuka, O. et al. Deep learning models for histopathological classification of gastric and colonic epithelial tumours. Sci. Rep.10, 1504 (2020).

    Google Scholar 

  26. Park, J. et al. A prospective validation and observer performance study of a deep learning algorithm for pathologic diagnosis of gastric tumors in endoscopic biopsies. Clin. Cancer Res.27, 719–728 (2021).

    Google Scholar 

  27. Yoshida, H. et al. Automated histological classification of whole-slide images of gastric biopsy specimens. Gastr Cancer21, 249–257 (2018).

    Google Scholar 

  28. Ba, W. et al. Assessment of deep learning assistance for the pathological diagnosis of gastric cancer. Mod. Pathol.35, 1262–1268 (2022).

    Google Scholar 

  29. Lan, J. et al. Using less annotation workload to establish a pathological auxiliary diagnosis system for gastric cancer. Cell Rep. Med.4, 101004 (2023).

    Google Scholar 

  30. Tung, C.-L. et al. Identifying pathological slices of gastric cancer via deep learning. J. Formosan Med. Assoc.121, 2457–2464 (2022).

    Google Scholar 

  31. Marletta, S., Treanor, D., Eccher, A. & Pantanowitz, L. Whole-slide imaging in cytopathology: state of the art and future directions. Diagn. Histopathol.27, 425–430 (2021).

    Google Scholar 

  32. Marletta, S. et al. Application of digital imaging and artificial intelligence to pathology of the placenta. Pediatr. Dev. Pathol.26, 5–12 (2023).

    Google Scholar 

  33. Fu, B. et al. Stohisnet: A hybrid multi-classification model with cnn and transformer for gastric pathology images. Comput. Methods Programs Biomed.221, 106924 (2022).

    Google Scholar 

  34. Kanavati, F. & Tsuneki, M. A deep learning model for gastric diffuse-type adenocarcinoma classification in whole slide images. Sci. Rep.11, 20486 (2021).

    Google Scholar 

  35. Tsuneki, M. & Kanavati, F. Inference of captions from histopathological patches. Proc. Mach. Learn. Res.1235, 1235–1250 (2022).

    Google Scholar 

  36. Qin, W. et al. What a whole slide image can tell? subtype-guided masked transformer for pathological image captioning. arXiv:2310.20607 (2023).

  37. Tsuneki, M. & Kanavati, F. Inference of captions from histopathological patches. In Konukoglu, E. et al. (eds.) Proceedings of The 5th International Conference on Medical Imaging with Deep Learning, vol. 172 of Proceedings of Machine Learning Research, 1235–1250 (PMLR, 2022).

  38. Cong, G. & Fung, V. Improving materials property predictions for graph neural networks with minimal feature engineering. Mach. Learn. Sci. Technol.4, 035030 (2023).

    Google Scholar 

  39. Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE international symposium on biomedical imaging: from nano to macro, 1107–1110 (IEEE, 2009).

  40. Lee, H., Lee, K., Lee, K., Lee, H. & Shin, J. Improving transferability of representations via augmentation-aware self-supervision. Adv. Neural Inf. Process. Syst.34, 17710–17722 (2021).

    Google Scholar 

  41. Griffis, D., Shivade, C., Fosler-Lussier, E. & Lai, A. M. A quantitative and qualitative evaluation of sentence boundary detection for the clinical domain. AMIA Summits Transl. Sci. Proc.2016, 88 (2016).

    Google Scholar 

  42. Liang, M. et al. Caf-ahgcn: context-aware attention fusion adaptive hypergraph convolutional network for human-interpretable prediction of gigapixel whole-slide image. Vis. Comput.40(12), 8747–65 (2024).

    Google Scholar 

  43. Cai, C. et al. Pathologist-level diagnosis of ulcerative colitis inflammatory activity level using an automated histological grading method. Int. J. Med. Inform.192, 105648 (2024).

    Google Scholar 

  44. Wang, Z. & Nirjon, S. Characterizing disparity between edge models and high-accuracy base models for vision tasks. arXiv:2407.10016 (2024).

  45. Seo, M., Lee, H.-J. & Nguyen, X. T. Vit-p3de\(\ast\): Vision transformer based multi-camera instance. In IJCAI, 1340–1350 (2023).

  46. Fu, L. et al. Rethinking patch dependence for masked autoencoders. arXiv:2401.14391 (2024).

  47. Guo, Z. et al. Histgen: Histopathology report generation via local-global feature encoding and cross-modal context interaction. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 189–199 (Springer, 2024).

  48. Das, K., Conjeti, S., Chatterjee, J. & Sheet, D. Detection of breast cancer from whole slide histopathological images using deep multiple instance cnn. IEEE Access8, 213502–213511 (2020).

    Google Scholar 

  49. He, T., Zhang, J., Zhou, Z. & Glass, J. Quantifying exposure bias for neural language generation (2019).

  50. Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311–318 (2002).

  51. Lin, C.-Y. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 74–81 (2004).

  52. Banerjee, S. & Lavie, A. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 65–72 (2005).

  53. Vedantam, R., Lawrence Zitnick, C. & Parikh, D. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4566–4575 (2015).

  54. Japanese Gastric Cancer Association jgca@ koto. kpu-m. ac. jp. Japanese gastric cancer treatment guidelines 2018 Japanese gastric cancer association. Gastr. Cancer24, 1–21 (2021).

    Google Scholar 

  55. Google Cloud. Evaluating models – understanding the bleu score – interpretation. (accessed 06 Feb 2022) https://cloud.google.com/translate/automl/docs/evaluate#interpretation (2022).

  56. WHO Classification of Tumours Editorial Board. Digestive system tumours, 5th edn (International Agency for Research on Cancer, Lyon, 2019).

  57. Lauren, P. The two histological main types of gastric carcinoma: diffuse and so-called intestinal-type carcinoma: an attempt at a histo-clinical classification. Acta Pathol. Microbiol. Scand.64, 31–49 (1965).

    Google Scholar 

  58. Japanese Gastric Cancer Association jgca@ koto. kpu-m. ac. jp. Japanese classification of gastric carcinoma: 3rd English edition. Gastr. Cancer14, 101–112 (2011).

Download references

Funding

This work was supported by the Digital Medical Products Development Based on Medical Data Synthesis and AI Technologies Program(RS-2025-02305698, Development of On-Device AI Digital Medical Products Utilizing Synthetic Technology and Synthetic Data for Atypical Medical Data) funded by the Ministry of Trade, Industry&Energy(MOTIE) of Korea. This work was supported by the Gachon University research fund of 2024 (GCU-2024-202410530001).

Author information

Authors and Affiliations

  1. Medical Devices R&D Center, Gachon University Gil Medical Center, Incheon, 21565, South Korea

    Youngseop Lee, Young Jae Kim & Kwang Gi Kim

  2. Department of Pathology, Graduate School of Medicine, College of Medicine, Seoul National University, Seoul, 03080, South Korea

    Kyungah Bai

  3. Gachon Biomedical & Convergence Institute, Gachon University Gil Medical Center, Incheon, 21936, South Korea

    Young Jae Kim

  4. Department of Pathology, Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea

    Jisup Kim

  5. Department of Biomedical Engineering, College of IT Convergence, Seongnam, Seongnam, 13120, South Korea

    Kwang Gi Kim

Authors
  1. Youngseop Lee
    View author publications

    Search author on:PubMed Google Scholar

  2. Kyungah Bai
    View author publications

    Search author on:PubMed Google Scholar

  3. Young Jae Kim
    View author publications

    Search author on:PubMed Google Scholar

  4. Jisup Kim
    View author publications

    Search author on:PubMed Google Scholar

  5. Kwang Gi Kim
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Youngseop Lee conceptualized the study, developed the methodology, conducted the experiments, and analyzed the results. Kyungah Bai contributed to data collection, validation of experimental results, and manuscript editing. Youngjae Kim provided technical support for model implementation and contributed to the preprocessing pipeline. Jisup Kim and Kwanggi Kim supervised the study, served as corresponding authors, and provided pathological and experimental guidance. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Jisup Kim or Kwang Gi Kim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This study was approved by the Institutional Review Board (IRB) of Gil Hospital of Gachon University (Approval Number: GBIRB2024-121). All experimental protocols were conducted in accordance with relevant guidelines and regulations, strictly adhering to the ethical principles outlined in the Declaration of Helsinki.

Informed consent

The requirement for informed consent was waived due to the retrospective nature of the study design.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, Y., Bai, K., Kim, Y. et al. AI caption generation model for digital pathology of adenocarcinoma in endoscopic histopathology using multi-instance attention mechanisms. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37455-5

Download citation

  • Received: 30 July 2025

  • Accepted: 22 January 2026

  • Published: 12 March 2026

  • DOI: https://doi.org/10.1038/s41598-026-37455-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Deep learning
  • Digital pathology
  • Captioning model
  • Multi-instance learning
  • Endoscopic histopathology
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer