A universal foundation model for grounded biomedical image interpretation

Wu, Linshan; Nie, Yuxiang; He, Sunan; Zhuang, Jiaxin; Luo, Luyang; Li, Tao; Xie, Zhuoyao; Chen, Dexuan; Zhao, Yinghua; Mahboobani, Neeraj; Vardhanabhuti, Varut; Chan, Ronald Cheong Kin; Peng, Yifan; Rajpurkar, Pranav; Chen, Hao

doi:10.1038/s41467-026-73986-1

Download PDF

Article
Open access
Published: 04 June 2026

A universal foundation model for grounded biomedical image interpretation

Nature Communications (2026) Cite this article

1006 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

The integration of AI-assisted biomedical image analysis into clinical practice demands AI-generated findings that are not only accurate but also interpretable. However, existing models generally lack the ability to simultaneously generate diagnostic findings and localize corresponding targets. This limitation makes it challenging to correlate AI-generated findings with visual evidence and interpret the results. To this end, we introduce UniBiomed, a universal foundation model for grounded biomedical image interpretation, which is capable of generating accurate diagnostic findings and segmenting the biomedical targets. UniBiomed is based on an integration of Multi-modal Large Language Model and Segment Anything Model, which can unify diverse biomedical tasks in universal training for advancing grounded interpretation. To develop UniBiomed, we curate a large-scale dataset comprising 27 million triplets of images, region annotations, and text descriptions. Extensive validation on 70 internal and 14 external datasets demonstrated the state-of-the-art performance of UniBiomed in diverse biomedical tasks.

A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities

Article 18 November 2024

A generalist vision–language foundation model for diverse biomedical tasks

Article 07 August 2024

A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification

Article Open access 02 September 2023

Acknowledgements

We thank the support of HKUST SuperPOD for providing the GPU platform for model training. Icons of Figs. 1 (a, c, d), 5 (a, b), 6, and Supplementary Figs. A5 (b), A8, A9 (a, b) are made by Freepik from www.flaticon.com.

This project has been reviewed and approved by the Human and Artefacts Research Ethics Committee (HAREC). The protocol number is HREP-2025-0188.

Funding

This work was supported by the Hong Kong Innovation and Technology Commission (Project No. MHP/002/22, GHP/006/22GD and ITCPD/17-9), HKUST (Project No. FS111), and the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Reference Number: T45-401/22-N).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Linshan Wu, Yuxiang Nie, Sunan He, Jiaxin Zhuang, Luyang Luo & Hao Chen
Department of Biomedical Informatics, Harvard University, Boston, MA, USA
Luyang Luo & Pranav Rajpurkar
Department of Radiology, The Third Affiliated Hospital of Southern Medical University, Guangzhou, China
Tao Li, Zhuoyao Xie, Dexuan Chen & Yinghua Zhao
Department of Imaging and Interventional Radiology, The Chinese University of Hong Kong, Hong Kong, China
Neeraj Mahboobani
Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
Varut Vardhanabhuti
Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong, China
Ronald Cheong Kin Chan
State Key Laboratory of Translational Oncology, The Chinese University of Hong Kong, Hong Kong, China
Ronald Cheong Kin Chan
Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
Yifan Peng
Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Hao Chen
Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
Hao Chen
State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong, China
Hao Chen
Shenzhen-Hong Kong Collaborative Innovation Research Institute, The Hong Kong University of Science and Technology, Shenzhen, China
Hao Chen

Authors

Linshan Wu
View author publications
Search author on:PubMed Google Scholar
Yuxiang Nie
View author publications
Search author on:PubMed Google Scholar
Sunan He
View author publications
Search author on:PubMed Google Scholar
Jiaxin Zhuang
View author publications
Search author on:PubMed Google Scholar
Luyang Luo
View author publications
Search author on:PubMed Google Scholar
Tao Li
View author publications
Search author on:PubMed Google Scholar
Zhuoyao Xie
View author publications
Search author on:PubMed Google Scholar
Dexuan Chen
View author publications
Search author on:PubMed Google Scholar
Yinghua Zhao
View author publications
Search author on:PubMed Google Scholar
Neeraj Mahboobani
View author publications
Search author on:PubMed Google Scholar
Varut Vardhanabhuti
View author publications
Search author on:PubMed Google Scholar
Ronald Cheong Kin Chan
View author publications
Search author on:PubMed Google Scholar
Yifan Peng
View author publications
Search author on:PubMed Google Scholar
Pranav Rajpurkar
View author publications
Search author on:PubMed Google Scholar
Hao Chen
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Hao Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download ZIP )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, L., Nie, Y., He, S. et al. A universal foundation model for grounded biomedical image interpretation. Nat Commun (2026). https://doi.org/10.1038/s41467-026-73986-1

Download citation

Received: 07 July 2025
Accepted: 27 May 2026
Published: 04 June 2026
DOI: https://doi.org/10.1038/s41467-026-73986-1