Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
Decipher-MR: a vision-language foundation model for 3D MRI representations
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 04 April 2026

Decipher-MR: a vision-language foundation model for 3D MRI representations

  • Zhijian Yang1,
  • Noel DSouza1,
  • Istvan Megyeri2,
  • Xiaojian Xu1,
  • Amin Honarmandi Shandiz2,
  • Farzin Haddadpour1,
  • Krisztian Koos2,
  • Laszlo Rusko2,
  • Emanuele Valeriano1,
  • Bharadwaj Swaminathan1,
  • Lei Wu1,
  • Parminder Bhatia1,
  • Taha Kass-Hout1 &
  • …
  • Erhan Bas1 

npj Digital Medicine , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Engineering
  • Health care
  • Mathematics and computing

Abstract

Magnetic Resonance Imaging is a critical imaging modality in clinical diagnosis and research, yet its complexity and heterogeneity hinder scalable, generalizable machine learning. Although foundation models have revolutionized language and vision tasks, their application to MRI remains constrained by data scarcity and narrow anatomical focus. We present Decipher-MR, a 3D MRI-specific vision-language foundation model trained on 200,000 MRI series from over 22,000 studies spanning diverse anatomical regions, sequences, and pathologies. Decipher-MR integrates self-supervised vision learning with report-guided text supervision to build robust representations for broad applications. To enable efficient use, Decipher-MR supports a modular design that enables tuning of lightweight, task-specific decoders attached to a frozen pretrained encoder. Following this setting, we evaluate Decipher-MR across disease classification, demographic prediction, anatomical localization, and cross-modal retrieval, demonstrating consistent improvements over existing foundation models and task-specific approaches. These results support Decipher-MR as a promising and reusable foundation for MRI-based AI, within the scope of the tasks and datasets evaluated.

Similar content being viewed by others

Projective diffeomorphic mapping of molecular digital pathology with tissue MRI

Article Open access 13 December 2022

A foundation model for enhancing magnetic resonance images and downstream segmentation, registration and diagnostic tasks

Article 05 December 2024

A generalizable foundation model for analysis of human brain MRI

Article Open access 05 February 2026

Data availability

The MRI datasets used for model pretraining and internal evaluations are proprietary and cannot be shared publicly due to institutional, contractual, and privacy restrictions. These include the pretraining MRI corpus, the held-out validation set, and three additional internal evaluation datasets (Source1, Source2 Head and Neck, and MRDLAS). Access to these datasets is not permitted outside the hosting institutions, and therefore, they cannot be made available. Example benchmark for missing organ localization can be found at: registry.opendata.aws/gehcai-mapsmr. This study also makes use of several publicly available datasets for benchmarking, including ADNI, PI-CAI, ACDC, LLD-MMRI, MRART, and AMOS. These datasets can be accessed through their respective data portals in accordance with their data-use agreements (citations provided in the manuscript). All dataset identifiers, accession links, and licensing details are provided in the Methods and Supplementary Information. Only aggregated results and derived summary statistics are reported in this manuscript. No individual-level proprietary data or raw imaging files can be released. Additional non-sensitive materials may be provided by the corresponding author upon reasonable request and subject to institutional approval.

Code availability

The pretraining code is primarily adapted from several open-source frameworks, including DINOv2 (https://github.com/facebookresearch/dinov2), the HuggingFace Transformers Trainer for BERT models (https://huggingface.co/docs/transformers/en/main_classes/trainer), and OpenCLIP (https://github.com/mlfoundations/open_clip). We introduced customized components unique to healthcare domain, specifically on MR modality, such as patch tokenization, data augmentation, and data organization, as detailed in the paper. Due to considerations related to safety, intellectual property, and commercial viability, the pretrained Decipher-MR weights might not be directly shared at this time. The algorithmic procedures and model architectures are fully described in the “Methods” section to support reproducibility, and additional methodological details may be provided by the corresponding author upon reasonable request.

References

  1. Carré, A. Standardization of brain MR images across machines and protocols: bridging the gap for MRI-based radiomics. Sci. Rep. 10, 12340 (2020).

    Google Scholar 

  2. OpenAI et al. Gpt-4 technical report (2024). https://arxiv.org/abs/2303.08774. 2303.08774.

  3. Radford, A. et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (2021). https://api.semanticscholar.org/CorpusID:231591445.

  4. Oquab, M. et al. DINOv2: Learning robust visual features without supervision. Transactions on Machine Learning Research (2024). https://openreview.net/forum?id=a68SUt6zFt. Featured Certification.

  5. He, K. et al. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16000–16009 (2022).

  6. Pérez-García, F. et al. Exploring scalable medical image encoders beyond text supervision. Nat. Mach. Intell. 7, 119–130 (2025).

    Google Scholar 

  7. Moutakanni, T. et al. Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning (2024). https://arxiv.org/abs/2405.01469. 2405.01469.

  8. Blankemeier, L. et al. Merlin: A vision language foundation model for 3d computed tomography (2024). https://arxiv.org/abs/2406.06512. 2406.06512.

  9. Yang, L. et al. Advancing multimodal medical capabilities of Gemini. arXiv preprint arXiv:2405.03162 (2024).

  10. Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

    Google Scholar 

  11. Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).

    Google Scholar 

  12. Codella, N. C. F. et al. Medimageinsight: An open-source embedding model for general domain medical imaging (2024). https://arxiv.org/abs/2410.06542. 2410.06542.

  13. Ye, Y. et al. Continual self-supervised learning: Towards universal multi-modal medical data representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11114–11124 (2024).

  14. Zhang, S. et al. A multimodal biomedical foundation model trained from fifteen million image-text pairs. NEJM AI 2, AIoa2400640 (2025).

    Google Scholar 

  15. Zhao, T. et al. A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities. Nat. Methods 22, 166–176 (2025).

    Google Scholar 

  16. Cox, J. et al. Brainsegfounder: Towards 3d foundation models for neuroimage segmentation (2024). https://arxiv.org/abs/2406.10395. 2406.10395.

  17. Ma, J. et al. Segment anything in medical images. Nat. Commun. 15, 654 (2024).

    Google Scholar 

  18. Sun, J. et al. Medical image analysis using improved sam-med2d: segmentation and classification perspectives. BMC Med. Imaging 24 (2024).

  19. Tak, D. et al. A foundation model for generalized brain MRI analysis. medRxiv (2024). https://www.medrxiv.org/content/early/2024/12/03/2024.12.02.24317992. https://www.medrxiv.org/content/early/2024/12/03/2024.12.02.24317992.full.pdf.

  20. Wang, S. et al. Triad: Vision foundation model for 3d magnetic resonance imaging (2025). https://arxiv.org/abs/2502.14064. 2502.14064.

  21. Cox, J. et al. BrainSegFounder: towards 3D foundation models for Neuroimage Segmentation. Med. Image Anal. 97, 103301 (2024).

    Google Scholar 

  22. Ji, Y. et al. Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation. In Koyejo, S. et al. (eds.) Advances in Neural Information Processing Systems, vol. 35, 36722–36732 (Curran Associates, Inc., 2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/ee604e1bedbd069d9fc9328b7b9584be-Paper-Datasets_and_Benchmarks.pdf.

  23. Bernard, O. et al. Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: Is the problem solved? IEEE Trans. Med. Imaging 37, 2514–2525 (2018).

    Google Scholar 

  24. Wang, H. et al. Sam-MED3D: Towards general-purpose segmentation models for volumetric medical images (2024). https://arxiv.org/abs/2310.15161. 2310.15161.

  25. Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).

    Google Scholar 

  26. Wang, Y. et al. DETR3D: 3D object detection from multi-view images via 3d-to-2d queries. In Proceedings of the 5th Conference on Robot Learning, vol. 164 of Proceedings of Machine Learning Research, 180–191 (PMLR, 2022). https://proceedings.mlr.press/v164/wang22b.html.

  27. Chen, Z. et al. Medical phrase grounding with region-phrase context contrastive alignment. In MICCAI (2023).

  28. Silva, M. V. F. et al. Alzheimer’s disease: risk factors and potentially protective measures. J. Biomed. Sci. 26, 33 (2019).

    Google Scholar 

  29. Hasanzadeh, F. et al. Bias recognition and mitigation strategies in artificial intelligence healthcare applications. npj Digital Med. 8, 154 (2025).

    Google Scholar 

  30. Dutt, R., Bohdal, O., Tsaftaris, S. A. & Hospedales, T. Fairtune: Optimizing parameter efficient fine tuning for fairness in medical image analysis. In International Conference on Learning Representations (2024).

  31. Wang, R. et al. Drop the shortcuts: image augmentation improves fairness and decreases ai detection of race and other demographics from medical images. eBioMedicine102, (2024). https://doi.org/10.1016/j.ebiom.2024.105047.

  32. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).

    Google Scholar 

  33. Doshi, J., Erus, G., Ou, Y., Gaonkar, B. & Davatzikos, C. Multi-atlas skull-stripping. Acad. Radiol. 20, 1566–1576 (2013).

    Google Scholar 

  34. He, X., Wang, A. Q. & Sabuncu, M. R. Neural pre-processing: A learning framework for end-to-end brain MRI pre-processing. In Greenspan, H. et al. (eds.) Medical Image Computing and Computer Assisted Intervention–MICCAI 2023, 258–267 (Springer Nature Switzerland, 2023).

  35. Yuan, Y., Ahn, E., Feng, D., Khadra, M. & Kim, J. Z-ssmnet: Zonal-aware self-supervised mesh network for prostate cancer detection and diagnosis with bi-parametric MRI. Comput. Med. Imaging Graph. 122, 102510 (2025).

    Google Scholar 

  36. Petersen, R. C. et al. Alzheimer’s disease neuroimaging initiative (ADNI): clinical characterization. Neurology 74, 201–209 (2010).

    Google Scholar 

  37. Saha, A. et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study. Lancet Oncol. 25, 879–887 (2024).

    Google Scholar 

  38. Lou, M. et al. Sdr-former: a siamese dual-resolution transformer for liver lesion classification using 3d multi-phase imaging. Neural Networks 107228 (2025).

  39. Nárai, Á. et al. Movement-related artefacts (mr-art) dataset of matched motion-corrupted and clean structural mri brain scans. Sci. Data 9, 630 (2022).

    Google Scholar 

  40. Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy.

  41. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C. & Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186 (Association for Computational Linguistics, 2019). https://aclanthology.org/N19-1423/.

  42. Gu, Y. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthcare3, (2021). https://doi.org/10.1145/3458754.

  43. Grattafiori, A. et al. The llama 3 herd of models (2024). https://arxiv.org/abs/2407.21783. 2407.21783.

  44. Lee, D., de Keizer, N., Lau, F. & Cornet, R. Literature review of SNOMED CT use. J. Am. Med. Inform. Assoc. 21, e11–e19 (2013).

    Google Scholar 

  45. Amazon Web Services. Amazon comprehend medical - extract insights from medical text (n.d.). https://aws.amazon.com/comprehend/medical/.

  46. Dong, H. et al. MRI-core: a foundation model for magnetic resonance imaging. arXiv preprint arXiv:2404.09957 (2024).

  47. Myronenko, A. 3d mri brain tumor segmentation using autoencoder regularization. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 311–320 (Springer International Publishing, Cham, 2019).

  48. Isensee, F. et al. nnu-net revisited: A call for rigorous validation in 3d medical image segmentation. In Linguraru, M. G. et al. (eds.) Medical Image Computing and Computer Assisted Intervention–MICCAI 2024, 488–498 (Springer Nature Switzerland, 2024).

Download references

Acknowledgements

The authors thank Seyed Iman Zare Estakhraji, Marc Lebel, Michail Fanariotis, and anonymous reviewers for their constructive feedback and discussions throughout the development of this study.

Author information

Authors and Affiliations

  1. GE Healthcare, Seattle, WA, USA

    Zhijian Yang, Noel DSouza, Xiaojian Xu, Farzin Haddadpour, Emanuele Valeriano, Bharadwaj Swaminathan, Lei Wu, Parminder Bhatia, Taha Kass-Hout & Erhan Bas

  2. GE Healthcare, Budapest, Hungary

    Istvan Megyeri, Amin Honarmandi Shandiz, Krisztian Koos & Laszlo Rusko

Authors
  1. Zhijian Yang
    View author publications

    Search author on:PubMed Google Scholar

  2. Noel DSouza
    View author publications

    Search author on:PubMed Google Scholar

  3. Istvan Megyeri
    View author publications

    Search author on:PubMed Google Scholar

  4. Xiaojian Xu
    View author publications

    Search author on:PubMed Google Scholar

  5. Amin Honarmandi Shandiz
    View author publications

    Search author on:PubMed Google Scholar

  6. Farzin Haddadpour
    View author publications

    Search author on:PubMed Google Scholar

  7. Krisztian Koos
    View author publications

    Search author on:PubMed Google Scholar

  8. Laszlo Rusko
    View author publications

    Search author on:PubMed Google Scholar

  9. Emanuele Valeriano
    View author publications

    Search author on:PubMed Google Scholar

  10. Bharadwaj Swaminathan
    View author publications

    Search author on:PubMed Google Scholar

  11. Lei Wu
    View author publications

    Search author on:PubMed Google Scholar

  12. Parminder Bhatia
    View author publications

    Search author on:PubMed Google Scholar

  13. Taha Kass-Hout
    View author publications

    Search author on:PubMed Google Scholar

  14. Erhan Bas
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Z.Y. designed the pretraining model with contributions from X.X., N.D., and I.M. on model design and data preparation. N.D., I.M., X.X., A.H., F.H., K.K., L.R. worked on decoder design, carried out finetuning experiments. E.V., B.S., L.W., P.B., T.K.H. assisted experimental setup for fine-tuning and analysis of results. E.B. designed and directed the project with contributions from P.B. and T.K.H. Z.Y. wrote the majority of manuscript with inputs from N.D., I.M., X.X., A.H., F.H., K.K., L.R., and E.B. All authors discussed the results and contributed to the final manuscript.

Corresponding authors

Correspondence to Zhijian Yang or Erhan Bas.

Ethics declarations

Competing interests

All authors are employees of GE Healthcare. The authors declare no additional competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Z., DSouza, N., Megyeri, I. et al. Decipher-MR: a vision-language foundation model for 3D MRI representations. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02596-4

Download citation

  • Received: 29 August 2025

  • Accepted: 21 March 2026

  • Published: 04 April 2026

  • DOI: https://doi.org/10.1038/s41746-026-02596-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Multimodal AI for Digital Medicine

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics