Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
Clinical utility of foundation models in musculoskeletal MRI for biomarker fidelity and predictive outcomes
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 24 March 2026

Clinical utility of foundation models in musculoskeletal MRI for biomarker fidelity and predictive outcomes

  • Gabrielle Hoyer1,2,3,
  • Michelle W. Tong1,2,3,
  • Rupsa Bhattacharjee1,4,
  • Valentina Pedoia1,5 &
  • …
  • Sharmila Majumdar1,3 

npj Digital Medicine , Article number:  (2026) Cite this article

  • 1405 Accesses

  • 4 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Biomarkers
  • Computational biology and bioinformatics
  • Health care
  • Medical research

Abstract

Precision medicine in musculoskeletal imaging requires scalable measurement infrastructure. We developed a modular system that converts routine MRI into standardized quantitative biomarkers suitable for clinical decision support. Promptable foundation segmenters (SAM, SAM2, MedSAM) were fine-tuned across heterogeneous musculoskeletal datasets and coupled to automated detection for fully automatic prompting. Fine-tuned segmentations yielded clinically reliable measurements with high concordance to expert annotations across cartilage, bone, and soft tissue biomarkers. Using the same measurements, we demonstrate two applications: (i) a three-stage knee triage cascade that reduces verification workload while maintaining sensitivity, and (ii) 48-month landmark models that forecast knee replacement and incident osteoarthritis with favorable calibration and net benefit across clinically relevant thresholds. Our model-agnostic, open-source architecture enables independent validation and development. This work validates a pathway from automated measurement to clinical decision: reliable biomarkers drive both workload optimization today and patient risk stratification tomorrow, and the developed framework shows how foundation models can be operationalized within precision medicine systems.

Similar content being viewed by others

Foundations of a knee joint digital twin from qMRI biomarkers for osteoarthritis and knee replacement

Article Open access 21 February 2025

KneeXNet-2.5D: a clinically-oriented and explainable deep learning framework for MRI-based knee cartilage and meniscus segmentation

Article Open access 16 February 2026

A memory based model for cartilage and meniscus segmentation in 3D knee MRI

Article Open access 29 December 2025

Data availability

All Supplementary Tables (S0–S25) and study-generated Data Tables (D1-D57) cited in the manuscript are publicly available at Figshare: https://doi.org/10.6084/m9.figshare.29633207. Osteoarthritis Initiative (OAI) MRI scans can be accessed via the OAI data portal with registration and data use agreement. Additional institution-specific MRI datasets are subject to institutional review board restrictions; deidentified versions are available from the corresponding author upon reasonable request. Fine-tuned segmentation weights used in this study will be deposited in a public repository at the time of publication. Versioned metadata describing training datasets, label maps, and inference settings will be included to support reproduction and external validation.

Code availability

All code, configuration files, and preprocessing scripts are available at: https://github.com/gabbieHoyer/AutoMedLabel. Documentation and environment files are provided for reproducibility. Additional details are described in the Supplementary Engineering Framework.

References

  1. Annarumma, M. et al. Automated Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks. Radiology 291, 196–202 (2019).

    Google Scholar 

  2. O’Neill, T. J. et al. Active Reprioritization of the Reading Worklist Using Artificial Intelligence Has a Beneficial Effect on the Turnaround Time for Interpretation of Head CT with Intracranial Hemorrhage. Radiology: Artif. Intell. 3, e200024 (2021).

    Google Scholar 

  3. Batra, K., Xi, Y., Bhagwat, S., Espino, A. & Peshock, R. M. Radiologist Worklist Reprioritization Using Artificial Intelligence: Impact on Report Turnaround Times for CTPA Examinations Positive for Acute Pulmonary Embolism. Am. J. Roentgenol. 221, 324–333 (2023).

    Google Scholar 

  4. Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 278, 563–577 (2016).

    Google Scholar 

  5. Eckstein, F. & Wirth, W. Quantitative Cartilage Imaging in Knee Osteoarthritis. Arthritis 2011, 1–19 (2011).

    Google Scholar 

  6. Cieza, A. et al. Global estimates of the need for rehabilitation based on the Global Burden of Disease study 2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 396, 2006–2017 (2020).

    Google Scholar 

  7. Hartvigsen, J. et al. What low back pain is and why we need to pay attention. Lancet 391, 2356–2367 (2018).

    Google Scholar 

  8. Williams, A. et al. Musculoskeletal conditions may increase the risk of chronic disease: a systematic review and meta-analysis of cohort studies. BMC Med. 16, 167 (2018).

    Google Scholar 

  9. Tajbakhsh, N. et al. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Med. Image Anal. 63, 101693 (2020).

    Google Scholar 

  10. Pons, C. et al. Quantifying skeletal muscle volume and shape in humans using MRI: A systematic review of validity and reliability. PLOS ONE 13, e0207847 (2018).

    Google Scholar 

  11. Tunset, A., Kjaer, P., Samir Chreiteh, S. & Secher Jensen, T. A method for quantitative measurement of lumbar intervertebral disc structures: an intra- and inter-rater agreement and reliability study. Chiropr. Man. Therapies 21, 26 (2013).

    Google Scholar 

  12. Kirillov, A. et al. Segment anything. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV) 4015–4026 (2023).

  13. Ravi, N. et al. SAM 2: Segment anything in images and videos. In International Conference on Learning Representations (ICLR) 41175–41218 (2025).

  14. Ma, J. et al. Segment anything in medical images. Nat. Commun. 15, 654 (2024).

    Google Scholar 

  15. Committee, R. T. Integrating the Healthcare Enterprise. Radiology Technical Framework Supplement: AI Results (AIR). Tech. Rep. Rev. 1.3, 1–110 (2025).

  16. Leiner, T., Bennink, E., Mol, C. P., Kuijf, H. J. & Veldhuis, W. B. Bringing AI to the clinic: blueprint for a vendor-neutral AI deployment infrastructure. Insights into Imaging 12, 11 (2021).

    Google Scholar 

  17. Brink, L. et al. ACR’s Connect and AI-LAB technical framework. JAMIA open 5, ooac094 (2022).

    Google Scholar 

  18. Eckstein, F. et al. Quantitative MRI measures of cartilage predict knee replacement: a case-control study from the Osteoarthritis Initiative. Ann. Rheum. Dis. 72, 707–714 (2013).

    Google Scholar 

  19. Kwoh, C. et al. Predicting knee replacement in participants eligible for disease-modifying osteoarthritis drug treatment with structural endpoints. Osteoarthr. Cartil. 28, 782–791 (2020).

    Google Scholar 

  20. Van Houwelingen, H. & Putter, H. Dynamic Prediction in Clinical Survival Analysis 1st edn. (CRC Press, 2011).

  21. Van Houwelingen, H. C. & Putter, H. Dynamic predicting by landmarking as an alternative for multi-state modeling: an application to acute lymphoid leukemia data. Lifetime Data Anal. 14, 447–463 (2008).

    Google Scholar 

  22. Vickers, A. J. & Elkin, E. B. Decision Curve Analysis: A Novel Method for Evaluating Prediction Models. Med. Decis. Mak. 26, 565–574 (2006).

    Google Scholar 

  23. Zhao, T. et al. A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities. Nat. Methods 22, 166–176 (2025).

    Google Scholar 

  24. Tolpadi, A. A. et al. K2S Challenge: From Undersampled K-Space to Automatic Segmentation. Bioengineering 10, 267 (2023).

    Google Scholar 

  25. Pedoia, V. et al. Principal component analysis-T1ρ voxel based relaxometry of the articular cartilage: a comparison of biochemical patterns in osteoarthritis and anterior cruciate ligament subjects. Quant. Imaging Med. Surg. 6, 623–633 (2016).

    Google Scholar 

  26. Peterfy, C. G., Schneider, E. & Nevitt, M. The osteoarthritis initiative: report on the design rationale for the magnetic resonance imaging protocol for the knee. Osteoarthr. Cartil. 16, 1433–1441 (2008).

    Google Scholar 

  27. White Paper: Imorphics OA Knee MRI Measurements. Tech. Rep. (2017). www.imorphics.com.

  28. Hess, M. et al. Deep Learning for Multi-Tissue Segmentation and Fully Automatic Personalized Biomechanical Models from BACPAC Clinical Lumbar Spine MRI. Pain. Med. 24, S139–S148 (2023).

    Google Scholar 

  29. Thahakoya, R. Evaluating the relationship of proximal femoral bone shape asymmetry with cartilage health and biomechanics in patients with hip OA. Proc. Intl. Soc. Mag. Reson. Med. 31, Abstract 4373 (2023).

  30. Lee, S. et al. Magnetic resonance rotator cuff fat fraction and its relationship with tendon tear severity and subject characteristics. J. Shoulder Elb. Surg. 24, 1442–1451 (2015).

    Google Scholar 

  31. Nardo, L. et al. Quantitative assessment of fat infiltration in the rotator cuff muscles using water-fat MRI: Fat Infiltration in the Rotator Cuff Muscles. J. Magn. Reson. Imaging 39, 1178–1185 (2014).

    Google Scholar 

  32. Dice, L. R. Measures of the Amount of Ecologic Association Between Species. Ecology 26, 297–302 (1945).

    Google Scholar 

  33. Jaccard, P. Étude comparative de la distribution florale dans une portion des alpes et des jura. Bull. Soc. Vaud. Sci. Nat. 37, 547–579 (1901).

    Google Scholar 

  34. Friedman, M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. J. Am. Stat. Assoc. 32, 675–701 (1937).

    Google Scholar 

  35. Wilcoxon, F. Individual Comparisons by Ranking Methods. Biometrics Bull. 1, 80 (1945).

    Google Scholar 

  36. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B: Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  37. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR) (2019).

  38. Loshchilov, I. & Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations (ICLR) (2017).

  39. Nickolls, J., Buck, I., Garland, M. & Skadron, K. Scalable Parallel Programming with CUDA: Is CUDA the parallel programming model that application developers have been waiting for? Queue 6, 40–53 (2008).

    Google Scholar 

  40. Stekhoven, D. J. & Bühlmann, P. MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).

    Google Scholar 

  41. Kutner, M. H., Nachtsheim, C. J., Neter, J. & Li, W. Applied Linear Statistical Models 5th edn. (McGraw-Hill/Irwin, 2005).

  42. Shapiro, S. S. & Wilk, M. B. An analysis of variance test for normality (complete samples). Biometrika 52, 591–611 (1965).

    Google Scholar 

  43. Brown, M. B. & Forsythe, A. B. Robust tests for the equality of variances. J. Am. Stat. Assoc. 69, 364–367 (1974).

    Google Scholar 

  44. Koo, T. K. & Li, M. Y. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J. Chiropr. Med. 15, 155–163 (2016).

    Google Scholar 

  45. Bland, J. M. & Altman, D. G. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet (Lond., Engl.) 1, 307–310 (1986).

    Google Scholar 

  46. Rasmussen, C. E. & Williams, C. K. I. Gaussian Processes for Machine Learning (MIT Press, 2005).

  47. Jocher, G., Qiu, J. & Chaurasia, A. Ultralytics YOLO (2023). https://github.com/ultralytics/ultralytics.

  48. Zou, H. & Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B: Stat. Methodol. 67, 301–320 (2005).

    Google Scholar 

  49. Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA 785–794 (2016).

  50. Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  51. Hoyer, G. et al. Foundations of a knee joint digital twin from qMRI biomarkers for osteoarthritis and knee replacement. npj Digit. Med. 8, 118 (2025).

    Google Scholar 

  52. Breiman, L. Random Forests. Mach. Learn. 45, 5–32 (2001).

    Google Scholar 

  53. Zadrozny, B. & Elkan, C. Transforming classifier scores into accurate multiclass probability estimates. In Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD) 694–699 (2002).

  54. Cox, D. R. Regression Models and Life-Tables. J. R. Stat. Soc. Ser. B: Stat. Methodol. 34, 187–202 (1972).

    Google Scholar 

  55. Harrell, F. E.Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Springer Series in Statistics (Springer International Publishing, Cham, 2015).

  56. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30, 1105–1117 (2011).

    Google Scholar 

  57. Brier, G. W. Verification of forecasts expressed in terms of probability. Monthly Weather Rev. 78, 1–3 (1950).

    Google Scholar 

  58. Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).

    Google Scholar 

  59. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. (NeurIPS) 30, 4768–4777 (2017).

    Google Scholar 

Download references

Acknowledgements

This research was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIH-NIAMS) through grants R01AR069006, UH3AR076724, R00AR070902, R01AR078762, P50AR060752, R61AR073552, R33AR073552, R01AR0796471, R01AR046905, and R01AR078917. Data and resources from the Osteoarthritis Initiative (OAI) were used in this study. The OAI is a public-private partnership supported by NIH contracts N01-AR-2-2258 through N01-AR-2-2262 and the Foundation for the National Institutes of Health, with contributions from Merck, Novartis, GlaxoSmithKline, and Pfizer. The funders had no role in study design, data acquisition, analysis, interpretation, or manuscript preparation. We thank members of the Musculoskeletal Quantitative Imaging Research (MQIR) group at UCSF for their input and support throughout the development of this work.

Author information

Authors and Affiliations

  1. Center for Intelligent Imaging, Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA, USA

    Gabrielle Hoyer, Michelle W. Tong, Rupsa Bhattacharjee, Valentina Pedoia & Sharmila Majumdar

  2. Department of Bioengineering, University of California, Berkeley, CA, USA

    Gabrielle Hoyer & Michelle W. Tong

  3. Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA

    Gabrielle Hoyer, Michelle W. Tong & Sharmila Majumdar

  4. Department of Medical Sciences and Technology, Indian Institute of Technology Madras, Chennai, India

    Rupsa Bhattacharjee

  5. Bay Area Institute of Computation, Altos Labs, Redwood City, CA, USA

    Valentina Pedoia

Authors
  1. Gabrielle Hoyer
    View author publications

    Search author on:PubMed Google Scholar

  2. Michelle W. Tong
    View author publications

    Search author on:PubMed Google Scholar

  3. Rupsa Bhattacharjee
    View author publications

    Search author on:PubMed Google Scholar

  4. Valentina Pedoia
    View author publications

    Search author on:PubMed Google Scholar

  5. Sharmila Majumdar
    View author publications

    Search author on:PubMed Google Scholar

Contributions

All authors contributed to the conception and design of the study, as well as to the preparation and approval of the manuscript and Supplementary Materials. G.H. developed the software infrastructure, including metadata management, model fine-tuning, evaluation pipelines, and the automated AutoLabel inference system. Data aggregation and preprocessing were carried out by G.H. and M.W.T., providing a unified basis for analysis. Model training, fine-tuning, and evaluation were conducted by G.H. Statistical design was a collaborative effort among G.H., V.P., and S.M., with G.H. conducting the analysis and V.P. and S.M. performing technical verification. Biomarker evaluation was jointly ideated by all authors, with implementation and analysis executed by G.H.; contributions from M.W.T. and R.B. in data preparation and results validation were integral to the process. G.H. conceptualized and carried out clinical utility proof-of-concept analyses, which were validated by S.M. V.P. and S.M. provided leadership in conceptualizing the study, securing funding, and offering valuable input during the manuscript revision process.

Corresponding author

Correspondence to Gabrielle Hoyer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hoyer, G., Tong, M.W., Bhattacharjee, R. et al. Clinical utility of foundation models in musculoskeletal MRI for biomarker fidelity and predictive outcomes. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02520-w

Download citation

  • Received: 15 November 2025

  • Accepted: 26 February 2026

  • Published: 24 March 2026

  • DOI: https://doi.org/10.1038/s41746-026-02520-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Emerging Applications of Machine Learning and AI for Predictive Modeling in Precision Medicine

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research