Abstract
While thousands of AI prediction models are published annually, few are adopted into routine practice, partly because improved statistical performance does not necessarily translate into meaningful impact on clinical decision-making. We conducted a prospective randomized multi-reader multi-case study to evaluate how a machine learning–based prognostic tool influences clinician performance in colorectal liver metastases (CRLM). In a prospective, randomized multi-reader multi-case trial (NCT07027605; Registration Date: January 1, 2025), 12 surgical oncologists assessed 166 retrospective CRLM cases with and without tool assistance in a crossed design with a 5-week washout. The primary endpoint was the difference in AUC for predicting 3-year mortality. Between January and July 2025, 12 readers completed 3984 assessments. Model assistance significantly improved the AUC for 3-year mortality prediction (mean difference 0.091; 95% CI 0.001–0.181; P = 0.048) and consistently improved accuracy across secondary prognostic endpoints. It also reduced decision time (2.53 vs. 3.04 minutes) and increased reader confidence. Benefits were greatest for junior to mid-level surgical oncologists. This exploratory study demonstrates that a machine learning prognostic tool can significantly improve accuracy, efficiency, and confidence in CRLM evaluation.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to restrictions designed to protect patient privacy and in accordance with the ethical approval governing this study, but are available from the corresponding author on reasonable request.
Code availability
The code generated in this study will be made available by the corresponding author upon reasonable request.
References
Markowetz, F. All models are wrong and yours are useless: making clinical prediction models impactful for patients. NPJ Precis. Oncol. 8, 54 (2024).
Watson, J. et al. Overcoming barriers to the adoption and implementation of predictive modeling and machine learning in clinical care: what can we learn from US academic medical centers? JAMIA Open 3, 167–172 (2020).
Giddings, R. et al. Factors influencing clinician and patient interaction with machine learning-based risk prediction models: a systematic review. Lancet Digit. Health 6, e131–e144 (2024).
Andaur Navarro, C. L. et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 375, n2281 (2021).
Marwaha, J. S. & Kvedar, J. C. Crossing the chasm from model performance to clinical impact: the need to improve implementation and evaluation of AI. NPJ Digit. Med. 5, 25 (2022).
Khera, R. et al. AI in Medicine-JAMA’s focus on clinical outcomes, patient-centered care, quality, and equity. JAMA 330, 818–820 (2023).
Obermeyer, Z. & Weinstein, J. N. Adoption of artificial intelligence and machine learning is increasing, but irrational exuberance remains. NEJM Catal. Innov. Care Deliv. 1, 1 (2020).
Sachs, M. C., Sjölander, A. & Gabriel, E. E. Aim for Clinical Utility, Not Just Predictive Accuracy. Epidemiology 31, 359–364 (2020).
Liu, F. et al. Application of large language models in medicine. Nat. Rev. Bioeng. 3, 445–464 (2025).
U. S. Food & Drug Administration. Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations. Draft Guid. 3, 6-25 (2025).
Food, U. S. et al. Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products. U.S. Food and Drug Administration, Silver Spring, MD, (2025).
Zhou, H. et al. Colorectal liver metastasis: molecular mechanism and interventional therapy. Signal Transduct. Target Ther. 7, 70 (2022).
Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).
Chen, Q. et al. Personalized prediction of postoperative complication and survival among Colorectal Liver Metastases Patients Receiving Simultaneous Resection using machine learning approaches: A multi-center study. Cancer Lett. 593, 216967 (2024).
Chávez-Villa, M. et al. Emerging role of liver transplantation for unresectable colorectal liver metastases. J. Clin. Oncol. 42, 1098–1101 (2024).
Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).
Wada, Y. et al. A transcriptomic signature that predicts cancer recurrence after hepatectomy in patients with colorectal liver metastases. Eur. J. Cancer 163, 66–76 (2022).
Kokkinakis, S., Ziogas, I. A., Llaque Salazar, J. D., Moris, D. P. & Tsoulfas, G. Clinical prediction models for prognosis of colorectal liver metastases: a comprehensive review of regression-based and machine learning models. Cancers 16, https://doi.org/10.3390/cancers16091645 (2024).
Saelmans, A. et al. Implementation and updating of clinical prediction models: a systematic review. Mayo Clin. Proc. Digit Health 3, 100228 (2025).
Lee, T. C., Shah, N. U., Haack, A. & Baxter, S. L. Clinical Implementation of Predictive Models Embedded within Electronic Health Record Systems: A Systematic Review. Informatics 3, 7–25 (2020).
Peek, N., Capurro, D., Rozova, V. & van der Veer, S. N. Bridging the gap: challenges and strategies for the implementation of artificial intelligence-based clinical decision support systems in clinical practice. Yearb. Med. Inform. 33, 103–114 (2024).
Obuchowski, N. A. & Bullen, J. Multireader diagnostic accuracy imaging studies: fundamentals of design and analysis. Radiology 303, 26–34 (2022).
Skaron, A., Li, K. & Zhou, X.-H. Statistical methods for MRMC ROC studies. Acad. Radiol. 19, 1499–1507 (2012).
Finkelstein, J., Gabriel, A., Schmer, S., Truong, T.-T. & Dunn, A. Identifying facilitators and barriers to implementation of AI-assisted clinical decision support in an electronic health record system. J. Med. Syst. 48, 89 (2024).
Stevens, E. R. et al. Reducing prescribing of antibiotics for acute respiratory infections using a frontline nurse-led EHR-Integrated clinical decision support tool: protocol for a stepped wedge randomized control trial. BMC Med. Inform. Decis. Mak. 23, 260 (2023).
Blanche, P., Latouche, A. & Viallon, V. Time-dependent AUC with right-censored data: a survey. Risk Assess. Eval. Predict. 215, 239–251 (2013).
Obuchowski, N. A. Jr & Rockette, H. E. Jr Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests an ANOVA approach with dependent observations. Commun. Stat. -Simul. Comput. 24, 285–308 (1995).
Liu, X., Cruz Rivera, S., Moher, D., Calvert, M. J. & Denniston, A. K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–1374 (2020).
Acknowledgements
This study was supported by the National Natural Science Foundation of China (Grant no. 82503985, 82141127), National Science and Technology Major Project of the Ministry of Science and Technology of China (Grant no. 2024ZD0520500), National Key Research and Development Program of China (grant No. 2023YFC3403800 and 2023YFC3403804), the CAMS Innovation Fund for Medical Sciences (Grant no. 2021-I2M-C&T-B-057).
Author information
Authors and Affiliations
Contributions
Concept and design: Acquisition, analysis, or interpretation of data: Qichen Chen, Jinliang Tong, Yiqiao Deng, Xinyu Bi, Yuan Li, Kan Li, and Hong ZhaoDrafting of the manuscript: Qichen Chen, Jinliang Tong, Yiqiao Deng, Yuan Li, and Kan LiCritical review of the manuscript for important intellectual content: Qichen Chen, Yuan Li, Kan Li, and Hong ZhaoStatistical analysis: Qichen Chen, Jinliang Tong, Yiqiao Deng, and Kan LiAdministrative, technical, or material support: All authorsSupervision: Qichen Chen, Kan Li, Hong Zhao and Yuan Li.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, Q., Tong, J., Deng, Y. et al. Impact of an AI prognostic tool on clinician performance in colorectal liver metastases. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02606-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-026-02606-5