Impact of an AI prognostic tool on clinician performance in colorectal liver metastases

Chen, Qichen; Tong, Jinliang; Deng, Yiqiao; Bi, Xinyu; Li, Yuan; Li, Kan; Zhao, Hong

doi:10.1038/s41746-026-02606-5

Download PDF

Article
Open access
Published: 08 April 2026

Impact of an AI prognostic tool on clinician performance in colorectal liver metastases

Qichen Chen¹^na1,
Jinliang Tong²^na1,
Yiqiao Deng²^na1,
Xinyu Bi²,
Yuan Li³,
Kan Li⁴ &
…
Hong Zhao²

npj Digital Medicine , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

While thousands of AI prediction models are published annually, few are adopted into routine practice, partly because improved statistical performance does not necessarily translate into meaningful impact on clinical decision-making. We conducted a prospective randomized multi-reader multi-case study to evaluate how a machine learning–based prognostic tool influences clinician performance in colorectal liver metastases (CRLM). In a prospective, randomized multi-reader multi-case trial (NCT07027605; Registration Date: January 1, 2025), 12 surgical oncologists assessed 166 retrospective CRLM cases with and without tool assistance in a crossed design with a 5-week washout. The primary endpoint was the difference in AUC for predicting 3-year mortality. Between January and July 2025, 12 readers completed 3984 assessments. Model assistance significantly improved the AUC for 3-year mortality prediction (mean difference 0.091; 95% CI 0.001–0.181; P = 0.048) and consistently improved accuracy across secondary prognostic endpoints. It also reduced decision time (2.53 vs. 3.04 minutes) and increased reader confidence. Benefits were greatest for junior to mid-level surgical oncologists. This exploratory study demonstrates that a machine learning prognostic tool can significantly improve accuracy, efficiency, and confidence in CRLM evaluation.

Data availability

The datasets generated and/or analyzed during the current study are not publicly available due to restrictions designed to protect patient privacy and in accordance with the ethical approval governing this study, but are available from the corresponding author on reasonable request.

Code availability

The code generated in this study will be made available by the corresponding author upon reasonable request.

References

Markowetz, F. All models are wrong and yours are useless: making clinical prediction models impactful for patients. NPJ Precis. Oncol. 8, 54 (2024).
Google Scholar
Watson, J. et al. Overcoming barriers to the adoption and implementation of predictive modeling and machine learning in clinical care: what can we learn from US academic medical centers? JAMIA Open 3, 167–172 (2020).
Google Scholar
Giddings, R. et al. Factors influencing clinician and patient interaction with machine learning-based risk prediction models: a systematic review. Lancet Digit. Health 6, e131–e144 (2024).
Google Scholar
Andaur Navarro, C. L. et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 375, n2281 (2021).
Google Scholar
Marwaha, J. S. & Kvedar, J. C. Crossing the chasm from model performance to clinical impact: the need to improve implementation and evaluation of AI. NPJ Digit. Med. 5, 25 (2022).
Google Scholar
Khera, R. et al. AI in Medicine-JAMA’s focus on clinical outcomes, patient-centered care, quality, and equity. JAMA 330, 818–820 (2023).
Google Scholar
Obermeyer, Z. & Weinstein, J. N. Adoption of artificial intelligence and machine learning is increasing, but irrational exuberance remains. NEJM Catal. Innov. Care Deliv. 1, 1 (2020).
Sachs, M. C., Sjölander, A. & Gabriel, E. E. Aim for Clinical Utility, Not Just Predictive Accuracy. Epidemiology 31, 359–364 (2020).
Google Scholar
Liu, F. et al. Application of large language models in medicine. Nat. Rev. Bioeng. 3, 445–464 (2025).
U. S. Food & Drug Administration. Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations. Draft Guid. 3, 6-25 (2025).
Food, U. S. et al. Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products. U.S. Food and Drug Administration, Silver Spring, MD, (2025).
Zhou, H. et al. Colorectal liver metastasis: molecular mechanism and interventional therapy. Signal Transduct. Target Ther. 7, 70 (2022).
Google Scholar
Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).
Google Scholar
Chen, Q. et al. Personalized prediction of postoperative complication and survival among Colorectal Liver Metastases Patients Receiving Simultaneous Resection using machine learning approaches: A multi-center study. Cancer Lett. 593, 216967 (2024).
Google Scholar
Chávez-Villa, M. et al. Emerging role of liver transplantation for unresectable colorectal liver metastases. J. Clin. Oncol. 42, 1098–1101 (2024).
Google Scholar
Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).
Google Scholar
Wada, Y. et al. A transcriptomic signature that predicts cancer recurrence after hepatectomy in patients with colorectal liver metastases. Eur. J. Cancer 163, 66–76 (2022).
Google Scholar
Kokkinakis, S., Ziogas, I. A., Llaque Salazar, J. D., Moris, D. P. & Tsoulfas, G. Clinical prediction models for prognosis of colorectal liver metastases: a comprehensive review of regression-based and machine learning models. Cancers 16, https://doi.org/10.3390/cancers16091645 (2024).
Saelmans, A. et al. Implementation and updating of clinical prediction models: a systematic review. Mayo Clin. Proc. Digit Health 3, 100228 (2025).
Google Scholar
Lee, T. C., Shah, N. U., Haack, A. & Baxter, S. L. Clinical Implementation of Predictive Models Embedded within Electronic Health Record Systems: A Systematic Review. Informatics 3, 7–25 (2020).
Peek, N., Capurro, D., Rozova, V. & van der Veer, S. N. Bridging the gap: challenges and strategies for the implementation of artificial intelligence-based clinical decision support systems in clinical practice. Yearb. Med. Inform. 33, 103–114 (2024).
Google Scholar
Obuchowski, N. A. & Bullen, J. Multireader diagnostic accuracy imaging studies: fundamentals of design and analysis. Radiology 303, 26–34 (2022).
Google Scholar
Skaron, A., Li, K. & Zhou, X.-H. Statistical methods for MRMC ROC studies. Acad. Radiol. 19, 1499–1507 (2012).
Google Scholar
Finkelstein, J., Gabriel, A., Schmer, S., Truong, T.-T. & Dunn, A. Identifying facilitators and barriers to implementation of AI-assisted clinical decision support in an electronic health record system. J. Med. Syst. 48, 89 (2024).
Google Scholar
Stevens, E. R. et al. Reducing prescribing of antibiotics for acute respiratory infections using a frontline nurse-led EHR-Integrated clinical decision support tool: protocol for a stepped wedge randomized control trial. BMC Med. Inform. Decis. Mak. 23, 260 (2023).
Google Scholar
Blanche, P., Latouche, A. & Viallon, V. Time-dependent AUC with right-censored data: a survey. Risk Assess. Eval. Predict. 215, 239–251 (2013).
Obuchowski, N. A. Jr & Rockette, H. E. Jr Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests an ANOVA approach with dependent observations. Commun. Stat. -Simul. Comput. 24, 285–308 (1995).
Google Scholar
Liu, X., Cruz Rivera, S., Moher, D., Calvert, M. J. & Denniston, A. K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–1374 (2020).
Google Scholar

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (Grant no. 82503985, 82141127), National Science and Technology Major Project of the Ministry of Science and Technology of China (Grant no. 2024ZD0520500), National Key Research and Development Program of China (grant No. 2023YFC3403800 and 2023YFC3403804), the CAMS Innovation Fund for Medical Sciences (Grant no. 2021-I2M-C&T-B-057).

Author information

These authors contributed equally: Qichen Chen, Jinliang Tong, Yiqiao Deng.

Authors and Affiliations

Department of Colorectal Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Qichen Chen
Department of Hepatobiliary Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Jinliang Tong, Yiqiao Deng, Xinyu Bi & Hong Zhao
Department of Colorectal Surgery, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
Yuan Li
Daiichi Sankyo, NJ, Basking Ridge, NJ, USA
Kan Li

Authors

Qichen Chen
View author publications
Search author on:PubMed Google Scholar
Jinliang Tong
View author publications
Search author on:PubMed Google Scholar
Yiqiao Deng
View author publications
Search author on:PubMed Google Scholar
Xinyu Bi
View author publications
Search author on:PubMed Google Scholar
Yuan Li
View author publications
Search author on:PubMed Google Scholar
Kan Li
View author publications
Search author on:PubMed Google Scholar
Hong Zhao
View author publications
Search author on:PubMed Google Scholar

Contributions

Concept and design: Acquisition, analysis, or interpretation of data: Qichen Chen, Jinliang Tong, Yiqiao Deng, Xinyu Bi, Yuan Li, Kan Li, and Hong ZhaoDrafting of the manuscript: Qichen Chen, Jinliang Tong, Yiqiao Deng, Yuan Li, and Kan LiCritical review of the manuscript for important intellectual content: Qichen Chen, Yuan Li, Kan Li, and Hong ZhaoStatistical analysis: Qichen Chen, Jinliang Tong, Yiqiao Deng, and Kan LiAdministrative, technical, or material support: All authorsSupervision: Qichen Chen, Kan Li, Hong Zhao and Yuan Li.

Corresponding authors

Correspondence to Qichen Chen, Yuan Li, Kan Li or Hong Zhao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

41746_2026_2606_MOESM1_ESM (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, Q., Tong, J., Deng, Y. et al. Impact of an AI prognostic tool on clinician performance in colorectal liver metastases. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02606-5

Download citation

Received: 06 September 2025
Accepted: 25 March 2026
Published: 08 April 2026
DOI: https://doi.org/10.1038/s41746-026-02606-5