Abstract
Deep learning (DL) applications in healthcare are expanding beyond proof-of-concept studies. Yet, the extent of its real-world implementation and impact on patient care and clinical workflows remains unclear due to the limited prospective real-world findings. Understanding how DL tools perform in real clinical environments is critical for guiding successful and sustainable deployment. Using a layered methodology with established implementation science frameworks, this systematic review aimed to systematically map the implementation strategies and outcomes of prospective DL implementation studies, proposing recommendations based on identified gaps of relevant studies to serve as a guide for the future implementation of DL systems. 20 articles were included: 3 from radiology, 1 from otolaryngology, 3 from dermatology, and 13 from ophthalmology. All studies assessed clinical outcomes, demonstrating the effectiveness and feasibility of integrating DL systems into existing clinical workflows. Adoption and appropriateness were the most frequently evaluated implementation outcomes; only one study evaluated implementation costs, and none evaluated sustainability. Stakeholder acceptability was only evaluated in 8 studies. Given the paucity of real-world DL implementation research, continued research into the clinical deployment of DL systems using hybrid effectiveness-implementation study designs as a framework is essential to facilitate its seamless and effective adoption into clinical practice.
Similar content being viewed by others
Data availability
The data that support the results of this study will be made available upon request to the corresponding author.
References
Takahashi, T., Nozaki, K., Gonda, T., Mameno, T. & Ikebe, K. Deep learning-based detection of dental prostheses and restorations. Sci. Rep. 11, 1960 (2021).
Tham, Y.-C. et al. Detecting visually significant cataract using retinal photograph-based deep learning. Nat. Aging 2, 264–271 (2022).
Nusinovici, S. et al. Retinal photograph-based deep learning predicts biological age, and stratifies morbidity and mortality risk. Age Ageing 51, afac065 (2022).
Tseng, R. et al. Considerations for Artificial Intelligence Real-World Implementation in Ophthalmology: Providers’ and Patients’ Perspectives. Asia-Pac. J. Ophthalmol. 10, 299–306 (2021).
Young, A. T., Amara, D., Bhattacharya, A. & Wei, M. L. Patient and general public attitudes towards clinical artificial intelligence: a mixed methods systematic review. Lancet Digit. Health 3, e599–e611 (2021).
Abramoff, M. D., Lavin, P. T., Birch, M., Shah, N. & Folk, J. C. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit. Med. 1, 39 (2018).
Ipp, E. et al. Pivotal Evaluation of an Artificial Intelligence System for Autonomous Detection of Referrable and Vision-Threatening Diabetic Retinopathy. JAMA Netw. Open 4, e2134254 (2021).
Ben-Israel, D. et al. The impact of machine learning on patient care: A systematic review. Artif. Intell. Med. 103, 101785 (2020).
Nwanosike, E. M., Conway, B. R., Merchant, H. A. & Hasan, S. S. Potential applications and performance of machine learning techniques and algorithms in clinical practice: A systematic review. Int. J. Med. Inform. 159, 104679 (2022).
Tseng, R., Tham, Y. C., Rim, T. H. & Cheng, C. Y. Emergence of non-artificial intelligence digital health innovations in ophthalmology: A systematic review. Clin. Exp. Ophthalmol. 49, 741–756 (2021).
Huang, J., Shlobin, N. A., Lam, S. K. & DeCuypere, M. Artificial intelligence applications in pediatric brain tumor imaging: a systematic review. World Neurosurg. 157, 99–105 (2022).
Buchlak, Q. D. et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg. Rev. 43, 1235–1253 (2020).
Fleuren, L. M. et al. Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. 46, 383–400 (2020).
Proctor, E. et al. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm. Policy Ment. Health 38, 65–76 (2011).
Muñoz-López, C. et al. Performance of a deep neural network in teledermatology: a single-centre prospective diagnostic study. J. Eur. Acad. Dermatol. Venereol. 35, 546–553 (2021).
Liu, X. et al. Evaluation of an OCT-AI-based telemedicine platform for retinal disease screening and referral in a primary care setting. Transl. Vis. Sci. Technol. 11, 4 (2022).
Liu, B. et al. Assisting scalable diagnosis automatically via CT images in the combat against COVID-19. Sci. Rep. 11, 4145 (2021).
Zhu, A. et al. Implementation of deep learning artificial intelligence in vision-threatening disease screenings for an underserved community during COVID-19. J. Telemed. Telecare 30, 1590–1597 (2024).
Skevas, C. et al. Implementing and evaluating a fully functional AI-enabled model for chronic eye disease screening in a real clinical environment. BMC Ophthalmol. 24, 51 (2024).
Antaki, F. et al. Implementation of artificial intelligence-based diabetic retinopathy screening in a tertiary care hospital in Quebec: prospective validation study. JMIR Diab. 9, e59867 (2024).
Li, J. et al. Integrated image-based deep learning and language models for primary diabetes care. Nat. Med. 30, 2886–2896 (2024).
Keel, S. et al. Feasibility and patient acceptability of a novel artificial intelligence-based screening model for diabetic retinopathy at endocrinology outpatient services: a pilot study. Sci. Rep. 8, 4330 (2018).
Critical Appraisal Tools, https://jbi.global/critical-appraisal-tools.
Effective Practice and Organisation of Care (EPOC). The EPOC taxonomy of health systems interventions. (Oslo: Norwegian Knowledge Centre for the Health Services, 2016).
Arbabshirani, M. R. et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. npj Digit. Med. 1, 9 (2018).
Liu, R. et al. Application of artificial intelligence-based dual-modality analysis combining fundus photography and optical coherence tomography in diabetic retinopathy screening in a community hospital. Biomed. Eng. Online 21, 47 (2022).
Dai, L. et al. A deep learning system for predicting time to progression of diabetic retinopathy. Nat. Med. 30, 584–594 (2024).
Gu, C. et al. Application of artificial intelligence system for screening multiple fundus diseases in Chinese primary healthcare settings: a real-world, multicentre and cross-sectional study of 4795 cases. Br. J. Ophthalmol. 108, 424–431 (2024).
Chen, B. et al. A 3D and explainable artificial intelligence model for evaluation of chronic otitis media based on temporal bone computed tomography: model development, validation, and clinical application. J. Med. Internet Res. 26, e51706 (2024).
Scheetz, J. et al. Real-world artificial intelligence-based opportunistic screening for diabetic retinopathy in endocrinology and indigenous healthcare settings in Australia. Sci. Rep. 11, 15808 (2021).
Wang, J. et al. Value of deep learning reconstruction of chest low-dose CT for image quality improvement and lung parenchyma assessment on lung window. Eur. Radio. 34, 1053–1064 (2024).
Ruamviboonsuk, P. et al. Real-time diabetic retinopathy screening by deep learning in a multisite national screening programme: a prospective interventional cohort study. Lancet Digit. Health 4, e235–e244 (2022).
Bora, A. et al. Risk stratification for diabetic retinopathy screening order using deep learning: a multicenter prospective study. Transl. Vis. Sci. Technol. 12, 11 (2023).
Goessinger, E. V. et al. Patient and dermatologists’ perspectives on augmented intelligence for melanoma screening: A prospective study. J. Eur. Acad. Dermatol. Venereol. 38, 2240–2249 (2024).
Shah, P., Mishra, D., Shanmugam, M., Vighnesh, M. J. & Jayaraj, H. Acceptability of artificial intelligence-based retina screening in the general population. Indian J. Ophthalmol. 70, 1140–1144 (2022).
Huang, K. et al. Artificial intelligence-based psoriasis severity assessment: real-world study and application. J. Med. Internet Res. 25, e44932 (2023).
Eusebi, P. Diagnostic accuracy measures. Cerebrovasc. Dis. 36, 267–272 (2013).
Tobore, I. et al. Deep learning intervention for health care challenges: some biomedical domain considerations. JMIR Mhealth Uhealth 7, e11966 (2019).
Gold, H. T., McDermott, C., Hoomans, T. & Wagner, T. H. Cost data in implementation science: categories and approaches to costing. Implement Sci. 17, 11 (2022).
Molero, A., Calabrò, M., Vignes, M., Gouget, B. & Gruson, D. Sustainability in healthcare: perspectives and reflections regarding laboratory medicine. Ann. Lab. Med. 41, 139–144 (2021).
Mortimer, F., Isherwood, J., Wilkinson, A. & Vaux, E. Sustainability in quality improvement: redefining value. Future Health. J. 5, 88–93 (2018).
Mayo-Smith, M., Radwin, L. E., Abdulkerim, H. & Mohr, D. C. Factors associated with patient ratings of timeliness of primary care appointments. J. Patient Exp. 7, 1203–1210 (2020).
Wu, L. et al. Real-time artificial intelligence for detecting focal lesions and diagnosing neoplasms of the stomach by white-light endoscopy (with videos). Gastrointest. Endosc. 95, 269–280.e266 (2022).
Srisubat, A. et al. Cost-utility analysis of deep learning and trained human graders for diabetic retinopathy screening in a nationwide program. Ophthalmol. Ther. 12, 1339–1357 (2023).
Hu, W. et al. Population impact and cost-effectiveness of artificial intelligence-based diabetic retinopathy screening in people living with diabetes in Australia: a cost effectiveness analysis. EClinicalMedicine 67, 102387 (2024).
Li, R. et al. Cost-effectiveness and cost-utility of traditional and telemedicine combined population-based age-related macular degeneration and diabetic retinopathy screening in rural and urban China. Lancet Reg. Health West Pac. 23, 100435 (2022).
Curran, G. M., Bauer, M., Mittman, B., Pyne, J. M. & Stetler, C. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med. Care 50, 217–226 (2012).
Gunasekeran, D. V. et al. National use of artificial intelligence for eye screening in Singapore. NEJM AI 1, AIcs2400404 (2024).
Pinnock, H. et al. Standards for Reporting Implementation Studies (StaRI) Statement. BMJ 356, i6795 (2017).
Tufanaru, C. M. Z., Aromataris E., Campbell, J. & Hopp, L. In JBI Manual for Evidence Synthesis (ed Munn Z. & Aromataris E.) Ch. 3 (JBI, 2020).
Acknowledgements
This project was supported by the National Medical Research Council of Singapore (NMRC/MOH/HCSAINV21nov-0001) (YCT).
Author information
Authors and Affiliations
Contributions
R.M.W.T.T., E.L., and Y.C.T. contributed to the design of the study. R.M.W.T.T. and L.C. performed the literature search and interpreted the results. R.M.W.T.T., L.C., and E.L. drafted the initial version of the manuscript. R.M.W.T.T. and E.L. created the figures. R.M.W.T.T., L.C., J.H.L.G., Y.C., T.C., E.L., and Y.C.T. contributed additional content and made revisions to the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Tseng, R.M.W.W., Ong, L.C., Goh, J.H.L. et al. Prospective real-world implementation of deep learning systems in healthcare: a systematic review guided by implementation science. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02358-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-026-02358-2


