Accelerating AI innovation in healthcare: real-world clinical research applications on the Mayo Clinic Platform

Yu, Yue; Hu, Xinyue; Rajaganapathy, Sivaraman; Feng, Jingna; Abdelhameed, Ahmed; Li, Xiaodi; Li, Jianfu; Liu, Xiaoke; Yang, Liu; Ertekin-Taner, Nilüfer; Fiero, Phil; Boroumand, Soulmaz; Larsen, Richard; Goyal, Maneesh; Otley, Clark C.; Zong, Nansu; Shah, Vijay H.; Halamka, John D.; Tao, Cui

doi:10.1038/s44401-026-00068-1

Download PDF

Brief Communication
Open access
Published: 16 February 2026

Accelerating AI innovation in healthcare: real-world clinical research applications on the Mayo Clinic Platform

Yue Yu¹,
Xinyue Hu¹,
Sivaraman Rajaganapathy¹,
Jingna Feng¹,
Ahmed Abdelhameed¹,
Xiaodi Li¹,
Jianfu Li¹,
Xiaoke Liu²,
Liu Yang³,
Nilüfer Ertekin-Taner^4,5,
Phil Fiero⁶,
Soulmaz Boroumand⁶,
Richard Larsen⁶,
Maneesh Goyal⁶,
Clark C. Otley^6,7,
Nansu Zong¹,
Vijay H. Shah⁸,
John D. Halamka⁶ &
…
Cui Tao^1,6

npj Health Systems volume 3, Article number: 17 (2026) Cite this article

14k Accesses
20 Altmetric
Metrics details

Subjects

Abstract

Artificial intelligence (AI) holds promise for healthcare, but real-world implementation remains difficult. The Mayo clinic platform (MCP) addresses this by providing scalable, multi-institutional, de-identified data and analytical tools. Through four research projects, we demonstrate MCP’s ability to support efficient cohort identification, AI model development, and real-world evidence generation. MCP enables broader accessibility and standardization compared to institutional EHRs, positioning it as a powerful platform for advancing translational research and precision medicine.

Introduction

In recent years, artificial intelligence (AI) has emerged as a transformative force poised to revolutionize the field of biomedicine^1,2. However, the transition of AI algorithms from in silico simulations to practical, real-world clinical applications remains challenging^3,4. Effective implementation of AI in healthcare requires a comprehensive integration of the entire ecosystem, extending beyond the algorithms themselves⁵. For instance, within the medical domain, a prominent trend is the development of multimodal AI models that integrate diverse data types across multiple modalities^6,7,8. This advancement, however, introduces complex issues, such as safeguarding patient privacy amidst the aggregation of sensitive information^9,10. From the perspective of AI model development, advancing beyond retrospective design and validation poses an additional hurdle^11,12. Moreover, ensuring the accessibility of advanced tools and adequate computational resources to accommodate the variable requirements of individual users is essential for widespread adoption and effectiveness^9,13. Another significant challenge is the integration of expert-in-the-loop systems that require no-code AI platforms¹⁴, which are crucial for enabling non-technical medical professionals to effectively use and interact with AI tools without needing extensive programming knowledge.

To accelerate the development of medical AI, several established initiatives, including i2b2/TranSMART^15,16 and OHDSI/OMOP^17,18, have significantly advanced real-world data integration and analytics. In addition, large-scale research platforms such as the All of Us Research Program^19,20 and the UK Biobank^21,22, have emerged to support AI research and data science studies. These platforms offer standardized longitudinal real-world data—both All of Us and UK Biobank offer EHR data in the OMOP common data model (CDM) format and secure, cloud-based environments designed for high-performance computing.

Since 2019, Mayo Clinic started to create the Mayo clinic platform (MCP)²³, which focuses on transforming healthcare through data science and digital health technologies. By leveraging a vast array of standardized clinical data, advanced analytics, and collaborative networks like the Mayo Clinic Care Network, the platform aims to improve patient care and streamline health outcomes. It fosters innovation by enabling healthcare organizations, providers, and digital health companies to access real-time insights and deploy cutting-edge solutions.

In this brief communication, we aim to demonstrate how MCP enables real-world clinical research and AI innovation through practical applications. Specifically, we explore the platform’s capabilities by conducting four representative research projects using real-world EHR data and integrated MCP tools (Fig. 1). By leveraging MCP’s robust data infrastructure and versatile toolset—from intuitive visualizers to advanced AI workspaces—these projects collectively showcase how MCP facilitates scalable, reproducible, and collaborative research. Rather than providing exhaustive technical details for each project, this study highlights the platform’s integrative features, including standardized multi-institutional data, privacy-preserving design, and accessible analytical environments. Together, these examples demonstrate MCP’s pivotal role as an enabling infrastructure for AI-driven clinical research, accelerating translational medicine and advancing precision healthcare.

**Fig. 1: Workflow of the four demonstration projects.**

Demonstration Projects Enabled by MCP

Table 1 summarizes four clinical research projects conducted using the MCP, which includes both standard statistical analysis and AI-based research. For the randomized controlled trial (RCT) stimulation project, the Cohort Visualizer tool was used to build the study cohort. All projects also utilized the Schema Visualizer and Workspaces for EHR data collection and analysis. The results highlight the effectiveness of MCP’s data, tools, and computing environment in facilitating successful data science research. MCP contributed to significant outcomes across all projects, including the development of a reusable research pipeline, scientific validation of existing studies, and AI-based prediction model. Specifically, Project 1 delivered a reusable pipeline for stimulating RCTs using real-world data, offering a cost-effective alternative for evaluating treatment efficacy. Project 2 validated prior findings through robust statistical analysis, contributing real-world evidence to support antihypertensive medications in reducing dementia risk. Project 3 validated a deep learning model for predicting MCI-to-AD progression via the EHR data from diverse healthcare systems, showcasing the potential of AI in early disease detection. Project 4 created an advanced predictive model for major adverse cardiovascular events (MACE) following liver transplantation, enabling improved clinical risk stratification. These outcomes highlight MCP’s capacity to accelerate data-driven research, support translational science, and generate actionable insights that can enhance patient care.

Table 1 Overview of Clinical Research Projects Leveraging MCP

Full size table

As an example of system performance, the MACE after LT prediction project illustrates MCP’s efficiency. For a researcher familiar with the MCP data structure (or OMOP CDM), it typically takes about one week to collect all required structured EHR data (demographics, diagnoses, procedures, and medications) for approximately 15,000 patients. Using a medium-computing configuration (6 CPU cores, 38 GB RAM, no GPU), it takes only about 10 min to train and run the BiGRU deep learning model. This demonstrates that MCP can support both large-scale data processing and rapid model development, providing an efficient and accessible environment for real-world machine learning research.

Key contributions of MCP for real-world AI research

In this study, we demonstrated that MCP has played a critical role in enabling clinical studies using real-world EHR data. While multiple platforms have enabled advances in AI research, this paper focuses on the MCP environment to illustrate practical workflows, collaborations, and outcomes. The MCP provides not only comprehensive, standardized, de-identified, and multiple institutional real-world data, but also powerful tools in the data science and healthcare domains. We appreciated key features such as the Cohort Visualizer, Schema Visualizer, and Workspaces. Our studies not only yielded publishable results from the research perspective, but also effectively leveraged AI-driven methodologies to address real-world clinical challenges, reinforcing the platform’s impact on both academic research and clinical innovation.

While multiple data-sharing and analytics frameworks—such as i2b2/TranSMART and OHDSI/OMOP—have provided valuable infrastructure and tools for real-world evidence research, MCP extends these concepts by integrating federated, multi-institutional data with standardized OMOP CDM formatting and embedding comprehensive research tools within a single cloud-based environment. This approach not only ensures interoperability with existing data standards but also expands accessibility for external researchers through a subscription-based model, supporting both open-source and proprietary analytic pipelines. By combining secure de-identified data access, code-free interfaces, and AI-ready computing environments, MCP serves as a next-generation platform that bridges real-world data analytics and AI-driven translational medicine. Its hybrid design enables scalability and reproducibility while ensuring privacy and compliance. This synthesis of data standardization, federated architecture, and integrated AI development distinguishes MCP as a novel, comprehensive framework for accelerating healthcare innovation.

Compared to traditional institutional real-world data research repositories—that is, standardized real-world clinical data repositories created within individual institutions for research use—the MCP offers distinct advantages for clinical research (as Table 2 shows). MCP provides de-identified data, streamlining IRB approvals and accelerating research timelines for users. Additionally, it enables external researchers to access high-quality Mayo EHR data for study analysis and validation, whereas institutional research repositories are usually restricted to internal use²⁴. MCP also incorporates extensive data standardization, particularly for unstructured clinical notes, by offering AI-powered processing to synthesize standardized data representations, thereby improving the utility of unstructured text for clinical decision sopport²⁵. In contrast, most of the institutional research repositories primarily rely on medical billing codes as the main data record for research use. Furthermore, MCP is more than a data warehouse—it supports a broad range of users through integrated tools that facilitate research across skill levels, from code-free interfaces to advanced programming environments, enhancing accessibility. In comparison, using institutional repositories is more coding-intensive, requiring a steeper learning curve²⁶. Moreover, MCP will not only integrate Mayo Clinic’s data but also data from other academic medical centers who partner with the MCP to contribute de-identified data to MCP’s federated data network (each, a “Data Network Partner”), thereby broadening the scope of available research data. By offering these capabilities, MCP enhances data analysis, improves model validation, and facilitates more efficient and reliable clinical research.

Table 2 Advantages of MCP Compared to the Institutional Real-World Data Research Repositories

Full size table

To improve accessibility, MCP provides both no-code and code-enabled tools to support researchers with diverse technical backgrounds. The Cohort Visualizer and Schema Visualizer allow non-technical users to explore data and define cohorts through intuitive interfaces, while advanced users can utilize Workspaces and coding environments such as JupyterLab and RStudio for customized analyses. We recognize that accessibility for users with limited data science or machine learning expertise remains a continual area for enhancement. Ongoing development efforts aim to further expand low-code and guided-analytics features, enabling clinicians and other domain experts to engage in AI-driven research more effectively. These initiatives align with MCP’s broader vision to democratize data science and make AI-powered research more inclusive across the healthcare community.

A limitation of this study is that all four projects focused exclusively on structured EHR data within MCP, without incorporating other data modalities. For example, MCP also supports the processing and analysis of unstructured EHR data, including free-text clinical notes, through integrated natural language processing (NLP) and large language model (LLM) pipelines. In the future, we plan to use additional data types of MCP, including clinical notes, medical images, and omics data, to broaden research opportunities. Furthermore, as external datasets become available, cross-validation across institutions will further strengthen clinical research. Additionally, MCP provides state-of-the-art AI deployment capabilities via infrastructure designed to streamline the integration of AI solutions into clinical workflows. While we have yet to implement these four research projects using such deployment capabilities, future studies will explore its capabilities to facilitate AI deployment and assess its potential to accelerate the translation of AI-driven innovations into real-world clinical practice. By leveraging MCP, we aim to bridge the gap between research and clinical application.

In the era of AI, the MCP is poised to revolutionize clinical research by advancing multimodal AI, real-world evidence generation, and global data collaboration. By integrating structured EHR data, clinical notes, imaging, and genomics, researchers can leverage MCP harmonized data to enhance biomedical knowledge for large medical foundation models. This integration will boost downstream tasks such as predictive analytics for early disease detection and personalized treatment^7,8. MCP also ensures robust and generalizable AI model validation across multiple institutions. Its data ecosystem will facilitate large-scale studies while maintaining patient privacy, effectively bridging the gap between AI research and real-world clinical implementation²⁷. Moreover, MCP may transform drug development by enabling real-world evidence-based trials that extend beyond traditional clinical settings. This approach allows for broader participation and more diverse data collection, enhancing trial efficiency and relevance²⁸. In addition, MCP can facilitate our “Clinical Trials Beyond Walls” approach which allows broader participation by removing barriers for patient involvement and includes initiatives with underserved communities to enhance the relevance and quality of clinical trials²⁹. With scalable research tools and expanded accessibility, MCP will empower a diverse research community, accelerating medical innovation and driving the future of precision medicine and proactive healthcare.

Method

Platform Architecture Overview

The MCP is a secure, cloud-based data science environment designed to accelerate research and innovation through access to large-scale, de-identified, standardized clinical data and integrated analytical tools. The platform architecture is built to ensure scalability, privacy, and accessibility for researchers across diverse disciplines.

Extensive De-identified and Standardized Data Resources: The MCP employs an innovative de-identification and standardization process applied to data from more than 15.1 million patients. To safeguard patient privacy, the platform uses a multilayered de-identification strategy the combines rule-based heuristics and deep learning models to identify and replace personally identifiable information³⁰. These measures ensure full compliance with HIPAA and institutional governance policies. In addition, the platform provides extensive data standardization, including mapping EHR data to standard medical terminologies and common data models. This rich, multimodal dataset enables a wide range of research applications, including AI model training, real-world evidence generation, and clinical insight discovery.

Integrated Research Tools: The MCP provides a comprehensive suite of research tools that streamline the entire data-to-discovery workflow. These tools enable secure data access, exploration, and analytics within a unified platform. Designed for scalability and ease of use, the MCP tool ecosystem supports both technical and non-technical users, promoting efficient, reproducible, and collaborative research across diverse data types while maintaining rigorous standards for privacy, governance, and compliance.

Dedicated Data Science Environment: Researchers access MCP through a secure, cloud-hosted data science environment tailored for their use. This environment integrates the MCP research tools and provides preconfigured support for open-source analytical frameworks such as Python, R, and TensorFlow. It offers controlled, compliant access to de-identified data and high-performance computing resources, enabling seamless model training and evaluation within a managed and privacy-preserving infrastructure.

This architecture establishes MCP as a scalable, privacy-preserving, and AI-ready research environment that enables investigators to generate actionable insights from de-identified real-world data while maintaining the highest standards of security and compliance.

Real-world observational data in MCP

MCP provides access to extensive, high-quality clinical data, including standardized structured data (e.g., diagnoses, lab results, medications) and unstructured data (e.g., clinical notes, images). This de-identified data spans diverse demographics and captures patient journeys over time. Currently, MCP’s datasets include over 15.1 million patient records, 12 billion radiology images, 3.2 billion lab results, and 1.65 billion clinical notes, all accessible through a secure data science environment. In addition to the Mayo specific standardized EHR, MCP also provides EHR data in the OMOP CDM format, which enhances interoperability and allows users to leverage analytic pipelines and tools developed within the OHDSI ecosystem.

MCP tools used in this study

MCP partners with nference, inc³¹. to make available various tools to accommodate different needs. In this study, since we only used structured EHR data within MCP, the following tools were utilized.

Cohort Visualizer facilitates the quick creation, characterization, and comparison of patient cohorts for hypothesis testing and analysis using EHR data. It supports both structured and unstructured data, offering code-free analytics and intuitive visualization tools. Users can load or create new cohorts and analyze them using graphical or tabular formats by the cohort builder. With user-friendly navigation, it allows users, regardless of technical expertise, to explore vast clinical datasets using standard clinical codes or keywords, helping to accelerate clinical research and address unmet needs in translational medicine. Additionally, for more detailed downstream analysis, it provides SQL code to facilitate data retrieval from the EHR database. Figure 2A shows the user interface of the MCP Cohort Builder, where users can define and filter patient cohorts using structured/unstructured EHR data. Figure 2B illustrates the Cohort Comparison interface, which allows users to visualize and compare cohort characteristics through graphical summaries.

Schema Visualizer provides an interactive interface for exploring the data dictionary and schema within MCP. It offers detailed information on tables, columns, and their relationships, along with query code examples for downstream data collection (Fig. 2C). Additionally, it features an advanced search tool that enables users to efficiently locate specific tables, columns, or values within the data schema.

Workspaces in MCP offer a comprehensive environment for accessing data and computing resources, supporting advanced analytics and data science workflows. The platform provides scalable computational resources tailored to a variety of research needs. For an individual researcher, the maximum available configuration includes 208 CPU cores, 1872 GB of RAM, and 8 NVIDIA H100 80 GB GPUs, ensuring capacity for complex, data-intensive machine learning workflows. They also provide the latest open-source tools, packages, and libraries for cloud-based computation, with integrated support for JupyterLab, VSCode, and RStudio to accommodate diverse coding needs. This all-in-one platform streamlines data collection, processing, and analysis. Additionally, Workspaces include high-performance computing capabilities for resource-intensive tasks such as data mining, machine learning, and deep learning. They also offer code-level guidance for various applications, including data extraction, large language model (LLM) execution, and medical image processing. Furthermore, users can leverage Git within Workspaces to efficiently manage and collaborate on their repositories in GitHub. Figure 2D, E shows the interface page of the MCP workspace.

Research projects conducted on MCP

To comprehensively showcase MCP’s capabilities across various clinical research scenarios, we designed four distinct projects. Figure 3 illustrates the aims of these projects within their respective clinical research contexts. Detailed descriptions of each project are provided below.

Project 1. Stimulating drug efficacy randomized controlled trials (RCTs) for heart failure (HF) patients using real-world observational clinical data. This project leverages the rich retrospective data available on MCP to stimulate the conditions of traditional randomized controlled trials (RCTs). By doing so, it enables high-quality research that sidesteps the usual costs and ethical concerns associated with traditional RCTs. More specifically, we developed methodologies to stimulate RCTs for evaluating drug efficacy in HF patients using real-world observational data. Key objectives include identifying suitable RCT candidates for stimulation and leveraging EHR data to replicate heart failure drug efficacy trials, thereby enabling robust comparative effectiveness research in the absence of traditional RCTs. Additionally, this project explores the use of the Cohort Visualizer, a code-free analytical tool designed for researchers without a data science background, facilitating accessible and efficient cohort analysis.

Project 2. Impact of antihypertensive medications (AHMs) on Alzheimer’s Disease and Related Dementias (ADRD) risk in hypertensive patients with mild cognitive impairment (MCI). This study aims to validate findings from a prior study³² that suggested AHM use may be associated with a reduced risk of ADRD in hypertensive patients with MCI. Utilizing real-world observational data, the primary objective is to perform survival analysis to assess the relationship between AHM use and ADRD progression. Additionally, the study investigates potential drug-drug interactions between AHMs, statins, and metformin within the target patient cohort, providing further insights into pharmacological influences on dementia risk. This project serves as a simulation of traditional clinical research, employing statistical analysis to assess real-world evidence.

Project 3. Building a Mild Cognitive Impairment (MCI) to Alzheimer’s Disease (AD) progression prediction model using EHR data and deep learning method. This project focuses on training and validating a deep learning model³³ to predict the progression from MCI, considered to be a prodrome to dementia³⁴, to AD using longitudinal EHR data. Specifically, it employs the Bidirectional Gated Recurrent Units (BiGRU) deep learning model to forecast MCI progression at varying time intervals, extending up to five years post-diagnosis. Additionally, the study aims to validate the model’s generalizability across diverse datasets and healthcare systems, ensuring its applicability in real-world clinical settings.

Project 4. Developing Deep Learning Model to predict Major Adverse Cardiovascular Events (MACE) After Liver Transplantation (LT). This project focuses on leveraging longitudinal EHR data to develop advanced deep learning models on the MCP for predicting MACE following LT and to compare the performance with our previously developed model based on medical claims data³⁵. By identifying high-risk candidates, the model aids clinicians in risk stratification and informs management strategies to improve transplant outcomes. Additionally, the model highlights key predictive features, enabling physicians to implement targeted preventive measures to reduce the likelihood of adverse cardiovascular events. This study demonstrates the capability of MCP in facilitating deep learning model development for clinical research.

Data collection and analysis approach

The MCP tools have played a crucial role in facilitating these projects by providing a unified platform for cohort development, data extraction, and analysis. Specifically, Project 1 leveraged the Cohort Visualizer to identify RCT candidates. Subsequently, all projects utilized Jupyter Notebook to execute SparkSQL API queries for extracting EHR data from the MCP database. Finally, data analysis—including statistical evaluations and deep learning modeling—was conducted within the Workspace using either R or Python.

Platform accessibility and reusability

The MCP is a subscription-based, cloud-hosted research environment accessible to external users following registration and approval. Researchers, healthcare organizations, and industry partners can register to access MCP’s de-identified datasets and integrated tools by completing the required onboarding process. Once registered, users have access to the same standardized data, analytical tools, and secure computing environments described in this paper. The platform supports both open-source and proprietary components—users can utilize open-source tools (e.g., Python, R, TensorFlow, PyTorch) within the MCP Workspaces, ensuring flexibility and reproducibility. This hybrid model promotes collaboration, scalability, and replicable research while maintaining robust privacy and security protections.

Data availability

This study involves analysis of de-identified Electronic Health Record (EHR) data via the Mayo Clinic Platform. Data shown and reported in this manuscript has been extracted from the EHR using an established protocol for data extraction, aimed at preserving patient privacy. The data has been determined to be de-identified pursuant to an expert’s evaluation, in accordance with the HIPAA Privacy Rule. Any data beyond what is reported in the manuscript, including but not limited to the raw EHR data, cannot be shared or released due to the parameters of the expert determination to maintain the data de-identification. Contact corresponding authors for additional details regarding the Mayo Clinic Platform.

References

Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).
Article PubMed CAS Google Scholar
Yu, K. H., Beam, A. L. & Kohane, I. S. Artificial intelligence in healthcare. Nat. Biomed. Eng. 2, 719–731 (2018).
Article PubMed Google Scholar
London, A. J. Artificial intelligence in medicine: Overcoming or recapitulating structural challenges to improving patient care?. Cell Rep. Med. 3, 100622 (2022).
Article PubMed PubMed Central Google Scholar
Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17, 195 (2019).
Article PubMed PubMed Central Google Scholar
Kwong, J. C. C., Nickel, G. C., Wang, S. C. Y. & Kvedar, J. C. Integrating artificial intelligence into healthcare systems: more than just the algorithm. Npj Digit. Med. 7, 52 (2024).
Article PubMed PubMed Central Google Scholar
Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J. AI in health and medicine. Nat. Med. 28, 31–38 (2022).
Article PubMed CAS Google Scholar
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Article PubMed CAS Google Scholar
Soenksen, L. R. et al. Integrated multimodal artificial intelligence framework for healthcare applications. Npj Digit. Med. 5, 149 (2022).
Article PubMed PubMed Central Google Scholar
Kaissis, G. A., Makowski, M. R., Rückert, D. & Braren, R. F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2, 305–311 (2020).
Article Google Scholar
McCradden, M. D., Stephenson, E. A. & Anderson, J. A. Clinical research underlies ethical integration of healthcare artificial intelligence. Nat. Med. 26, 1325–1326 (2020).
Article PubMed CAS Google Scholar
Keane, P. A. & Topol, E. J. With an eye to AI and autonomous diagnosis. Npj. Digit. Med. 1, 40 (2018).
Article PubMed PubMed Central Google Scholar
Varoquaux, G. & Cheplygina, V. Machine learning for medical imaging: methodological failures and recommendations for the future. Npj. Digit. Med. 5, 48 (2022).
Article PubMed PubMed Central Google Scholar
Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380, 1347–1358 (2019).
Article PubMed Google Scholar
Dasegowda, G. et al. No code machine learning: validating the approach on use-case for classifying clavicle fractures. Clin. Imaging 112, 110207 (2024).
Article PubMed Google Scholar
Murphy, S. N., Mendis, M. E., Berkowitz, D. A., Kohane, I. & Chueh, H. C. Integration of clinical and genetic data in the i2b2 architecture. AMIA Annu Symp. Proc. 2006, 1040 (2006).
PubMed PubMed Central Google Scholar
https://i2b2transmart.org/.
Stang, P. E. et al. Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership. Ann. Intern Med. 153, 600–606 (2010).
Article PubMed Google Scholar
https://www.ohdsi.org/.
All of Us Research Program Investigators, et al. The “All of Us” Research Program. N. Engl. J. Med. 381:668-676 (2019).
https://allofus.nih.gov/.
Ollier, W., Sprosen, T. & Peakman, T. UK Biobank: from concept to reality. Pharmacogenomics 6, 639–646 (2005).
Article PubMed Google Scholar
https://www.ukbiobank.ac.uk/.
https://www.mayoclinicplatform.org/.
Johnson, A. E. W. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data. 10, 1 (2023).
Article PubMed PubMed Central CAS Google Scholar
Bongurala, A. R., Save, D., Virmani, A. & Kashyap, R. Transforming Health Care With Artificial Intelligence: Redefining Medical Documentation. Mayo Clin. Proc. Digit. Health 2, 342–347 (2024).
Article PubMed PubMed Central Google Scholar
Kotecha, D. et al. CODE-EHR best-practice framework for the use of structured electronic health-care records in clinical research. Lancet Digit. Health 4, e757–e764 (2022).
Article PubMed CAS Google Scholar
Price, W. N. 2nd & Cohen, I. G. Privacy in the age of medical big data. Nat. Med. 25, 37–43 (2019).
Article PubMed PubMed Central CAS Google Scholar
Subbiah, V. The next generation of evidence-based medicine. Nat. Med. 29, 49–58 (2023).
Article PubMed CAS Google Scholar
Goldberg, J. M., Amin, N. P., Zachariah, K. A. & Bhatt, A. B. The Introduction of AI Into Decentralized Clinical Trials. Preparing a Paradig. Shift. JACC Adv. 3, 101094 (2024).
Google Scholar
Murugadoss K., et al. Building a best-in-class automated de-identification tool for electronic health records through ensemble learning. Patterns (N Y). 12;2:100255. (2021).
https://nference.com/.
Lundin, S. K. et al. Association between risk of Alzheimer’s disease and related dementias and angiotensin receptor Ⅱ blockers treatment for individuals with hypertension in high-volume claims data. EBioMedicine 109, 105378 (2024).
Article PubMed PubMed Central CAS Google Scholar
Abdelhameed, A., et al. AI-powered model for accurate prediction of MCI-to-AD progression. Acta Pharm. Sin. B. (2025) https://doi.org/10.1016/j.apsb.2025.01.027.
2025 Alzheimer’s disease facts and figures. Alzheimers Dement. 21 e70235 (2025) https://doi.org/10.1002/alz.70235.
Abdelhameed, A. et al. Deep Learning-Based Prediction Modeling of Major Adverse Cardiovascular Events After Liver Transplantation. Mayo Clin. Proc. Digit. Health 2, 221–230 (2024).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by multiple grants from the National Institutes of Health (NIH), including the National Institute on Aging (R01AG083039, R01AG084236, U24AG088019, U01AG088076, RF AG051504, U01AG046139, R01AG061796, and U19AG074879) and the National Institute of General Medical Sciences (R00GM135488). Additional support was provided by the Alzheimer’s Association Zenith Fellows Award (ZEN-22-969810). We also acknowledge Morgan E. Schacht, J.D., from Mayo Clinic, for her assistance in reviewing the manuscript for legal editing.

Author information

Authors and Affiliations

Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, USA
Yue Yu, Xinyue Hu, Sivaraman Rajaganapathy, Jingna Feng, Ahmed Abdelhameed, Xiaodi Li, Jianfu Li, Nansu Zong & Cui Tao
Department of Cardiovascular Medicine, Mayo Clinic Health System, La Crosse, WI, USA
Xiaoke Liu
Division of Hepatology and Liver Transplant, Mayo Clinic, Jacksonville, FL, USA
Liu Yang
Department of Neurology, Mayo Clinic, Jacksonville, FL, USA
Nilüfer Ertekin-Taner
Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
Nilüfer Ertekin-Taner
Mayo Clinic Platform, Rochester, MN, USA
Phil Fiero, Soulmaz Boroumand, Richard Larsen, Maneesh Goyal, Clark C. Otley, John D. Halamka & Cui Tao
Department of Dermatology, Mayo Clinic, Rochester, MN, USA
Clark C. Otley
Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
Vijay H. Shah

Authors

Yue Yu
View author publications
Search author on:PubMed Google Scholar
Xinyue Hu
View author publications
Search author on:PubMed Google Scholar
Sivaraman Rajaganapathy
View author publications
Search author on:PubMed Google Scholar
Jingna Feng
View author publications
Search author on:PubMed Google Scholar
Ahmed Abdelhameed
View author publications
Search author on:PubMed Google Scholar
Xiaodi Li
View author publications
Search author on:PubMed Google Scholar
Jianfu Li
View author publications
Search author on:PubMed Google Scholar
Xiaoke Liu
View author publications
Search author on:PubMed Google Scholar
Liu Yang
View author publications
Search author on:PubMed Google Scholar
Nilüfer Ertekin-Taner
View author publications
Search author on:PubMed Google Scholar
Phil Fiero
View author publications
Search author on:PubMed Google Scholar
Soulmaz Boroumand
View author publications
Search author on:PubMed Google Scholar
Richard Larsen
View author publications
Search author on:PubMed Google Scholar
Maneesh Goyal
View author publications
Search author on:PubMed Google Scholar
Clark C. Otley
View author publications
Search author on:PubMed Google Scholar
Nansu Zong
View author publications
Search author on:PubMed Google Scholar
Vijay H. Shah
View author publications
Search author on:PubMed Google Scholar
John D. Halamka
View author publications
Search author on:PubMed Google Scholar
Cui Tao
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.Y., N.Z., J.D.H., and C.T. conceived the study. J.L., X.L., L.Y., and N.E.T. supported the design of the clinical research projects. Y.Y., X.H., S.R., and X.L. collected the data and performed the analyses. P.F., S.B., R.L., and M.G. provided support from the Mayo Clinic Platform. C.O. and V.H.S. contributed clinical expertise and feedback from a practice perspective. Y.Y. and C.T. drafted the manuscript. All authors reviewed, revised, and approved the final manuscript. J.D.H. and C.T. supervised the study.

Corresponding authors

Correspondence to John D. Halamka or Cui Tao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, Y., Hu, X., Rajaganapathy, S. et al. Accelerating AI innovation in healthcare: real-world clinical research applications on the Mayo Clinic Platform. npj Health Syst. 3, 17 (2026). https://doi.org/10.1038/s44401-026-00068-1

Download citation

Received: 11 August 2025
Accepted: 08 January 2026
Published: 16 February 2026
Version of record: 16 February 2026
DOI: https://doi.org/10.1038/s44401-026-00068-1