Feasibility assessment of an EHR-integrated research platform for prospective data collection in community oncology practice

Ghanim, Majd T.; Sanchez, Daniel; Dykas, Claire; Borgsten, Fredrik; Dabral, Abhishek; Fernandez, Mariana; Hester, Laura; Kaur, Maneet; Katragadda, Lakshmikanth; Lu, Yichen; Rodriguez-Sullivan, Amanda; Rothschild, Melanie; Salcuni, Paul M.; Salituro, Natalie; Zuniga, Richard M.; Meropol, Neal J.; Batavia, Ashita S.

doi:10.1038/s44401-025-00066-9

Download PDF

Article
Open access
Published: 28 January 2026

Feasibility assessment of an EHR-integrated research platform for prospective data collection in community oncology practice

Majd T. Ghanim¹,
Daniel Sanchez²,
Claire Dykas¹,
Fredrik Borgsten³,
Abhishek Dabral¹,
Mariana Fernandez⁴,
Laura Hester²,
Maneet Kaur¹,
Lakshmikanth Katragadda⁵,
Yichen Lu¹,
Amanda Rodriguez-Sullivan¹,
Melanie Rothschild²,
Paul M. Salcuni¹,
Natalie Salituro¹,
Richard M. Zuniga⁶,
Neal J. Meropol¹ &
…
Ashita S. Batavia²

npj Health Systems volume 3, Article number: 10 (2026) Cite this article

2106 Accesses
Metrics details

Subjects

Abstract

Post-approval observational studies that rely on traditional data collection approaches may be burdensome for community oncology settings with limited research infrastructure. We conducted a prospective observational study using an EHR-integrated platform to evaluate its feasibility for streamlining data collection in community oncology. We were able to capture high-quality data on infusion-related reactions in patients with multiple myeloma receiving daratumumab. The EHR-integrated platform included centralized patient identification, automated EHR-to-electronic data capture system data transfer, and centralized abstraction services. Over 11 months, 82 patients enrolled at four US community oncology sites, with 30.5% identifying as Black or Hispanic. Across 4568 essential data points: missingness accounted for 0.59%, 91% of data points were eligible for automated transfer, median transfer time for structured data was 1.7 days, and query frequency was <2% with a median resolution time of 5.1 days. This pilot study demonstrated that EHR-integrated platforms can be operationalized in community oncology settings with high data completeness and rapid, diverse enrollment, expanding access to clinical trials outside traditional research centers and producing more generalizable results.

Exploring racial disparities in treatment patterns and outcomes for patients with multiple myeloma using real world data

Article Open access 19 April 2022

An open-source framework for end-to-end analysis of electronic health record data

Article Open access 12 September 2024

Global disparities in drug-related adverse events of patients with multiple myeloma: a pharmacovigilance study

Article Open access 20 December 2024

Introduction

Post-approval studies of cancer therapies are often conducted to address ongoing gaps in evidence. These studies may be overseen by the US Food and Drug Administration (FDA) as Postmarketing Commitments (PMCs) or Postmarketing Requirements (PMRs) to collect additional data regarding safety and efficacy of a treatment¹. In addition to regulatory considerations, post-approval studies may be used to inform clinical decisions in routine practice², establish registries with linked clinical and biologic data for discovery or clinical validation, or support payer access decisions³. Post-approval studies may also characterize long-term outcomes⁴, alternate dosing or treatment schedules⁵, or expanded indications⁶. Resource-intensive data collection requirements are associated with delays in completion of post-approval studies for patients with cancer, even those that are required by the FDA⁷. Current challenges to the timely and successful completion of post-approval studies for patients with cancer include complex data collection, limited site capacity for conducting research studies, constraints on resources, and lower prioritization by site investigators that favor interventional studies of novel agents⁸. New design and operational approaches to post-approval studies that streamline participation for patients and study sites can enable evidence generation and ensure access for participants wherever they receive care.

Historically, many post-approval studies are designed and conducted with traditional clinical trial operational models that include heavy data collection burden, and take place at specialized research centers (e.g., academic medical centers) that limit the representativeness of the patients who are enrolled⁹. Traditional methods of data capture require manual transfer from the electronic health record (EHR) and other data sources into a separate electronic data capture (EDC) system. This process increases the potential for transcription errors, introduces inconsistencies between source and the study records, and requires substantial site resources. Adding to site burden, the data required for PMCs and PMRs often exceeds what is routinely documented in clinical care. These barriers are particularly acute in community settings, where research infrastructure is often limited¹⁰. In addition, patient identification processes are labor-intensive and typically rely on investigator-clinician referrals or manual chart review, which can be inefficient, inconsistently applied, and prone to selection bias¹¹. Traditional processes tend to favor academic medical centers that have more robust research infrastructure including support staff and dedicated non-clinical research time for investigators. This has historically contributed to geographic and demographic disparities in patient access and participation and enrollment of less representative patient populations^12,13.

Given that EHRs are widely used for documenting cancer care and much of the data needed for clinical studies already exists within these systems¹⁴, we developed a clinical study platform to seamlessly integrate research processes into EHR workflows. This platform supports more efficient conduct of post-approval studies in community settings. Key features of the platform include automated patient-matching tools to assist with real-time identification of eligible patients, and a Fast Healthcare Interoperability Resources (FHIR)-based EHR-to-EDC connector for automated transfer of EHR clinical data, including direct entry of study-specific data elements (e.g., adverse events [AEs]) within routine clinical workflows. Health Level Seven International (HL7) FHIR and proprietary application programming interfaces (APIs) are increasingly being recognized as enabling technologies for EHR-to-EDC data transfer^{15,16,17,18,19}; however, to date significant adoption challenges exist¹⁸. In this report, we described a pilot implementation study of a technology-enabled platform for post-approval evidence generation in oncology. The clinical context for this study is collection of infusion-related AEs in patients with multiple myeloma who are treated with daratumumab. Daratumumab is a monoclonal antibody that targets CD38, commonly expressed on abnormal clonal plasma cells²⁰. Daratumumab may be administered intravenously (IV) or subcutaneously (SC), with the IV formulation historically associated with higher rates of infusion-related reactions (IRRs)²¹. In daratumumab clinical trials, IRRs were reported in approximately 37% of patients during the first IV infusion, with lower rates observed with subsequent administrations²². The SC formulation, though assessed in a different patient population, has been associated with a lower overall incidence of IRRs, approximately 7% in pooled safety analysis²³. Given their frequency and clinical relevance, IRRs represent a suitable post-approval safety endpoint for evaluating the performance of an EHR-integrated research platform in real-world practice. This pilot study sought to assess the quality of data on IRRs in patients receiving daratumumab when captured through an EHR-integrated research platform.

Results

Study enrollment and cohort representativeness

The study enrolled 82 patients between April 2024 and March 2025 at four independent, community oncology practices (Table 1). Nearly all patients completed the study, with three early discontinuations: one due to death, and two due to patient withdrawal.

Table 1 Baseline characteristics of enrolled participants

Full size table

Centralized patient identification services screened 353,602 unique patients who were scheduled for upcoming visits, using a combination of automated matching and human review. Two hundred seventy-eight patients were deemed eligible and another 1594 as “watching” (patients who did not meet any exclusion criteria but did not yet satisfy all inclusion criteria). Eighty-two patients chose to enroll and were treated on study, 57 of whom were surfaced to the sites by the centralized matching service. Black and Latinx patients comprised 26.5% of overall site populations, 37.6% of those patients eligible based on structured data alone (e.g., International Classification of Diseases [ICD] code), 39.9% of those eligible after human review, and 30.5% of those who ultimately enrolled (all proportions include patients with incomplete race/ethnicity data based on harmonized EHR data as of August 2025).

Data completeness

A total of 28 fields (variables) are reported in this manuscript, including 16 fields for baseline characteristics and 12 fields for daratumumab administration characteristics. Among the 4568 data points across 82 patients relevant to the primary objective (participant baseline characteristics and daratumumab administration), 4541 (>99%) were collected, and 27 (0.59%; 95% CI: 0.40–0.87%) were missing.

Among the 82 participants, no missing data points were observed for 16 fields including demographics, vital signs, medical history, and disease characteristics. Similarly, there was no missingness for laboratory measurements. Missingness was only observed for investigator-assessed ISS stage (n = 9, 11.0%), and bone marrow plasma cell percentage (n = 4, 4.9%).

Missingness of daratumumab administration data are summarized in Table 2. Of the 12 fields used to describe daratumumab administrations, nine fields had no missing data points across all administrations. Among IV administrations (4 patients, 12 infusions), all four fields had no missing data points. Among SC administrations (78 patients, 228 doses), injection start time was missing for 1 dose, and anatomical location was missing for 10 doses. Among IV and SC administrations combined, 3 data points for IRR mitigation treatment(s) after daratumumab administration were missing.

Table 2 Missingness of IRR data elements among patients who received at least 1 administration of daratumumab

Full size table

EHR-to-EDC automated data transfer

Overall, there were 43,941 data points collected in the EDC. Ninety one percent of data points were eligible to be transferred via the EHR-to-EDC connector, with nine percent requiring manual site data entry into the EDC (e.g., End of Treatment form). A total of 31,611 data points were pushed across 3173 EHR-to-EDC session submissions (10.0 data points per submission). Table 3 shows the potential vs. actual use of EHR-to-EDC data transfer. Note that data points may be transferred more than once to reflect source data changes to the EDC, so that data points pushed may not equal total data points collected.

Table 3 Data transfer eligibility/volume by data category

Full size table

As timeliness of data acquisition is critical to study completion and analysis, we measured the timelines for data entry in the EDC, categorized by the type of data entry required. As shown in Table 4, the median time from scheduled visit to EDC data entry was lowest for those parameters that could be directly transferred by FHIR (1.7 days) and highest for those requiring centralized abstraction (9.1 days).

Table 4 Median (IQR) for days between scheduled visits and EDC data entry, by visit group^a

Full size table

Data queries

In total, 5752 “field-level” queries (i.e., associated with a specific field in the EDC) were opened (Table 5). 5617 of the field-level queries were opened manually by the sponsor, with the remainder surfaced as automated edit checks. 4519 (80.5%) queries were resolved within two weeks with a median time to resolution of 5.1 days.

Table 5 Query metrics

Full size table

Discussion

This pilot study demonstrated that a technology-enabled research infrastructure, integrating real-time patient identification, structured EHR-to-EDC data transfer, and centralized abstraction services can be operationalized in community oncology settings to support prospective post-approval evidence generation. The study was designed with operational and regulatory elements required to fulfill a PMC, including alignment with FDA data standard guidance²⁴ and guidance regarding the use of EHR data in clinical investigations²⁵. Overall, data missingness was low (0.59%). Most structured data fields were successfully mapped from the EHR into the EDC system, while additional unstructured fields were populated through centralized abstraction. These processes reduced the reliance on site-based transcription and contributed to low rates of missingness, particularly for protocol-defined variables related to IRRs.

Eighty-two patients with multiple myeloma were accrued to this study at four community oncology sites over an 11-month period. This rapid accrual was associated with a centralized technology-enabled patient identification service, designed to reduce site burden. Furthermore, 30.5% of enrolled patients identified as Black or Hispanic, far higher than other recent studies of patients with myeloma²⁶. This higher enrollment of Black and Hispanic patients is likely attributable in part to the intentional inclusion of study sites that serve diverse populations. Together these results raise the possibility that automation may improve patient ascertainment, mitigate unconscious bias associated with patient selection, and advance trial representativeness. However, a direct comparison of the race and ethnicity of patients who enrolled to those who did not enroll is hindered by increased missingness in the availability of self-reported demographic data among patients who did not participate in this study. In addition, we did not capture reasons for non-enrollment of patients who were surfaced as potentially eligible. For these reasons, any conclusions regarding enrollment representativeness should be viewed as exploratory and hypothesis-generating.

All 82 enrolled participants received daratumumab, and nearly all (79/82) completed study follow-up, supporting the feasibility of prospective safety data collection in a routine care environment. Importantly, the platform supported participation from community-based centers, which often have limited engagement in research due to administrative and infrastructure constraints. The ability to capture high-quality data in these settings illustrates the potential for broader inclusion in regulatory research studies, particularly those designed to contribute generalizable safety data post-approval.

Delays in data submission from study sites and query resolution impede the pace of evidence generation^27,28,29. In this study, the time from patient visit to data receipt by the EDC was less than 1 week (median 5.2 days) overall, with a median time of 1.7 days for structured data transferred through the EHR-to-EDC connector. Furthermore, the proportion of manually opened field-level queries per datapoint was ≤2% among EHR-to-EDC transferred data (excluding out-of-range lab and vitals results). This rate is significantly lower than the 5.2% observed for elements populated manually by site staff, suggesting the potential to reduce transcription errors significantly. Query resolution was rapid, with a median of 5.1 days. Together, these metrics suggest a high-level efficiency for clinical staff associated with centralization and automation of research processes, and rapid receipt of data for analysis.

Additional operational efficiencies associated with this EHR-based platform included a reduction in the burden of data entry at clinical sites, streamlined monitoring workflows, and minimized query volume, driven in part by structured EHR-to-EDC data transfer and centralized source verification. While traditional prospective clinical studies rely on site-based transcription and monitoring processes, this approach allowed for scalable, centralized operations with lower disruption to clinic workflows. Given the association of site research participation with high-quality care, this EHR-based platform with centralized services implemented in community practice settings could have broader implications for patient care³⁰.

In this pilot study, we observed incomplete use of the EHR-to-EDC connector with 67% of data elements transferred vs. 91% eligible for transfer. It is plausible that the learning curve associated with adoption of new technology in a clinical setting is in part responsible for this finding^18,31. Additionally, we discovered specific technical and workflow factors that at least partially contributed to incomplete automated transfer. Fields meant for abstraction took more time to populate in the EDC and did not involve site intervention, creating ambiguity about whether sites should enter these data themselves. Furthermore, laboratory tests that were not performed were unable to be transferred as such by the EHR-to-EDC tool. Users were required to manually report that the laboratory test was not performed in the EDC. Qualitative feedback from the study operational team and study sites indicated that additional site training and improved timeliness of centralized abstraction contributed to increasing use of automated data transfer during the course of the study. However, it was not feasible to quantitatively describe this trend within the scope of this study because of differences in site activation timelines and data cleaning operations (which encouraged manual data entry due to the nature of EDC query workflows).

The results we present must be interpreted in the context of several potential limitations. Since this pilot study was conducted in only four clinical sites, the data may not be generalizable to all community oncology practices or in the non-oncology context. In addition, we could not identify a published industry standard for data completeness or query frequency; these results are therefore not comparative and may be considered as a baseline for future research.

These results provide initial support for the use of EHR-integrated platforms in generating regulatory grade evidence in oncology. This platform was associated with robust and representative enrollment while reducing site operational burden. Support for timely data analysis was provided by a low-level of missing data and data queries, with rapid receipt of data in the study database following patient encounters. Expanding clinical research from traditional academic trial sites to the community oncology setting can accelerate the generation of generalizable evidence. While this pilot study was observational, this platform is relevant to interventional studies, with use-cases that include regulatory PMRs, expanded indications for cancer therapeutics, as well as non-regulatory evidence needs that inform routine practice.

Methods

Study design

The overall goal of this pilot study was to evaluate the feasibility of an EHR-integrated platform for post-approval safety monitoring. This was a non-interventional, prospective pilot study conducted across four US community practice clinical sites. The study was designed to evaluate data quality performance of an EHR-integrated data collection infrastructure to capture incidence and characteristics of severe (Grade 3–4) and fatal (Grade 5) IRRs in patients with multiple myeloma receiving daratumumab. Data were captured before and after the first three administrations of daratumumab, administered per routine clinical care and in accordance with approved local practice. Data quality was assessed based upon data completeness and query frequency.

The study was performed in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants, authorizing the use of source data for research and allowing verification in compliance with applicable local regulatory and ethical requirements. Institutional review board approval of the study protocol was obtained from WCG IRB (reference number: IRB00000533).

Objectives

This pilot study aimed to assess an EHR-based data collection infrastructure for capturing potential risk factors for severe and fatal IRRs in participants treated with IV or SC daratumumab for the treatment of multiple myeloma in the community practice setting. Protocol-specified data quality and objectives were:

1.
To assess the completeness of study data collected from the EHR and transmitted to the EDC via an EHR-to-EDC connector (primary objective). Relevant endpoints were missingness of IRR risk factor variables in EDC for all enrolled participants and missingness of IRR data elements in EDC for participants who received at least 1 administration of daratumumab.
2.
To describe operational time and effort of data collection and processing. Relevant endpoints were number of manual queries issued in the EDC, time to query resolution (personnel-related), time to EDC data availability after a study visit, data volume pushed per each EHR-to-EDC user session, percent of EDC data populated via EHR-to-EDC connector (data collection infrastructure-related).

In addition to these outcomes, in this manuscript we also present data on the efficiency of patient screening and associated cohort representativeness.

The results pertinent to clinical objectives regarding incidence and management of IRRs will be reported in subsequent publication.

Eligibility criteria

Eligibility criteria were pragmatically designed^32,33, consistent with routine use of daratumumab. Eligible subjects were at least 18 years old, with a diagnosis of multiple myeloma. Planned treatment with daratumumab in accordance with the approved label either as monotherapy or in combination with other agents was required. Patients with newly diagnosed multiple myeloma who were eligible for transplantation, and prescribed daratumumab according to National Comprehensive Cancer Network (NCCN) guidelines³⁴ were also eligible. Patients who had previously received daratumumab or anti-CD38 antibody therapy or had received an investigational drug or medical device within two weeks of study start were excluded. Patients who were actively enrolled in an interventional clinical trial were not eligible.

EHR-based operational innovations

The study used a multi-component technology stack to support automated patient matching and data acquisition:

Patient identification and matching: A data driven approach was used to identify eligible participants, leveraging custom trial matching algorithms based on available structured data in the patient chart, and clinical review using a combination of specially-trained human abstractors and assistive technology. Site-specific and structured criteria were evaluated daily, and patients were considered potentially eligible if they: had an upcoming visit at a study site; were at least 18 years of age; had an ICD-10 code for multiple myeloma; had no death event from EHR, obituary, or Social Security Death Index data; and had no evidence of a drug order for anti-CD-38 agents. Potentially eligible patients were rereviewed by human abstractors when new documents were added to the patient chart³⁵. All potential trial patients were then surfaced directly to site investigators and research staff directly at the point of care via an EHR-embedded interface³⁵. A visual representation of the data flow architecture is provided in Fig. 1.

EHR-to-EDC data transfer: A secure EHR-to-EDC connector (Flatiron Clinical Pipe) was used to enable direct data transfer from the EHR system to the EDC platform, to reduce the need for manual data transcription. Site Initiation Visits included mandatory EHR-to-EDC connector training prior to data entry and subject enrollment. Sites received ongoing support throughout the study, including a comprehensive Data Entry Guide and access to a dedicated help desk to encourage effective implementation. Study-specific mapping was performed in the EHR-to-EDC connector to categorize study data into the following data transfer eligibility categories:

1.
Structured data: includes elements such as lab observations and vital signs, transferred using Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standards.
2.
Unstructured data: includes elements contained in clinician narratives and clinical notes that are relevant to the study’s objectives and endpoints. Examples include documentation of signs and symptoms, IRRs, pre- and post-administration medications, and provider-reported impressions. Source documents from the EHR are ingested via FHIR application programming interface into the EHR-to-EDC connector, which are then surfaced in an abstraction tool. This is a software tool that efficiently organizes and displays unstructured documents to trained human abstractors to capture key data elements in data entry forms. Centralized abstraction by trained personnel minimized site level variability in unstructured data interpretation. Edit Checks or automated discrepancy checks (e.g., start date later than end date) and routine monitoring enhanced consistency and accuracy.
3.
Electronic (E)-Source Forms: these forms accessed through the EHR user interface and linked to EHR source data permit clinical trial-specific clinical information for study participants to be entered within routine workflows, largely as structured fields. Parameters included AE grading per Common Terminology Criteria for Adverse Events (CTCAE) version 5.0³⁶, relevant past medical/surgical history, and concomitant medications.

The data transfer workflow involved launching the EHR-to-EDC connector from within the EHR user interface. The user (site staff) was then able to select which data to submit and transfer directly into the corresponding electronic case report form (CRF) within the EDC, via “point and click.” Live site staff training was conducted before study initiation. Sites were responsible for ensuring all relevant source data were populated in the EDC for monitoring or audit purposes. Data points could only be transferred according to their mapped data eligibility category and associated transfer mechanism; however, site staff could directly enter any data point into the EDC if preferred.

Data standards and mapping

Study data mapping began with the creation of a data model & form- and field-level data dictionary. To ensure that the data collected would be suitable for a regulatory use case, the data elements were mapped for equivalence to an ongoing post-approval safety study³⁷. After collection, all study data were mapped to the standard Clinical Data Standards Interchange Consortium (CDISC) Study Data Tabulation Model (SDTM), to ensure standardization, harmonization and suitability for regulatory submission of study data (to establish confidence in the application of this platform for future regulatory submissions).

Quality control and monitoring

Data quality processes followed an integrated data review plan which included a cross-functional review across the clinical data management team, site monitors, medical monitors and clinical operations. A comprehensive quality control and monitoring framework was centered around evaluation of:

Data missingness. Missingness is defined as any absence of a required variable in the EDC. Missingness could result from (1) lack of documentation at the source or (2) failure of data transfer into the EDC. Fields explicitly marked as “not done” are not considered missing.
Query volume and median time to query resolution. As per Good Clinical Data Management Practices (GCDMP), a query is defined as a communication tool used to clarify and resolve discrepancies, inconsistencies, or missing information identified within the collected data of a clinical trial via manual and automatic system checks³⁸. We measured the frequency of data queries and the timeliness of their resolution to elucidate efficiencies enabled by automated EHR-to-EDC data transfer. Time to query resolution is defined as the time (days) from when the sponsor initiated a query communication to the time when the sponsor marked the query as resolved. Multiple query communications on the same data point are possible.

Centralized abstraction workflows were led by data entry guidelines which define the expectations for data entry into the EDC system or completion of the CRF for the Principal Investigators, Site Coordinators, and designees. Select metrics were aligned with risk-based monitoring and Risk Assessment Categorization Tools (RACT)^39,40.

This study’s EDC system used a commonly employed rules-based approach (or “Edit Checks”) for identification of discrepant data and had functionality for authoring, storing, managing, executing the rules and tracking the lifecycle of identified discrepancies.

Variable definitions and data sources

All exposures, outcomes, and risk factor variables related to study endpoints were predefined and standardized using the study data dictionary. Data sources included both structured fields and unstructured clinical notes in the EHR, verified through abstraction and mapping workflows.

For variables with more than one data source (e.g., medications available via structured fields and physician notes), harmonization and cross-source validation were applied. Diagnostic criteria for IRRs were based on CTCAE v5.0³⁶. Variables were categorized and transformed where needed (e.g., age categories, medication windows) based on clinically relevant groupings.

Statistical analyses

The study was initially designed with a sample size of 50 patients, as a convenience cohort. To increase precision of outcome estimates and supported by rapid enrollment, the decision was made to extend the enrollment period an additional three months (final n = 82). All analyses were descriptive, and no formal hypothesis testing was planned. Categorical variables were summarized using frequencies and percentages, while continuous variables were described using median and interquartile range.

Baseline characteristics and the number and percentage of missing characteristics were summarized among enrolled patients. Missingness was defined as patients who do not have the characteristics or values of the characteristics available in the EDC. Missingness may occur due to data not being available at the source or data not being transferred from the source into the EDC. Characteristics include baseline demographics, vital signs, medical history, disease characteristics, and laboratory measurements. The number and percentage of missing IRR data elements are summarized per daratumumab administration among patients who received at least one administration. Characteristics of daratumumab administration include route of administration, prescribed dose, administration date, start time, pre- and post-medications, infusion volume/interruptions/stop time (for IV administrations), and anatomical location of administration (for SC administrations).

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

U. S. Food and Drug Administration. Postmarketing requirements and commitments: introduction. https://www.fda.gov/drugs/guidance-compliance-regulatory-information/postmarketing-requirements-and-commitments-introduction (2024).
Berger, M. et al. Good practices for real-world data studies of treatment and/or comparative effectiveness: recommendations from the joint ISPOR-ISPE special task force on real-world evidence in health care decision making. Value Health 20, (2017).
Roberts, M. & Ferguson, G. Real-world evidence: bridging gaps in evidence to guide payer decisions. PharmacoEconomics Open 5, 3–11 (2020).
Naci, H., Smalley, K. R. & Kesselheim, A. S. Characteristics of preapproval and postapproval studies for drugs granted accelerated approval by the US Food and Drug Administration. JAMA 318, 626–636 (2017).
Article PubMed PubMed Central Google Scholar
Collins, G., McKelvey, B., Andrews, H. S., Allen, J. D. & Stewart, M. D. An analysis of dosing-related postmarketing requirements for novel oncology drugs approved by the U.S. Food and Drug Administration, 2012-2022. Clin. Cancer Res. 30, 937–941 (2024).
Article PubMed Google Scholar
Feinberg, B. A. et al. Use of real-world evidence to support FDA approval of oncology drugs. Value Health 23, 1358–1365 (2020).
Article PubMed Google Scholar
Moneer, O. et al. New drug postmarketing requirements and commitments in the US: a systematic review of the evidence. Drug Saf. 45, 305–318 (2022).
Article PubMed Google Scholar
Kumar, G., Chaudhary, P., Quinn, A. & Su, D. Barriers for cancer clinical trial enrollment: a qualitative study of the perspectives of healthcare providers. Contemp. Clin. Trials Commun. 28, 100939 (2022).
Article PubMed PubMed Central Google Scholar
National Academies of Sciences, Engineering, and Medicine. Improving Representation in Clinical Trials and Research (The National Academies Press, Washington, DC, 2022). https://doi.org/10.17226/26479.
Wong, A. R. et al. Barriers to participation in therapeutic clinical trials as perceived by community oncologists. JCO Oncol. Pract. 16, e849–e858 (2020).
Article PubMed PubMed Central Google Scholar
Niranjan, S. J. et al. Bias and stereotyping among research and clinical professionals: perspectives on minority recruitment for oncology clinical trials. Cancer 126, 1958–1968 (2020).
Article PubMed Google Scholar
Harvey, R. D. et al. A call to action to advance patient-focused and decentralized clinical trials. Cancer 130, 1193–1203 (2024).
Article PubMed Google Scholar
Altomare, I. et al. Are community oncology practices with or without clinical research programs different? A comparison of patient and practice characteristics. JNCI Cancer Spectr. 8, pkae060 (2024).
Article PubMed PubMed Central Google Scholar
Salcuni, P. Optimizing data collection: assessing EHR-to-EDC data transfer potential across structured and unstructured data. AMIA Informatics Summit Proceedings https://knowledge.amia.org/ (2025).
Rodrigues, J. J. P. C., Compte, S. S. & Diez, I. de la T. Advances on Sensors for Health Systems. in (eds. Rodrigues”, [“Joel José P.C., Compte”, “Sandra Sendra & Diez”], “Isabel de la Torra) 21–31 (Elsevier, 2016) https://doi.org/10.1016/bf978-1-78548-091-1.50002-6.
HL7 International. HL7 FHIR. ROHL7 FHIR https://www.hl7.org/fhir/index.html (2023).
Venkatesh, K. A., Benjamin, R., Karunakaran, C. & Elango, S. Biosensors and Bioelectronics 247–317 (Elsevier, 2015).
Pfeffer, M., Deneris, M., Shelley, A., Salcuni, P. & Altomare, I. Utility of automated data transfer for cancer clinical trials and considerations for implementation. ESMO Real. World Data Digital Oncol. 7, 100112 (2025).
Article Google Scholar
Cheng, A. C. et al. Evaluating automated electronic case report form data entry from electronic health records. J. Clin. Transl. Sci. 7, e29 (2022).
Article PubMed PubMed Central Google Scholar
Hill, E., Morrison, C. & Kazandjian, D. Daratumumab: a review of current indications and future directions. Semin. Oncol. 49, 48–59 (2022).
Article CAS PubMed Google Scholar
Usmani, S. Z. et al. Final analysis of the phase III non-inferiority COLUMBA study of subcutaneous versus intravenous daratumumab in patients with relapsed or refractory multiple myeloma. Haematologica 107, 2408–2417 (2022).
Article CAS PubMed PubMed Central Google Scholar
Janssen Biotech Inc. DARZALEX (daratumumab) prescribing information. https://www.accessdata.fda.gov/drugsatfda_docs/label/2019/761036s020lbl.pdf (2025).
Janssen Biotech Inc. DARZALEXFASPRO (daratumumab and hyaluronidase‑fihj) Prescribing Information. https://www.accessdata.fda.gov/drugsatfda_docs/label/2020/761145s000lbl.pdf (2024).
U.S. Food and Drug Administration. Study data standards resources. U.S. Food and Drug Administration https://www.fda.gov/industry/fda-data-standards-advisory-board/study-data-standards-resources (2025).
U.S. Food and Drug Administration. Use of electronic health record data in clinical investigations guidance for industry. U.S. Food and Drug Administration https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-electronic-health-record-data-clinical-investigations-guidance-industry (2018).
Hartley-Brown, M. et al. Creating equitable and inclusive clinical trials for multiple myeloma. Clin. Lymphoma Myeloma Leuk. 24, 32–39 (2024).
Article PubMed Google Scholar
Montano, O. et al. Electronic data capture-study conduct, maintenance and closeout. J. Soc. Clin. Data Manag. 1, 1–22 (2021).
Tolmie, E. P., Dinnett, E. M., Ronald, E. S. & Gaw, A. Clinical trials: minimising source data queries to streamline endpoint adjudication in a large multi-national trial. Trials 12, 112 (2011).
Article PubMed PubMed Central Google Scholar
Litchfield, J. et al. Is the future for clinical trials internet-based? A cluster randomized clinical trial. Clin. Trials 2, 72–79 (2005).
Article PubMed Google Scholar
Chiang, A. C. et al. Measuring the impact of academic cancer network development on clinical integration, quality of care, and patient satisfaction. J. Oncol. Pract. 14, e823–e833 (2018).
Article PubMed Google Scholar
Tsai, C. H. et al. Effects of electronic health record implementation and barriers to adoption and use: a scoping review and qualitative analysis of the content. Life 10, 327 (2020).
Batavia, A. S. et al. Pragmatism in postmarket trials. Friends of Cancer Res. https://friendsofcancerresearch.org/wp-content/uploads/Pragmatism-in-Postmarket-Trials-.pdf (2024).
Rivera, D. R. et al. Modernizing research and evidence consensus definitions. JAMA Netw. Open 8, e2516674 (2025).
Article PubMed Google Scholar
National Comprehensive Cancer Network (NCCN). Multiple myeloma: NCCN guidelines for patients. Natl. Comprehens. Cancer Netw. https://www.nccn.org/patients/guidelines/content/PDF/myeloma-patient.pdf (2024).
Afghahi, A. et al. Accrual of historically underrepresented patients with multiple myeloma using centralized screening in a multicenter study. J. Clin. Oncol. 43, e13709–e13709 (2025).
Article Google Scholar
Division of Cancer Treatment and Diagnosis (DCTD), National Cancer Institute. Lead organizations: NCI network trial development and conduct. National Cancer Institute https://dctd.cancer.gov/research/ctep-trials/trial-development (2025).
U.S. Food and Drug Administration. Postmarketing requirements and commitments: rearchable database. U.S. Food and Drug Administration https://www.accessdata.fda.gov/scripts/cder/pmc/ (2025).
Society for Clinical Data Management (SCDM). Explore best practices in Clinical Data Management. Society for Clinical Data Management https://scdm.org/gcdmp/ (2025).
TransCelerate BioPharma Inc. Risk‑based monitoring methodology. TransCelerate BioPharma https://www.transceleratebiopharmainc.com/wp-content/uploads/2016/01/TransCelerate-RBM-Position-Paper-FINAL-30MAY2013.pdf.pdf (2013).
U.S. Food and Drug Administration. Oversight of clinical investigations — a risk‑based approach to monitoring: guidance for industry. U.S. Food and Drug Administration https://www.fda.gov/regulatory-information/search-fda-guidance-documents/oversight-clinical-investigations-risk-based-approach-monitoring (2013).

Download references

Acknowledgements

This study was sponsored by Johnson & Johnson, and there is no specific grant number associated with the sponsorship. The authors would like to thank Obianuju Okoye, MD, MBA, MS-Epi from Flatiron Health, New York, NY, for medical writing assistance. Publication management and editorial support were provided by Darren Johnson, PhD of Flatiron Health, New York, NY.

Author information

Authors and Affiliations

Flatiron Health, New York, NY, USA
Majd T. Ghanim, Claire Dykas, Abhishek Dabral, Maneet Kaur, Yichen Lu, Amanda Rodriguez-Sullivan, Paul M. Salcuni, Natalie Salituro & Neal J. Meropol
Johnson & Johnson, New Brunswick, NJ, USA
Daniel Sanchez, Laura Hester, Melanie Rothschild & Ashita S. Batavia
Johnson & Johnson, Raritan, NJ, USA
Fredrik Borgsten
Johnson & Johnson, Madrid, Spain
Mariana Fernandez
Clearview Cancer Institute, Huntsville, AL, USA
Lakshmikanth Katragadda
New York Cancer and Blood Specialists, Babylon, NY, USA
Richard M. Zuniga

Authors

Majd T. Ghanim
View author publications
Search author on:PubMed Google Scholar
Daniel Sanchez
View author publications
Search author on:PubMed Google Scholar
Claire Dykas
View author publications
Search author on:PubMed Google Scholar
Fredrik Borgsten
View author publications
Search author on:PubMed Google Scholar
Abhishek Dabral
View author publications
Search author on:PubMed Google Scholar
Mariana Fernandez
View author publications
Search author on:PubMed Google Scholar
Laura Hester
View author publications
Search author on:PubMed Google Scholar
Maneet Kaur
View author publications
Search author on:PubMed Google Scholar
Lakshmikanth Katragadda
View author publications
Search author on:PubMed Google Scholar
Yichen Lu
View author publications
Search author on:PubMed Google Scholar
Amanda Rodriguez-Sullivan
View author publications
Search author on:PubMed Google Scholar
Melanie Rothschild
View author publications
Search author on:PubMed Google Scholar
Paul M. Salcuni
View author publications
Search author on:PubMed Google Scholar
Natalie Salituro
View author publications
Search author on:PubMed Google Scholar
Richard M. Zuniga
View author publications
Search author on:PubMed Google Scholar
Neal J. Meropol
View author publications
Search author on:PubMed Google Scholar
Ashita S. Batavia
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: M.T.G., A.D., N.J.M., A.R.S., D.S., F.B., M.F., L.H., M.R., and A.S.B.; Methodology: M.T.G., P.S., A.D., N.J.M., C.D., A.R.S., Y.L., N.S., M.K., D.S., F.B., M.F., L.H., M.R., and A.S.B.; Software: P.S., Y.L., and L.H.; Validation: P.S., A.D., C.D., A.R.S., and L.H.; Formal analysis: M.T.G., P.S., A.D., Y.L., N.S., M.K., D.S., L.H., and A.S.B.; Investigation: L.K., R.M.Z., P.S., A.D., N.J.M., C.D., A.R.S., N.S., D.S., F.B., M.F., L.H., M.R., and A.S.B.; Resources: M.T.G., P.S., C.D., A.R.S., D.S., F.B., M.F., L.H., M.R., and A.S.B.; Data curation: M.T.G., P.S., A.D., and N.S.; Writing—Original Draft: M.T.G., A.D., N.J.M., C.D., A.R.S., D.S., and A.S.B.; Writing—Review & Editing: M.T.G., P.S., A.D., N.J.M., C.D., A.R.S., Y.L., N.S., M.K., D.S., F.B., M.F., L.H., M.R., A.S.B., L.K., and R.M.Z.; Visualization: P.S., A.D., A.R.S., and C.D.; Supervision: A.D., N.J.M., A.R.S., M.K., D.S., M.F., L.H., M.R., and A.S.B.; Project administration: N.J.M., D.S., L.H., M.R., and A.S.B.; Funding acquisition: D.S. and A.S.B. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Neal J. Meropol.

Ethics declarations

Competing interests

During the study period, M.T.G., C.D., A.D., M.K., Y.L., A.R., P.S., N.S., and N.J.M. reported employment with Flatiron Health, Inc. and stock ownership in Roche. D.S., F.B., M.F., L.H., M.R., and A.S.B. reported employment at Johnson & Johnson. R.M.Z. and L.K. had no competing interests to declare.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ghanim, M.T., Sanchez, D., Dykas, C. et al. Feasibility assessment of an EHR-integrated research platform for prospective data collection in community oncology practice. npj Health Syst. 3, 10 (2026). https://doi.org/10.1038/s44401-025-00066-9

Download citation

Received: 12 September 2025
Accepted: 26 December 2025
Published: 28 January 2026
Version of record: 28 January 2026
DOI: https://doi.org/10.1038/s44401-025-00066-9