Main

AF is the most common cardiac arrhythmia, affecting 59 million individuals worldwide, with a lifetime risk of 1 in 3 (refs. 1,2). AF is a major contributor to morbidity and mortality in the Western world, including risks of stroke, heart failure and dementia. The stroke rate is increased five-fold in patients with clinical AF but can be substantially reduced by prophylactic oral anti-coagulation (OAC)3.

The diagnosis of AF is made by its documentation in a surface ECG3. Because AF is often asymptomatic and can occur only intermittently, the diagnosis is often made too late, after complications have already occurred. In the United States, an estimated 700,000 people have undiagnosed AF, with half of them at least at moderate to high risk for stroke4. Nearly a quarter of patients are diagnosed with AF only after stroke5. Hence, there is great interest among both the public and clinicians in detecting AF earlier at the subclinical level, although the thresholds for initiation of OAC in subclinical AF are less well defined6.

Smart devices with optical sensors, such as smartphones or smartwatches, can detect irregularities in the pulse wave sequence suggestive of AF, which could make them a valuable screening tool in a broad population. Although digital screening with a smartwatch offers the possibility of continuous passive screening, digital screening with a smartphone is based on intermittent active screening. In a ‘direct-to-consumer’ approach targeting owners of brand-specific wrist-worn devices, three pragmatic, large-scale, single-arm, observational studies could demonstrate the fundamental power of digital AF screening7,8,9. However, key questions remain unanswered. One, demonstrating the efficacy of a diagnostic intervention requires direct comparison with the current gold standard, ideally in the context of a randomized trial. Previous randomized trials have evaluated the incremental benefit of AF screening with handheld ECG devices10,11,12. However, the extent to which scalable, digital, smart device–based AF screening strategies can increase AF detection rates compared to routine screening over a given period of time is currently unknown. Two, digital strategies should be accessible to a relevant proportion of the targeted population, in line with their claim to scalability. In the case of AF screening, this implies that also elderly people should be able to access and use this technology. Ideally, digital technology is not limited to the product of a specific brand and runs on devices with a high penetration rate. Three, diagnostic findings should also be of therapeutic relevance and lead to a change in treatment.

To address these issues, we invited policyholders of a large German health insurance company who were free of known AF but at increased risk of stroke to participate in eBRAVE-AF—a pragmatic siteless, digital, open-label randomized trial13. We developed a study app to handle the communication with the participants, including electronic informed consent, randomization and follow-up. There was no in-person contact with participants throughout the study. Through app-based randomization, we assigned participants to either 6-month digital AF screening or usual care. During digital screening, participants used their own smartphones to repetitively screen for irregularities of their pulse waves by means of a certified smartphone app. We used external ECG loop recorders to validate abnormal findings. Over a 6-month period, we directly compared digital screening to usual care, which comprised AF detection in clinical practice without study-related interventions. We defined the primary efficacy endpoint as the first diagnosis of treatment-relevant AF that led to initiation of OAC by independent physicians who were not involved in the study. To improve compliance and increase power for secondary analyses, participants who did not meet the primary endpoint in the first 6 months were invited to participate in a second 6-month study phase with cross-over assignment to usual care or digital screening, respectively. A detailed description of the trial protocol was reported previously13.

Results

Study participants

Of the 67,488 policyholders invited, 5,551 fulfilled the inclusion criteria, downloaded the study app (Extended Data Fig. 1a) to their smartphone and provided electronic consent (Fig. 1). Participants were recruited from all parts of Germany. Median age was 65 years (interquartile range (IQR) 11); 31% were females; and median CHA2DS2-VASc score was 3 (IQR 1) (Table 1).

Fig. 1: CONSORT diagram.
figure 1

Figure shows the flow diagram of the study. ITT, intention-to-treat.

Table 1 Characteristics and medical treatments of the study participants

Randomization

By server-based, simple randomization triggered through the installed study app, 2,860 participants were assigned to digital screening, and 2,691 participants were assigned to usual care. Both groups were well matched with respect to age, sex, known heart diseases, cardiovascular risk factors and medical treatments (Table 1).

Follow-up

Within the first 6 months, 172 of the 5,551 participants (3.1%) were lost to follow-up; 198 participants (3.6%) withdrew their consent; and nine participants died (Fig. 1). Of the remaining 5,172 participants, 6-month follow-up information was available from the study app and/or phone calls for 5,077 participants (98.2%). For 95 participants (1.8%), the information was obtained solely from health insurance data. All but one of the primary endpoints could be verified by source documents.

Detection of treatment-relevant AF

Within the first 6 months, the primary efficacy endpoint of newly diagnosed AF leading to OAC treatment was reached by 38 participants (1.33%) of the 2,860 participants assigned to digital screening and by 17 participants (0.63%) of the 2,691 participants assigned to usual care. Because the proportional hazards assumption was not met, logistic regression analysis was used, and odds ratios (ORs) are reported. The corresponding OR for the primary endpoint was 2.12 (95% CI, 1.19–3.76; P = 0.010) (Fig. 2a and Table 2).

Fig. 2: Cumulative event rates for the primary and secondary endpoints.
figure 2

ad, Cumulative detection rates of AF leading to OAC in phases 1 (a) and 2 (b) as well as cumulative detection rates of AF (including AF not treated with OAC) in phases 1 (c) and 2 (d). Red color indicates digital screening groups (group 1 in phase 1; group 2 in phase 2), and blue color indicates usual screening groups (group 2 in phase 1; group 1 in phase 2).

Table 2 Results of study endpoints for phases 1 and 2

A total of 4,752 participants without a primary endpoint in the first 6 months accepted the invitation for a second 6-month study phase with cross-over assignment (Fig. 1). The median time from invitation until completion of the cross-over questionnaire and inclusion in the second phase of the study was 10 days (IQR 22 days). During the second phase of the trial, 235 of the 4,752 participants (4.9%) were lost to follow-up. Again, digital screening was superior to usual care. Treatment-relevant AF was detected in 33 (1.38%) of the 2,387 participants assigned to digital screening and in 12 (0.51%) of the 2,365 participants assigned to traditional screening, corresponding to an OR of 2.75 (95% CI, 1.42–5.34; P = 0.003) (Fig. 2b and Table 2).

Secondary endpoints

Table 2 shows the secondary endpoints for both phases of the study. Both newly diagnosed AF (with or without subsequent OAC initiation) as well as prescription of OAC were more frequent during digital screening (Fig. 2c,d). The exact mode by which the 85 AF cases (with or without initiation of OAC) were detected during digital screening is shown in Extended Data Table 1. Fifty-nine of the 85 AF cases (69%) were detected by abnormal photoplethysmograms (PPGs) during digital screening, whereas 26 cases (31%) were detected by usual care. A detailed analysis of the 26 cases is presented in Extended Data Fig. 2, showing that only three of these cases had truly normal PPGs. Fifteen cases had partial-abnormal PPGs not meeting the AF criteria; five cases had low compliance; and three cases had abnormal PPGs, which, however, were not related to AF detection. No significant differences were observed with respect to other secondary endpoints.

Ten deaths were reported during digital screening and five deaths were reported during usual care over the first and second phases of the study. The causes of death of all participants are provided in Extended Data Table 2. No difference was observed in major bleeding between digital screening and usual care.

Characteristics of detected AF

Table 3 depicts the characteristics of newly detected AF that led to prescription of OAC in both groups. Of the 71 primary endpoints that occurred during digital screening, 19 (27%) were related to persistent AF and 49 (69%) to paroxysmal AF, and, in three cases (4%), the type of AF could not be classified. Of the 29 primary endpoints that occurred during usual care, these numbers were six (21%), 19 (66%) and four (14%), respectively. Types of AF leading to OAC were not statistically significantly different between digital screening and usual care (P = 0.220). In participants who were anti-coagulated for paroxysmal AF during digital screening, the median maximum AF duration was 11 hours (IQR 4–24), with a median AF burden of 6.1% (IQR 2.0–8.5%) (Table 3).

Table 3 Characteristics of AF leading to OAC

Association of AF detection with clinical outcomes

The association between newly detected AF and abnormal PPG measurements with a major adverse cardiac or cerebrovascular event (MACCE) was investigated in an exploratory analysis. A total of 126 of the 5,551 participants developed a MACCE over a median follow-up time of 391 (IQR 64) days. The single components of MACCE are depicted in Extended Data Table 3. In Cox regression analysis, AF as a time-dependent covariate, whether detected digitally or by usual care, was a significant predictor of MACCE, with a hazard ratio (HR) of 6.13 (95% CI, 3.07–12.21; P < 0.001). This was also true for AF detected by abnormal PPG measurements (HR = 3.22; 95% CI, 1.00–10.33; P = 0.049) as well as for abnormal PPG measurements per se (that is, not necessarily followed by confirmed AF; HR = 2.74; 95 % CI, 1.25–6.00; P = 0.012) (Fig. 3).

Fig. 3: Time-dependent association of AF and abnormal PPG measurements with MACCE.
figure 3

PPG-detected AF refers to ECG-confirmed AF that was initially detected based on an abnormal PPG measurement. For this analysis, Cox regression analysis was used, and AF, PPG-detected AF and abnormal PPG were introduced as time-dependent covariates in the model. Differences were considered statistically significant when the two-sided P value was less than 0.05. No adjustments were made for multiple comparisons. The analysis was performed for all participants (n = 5,551). During the entire follow-up period, 135 participants developed AF, 62 had PPG-detected AF and 179 had abnormal PPG measurements. Due to censoring, the numbers can slightly differ from those in Extended Data Table 1. Error bars indicate 95% CI.

Details of digital AF screening

A total of 4,594 (88%) of the 5,247 participants assigned to digital screening during the entire study (phases 1 and 2 combined) performed at least one valid PPG measurement. A pre-specified per-protocol analysis for the primary endpoint is presented in Extended Data Table 4. The ORs improved to 2.56 (95% CI, 1.45–4.51; P = 0.001) and 3.48 (95% CI, 1.79–6.75; P < 0.001) for the first and second phase, respectively. During the entire study, participants performed a total of 300,509 PPG measurements, corresponding to a median of 53 (IQR 62) per active participant (compared to a scheduled number of 76 PPG measurements over 6 months). Throughout the study, 173 participants recorded an abnormal PPG (n = 104 in phase 1; n = 69 in phase 2), resulting in a confirmed AF diagnosis in 61 cases (Extended Data Table 1). Thus, an abnormal PPG measurement had a diagnostic yield for confirmed AF of 35.2% (95% CI, 28.3–42.2%). Of the 108 participants with abnormal PPGs but without diagnosed AF, 11 did not use an external ECG loop recorder (Extended Data Fig. 3). Over the course of the study, the number of active participants performing PPG self-measurements was relatively stable (Extended Data Fig. 4a,b). Advanced age as well as female sex were associated with a better compliance in terms of more frequent PPG measurements (Extended Data Fig. 4c–f). Increased number of PPG measurements as well as increased CHA2DS2-VASc score were independent predictors for the primary endpoint (Extended Data Fig. 5). Extended Data Fig. 6 illustrates the sensitivity for detecting AF as a function of PPG measurement frequency and AF burden.

Discussion

The results of our study show that a scalable, digital AF screening strategy using common smartphones can more than double the detection rate of treatment-relevant AF compared to routine screening among a broad elderly target population. Our study used a siteless, randomized design in which participants were recruited from the pool of policyholders of a large health insurance company. Participants were pre-selected by age and CHA2DS2-VASc score to increase both the pre-test probability for developing AF and the likelihood of clinical consequence in case of detected AF. To ensure treatment relevance, AF counted as a primary endpoint only if a study-independent physician initiated OAC. Digitally detected AF as well as abnormal PPG measurements per se were of prognostic significance as they predicted MACCE. Due to the cross-over design, both study groups underwent digital screening, which contributed to participant acceptance of the study and increased power for secondary analyses. The digital screening strategy was well received by the older participants, who tended to perform even more PPG measurements.

Digital AF screening by smart devices has been investigated in three large-scale, siteless observational studies: the Apple7, Huawei8 and Fitbit14 heart studies. All three studies took a similar ‘direct-to-consumer’ approach by inviting owners of brand-specific smartwatches or wrist-worn trackers to participate. Screening was performed in two steps, consisting of pre-screening by smart device–based PPG measurements followed by verifying 7-day ECG patches7,14, or, in the case of the Huawei heart study, by evaluation at dedicated telecare centers8. The recruitment efforts of these studies were impressive, with a combined total of more than 1 million participants. The recruitment strategy, however, resulted in enrollment of relatively young participants with a mean age of 41 and 35 years in the Apple and Huawei heart studies, respectively, and a median age between 40 and 54 years in the Fitbit heart study (exact value not specified14). Consequently, the proportion of participants over 65 years of age was low: 5.9%, 1.8% and 13% in the Apple, Huawei and Fitbit heart studies, respectively. However, it should be recognized that, because of the enormous recruitment effort, the absolute numbers of participants over 65 years of age were still very high—for example, 24,000 in the Apple heart study7. In combination, all three studies identified a total of 720 participants with newly diagnosed AF, corresponding to a detection rate of 0.07% per enrolled participant. This relatively small number, however, is due not only to the low pre-test probability of the population studied but also to the substantial proportion of participants who were lost to follow-up or were excluded during the various stages of the studies. The ratio of participants with abnormal PPG measurements to participants who could subsequently be evaluated with an ECG patch was 4.8:1 (2,161:450)7 and 4.5:1 (4,728:1,057)14 in the Apple and Fitbit heart studies, respectively. In the Huawei heart study, 262 of the 424 participants (61.8%) with abnormal PPG findings could be evaluated in telecare centers, corresponding to a ratio of 1.6:1 (ref. 8). In all three studies, PPG-based pre-screening was effective, with diagnostic yields offered by abnormal PPG findings for confirmed AF of 34% (153 of 450), 87% (227 of 262) and 32% (340 of 1,057) in the Apple, Huawei and Fitbit heart studies, respectively. It is important to note that none of these studies used smartwatch-based ECG recording as part of their AF screening algorithm. Advanced smartwatches offer the ability to perform ECG recording as an immediate confirmatory step of a pathological PPG measurement, which could reduce the rate of false-positive PPG measurements. However, because of their single-arm design, none of these studies allowed conclusions regarding the efficacy of digital AF screening compared to that of usual care.

Our study differs from the aforementioned studies in key aspects and provides answers to important unresolved questions. Through its randomized design, our study is the first, to our knowledge, to demonstrate a diagnostic two-fold to three-fold gain in detection rate of AF requiring OAC. This is not self-evident because, as our study shows, a relevant number of new AF cases are detected in a given period of time by usual methods, even during the phases of digital screening. Similarly to previous digital screening studies, we chose a ‘direct-to-participant’ strategy for recruitment. However, rather than targeting owners of brand-specific smart devices in the general population, we selected participants from a pool of policyholders. As a result, participants in our study were significantly older, with a median age of 65 years, and had a significantly higher risk of stroke, with a median CHA2DS2-VASc score of 3. In contrast to previous digital AF screening studies, our study, therefore, targets a more relevant group of the population, where detection of AF has a high likelihood to trigger OAC or has other implications. Therefore, an intensified digital monitoring using intermittent smartphone-based PPG measurement is both reasonable and effective to detect AF in a high-risk population. Our study provides valuable insights into the applicability of digital screening strategies in elderly patients who accepted the digital screening technology surprisingly well. In fact, we found no difference between older and younger participants in terms of quality of measurements and effectiveness of digital screening compared to usual care.

In our study, participants used common smartphones for digital pre-screening, which has both advantages and disadvantages compared to wrist-worn devices used in previous studies. In the United States, around 84% of the population owns a smartphone15. Because of the higher market penetration, smartphone-based strategies currently scale much better than strategies that require dedicated, wrist-worn devices. In contrast, wrist-worn devices offer significant technical advantages. For example, measurements can be taken automatically at more frequent times or even continuously, which could substantially increase the sensitivity in AF detection. In contrast, smartphone-based AF screening essentially relies on the participant’s active willingness to screen, similarly to intermittent AF screening by handheld ECG devices used in previous studies10,11,12. These studies have already shown that the frequency of intermittent screening is an important factor. In the REHEARSE-AF trial, intensive twice-weekly self-screening with a handheld ECG in outpatients over 65 years of age led to an impressive 3.9-fold increase in the detection rate of new AF cases over 12 months10. In contrast, in the VITAL-AF study, screening outpatients aged 65 years or older with a handheld ECG at only two primary care physician visits (IQR 1–3) over 12 months did not significantly increase the AF detection rate (1.72% versus 1.59%; P = 0.38)11. Also in our study, the number of PPG measurements performed was an independent predictor of detected AF, presumably because the detection threshold for intermittent AF is lowered in relation to AF burden. In five of the 26 cases, in which AF was detected by usual care during digital screening, participants had no or low measurement activity. However, our study also shows that the mode of PPG acquisition presumably does not affect the diagnostic yield of an abnormal PPG finding for confirmed AF. This was 35% in our study, which is similar to those in previous studies with wrist-worn technologies (34% and 32% in the Apple and Fitbit heart studies, respectively).

Achieving high-quality and complete follow-up in digital studies without being in personal contact with study participants is a difficult task. To optimize follow-up, we combined claims data with information provided by participants themselves, whether through app-based questionnaires or phone calls, and verified endpoint-relevant data through requested source documents. In the first phase of the trial, which was relevant for the primary efficacy analysis, only 172 of the 5,551 participants (3.1%) were lost to follow-up and 198 (3.6%) withdrew consent, which we consider very low for a siteless digital study, and which is substantially lower than in previous digital screening trials. More than 98% of the follow-ups were achieved by the study app or telephone calls, whereas less than 2% relied solely on insurance claims data. Selective underreporting can be a critical problem in digital studies and can lead to overestimation of effect size when it affects the control arm. This is unlikely in our study, as the completeness of follow-up was even better with usual care than with digital screening (loss to follow-up rates of 2.2% versus 4.0%, respectively). Because we made the same observation after cross-over, this may be a general phenomenon in digital studies unmasked by our specific study design and possibly triggered by participant fatigue to the digital interventions.

Finally, in contrast to previous studies, we considered the need for OAC prescription as prerequisite for reaching the primary endpoint. Notably, the indication for OAC was made by the treating physician who was not involved in the study. In this way, we were able to show that digital AF screening had not only a diagnostic effect but also a therapeutic effect. At this point, however, it is important to emphasize that our study cannot provide information on the prognostic implications of digital AF screening or screening-induced prophylactic OAC, which still remain uncertain12,16 and may be further elucidated by ongoing randomized trials (NCT01938248 and NCT02618577). The current US Preventive Services Task Force guidelines consider current evidence insufficient to change AF screening recommendations for adults 50 years of age or older17. AF duration and AF burden, as well as underlying stroke risk in relation to bleeding risk, may be the most important factors that determine clinical effectiveness of prophylactic OAC18,19. In this context, it is important to note that, in our study, the types of AF detected during digital screening or usual care did not differ. The maximum AF duration in participants who were prescribed with OAC for paroxysmal AF during digital screening was 11 hours (median; IQR 4–24), which qualifies for individual shared decision-making according to current recommendations3.

However, detection of AF has important clinical implications that go far beyond the issue of OAC. AF is associated with increased mortality because it is linked to comorbidities, including heart failure20, myocardial infarction21, chronic kidney disease22, venous thromboembolism23, dementia20 and cancer24. In addition to common risk factors, there are direct causal interactions between AF and its comorbidities, leading to interdependence in disease development25. In our study, digitally detected AF as well as abnormal PPG measurements per se were significant predictors of MACCE. This suggests that common smartphones may also serve as digital risk assessment tools identifying candidates who qualify for intensive diagnostics and multifactorial risk factor and disease modification.

Digital AF screening, like any other screening, is not without side effects. False-positive PPG measurements lead to unnecessary testing and costs. In our study, two of three participants with abnormal PPG measurements did not have confirmed AF. AF screening may also lead to overtreatment with, for example, OACs. There was a numerically higher number of bleedings during digital screening in the second phase of the study.

Our study has several limitations. As our study was conducted in Germany, findings might not be representative for other healthcare systems. Participants of our study were recruited from the pool of policyholders of a large health insurance company. Thus, a selection bias cannot be ruled out. We used a specific smartphone app for PPG self-measurements, which raises the question of transferability to other providers or systems. Considering the very similar diagnostic yields offered by the different PPG-based technologies in the different studies, we think that the principal findings of our study are generalizable. In our study, we cannot comment on the positive predictive value of the PPG algorithm used, as no simultaneous ECG assessments were performed. We used an external ECG loop recorder to verify AF, which automatically documents AF episodes. A full 14-day ECG recording would have provided more detailed information. The participants in phase 2 of the study are characterized by a negative screening in phase 1. Therefore, the ORs between the two phases are not comparable. In our study, we used an intermittent smartphone-based screening that requires active PPG measurements by the participant and is, therefore, not comparable to continuous, passive, smartwatch-based screening, which might provide better compliance and increased sensitivities in AF detection. Our study was an open-label study with all inherent limitations that such a study design entails. In particular, increased awareness of AF due to study participation cannot be excluded, which could have favored usual care.

In conclusion, our study shows that a digital scalable AF screening using ordinary smartphones can substantially increase the detection rate of AF requiring OAC compared to routine symptom-based screening. The technology is readily available, can be used on over 90% of smartphones currently available and has no measurable barriers to older people receiving the greatest benefit from screening.

Methods

Study participants and trial oversight

eBRAVE-AF (NCT04250220) was a siteless, prospective, digital, randomized, open-label study initiated and organized by Ludwig-Maximilians-University (LMU) Hospital in Munich, Germany13. Study participants were recruited from the pool of policyholders of Versicherungskammer Bayern, a large German health insurance company. Participants aged 50–90 years with a CHA2DS2-VASc score of ≥1 (men) or ≥2 (women), not known to have paroxysmal or persistent AF and not treated with OAC, were eligible. For AF, the following International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes were used: I48.0, I48.1 and I48.2. For treatment with OAC, the following Anatomical Therapeutic Chemical (ATC) codes were used: B01AA, B01AE, B01AE07, B01AF, B01AF01, B01AF02 and B01AF03 (Supplementary Tables 1 and 2). Baseline characteristics were derived from claims data and personal questionnaires. The timeframe to define the baseline was 2 years before study initiation. In total, 67,488 policyholders met the inclusion criteria and were invited by letter to participate. Between 4 February 2020 and 31 July 2020, 5,587 participants (8.3%) downloaded a specifically designed study app (eBRAVE-AF app, design-IT GmbH) from the Google Play Store (Android) or Apple Appstore (iOS) on their smartphone and provided electronic informed consent (Fig. 1). A total of 5,551 participants met the inclusion criteria and were included in the study. The eBRAVE-AF app was a customized app based on the MORE framework26. The eBRAVE-AF app handled the communication with the participants throughout the study and was connected to a server structure holding the backend of the MORE framework, including the database within the secured network infrastructure of LMU Hospital Munich. There was no in-person contact with study participants throughout the study. Clinical and follow-up information were collected via questionnaires in the study app or through telephone calls and were matched with health insurance data (Supplementary Tables 13). The study was approved by the medical ethics committee of LMU Hospital (no. 18–779). The study protocol is provided as a Supplementary Note.

Study design and randomization

The study was designed as a parallel-group randomized trial with a subsequent cross-over phase for secondary analyses. After confirming inclusion criteria in the study app and providing electronic informed consent, participants triggered a simple randomization process in the app by which they were assigned to one of two groups. The randomization was executed centrally in a server located at LMU Hospital. The generated random assignment was stored in a database located at LMU Hospital. The randomization was performed using the Mersenne Twister algorithm27, using PHP version 7.2.24 and the function array_rand. Investigators and participants were aware of the study group allocation. Group 1 performed a 6-month period of digital screening. Group 2 performed a 6-month period of usual care. After 6 months, participants who did not reach the primary endpoint in the first study phase, were still alive and did not withdraw consent were invited to participate in a second 6-month study phase with cross-over assignment to usual care or digital screening, respectively (Fig. 1). Phase 1 started at the time of signing the electronic informed consent. Phase 2 began after participants completed the cross-over questionnaire in the eBRAVE-AF app, which took place at least 6 months after enrollment in the study. For all participants, the study ended 12 months after signing the electronic informed consent. Access to digital screening was regulated by software.

Digital AF screening and usual care

Digital AF screening was two-stage and consisted of repetitive 1-minute PPG pulse wave self-measurements using a smartphone app (Preventicus Heartbeats, Preventicus), followed by confirmatory ECG recordings by means of a 14-day external ECG loop recorder (CardioMem CM 100XT, GETEMED) in case of an abnormal PPG finding. The PPG app is a CE-certified Class 2a medical device (according to the European Union Directive 93/42/EEC) for the detection of AF. PPG-based pre-screening was based on the analysis of the irregularity of the pulse wave sequence and morphology by a validated algorithm28,29, with a reported sensitivity and specificity of 89.9% (85.5–93.4) and 99.1% (97.5–99.8) for 1-minute measurements, respectively30. The external ECG loop recorder automatically detects and stores up to 200 AF episodes with a reported sensitivity of 91%. For every stored episode, the corresponding raw ECG with start and end time is available13. Participants were instructed to perform PPG measurements twice daily for the first 14 days and twice weekly thereafter, resulting in a scheduled 76 PPG measurements over 6 months. To perform a PPG measurement, the participant launched the PPG app and placed the smartphone camera on the fingertip. The PPG app activated the LED light and automatically started recording. The eBRAVE-AF app reminded participants to take self-measurements through push notifications. Abnormal findings were validated by a dedicated telemedicine center (Telecare Center Ulm). In case of a confirmed abnormal PPG finding, the study center contacted the participants by phone and sent the participants the external ECG loop recorder. The participants attached the loop recorder themselves and returned it to the study center for evaluation after 14 days. If AF was diagnosed by the study center, participants were asked to consult their local treating physician with the ECG report. The treating physicians were not involved in the study and made all treatment decisions, such as initiation of OAC, at their own discretion.

Usual care aimed to mirror the natural history of AF detection in real life. Accordingly, no study-related diagnostic procedures were performed during usual care.

Follow-up

In both groups, follow-up information was obtained by electronic questionnaires in the eBRAVE-AF app at 4-week intervals, as well as insurance claims data (last updated on 22 February 2022), which was available for all participants. If participants did not answer the electronic questionnaires, the information was obtained by phone call through the study team at LMU Hospital. Medical reports were requested for all endpoint-relevant data.

AF diagnosis

During digital screening, a diagnosis of AF was made when the CardioMem CM 100 XT loop recorder detected ≥30 seconds AF. During both digital screening and usual care, the diagnosis of AF could also be made by the treating physicians. Suspected cases for detected AF by usual care were identified through the regular app-based questionnaires as well as by insurance claims data (Supplementary Table 3).

Study endpoints

The primary efficacy endpoint was newly diagnosed AF within the first 6 months (phase 1 of the study) that led to new prescription of OAC by an independent physician not involved in the study. Secondary endpoints were newly diagnosed AF, newly prescribed OAC, stroke, thromboembolic events, major bleedings (Bleeding Academic Research Consortium (BARC) ≥ 2) and cost-effectiveness. The analysis of cost-effectiveness is not yet completed and is, therefore, not reported here. MACCE included cardiovascular mortality, stroke, thrombosis, pulmonary embolism and hospitalization due to myocardial infarction or decompensated heart failure. All endpoints were assessed during the study by app-based questionnaires and telephone calls (in case of suspected endpoints or when the app-based questionnaires were not answered) as well as by insurance claims data from all patients at the end of the study (ICD-10-CM and ATC codes are listed in Supplementary Table 3). Source documents were requested for all suspected endpoints (medical reports, recorded ECGs, etc). If source data were not available, the endpoint documented by insurance claims data was considered to be correct. In contrast, self-reported events by study participants via the study app required conformation by a source document. The primary endpoint was adjudicated by an independent endpoint committee blinded to the study group allocation. Death was considered a censoring event with exception of the endpoint MACCE. With exception of MACCE, all other endpoints were censored 12 months after signing the electronic informed consent. Sensitivity analyses of the primary and secondary endpoints using death as endpoint events, as well as competing risks, were also performed.

Sample size calculation

Based on results of previous randomized trials of AF screening using mobile ECG technologies10,31,32, we assumed a 6-month detection rate of 3% by digital screening and a detection rate of 1% by usual care. To detect significant differences between screening groups with a power of 90% at an alpha level of 5%, we calculated that at least 1,068 participants per study group would be necessary. To compensate for a possibly lower prevalence of AF in both treatment groups, to account for dropouts and to achieve greater power for secondary analyses, we planned to enroll at least 4,400 participants. The sample size calculation was performed for the primary endpoint using parallel-group design. The subsequent cross-over phase did not affect the sample size calculation but served to increase compliance and power for secondary analyses.

Statistical analyses

Continuous data are presented as medians with lower and upper quartiles and were compared using Wilcoxon and Kruskal–Wallis tests where appropriate. Categorical data are summarized with the use of frequencies and proportions and were compared using the chi-square test. Outcomes were analyzed using time-to-event methods. The effect of the mode of screening on the primary endpoint was primarily tested using Cox regression analysis. The proportional hazards assumption was tested using graphical diagnostics based on the scaled Schoenfeld residuals. As there was a violation of the proportional hazards assumption, logistic regression analysis was used, and the effect of digital screening on the primary endpoint was reported by means of OR. Primary and secondary analyses were performed separately for both phases of the study. The effect of AF or positive PPG measurements on MACCE was done in an exploratory analysis by introducing AF or positive PPG as a time-dependent covariate in a Cox regression analysis. For this analysis, censoring was not performed at 12 months. The effect of the total number of PPG measurements on the primary endpoint for different CHA2DS2-VASc scores was tested using logistic regression analysis. Event rates and cumulative proportions were estimated using the Kaplan–Meier method with 95% CI calculated based on Greenwood’s method and were compared using log-rank statistics. Analyses were done on a modified intention-to-treat basis according to the protocol, in which patients who had known AF or an ongoing treatment with OAC were excluded. We used bootstrapping to calculate the 95% CI of mean values. For all analyses, differences were considered statistically significant when the two-sided P value was less than 0.05. All statistical analyses were performed using CRAN R version 4.2.1.

Role of funding source

The study was primarily funded by resources of LMU Munich (to A.B. and S.M.). The study received partial funding by Pfizer Pharma GmbH (to A.B.). The PPG app was provided by Preventicus GmbH. The funders of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.