Introduction

Digital health technologies (DHTs) promise to transform observational and interventional studies because they enable remote and therefore frequent and ecologically valid quantification of patients’ signs measured in environments that are most meaningful to patients—where they live and work1. These advantages are especially relevant for research in individuals with Parkinson’s disease (PD) because of the fluctuating nature of motor signs. As noted by Marsden et al.2, even in the early stages of the disease “Changes in environment, the mood, the degree of motor effort, sleep and fatigue will all profoundly modify the appearance of parkinsonism.” This variability hinders the quantification of motor sign severity at infrequent in-clinic visits. It also encumbers the quantification of motor sign progression over time, a critical metric to test whether novel drugs slow disease progression. Here, we report exploratory data from daily at-home assessments with the exploratory Roche PD Mobile Application v2 DHT3 in the phase 2 PASADENA clinical trial of the anti-alpha synuclein monoclonal antibody prasinezumab4.

The core motor signs of PD are accessible to quantification with inertial measurement units (IMUs), most commonly accelerometers and gyroscopes, contained in custom DHT products as well as consumer-grade wearables and smartphones1. The recognition of this potential led to the development of DHTs in the form of smartphone and smartwatch apps, such as mPower (Sage Bionetworks, Seattle, Washington)5,6, HopkinsPD7,8, CloudUPDRS9, smartwatch-based motor assessments10, WATCH-PD11 and the Roche PD Mobile Application3,12. These approaches typically instruct individuals to perform “active tests” with a smartwatch or smartphone app, for example tapping buttons on the screen as quickly as possible to measure bradykinesia. Sensor features from the active tests show high test-retest reliability3,10,12, discriminate healthy controls (HCs) from individuals with PD (known-groups validity)6,12,13,14, even those who showed no impairment at a site visit12, HCs from idiopathic REM sleep behavior disorder13, and predict Movement Disorder Society Unified PD Rating Scale (MDS-UPDRS15) scores3,6,9,10,12 as well as future disability milestones (e.g., new onset of falls, freezing)8. Sensors can also be used to passively acquire data on motor behavior while patients go about their daily lives. Such passive monitoring data may be used to estimate the impact of the disease on motor functioning in daily life (e.g., turning speed while walking, sit-to-stand transitions)12. These findings indicate that DHT approaches may be well-suited for clinical trials, in which robust longitudinal data are critically required to assess potential treatment responses16.

Here, we report the results of DHT deployed as an exploratory outcome measure in the phase 2 PASADENA study of prasinezumab. Prasinezumab (formerly called RO7046015/PRX002) is an investigational humanized monoclonal antibody which binds with high affinity and avidity to human aggregated alpha-synuclein, the toxic protein hypothesized to cause PD17. The phase 2 randomized, double-blind, placebo-controlled PASADENA Part 1 study (NCT03100149) was designed to evaluate the efficacy and safety of monthly prasinezumab infusions over 52 weeks in individuals with early-stage PD who were not taking dopaminergic medication. All patients were randomized 1:1:1 to placebo, low dose (1500 mg) or high dose (4500 mg) prasinezumab for 52 weeks (Part 1)18. All participants received prasinezumab (blinded to dose) in the ensuing 52 weeks of the study (PASADENA Part 2). Data from Part 2 were used to determine whether potential group effects identified in Part 1 were also apparent when progression was measured over two years (i.e., Parts 1 and 2). While prasinezumab was safe and well tolerated, the primary endpoint (MDS-UPDRS sum of Parts I + II + III) was not met in Part 14. Analysis of the secondary endpoint (MDS-UPDRS Part III scores) indicated a numeric benefit on motor progression for low-dose prasinezumab in Part 1, which persisted into Part 24. Here we determine the extent to which exploratory sensor data from the Roche PD Mobile Application v23, deployed for daily testing and passive monitoring over two years, are in line with the MDS-UPDRS Part III findings in the Parts 1 and 2 of the PASADENA study. We stress that since the primary endpoint of this study was not met, and secondary and exploratory endpoints are not corrected for multiple comparisons, all statistical results pertaining to these endpoints are investigative and hypothesis-generating as opposed to indicative of significance per se. Thus, all results would require confirmation in future studies.

Results

Adherence to remote DHT active testing and passive monitoring over Parts 1 and 2 of PASADENA

The mean ( ± SD) adherence to the remote active testing protocol was 5.7 ± 1.7 days/week in Part 1 and 4.9 ± 2.3 days/week in Part 2. Adherence over two years did not differ between the pooled prasinezumab and placebo groups (placebo/delayed start group: 5.3 ± 2.0; pooled prasinezumab/early start group: 5.3 ± 1.8; U(df=185) = 11,156.5, p = 0.6). An average of 6.5 ± 4.2 and 9.8 ± 4.6 h/day of passive smartphone and smartwatch data, respectively, were collected in Part 1, and an average of 5.0 ± 4.2 and 7.3 ± 4.9 h/day of passive smartphone and smartwatch data, respectively, were collected in Part 2 (Fig. 1). Adherence to passive monitoring with the smartphone did not differ between groups over 2 years (placebo/delayed start group: 5.5 ± 4.0 h; pooled prasinezumab/early start group: 5.9 ± 4.1 h; U(df=213) = 11,654, p = 0.5). However, the placebo/delayed start group tended to collect fewer hours of smartwatch passive monitoring data compared to the pooled prasinezumab/early start group over two years (8.0 ± 4.2 h and 8.9 ± 4.3 h, respectively; U(df=209) = 12,468, p = 0.07). This difference was primarily apparent in PASADENA Part 1 (see Supplementary Fig. 1), leading to a significant interaction between group and device for two-year smartwatch passive monitoring adherence (estimate: -0.285; SE = 0.13; p < 0.01). We note that one sensor feature - passively monitored gesture power during non-gait segments—relied on smartwatch data.

Fig. 1: Adherence to remote DHT monitoring.
figure 1

Mean (error shading represents 95% confidence intervals) adherence to daily (a) active testing (AT) with the smartphone and (b) passive monitoring (PM) with the smartwatch and smartphone Roche PD Mobile Application v2 during Parts 1 (i.e., weeks 0–52 since randomization) and 2 (i.e., weeks 52–104 since randomization) of the PASADENA study.

Results of exploratory analyses of motor sign progression as measured by the Roche PD Mobile Application v2

Motor sign progression as measured by six individual sensor features progressed numerically less in the pooled prasinezumab compared to the placebo arm: speeded tapping variability of the least and most affected sides, maximum speed of hand-turning in the least and most affected sides, median turn speed in the U-turn task and the median power of hand movements during passively monitored non-gait segments (Table 1 and Fig. 2).

Table 1 Progression of motor signs as measured by Roche PD Mobile Application v2 active test and passive monitoring individual sensor features and summary Simple Sum v1.2 score in the placebo vs. pooled prasinezumab arms in PASADENA part 1
Fig. 2: Progression of individual sensor outcome measures in individuals with PD on placebo (black) and prasinezumab (blue).
figure 2

Progression of sensor features which surpassed the exploratory threshold for a group effect in PASADENA Part 1 is shown: (a) passively monitored gesture power during non-gait segments; (b) hand-turning, least affected side; (c) U-turn test; (d) speeded tapping, most affected and (e) least affected sides; (f) hand-turning, most affected side. Data from both the RCT PASADENA Part 1 (first 52 weeks; placebo group data plotted in black, pooled prasinezumab group data plotted in blue) and PASADENA Part 2 (second 52 weeks) in which all participants received prasinezumab (all data plotted in blue) are shown. Only data collected prior to the start of dopaminergic therapy are analyzed and plotted. Statistics are reported beneath each figure: estimates from the LMEs are coefficients of the fortnight-by-treatment arm interaction term (difference in slopes between the two arms), while estimates from MMRM are reported as contrasts between the pooled prasinezumab and placebo arms at the last study fortnight in the analysis.

In addition, the slope of progression on the summary motor Simple Sum v1.2 score was lower in the prasinezumab compared to the placebo group (Table 1 and Fig. 3).

Fig. 3: Progression of summary Simple Sum Score sensor outcome measure and MDS-UPDRS Part III in individuals with PD on placebo (black) and prasinezumab (blue).
figure 3

a Progression on motor signs in the pooled early- and delayed-start prasinezumab groups as measured by Simple Sum v1.2 score (results from the linear mixed effects models (LMEs) are reported) and (b) the MDS-UPDRS Part III. Only data collected prior to the start of dopaminergic therapy are analyzed and plotted. For both metrics, higher scores indicate more severe motor signs. b is adapted from ref. 4, Copyright ©2022 Massachusetts Medical Society. Reprinted with permission.

Results of exploratory analyses on the effect of group on motor sign progression over 2 years as measured by the Roche PD Mobile Application v2

Five numeric group findings from PASADENA Part 1 persisted in the analyses of Parts 1 and 2 data, i.e.,: finger tapping on the least and most affected sides, speed of U-turn task, passively monitored hand movement power during non-gait activities, and Simple Sum v1.2 (Table 2 and Figs. 2 and 3). We note that two features (Draw-a-shape on least affected side, rest tremor on most affected side) showed numerically less progression in the pooled prasinezumab group over 2 years (i.e., Parts 1 and 2) but not in Part 1 only. Raw and modeled PASADENA Parts 1 and 2 data for all individual features are shown in Supplementary Fig. 2.

Table 2 Progression of motor signs as measured by Roche PD Mobile Application v2 active test and passive monitoring individual sensor features and summary Simple Sum v1.2 score in the early-start (i.e., prasinezumab in both PASADENA parts 1 and 2) and delayed-start (i.e., placebo in PASADENA part 1 and prasinezumab in PASADENA part 2) groups

Exploration of relationship between sensor features and patient-reported quality of life

Potential relationships between sensor feature values and patient-reported quality of life measures (i.e., PDQ-3919 subscale scores) were explored with Wilcoxon signed rank tests of sensor feature values between impaired and non-impaired groups (i.e., scores of 0 vs. >0). Applying an exploratory p < 0.001 threshold, these indicated that sensor features reflecting upper limb bradykinesia (i.e., feature values from both sides of Speeded tapping and Hand turning, and least affected side for Draw-a-shape) may be associated with the PDQ-39 Activities of Daily Living subscale scores, and Speech and Hand-turning (most affected) sensor features with PDQ-39 Communication subscale score (see Table 3). Moreover, baseline PDQ-39 global scores correlated with baseline Speeded tapping (most affected side) according to the p < 0.001 threshold (rho = 0.29) (see Supplementary Fig. 4).

Table 3 Relationships between baseline sensor feature values (order recapitulates that in Table 1) in individuals with PD with unimpaired vs. impaired PDQ-39 subscale and total scores (Mann-Whitney U (p-) values reported; comparisons with p < 0.001 are highlighted in yellow)

Discussion

Exploratory daily DHT active testing and passive monitoring data showed numerically different patterns of motor progression the placebo and pooled prasinezumab groups of the randomized, double-blind, placebo-controlled PASADENA Part 1 study in early-stage, dopaminergic treatment-naive PD. These effects were observed with a smartphone-based measures of bradykinesia (i.e., finger-tapping and hand-turning active test, the speed turning in the active U-turn task, and passively monitored hand movement power during non-gait segments in daily life), as well as a composite DHT Simple Sum v1.2 score of motor sign severity. The majority of these findings were also apparent in the analyses of data from Part 1 + Part 2 (i.e., year 2) of the study. We reiterate that since the primary endpoint of PASADENA was negative4 and secondary and exploratory analyses were not corrected for multiple comparisons, the present findings must be deemed investigative and hypothesis-generating as opposed to indicative of a treatment effect. The present exploratory findings suggest that sensor-based outcome measures may contribute to quantifying motor sign progression in clinical research of early-stage, dopaminergic treatment-naïve PD.

Bradykinesia DHT features showed the numerically greatest group effect in this early-stage PD population. These findings are consistent with patterns in the MDS-UPDRS Part III PASADENA data, i.e., a numerically reduced progression of MDS-UPDRS Part III scores at one year in the pooled prasinezumab compared to the placebo group, whereby this finding was driven by bradykinesia subscale items4. Bradykinesia is known to decline most in the early stages of PD20 and has been one of the best-studied and accessible motor signs for DHT technologies21. In the present study, speeded tapping, hand turning, speed of turning while walking and passively monitored non-gait hand movement power all showed numeric group differences. Interestingly, the effect was evident in maximal capacity active U-turn testing, but not in passively monitored turn speed during walking in daily life. Further, the passively monitored hand movement findings are notable because they emanated from non-gait periods in daily life, and thus do not represent the well-known arm-swing reductions in PD22. Instead, they may comprise e.g., spontaneous co-speech gestures23 and plate-to-mouth movements24, both of which are affected in PD, which represent motor behaviors of central real-world patient relevance. However, we reiterate that the placebo group wore the smartwatch for fewer hours than the prasinezumab group. Thus, this finding requires confirmation in follow-up studies to rule out the possibility that group differences in adherence to smartwatch monitoring were responsible for this finding (i.e., that individuals who wore the smartwatch more also gesticulated with more power during non-gait segments).

An outstanding question for the digital biomarker field is the extent to which sensor features reflect aspects of the disease that are relevant to patients’ functioning in everyday life25. An early study by Martinez-Martin and colleagues26 demonstrated that clinicians’ motor examination ratings (MDS-UPDRS Part III) significantly correlated with an eight-item version of the PDQ-39 (rho = 0.47, n = 435). To explore the patient relevance of the present DHT sensor features, they were compared between patients with no versus some impairment in aspects of daily functioning deemed relevant to individuals with PD. PDQ-39 global scores correlated with a single bradykinesia feature (speeded tapping) at the exploratory threshold, in line with the primacy of this motor sign in early PD. Analyses of PDQ-39 subscale scores with digital feature values indicated that sensor features reflecting upper limb bradykinesia may be associated with the PDQ-39 Activities of Daily Living subscale scores, while both the Speech and Hand-turning sensor features may be associated with PDQ-39 Communication subscale scores. We note that the paucity of findings may be partly due to the limited impact of early-stage PD on daily activities as measured by the PDQ-39, resulting in a truncated range in PDQ-39 scores (cf. Supplementary Fig. 3). These highly exploratory results require further investigation in follow-up studies with individuals with PD exhibiting a larger range of daily functioning impacts.

Longitudinal research studies of PD are increasingly deploying DHTs to mitigate the challenges associated with traditional methods to track progression in PD27. The MDS-UPDRS15 provides neurologists with essential information to quantify disease progression; however, as with any infrequently administered clinical scale, the fluctuating nature of PD motor signs will impact the reliability of the corresponding scores2. Indeed, Evers and colleagues28 recently quantified the within-subject variance associated with MDS-UPDRS progression in the PPMI sample using linear Gaussian state space models. These demonstrated substantially lower within-subject reliability estimates for MDS-UPDRS Parts I, II, and III (ON and OFF scores)—i.e., 0.23–0.5—compared with previously reported ICCs measuring between-subject variance ( > 0.90)29. Isolated psychometric limitations (e.g., floor effects, item gaps between very mild motor signs and motor impact)20 compound these issues in early stages of PD. DHTs may therefore provide complementary data with which to quantify motor sign severity and progression, as they enable remote and frequent data capture with a more representative sampling of motor sign severity.

The primary advantages of DHTs—high sensitivity, objectivity of assessment, remote and frequent assessment opportunities, and the ecological validity of testing in environments in which patients live and work—recommend them for numerous applications in the context of PD. A recent systematic survey by Deb and colleagues27 identified four key areas of active and growing PD DHT research from 2008 to 2021: diagnosis, prognosis and monitoring measuring treatment responses and rehabilitation. Publications focusing on the measurement of treatment response represented the smallest proportion (8%) with the slowest growth of publications over time and focused on approved therapies such as levodopa and deep brain stimulation. While these findings will inform the development of future DHTs aimed at optimizing treatment regimens, taken together with the present findings, they provide indirect evidence of the utility of DHTs in testing new investigational drugs16,25.

Exploratory analyses of data from daily interactions with a DHT suggest that this digital biomarker approach may contribute to hypothesis generation in signal-seeking phase 2 clinical trials in early-stage, dopaminergic treatment-naïve PD. The present exploratory findings require confirmation in an independent study. With respect to prasinezumab, the present findings must be viewed in the context of the PASADENA study not meeting its primary endpoint, whereby the MDS-UPDRS sum of Parts I, II and III did not differ between prasinezumab and placebo groups. In analyses uncorrected for multiple comparisons, MDS-UPDRS Part III data indicated a numeric benefit of prasinezumab on motor sign progression3. We also note that only one feature was pre-specified for each active test at the outset of the clinical trial. The analysis of a wider range of DHT features (e.g., hand turning amplitude in addition to speed), including novel e.g., speech and passive monitoring DHT features developed since the initial feature selection for the present study3, will provide important opportunities for further learning. Nevertheless, the present findings support the use of DHTs to quantify disease progression in clinical research and trials16,25 and potentially treatment monitoring in routine clinical practice.

Methods

Ethics statement

The trial was conducted according to the principles of the Declaration of Helsinki and Good Clinical Practice guidelines, and was approved by central institutional review boards (Ethikkommission der Medizinischen Universität Innsbruck (Austria), Comité de Protection des Personnes (CPP) Ouest IV (France), Ethikkommission der LÄK Hessen (Germany), CEIm Hospital Universitari Vall d’Hebron (Spain), Copernicus Group Independent Review Board (United States) and Western Institutional Review Board (United States)) or ethics committees at each trial site (Ethikkommission der Universität Leipzig Geschäftsstelle der Ethikkommission an der Medizinischen Fakultät der Universität Leipzig (Germany), Ethikkommission der Fakultät für Medizin der Technischen Universität München (Germany), Ethikkommission der Universität Ulm (Oberer Eselsberg) (Germany), Landesamt für Gesundheit und Soziales Berlin Geschäftsstelle der Ethik-Kommission des Landes Berlin (Germany), Ethikkommission des FB Medizin der Philipps-Universität Marburg (Germany), Ethikkommission an der Medizinischen Fakultät der Eberhard-Karls-Universität und am Universitätsklinikum Tübingen (Germany), Ethikkommission an der Medizinischen Fakultät der HHU Düsseldorf (Germany), The University of Kansas Medical Center Human Research Protection Program (United States), Oregon Health & Science University Independent Review Board (United States), Northwestern University Institutional Review Board (United States), Spectrum Health Human Research Protection Program (United States), The University of Vermont Committees on Human Subjects (United States), Beth Israel Deaconess Medical Center Committee on Clinical Investigations, New Procedures and New Forms of Therapy (United States), Vanderbilt Human Research Protection Program Health (United States), University of Maryland, Baltimore Institutional Review Board (United States), University of Southern California Institutional Review Board (United States), Columbia University Medical Center Institutional Review Board (United States), University of Southern California San Francisco Institutional Review Board (United States), University of Pennsylvania Institutional Review Board (United States) and HCA—HealthOne Institutional Review Board (United States)).

All participants provided written informed consent before undergoing any trial-specific screening tests or evaluations.

Study design and participants

The PASADENA phase 2 study design (Clinical Trial Registry Name: ClinicalTrials.gov; Clinical Trial Registry ID: NCT03100149; registered 2017-03-29), participants, and primary data analyses and results have been previously reported4,18. Briefly, PASADENA Part 1 was a 52-week multicenter, randomized, double-blind, placebo-controlled study evaluating the effect of intravenous prasinezumab on disease progression in early-stage PD patients. Three hundred and sixteen individuals with PD were randomized to placebo, low dose (1500 mg), or high dose (4500 mg/3500 mg based on body weight) prasinezumab. The pooled dose group is compared to placebo in the present analyses as no appreciable dose group differences were apparent in preliminary analyses4. All had been diagnosed with PD for ≤2 years at screening, were in Hoehn & Yahr Stages I or II, had a dopamine transporter imaging with single photon emission computed tomography (DaT-SPECT) scan consistent with PD, and were either treatment-naïve or treated with monoamine oxidase B inhibitors (MAO-Bi) and were not expected to require dopaminergic therapy for at least 52 weeks. Table 4 presents the demographic and clinical characteristics of the PASADENA population used for the Part 1 analyses.

Table 4 Baseline (i.e., Day 1 of PASADENA) demographic and clinical characteristics of study participants with DHT data used in PASADENA Part 1 analyses

In Part 2 of PASADENA, placebo patients were re-randomized to receive either 1500 or 4500 mg prasinezumab (delayed-start group). Patients receiving prasinezumab in PASADENA Part 1 continued on the same dose of prasinezumab in PASADENA Part 2 (early-start group). In PASADENA Part 2, data from 7 patients were unavailable in the early-start group (n = 204). In the final study week 104 of PASADENA Part 2, uncensored data from 28 patients in the early-start and 55 patients in the delayed-start groups remained for analyses (i.e., 27% each).

All participants provided written informed consent before undergoing any study-specific test or evaluation.

Procedures

Participants were administered comprehensive clinical assessments, including the MDS-UPDRS, at baseline and study visits every 2 months thereafter18.

All participants were provided with a locked-down study smartphone (Galaxy S7, Samsung, Seoul, South Korea) and smartwatch (Moto G 360 2nd Gen Sport, Motorola, Chicago, USA) for remote monitoring with the Roche PD Mobile Application v2. The active tests described below were designed to measure the core motor symptoms associated with PD were presented on the smartphone (see Lipsmeier et al.3,12 for details). All motor active tests requiring one arm or hand were performed twice, once with each side.

  • Draw-a-shape: drawing predefined shapes on the smartphone with the finger;

  • Finger-tapping: alternately tapping two circles on the smartphone as quickly and regularly as possible with the index finger;

  • Hand-turning: pronation/supination of the outstretched hand holding the smartphone;

  • Speech: reading three questions and answering these out loud;

  • Phonation: sustained phonation (i.e., saying “aaaaa”);

  • Postural tremor: holding the phone in the outstretched hand;

  • Rest tremor: seated, holding the phone in the hand while the forearm rests on the thigh;

  • Balance: standing with the smartphone worn in a running belt;

  • U-turn task: walking several paces, turning 180 degrees, and repeating for 30 s, while wearing a running belt containing the smartphone;

  • Symbol Digit Modalities Test (SDMT) (dominant hand only): an electronic version of the SDMT30 (eSDMT).

Patients were instructed to complete a pre-specified set of active tests each morning, with circa half of the tests presented on alternating days (Fig. 1). After completing the active tests, patients were instructed to place the smartphone in a running belt or their trouser pocket, and to wear the smartwatch, for the rest of the day to passively measure motor behavior during normal daily life (Fig. 4). Finally, patients were instructed to recharge the devices overnight.

Fig. 4: Roche PD Mobile Application v2 testing suite and schedule.
figure 4

Patients used provisioned study smartphones to perform daily “active tests”, circa half on alternating days, and motor behavior in daily life was “passively monitored” via the smartphone worn in a running belt or pocket and a provisioned study smartwatch.

Patients were trained on the use of the technologies at the screening visit, and started the DHT testing procedure during the screening period (median: 33.5 days prior to baseline) to familiarize themselves with the technology and tests. Data from the screening period were not included in the analyses described below.

Throughout both the active tests and passive monitoring, the smartphone sensors (i.e., the accelerometer, gyroscope, magnetometer inertial measurement units (IMUs)) sampled patients’ movement patterns at circa 50 Hz. For select active tests, touch and sound were also recorded. Device technical status and obfuscated location data were also collected, whereby patients could choose to pause the latter. These data were not analyzed in the present report. Video was not recorded. No patient-identifiable information was stored on the devices. Sensor data from both devices were encrypted and uploaded to secure servers when the smartphones were connected to WiFi; minimally, this occurred during site visits.

Sensor data were quality controlled (QC-ed), and “sensor features” were derived from the QC-ed sensor data associated with each task (and side of the body if applicable)3. Passively monitored sensor data from the smartwatch and smartphone were submitted to a Human Activity Recognition machine learning model which segmented portions of data into meaningful human activities (e.g., “walking”, “sitting”)31. Dedicated algorithms were then applied to the segmented data to generate passive monitoring features such as the turn speed during daily walking and the power of hand movements during non-walking segments3. All available sensor features were log-transformed and aggregated by calculating the median values over all available data obtained within each 14-day period (fortnights). However, if fewer than three QC-ed feature values were available within each testing interval, the sensor feature for the given fortnight was labeled as missing (average of 5.22% for all individual sensor features and 22.86% overall for Simple Sum v1.2). Fortnight 0 was defined as the period between Day 1 and Day 14 and was regarded as the baseline for change score calculation.

DHT outcome measures

Seventeen individual sensor features were pre-specified for analyses based on the literature: one per active task (and side of the body if applicable), and two for passive monitoring. Two sensor features were calculated for each lateralized motor active task: one for the most and one for the least affected side. Thus, the 17 features comprised: Draw-a-shape: spiral celerity (for most and least affected sides), i.e., accuracy of drawing divided by the time to complete the spiral drawing; speeded tapping: tapping variability (for most and least affected sides), i.e., the standard deviation (SD) of inter-tap time; hand-turning: maximum hand turning speed (for most and least affected sides); speech: standard deviation of all words’ individual mel-frequency cepstral coefficient 2 (MFCC2); sustained phonation: voice jitter, i.e., deviation from true periodicity of vocal cord resonance; postural and rest tremor: log median squared energy (for most and least affected sides); balance: jerk, i.e., rate at which acceleration changes with time; U-turn: median turn speed; eSDMT: number of correct responses; and for passive monitoring, hand movement power during non-gait intervals, and median turn speed while walking. These features are described in more detail in Lipsmeier et al.3.

A composite summary Simple Sum v1.2 score was created to measure the core motor signs of PD. The Simple Sum v1.2 therefore included all of the above features with the exception of eSDMT and passive monitoring. In addition, given the ambiguity of the directionality of worsening in the spiral celerity (accuracy/time), this feature was excluded from the Simple Sum v1.2 score. To ensure the same range of variability across features, the minimally detectable change (MDC) was calculated for each feature based on their intra-class correlation coefficients (ICCs)32. For the Simple Sum v1.2 (i.e., not individual features), the directions of sensor feature values were amended such that increases corresponded to worsening motor signs, and change from baseline for each feature was calculated by scaling by their respective MDCs.

Sensor feature data aggregated over each two-week window, as described above, were the basis of all statistical analyses.

Statistical analyses

Adherence to the remote active testing procedure was quantified as the average number of days per week with complete active testing data. Adherence to passive monitoring was quantified as the average number of hours of passively collected smartphone and smartwatch data. Both metrics were calculated over all participants for every two-week study period separately. Potential group (pooled prasinezumab/early start vs. placebo/delayed start) differences in active testing and passive monitoring (considered separately for smartphone and smartwatch) adherence over two years were tested with Mann-Whitney U-tests. A potential interaction in passive monitoring adherence with respect to group and device (smartphone vs. smartwatch) was tested with a linear mixed effects model including age and sex as covariates.

Two sets of exploratory analyses were performed. The first tested for an effect of group on sensor feature progression over 52 weeks in the randomized, double-blind Part 1 of PASADENA (i.e., differences between pooled prasinezumab and placebo). The second tested for an effect of group on sensor feature progression over 104 weeks in Parts 1 and 2 of PASADENA, whereby all participants received prasinezumab in Part 2 (i.e., differences between early-start and delayed-start prasinezumab groups). Only sensor feature data collected entirely prior to the start of dopaminergic treatment or change in MAO-Bi dose were analyzed, consistent with the primary clinical readouts4. Thus, the size of the samples with available data at different time points throughout the study primarily reflects the loss of data from patients who started symptomatic therapy (cf. Figs. 2 and 3). As the present analyses were exploratory (i.e., hypothesis-generating) in nature, results were not corrected for multiple testing. For each sensor outcome measure, both raw sensor data (“raw data”) and adjusted sensor values resulting from the respective statistical model described below (i.e., linear mixed effect (LME) or mixed models with repeated measures (MMRMs)) (“modeled data”) are plotted. To contextualize the DHT findings, data on motor sign progression as measured with the MDS-UPDRS Part III15 is also displayed4.

Statistical analysis of PASADENA part 1 DHT data: exploratory analyses of the effect of group on motor sign progression as measured by the Roche PD mobile application v2

The objective of this analysis was to evaluate the effect of group on sensor feature change from baseline over the first 52 weeks of the PASADENA study in absence of dopaminergic PD treatment. DHT data from all randomized participants who (a) received at least one dose of prasinezumab or placebo (corresponding to all 316 patients), (b) had a non-missing result at baseline and (c) at least one non-missing feature value within the 52 weeks were analyzed.

Linear mixed effect (LME) models were fitted to estimate and compare the sensor feature progression slopes in the pooled prasinezumab and placebo treatment arms. The model included change from baseline in the sensor feature as the dependent variable while adjusting for the following fixed effects: time (fortnight, treated as a continuous variable), treatment arm, fortnight-by-treatment interaction, baseline sensor feature value, age group (<60 years vs. ≥ 60 years), sex, MAO-Bi at baseline (yes vs. no), and baseline DaT-SPECT binding ratio in the putamen contralateral to the clinically most affected side. A random intercept and slope for each study participant were also modeled. The residual variance-covariance structure between fortnights within participants was assumed to follow an autoregressive process of order 1. If model residuals were normally distributed, the LME model was used to make exploratory inferences regarding group differences. The target of inference for the LME model was the coefficient of the fortnight-by-treatment interaction term: its estimate and standard error were extracted from the fitted model, and hypothesis testing used an exploratory 2-sided significance level of 0.2, in line with the primary clinical analyses4.

For DHT feature data with non-normally distributed LME model residuals, MMRMs were fitted to estimate the mean change from baseline to week 52 in sensor features in the pooled prasinezumab vs. placebo groups, using the same model structure as for the LMEs. Fortnight was considered categorical and treated as the repeated variable within a participant. A compound symmetry variance-covariance matrix was applied to model the within-participant errors. The target of inference for the MMRMs was the difference in adjusted mean change from baseline to week 52 between the prasinezumab and placebo groups. Its estimate and standard error were extracted from the fitted model; degrees of freedom were calculated using the Satterthwaite method33, and hypothesis testing also used an exploratory 2-sided significance level of 0.2.

Statistical analysis of parts 1 and 2 DHT data: exploratory analyses of the effect of group on motor sign progression over 2 years as measured by the Roche PD mobile application v2

The objective of this analysis was to evaluate the effect of group on sensor feature changes from baseline over the first 104 weeks of PASADENA. The analysis population consisted of all randomized participants who completed Part 1 (a) without starting dopaminergic therapy, (b) received at least one dose of prasinezumab in Part 2, (c) had a non-missing result at baseline and (d) at least one non-missing fortnight value for any of the sensor feature scores in Part 2.

The same LME and MMRM model structures and decision rules as above were used for these analyses, with the following adaptations:

LME: two additional covariates were included: adjustments for the study period (Part 1 vs. Part 2), and for the switch from placebo to prasinezumab in the delayed-start arm (switch = 0 for patients treated with placebo during Part 1 and switch = 1 for patients treated with prasinezumab in Part 2). Two additional interaction terms were also included: a fortnight-by-study period term to enable the overall progression slope to change from Parts 1 to 2, and a fortnight-by-switch term to enable progression slopes to change from Parts 1 to 2 by a different amount in the early-start vs delayed-start arm.

MMRM: an additional covariate was introduced to adjust for the study period (Part 1 vs. Part 2).

The target of inference for the LME models was the coefficient of fortnight-by-treatment interaction term, as for the first set of analyses. The target of inference for the MMRMs was the difference in adjusted mean change from baseline to Week 104 between the early-start and delayed-start groups. Hypothesis testing used an exploratory 2-sided significance level of 0.2.

Exploration of relationship between sensor features and patient-reported quality of life

The relationship between sensor feature values and patient-reported quality of life as measured by the 39-item Parkinson’s Disease Questionnaire (PDQ-3919) was explored by comparing baseline sensor feature values with PDQ-39 global scores using Spearman’s correlations. The ranges of PDQ-39 subscale scores were truncated in this early-stage PD population (see Supplementary Fig. 3). Therefore, an additional analysis categorized patients as having no vs. some impairment (i.e., scores of 0 vs. >0) on each PDQ-39 subscale score. Sensor feature values were then compared between the non-impaired and impaired groups using nonparametric Wilcoxon signed rank tests. Results are noted if an exploratory threshold of p < 0.001 is achieved.