Pathogenic variants in the LRRK2 gene are among the most common causes of autosomal dominant Parkinson’s disease (PD)1,2 and are thought to act through a gain-of-function mechanism that increases kinase activity3. The LRRK2 p.L1795F variant (chr12:40322386:G:T, hg38, rs111910483) has been shown to significantly enhance kinase activity, supporting its pathogenic role4. It was previously identified in eight PD cases from 2007 to 20195,6,7, and most recently 20248 as well as suggested as a genetic risk factor with an odds ratio (OR) of 2.59. However, insufficient evidence of segregation precluded this variant from being considered “pathogenic”. Determining pathogenicity is crucial for diagnosis, genetic counseling, and even more for treatment, particularly now that LRRK2-specific clinical trials are underway10,11.

We screened a large cohort of PD cases and controls with short-read whole-genome sequencing (WGS) data, including 16,351 individuals from GP2 release 8 (DOI 10.5281/zenodo.13755496) and AMP-PD release 4 (for details see Methods and Supplementary Table 1) to identify recurrent rare coding variants of unknown significance co-segregating with PD in known PD genes (LRRK2, SNCA, VPS35, PINK1, PRKN, PARK7, and GBA1). We identified nine carriers of the LRRK2 p.L1795F variant (ENST00000298910.12:c.5385 G > T; chr12:40322386:G:T; Supplementary Figs. 1-6). Of these carriers, we identified two families based on kinship inference using genetic data (Fig. 1). The larger family (GP2-FAM-1) included four affected individuals showing the segregation of this variant with PD. The second family (AMP-FAM-1) consisted of three carriers, one clinically affected with PD and two asymptomatic carriers (at ages 55 and 76 years, respectively). The remaining two carriers were PD cases with a positive family history of PD, but no additional family members were available for genetic testing. Notably, rs111910483 is multiallelic, and we identified 7 additional carriers of the synonymous p.L1795L (ENST00000298910.12:c.5385 G > A; chr12:40322386:G:A) variant. However, this synonymous variant is very unlikely to be disease-causing and was therefore excluded from any further analyses. Additionally, we did not identify other recurrent variants in known PD genes with supporting segregation evidence.

Fig. 1: Pedigrees of identified families in this study.
figure 1

Pedigree of Family GP2-FAM-1 (A), CANADA-FAM-1 (B), and AMP-FAM-1 (C) with the LRRK2 p.L1795F variant. The pedigrees were drawn based on reported family history and may be incomplete. The index cases are indicated with arrows. Affected individuals are indicated by black symbols: circles (female) and squares (male). Diamond is where sex is undefined. Unaffected individuals are indicated by open symbols. Unaffected variant carriers are indicated by open symbols with a dot in the middle. A diagonal line indicates deceased individuals. Red circle indicates individuals with genetic data available (WGS data for GP2-FAM-1 and AMP-FAM-1, single gene testing for CANADA-FAM-1). Heterozygous mutant (m) and wild-type (wt) genotypes are indicated with corresponding age at the sample collection (age) and age at motor symptom onset (if known; AAO). A The mother of GP2-FAM-1 index was reported to have eight additional siblings (#), several of whom are clinically affected with PD; however, no detailed family history is available for these relatives. B One maternal aunt (II-1) of the CANADA-FAM-1 index was reported to have had Alzheimer’s disease (##).

Next, we screened the genotyping data of 54,153 affected and unaffected individuals generated within GP2 (DOI: 10.5281/zenodo.10962119), where the LRRK2 p.L1795F variant was directly genotyped using the Neurobooster array. We identified three additional clinically affected variant carriers (Supplementary Fig. 7). We further screened the clinical exome data from 10,454 individuals from PDGENE which resulted in one additional variant carrier (Supplementary Fig. 8). Finally, querying the CENTOGENE proprietary Databank CentoMD®12, we identified another family with four individuals carrying the LRRK2 p.L1795F variant, three of whom were PD cases and one being an asymptomatic carrier. In total, we identified 17 individuals carrying this variant across all the datasets, including nine index cases with PD as well as five affected and three unaffected family members.

The demographic and clinical details of all identified variant carriers are displayed in Table 1. More than two-thirds were females (70.6%; n = 12/17). All affected and unaffected carriers had a positive family history of PD. Notably, among the six singleton cases, two reported only second-degree relatives with PD, while three reported a multi-incident family history of the disease. Ages of motor symptom onset (AAO) in affected individuals ranged from 36 to 66 years. The median AAO was 54.5 years (interquartile range 47-60 years). The asymptomatic carriers were 55, 76 and 76 years old, respectively, at the time of sample collection and clinical evaluation. Based on the available clinical data, the majority of affected individuals had classical PD with an asymmetric onset of symptoms and a good response to dopaminergic medication, and without obvious atypical signs suggestive of other diagnoses (missing data for up to 30%). Detailed data on non-motor symptoms and neuropsychiatric comorbidities were scarce. Cognition was reported to be unaffected in the majority of affected carriers with good scores in cognition tests (including Montreal Cognitive Assessment [MoCA] and Mini Mental State Examination [MMSE]); however, one clinically affected individual had significant cognitive impairment (MoCA score of 17 points) and one unaffected carrier also showed some cognitive deficits (MoCA score of 23 points). More detailed characteristics of the individuals from the three identified families are available in the Supplementary Material.

Table 1 Demographic and clinical characteristics of identified LRRK2 p.L1795F variant carriers

The p.L1795F (ENST00000298910.12:c.5385 G > T) variant is currently categorized as a variant of uncertain significance in ClinVar and shows conflicting evidence from various in-silico prediction tools and databases (Supplementary Table 2 and Supplementary Fig. 9). It is rare and confined to European populations in several investigated databases (including gnomAD v4.1, the Regeneron Genetics Center Million Exome Variant Browser13, and the UK Biobank14 500 K genomes). Similarly, all identified LRRK2 p.L1795F carriers in this study were of European ancestry, whereas the variant was absent in other ancestral populations (n = 15,316) within the GP2 genotyping cohort. In Europeans, it had an allele frequency of 0.00012 among PD cases (5 heterozygous carriers and 20,812 noncarriers) while being absent in controls (n = 9,032; Table 2). The logistic regression analysis using the European population of the GP2 genotyping cohort did not reveal a significant association between this variant and PD, likely due to insufficient controls available in the dataset given its rarity (P > 0.8, Supplementary Table 3). When comparing the distribution of carriers between PD cases from the combined genotyping and WGS dataset (6 heterozygous carriers and 23,270 noncarriers) and two non-Finnish European control populations: gnomAD v3.1.2 non-neuro (0 heterozygous carriers and 31,960 noncarriers) and gnomAD v4.1 (2 heterozygous carriers and 589,826 noncarriers), this variant was significantly associated with PD (P < 0.0056 using gnomAD v3.1.2 non-neuro, and P < 7.84e-08, OR = 76.04, 95% CI: 15.35–376.77 using gnomAD v4.1, two-tailed Fisher’s exact test). Given this variant was observed only in the European population, we searched for the overlapping IBD segments among the variant carriers using the genotyping data. The median length of an IBD segment over LRRK2 in these individuals was 7.05 cM (range: 2.1–96.3 cM, Fig. 2). All genotyped carriers shared a core haplotype of 2.825 Mbp at this locus (Supplementary Table 4), suggesting that the p.L1795F variant descended from a common founder.

Table 2 Frequency of the LRRK2 p.L1795F and p.G2019S variants across ancestries in the GP2 genotyping cohort
Fig. 2: Overlapping identity-by-descent segments spanning LRRK2 p.L1795F variant among the variant carriers with genotyping data.
figure 2

Each line represents an IBD segment inferred between a unique pair of individuals. IBD segments are colored based on whether both individuals in a pair belong to the same family (GP2-FAM-1) or are considered unrelated (UR). FS indicates an IBD segment between full siblings, 2nd degree refers to a segment between a pair of second-degree relatives, and PO represents a segment between a parent and offspring. The vertical grey line marks the genomic position of the LRRK2 p.L1795F variant.

To our knowledge, we provide the largest number of LRRK2 p.L1795F variant carriers thus far, including 14 carriers clinically affected with PD and three asymptomatic carriers. The available data from the previously reported carriers5,6,7,8 do not align with our data, making an overlap of individuals between the studies unlikely. Including those reported in the literature, this brings the total to 22 clinically affected carriers of European ancestry. Still, the overall number of p.L1795F carriers is limited, and higher frequencies might be observed in specific European subpopulations. Our haplotype analysis indicating a common founder further supports this hypothesis, although we were only able to determine the geographical origin of one family of carriers in this study, which was of Ukrainian and Polish descent. Taken together with four recently published carriers of either Hungarian or Slovak origin, this likely indicates a Central-Eastern European origin8. Notably, we identified three asymptomatic p.L1795F carriers, who might still develop PD symptoms later in life. However, given the pedigree structure of these individuals, this may also reflect reduced penetrance - a common phenomenon in monogenic forms of PD, including other pathogenic LRRK2 variants.

Comparing the clinical phenotypes of p.L1795F carriers with those of other pathogenic LRRK2 variants, particularly p.G2019S15, revealed similarities among them and with idiopathic PD (iPD). While group differences in clinical phenotypes among LRRK2 variants may exist16, they do not enable meaningful genotype-phenotype correlations at an individual level. LRRK2-PD is clinically indistinguishable from iPD on an individual level. Most individuals with LRRK2-PD, including p.L1795F carriers, exhibit a classic PD phenotype with a good response to dopaminergic treatment. Atypical presentations have been described in single cases but are overall rare16. Notably, the p.L1795F variant is located in the COR-B domain, in close proximity to other pathogenic LRRK2 variants, namely p.Y1699C17 and p.F1700L18. Interestingly, for p.Y1699C carriers, a more heterogeneous phenotype has been reported, including atypical signs like amyotrophy, dementia and symptoms of behavioral disorders.17,19,20,21 However, this observation might be coincidental and biased by the small number of variant carriers. Atypical features, prominent non-motor features, or neuropsychiatric comorbidities haven’t been specifically reported for the majority of p.L1795F carriers, but the overall data is limited, making it difficult to draw meaningful conclusions. Overall, the p.L1795F phenotype aligns well with the general characteristics of LRRK2-PD and appears comparable to other LRRK2 variants with cautious interpretation given the limited number of identified carriers. The most significant differences between the genetic subtypes are their ancestral and geographical variability.

In conclusion, this is the first study providing evidence of the LRRK2 p.L1795F variant segregating with disease in multiplex families, missing from the previous reports5,6,7,8. Taken together with published functional data4, showing strongly enhanced LRRK2 kinase activity, our findings support the LRRK2 p.L1795F variant to be considered pathogenic. Large-scale studies can be helpful to identify novel rare causes of PD but also to re-evaluate previously identified variants by providing additional evidence of pathogenicity through an increased number of variant carriers and segregation. We therefore propose LRRK2 p.L1795F as a cause of PD, especially in the European population. Including this variant in the genetic screening of PD patients, particularly those of Central-Eastern European origin, may be beneficial for the variant carriers to be included in ongoing gene-specific clinical trials.

Methods

Ethics declaration

This study was conducted in accordance with the ethical standards of the institutional and national research committees. This study was approved by all ethics committees or institutional review boards of all sites participating in this study and providing samples and data, including the University of Cincinnati in Cincinnati (IRB#2017-5985), Ohio, USA, the Emory University School of Medicine in Atlanta, GA, USA, and the Michigan State University, MI, USA, and the University Health Network Research Ethics Board in Toronto, Canada. Informed consent for study participation was obtained from all participants.

Study design and participants

Our study workflow is highlighted in Fig. 3. Three sources of data were included in this study (Supplementary Table 1). First, we used the multi-ancestry whole-genome sequencing and genotyping data from the study participants recruited as part of GP222 (DOI 10.5281/zenodo.13755496) as previously described23,24. Individual-level demographic and clinical data were obtained from participating principal investigators and publicly available databases (e.g., for Coriell samples included in GP2). Second, we incorporated whole-genome sequencing data from AMP-PD. Participants in this initiative were recruited through multiple studies, including BioFIND, the Harvard Biomarkers Study (HBS), the Lewy Body Dementia Case-Control Cohort (LBD), the Parkinson’s Disease Biomarkers Program (PDBP), the Parkinson’s Progression Markers Initiative (PPMI), the LRRK2 Cohort Consortium (LCC), the Study of Isradipine as a Disease-Modifying Agent in Subjects with Early Parkinson Disease, Phase 3 (STEADY-PD3), and the Study of Urate Elevation in Parkinson’s Disease, Phase 3 (SURE-PD3). Clinical information and genetic samples from participants were obtained with appropriate written consent and local institutional and ethical approvals. Detailed information about these studies is available on the AMP-PD website (https://amp-pd.org) and the respective study websites. Third, we obtained the clinical exome sequencing data from PDGENE2, a large multi-center study in North America providing genetic testing and counseling to more than 15,000 participants.

Fig. 3: Study design.
figure 3

Figure created with BioRender.com.

Whole-genome sequencing (WGS) data

We included 9974 samples with the sequence alignment data available from BioFIND, HBS, LBD, PDBP, PPMI, STEADY-PD3, and SURE-PD3 cohorts through the AMP-PD release for joint genotyping with the GP2 cohort (Supplementary Table 5). Due to the unavailability of sequence alignment data from the LCC cohort, we used AMP-PD release 4 data to screen for potential pathogenic variants in this cohort.

Additionally, the DNA samples from 5,926 participants from the GP2 cohort (GP2 Data Release 8, DOI 10.5281/zenodo.13755496, Supplementary Table 5) were genome sequenced to an average of 30x coverage with 150 bp paired-end reads following Illumina’s TruSeq PCR-free library preparation protocol. We followed the same functional equivalence pipeline25 as AMP-PD to produce the sequence alignment against the GRCh38DH reference genome.

We used DeepVariant v.1.6.126 (https://github.com/google/deepvariant) to generate the single-sample variant calls for a total of 15,900 samples in GP2 and AMP-PD and performed joint-genotyping using GLnexus v1.4.3 (https://github.com/dnanexus-rnd/GLnexus) with the preset DeepVariant WGS configuration27. We set genotypes to be missing after variant quality control defined as genotype quality >=10, read depth >=10, and heterozygous allele balance between 0.2 and 0.8, and retained high-quality variants with a call rate > 0.95 after quality control. After the sample quality control following the quality metrics defined by AMP-PD28, we retained 15,752 samples (AMP-PD and GP2 combined) for the downstream analyses (Supplementary Table 5). Variant annotation was performed with Ensembl Variant Effect Predictor v111 (http://www.ensembl.org/info/docs/tools/vep/index.html, RRID:SCR_007931)29. We used KING v.2.3.0 (https://www.kingrelatedness.com, RRID:SCR_009251)30 to infer relatedness up to the second-degree relatives to confirm the known relationships and identify cryptic familial relationships. Genetic ancestry was determined using GenoTools v1.2.3 (https://github.com/GP2code/GenoTools) with the default settings31.

Genome-wide genotyping with the Neurobooster Array (GP2)

We screened the genotyping data published as part of GP2’s Data Release 732 (DOI: 10.5281/zenodo.10962119, Supplementary Table 6). Genotyping was performed by GP2 using the NeuroBooster Array (NBA; v.1.0, Illumina, San Diego, CA)33. Raw genotyping data underwent quality control and genetic ancestry prediction using GenoTools v1.2.3 with the default settings31. The LRRK2 p.L1795F variant was directly genotyped using NBA, and the quality of genotype calls was assessed by examining the signal intensity plots.

Clinical exome sequencing (PDGENEration)

We included 10,454 samples with clinical exome data available from PDGENE2 as part of GP2’s Data Release 8 (DOI 10.5281/zenodo.13755496)32. The sequence data processing followed the same pipeline of WGS data as mentioned above. We performed joint-genotyping using GLnexus v1.4.3 with the preset DeepVariant WES configuration and followed the same criteria for sample and variant quality control as for the WGS data.

Querying additional databases (CENTOGENE)

We queried the CENTOGENE proprietary Databank CentoMD®12 to identify potential additional variant carriers. CENTOGENE is a globally operating genetic diagnostic lab. Genetic data included in this manuscript was generated by exon-wise PCR amplification followed by Sanger sequencing.

Statistical analyses

To estimate the allele frequency of LRRK2 p.L1795F variant in multi-ancestral populations, we analyzed the GP2 genotyping data, the largest available dataset in this study. We excluded related individuals and samples from targeted recruitment, such as LRRK2 and GBA1 variant carriers within specific efforts of PPMI and LCC. Subsequently, we performed an association analysis of this variant with PD using the European population. We fitted the logistic regression model with PD status as binary outcome variable and the covariates as the genotype of LRRK2 p.L1795F variant, sex, age, family history, and the first six principal components to account for the population stratification. For cases, age at onset (AAO) or age at diagnosis was used, while for controls, age at sampling was used. Additionally, we merged GP2 genotyping data with the combined AMP-PD and GP2 WGS data, resulting in a cohort of 23,276 PD cases of European ancestry after excluding duplicated, related, and targeted recruitment samples as mentioned above. This allowed us to compare the carrier distribution between PD cases and non-Finnish European population from the Genome Aggregation Database (gnomAD v.3.1.2 non-neuro and v4.1, http://gnomad.broadinstitute.org/, RRID:SCR_014964) as external population controls using Fisher’s exact test. We excluded the PDGENE clinical exome data from this analysis as we could not estimate the genetic ancestry in the same manner as with the other datasets. The P value ≤ 0.05 was considered statistically significant for all the analyses.

To determine if carriers of the LRRK2 p.L1795F variant shared recent common ancestry, we phased the genotyping data from chromosome 12 in the European population using Beagle 5.4 (https://faculty.washington.edu/browning/beagle/beagle.html) with default settings34 and searched for identical-by-descent (IBD) segments with the length ≥2 cM shared across the carriers using hap-ibd v1.0.0 (https://github.com/browning-lab/hap-ibd) with default setting35.