Parkinson’s disease (PD) is a complex condition influenced by multiple genes and environmental factors, with specific single-gene mutations accounting only for 1–2% of cases, excluding GBA11. Early onset of PD (EOPD; < 50 years) represents about 3–14% of all PD cases depending on population2, and has been linked to mutations in several genes (i.e. PRKN, PINK1, DJ-1, LRRK2, SNCA, GBA1)2. Considering variants in these genes only, the frequency of genetically associated PD in early-onset cases reaches up to 20%3,4.

Synaptojanin 1, encoded by SYNJ1, is a lipid phosphatase abundantly expressed in brain tissues5 which plays an important role in synaptic trafficking and in autophagic clearance6,7,8. The protein consists of three functionally distinct domains: Sac1, 5-phosphatase, and proline rich domains (PRD)9. Previous studies suggested that biallelic SYNJ1 mutations cause autosomal recessive, early-onset parkinsonism and EOPD10,11,12,13. In the current study, we sought to examine the role of rare SYNJ1 variants in six non-familial cohorts with a total of 8,165 PD cases (including 818 EOPD) and 70,363 controls (detailed in Supplementary Table 1).

The average coverage of the SYNJ1 gene in the four cohorts sequenced at McGill was > 4000X, with 100% of nucleotides covered at >30X (Supplementary Table 2). From these cohorts, 23–75 rare variants were included in the analysis, depending on the cohort (detailed in Supplementary Table 3). In the AMP-PD cohort, 523 rare variants were included in the analysis and 440 rare variants were included in the UKBB cohort (coding and functional variants detailed in Supplementary Table 4).

Burden analysis with SKAT-O suggested a possible association between all rare variants and PD in the AMP-PD cohort (P = 1.42E-05; Pfdr = 2.98E-04), with nominal associations that did not survive false discovery rate (fdr) correction in the Columbia cohort (P = 0.030; Pfdr = 0.126) and the McGill cohort (P = 0.009; Pfdr = 0.095). No association was found in the meta-analysis of all six cohorts after false discovery rate correction (P = 0.025; Pfdr = 0.131). We also found a nominal association between non-synonymous variants and PD in the Columbia cohort (P = 0.012; Pfdr = 0.084; Table 1).

Table 1 Burden analysis of rare SYNJ1 variants

We then analyzed only EOPD (age at onset < 50 years) and found an association between all rare variants and PD in the AMP-PD cohort (P = 3.48E-05; Pfdr = 7.31E-04) and nominal association that did not survive FDR correction in the McGill cohort (P = 0.027; Pfdr = 0.189). In the meta-analysis, which included only EOPD patients and controls, all rare variants in SYNJ1 were associated with PD (P = 2.80E-03; Pfdr = 0.029). The association between non-synonymous variants and PD was nominally significant in the Columbia cohort (P = 0.044; Pfdr = 0.231) but not in the meta-analysis (Table 1).

To analyze variants within specific functional domains of SYNJ1 (Sac1, 5-phosphatase, PRD), we divided the gene regions into these domains and then repeated SKAT-O analysis. We found an association between the Sac1 domain of SYNJ1 and PD for non-synonymous variants with high Combined Annotation Dependent Depletion (CADD) scores in the UKBB cohort (P = 0.006; Pfdr = 0.082), which did not survive fdr correction. However, this association became significant in the meta-analysis (P = 0.002; Pfdr = 0.04; Supplementary Table 5). The association with the Sac1 domain of SYNJ1 was nominally significant in the EOPD group (P = 0.02; Pfdr=0.37) and remained significant in the group with PD onset after 50 years (P = 0.001; Pfdr = 0.04). We observed that this association was primary driven by the p.A195T SYNJ1 variant in the UKBB cohort (Odds ratio (OR) = 4.87; 95% confidence interval (CI) 1.76–13.46, P = 0.002, with a minor allele frequency (MAF) in cases of 0.001 and 0.0002 in both UKBB controls and gnomAD population of European ancestry). This variant is classified as a Variant of Uncertain Significance (VUS) in ClinVar and is likely benign according to the ACMG classification but is noted for its high CADD score. This variant was also detected in one patient from the Pavlov and Human Brain cohort and was not reported in any other studied cohorts.

We did not find any homozygous or compound heterozygous carriers of pathogenic variants (previously reported in association with PD11, defined as pathogenic or likely pathogenic by ClinVar or variants with CADD score > 20) and deletions in none of the studied cohorts.

In the current study, we found an association between all rare heterozygous variants in SYNJ1 and heterozygous variants with high CADD score in the Sac1 domain of SYNJ1, and the risk of PD in some of the analyzed cohorts. These findings suggest that SYNJ1 could be associated with sporadic PD. Additional studies are required to determine whether this potential association holds in other cohorts. The association we found of rare heterozygous SYNJ1 variants with EOPD was more convincing, yet here too, additional studies are needed. We did not identify biallelic carriers in our analysis, although private bi-allelic SYNJ1 variants were previously associated with EOPD and atypical parkinsonism (detailed in ref. 11),

Multiple genes that are involved in the autophagy-lysosomal pathway are also associated with PD14. From the biological point of view, SYNJ1 has a role in two pathways relevant to PD: synaptic trafficking and autophagic clearance7,8. Recent functional studies demonstrated that mutations in SYNJ1 destabilize dopaminergic neurons potentially due to defective clathrin uncoating, disrupting lipid metabolism and synaptic function15,16. However, SYNJ1 overexpression can counteract these effects, highlighting its potential therapeutic potential in PD15,16.

Our study has several limitations. In some of our cohorts, we had significant differences in sex and age between PD patients and controls. We adjusted for the effects of these covariates in our analysis to address this limitation. Additionally, in our study we predominantly included participants with European ancestry. Furthermore, we used different types of sequencing data and quality control procedures were performed independently for each of the cohorts. Thus, it could potentially create differences in variant enrichment between cohorts. To partially overcome this limitation, we analyzed each cohort separately and conducted meta-analyses of the different cohorts, rather than joint analysis of all cohorts. Finally, due to the limitations of the SKAT-O meta-analysis method, we were unable to assess heterogeneity across cohorts, which may influence the interpretation of our meta-analysis results.

To conclude, we found that rare heterozygous SYNJ1 variants were potentially associated with EOPD and variants in the Sac1 domain are associated with sporadic PD. Larger studies in cohorts of different ethnic backgrounds are needed to replicate our results.

Methods

Population

The study population comprised 8165 PD patients and 70,363 controls from six cohorts, including 818 EOPD (all demographic characteristics detailed in Supplementary Table 1). Four cohorts were collected at McGill university: (1) a cohort of French/French-Canadian from Quebec, Canada and Montpellier, France, (2) a cohort from Columbia University (New York, NY), (3) a cohort from Sheba Medical Center (Israel) and (4) a cohort from Pavlov and Human Brain institutes (Russia)17. The Columbia cohort comprises patients and controls of varied racial and ethnic origin (European, Ashkenazi [AJ] descent and a minority of Hispanics and blacks). Patients and controls in the Sheba cohort, which was recruited in Israel, are of full AJ ancestry (all four grandparents are AJ). The Pavlov and Human Brain institute cohort was collected from the North-Western region of Russia and mainly with East-European ancestry. Additionally, we performed the analysis in the Accelerating Medicines Partnership – Parkinson Disease (AMP-PD) initiative cohorts (https://amp-pd.org/; detailed in the Acknowledgment) and the UK biobank (UKBB) cohort. We only included participants of European ancestry from both cohorts, and we excluded any first and second-degree relatives from the analysis. All PD patients were diagnosed by movement disorder specialists according to the UK brain bank criteria18 or the MDS clinical diagnostic criteria19. The ethics committee of McGill University gave ethical approval for this work.

Standard protocol approvals, registrations, and patient consents

All local IRBs approved the protocols and informed consent was obtained from all individual participants before entering the study.

Targeted next-generation sequencing by molecular inversion probes

The entire coding sequence of the SYNJ1 gene, including exon-intron boundaries (±50 bps) and the 5′ and 3′ untranslated regions (UTRs), was targeted using molecular inversion probes (MIPs) as described earlier20. The full protocol is available at https://github.com/gan-orlab/MIP_protocol. The Genome Quebec Innovation Centre’s Illumina NovaSeq 6000 SP PE100 platform was used to sequence the library. Alignment was carried out to hg19 reference genome21 with coordinates for SYNJ1 chr21:34,001,069-34,100,351. Genome Analysis Toolkit (GATK, v3.8) was used for post-alignment quality checking and variant calling22. We applied standard quality control procedures as described before23. In brief, using the PLINK program version 1.9 and GATK, v3.8, we carried out quality control by eliminating variants and samples with poor quality. SNPs and samples with genotyping rate lower than 90% were excluded. The analyses only included variants with 30x minimal depths of coverage, having a MAF less than 1% and a minimum quality score (GQ) of 30.

Detection of copy-number variations (CNVs)

To detect copy-number variations (CNVs) from MIPs, we utilized the ExomeDepth R package24, which has been validated for CNV detection in MIPs25. Briefly, ExomeDepth selects reference samples that show high correlation with the test samples and employs a hidden Markov model to call CNVs. We applied quality control measures to our MIPs library, excluding probes with less than 100x coverage and genes where over 10% of probes had insufficient coverage. Additionally, we removed low-quality samples with less than 50x coverage. CNVs in SYNJ1 were called following parameters from previous MIPs analyses25, including GC correction and a transition probability of 1e-06. The reference set consisted solely of control samples, and test samples were excluded when calling CNVs in the control group.

Whole-exome and whole-genome sequencing data

The genetic data in AMP-PD and UKBB were aligned to the human reference genome hg38 and we used the appropriate coordinates to extract the SYNJ1 data (chr21:32,628,759-32,728,039). Quality control procedures were performed as previously described in detail for AMP-PD26 and UKBB27 cohorts.

Statistical analysis

To analyze rare variants (MAF < 0.01), we applied the optimized sequence Kernel association test (SKAT-O, R package)28 in each cohort separately, followed by meta-analysis using the metaSKAT package29. We performed separate analyses for all rare variants, non-synonymous variants, and variants with high pathogenicity scores of ≥ 20 (CADD v1.6)30. For domain-based analysis, domains boundaries were decided by the widest intervals of each domain based on a combination of estimates from publicly available domain annotation resources. These resources are SuperFamily, Pfam, Smart, Gene3D, PANTHER, Conserved Domains Database, and PROSITE31,32,33,34. We adjusted for sex and age in all analyses. For the Colombia cohort specifically, we also adjusted for ethnicity, as this cohort includes individuals of European, AJ, Hispanic, and Black descent. FDR correction was applied to all p-values.