In recent years, the role of UQCRC1 in Parkinson’s disease (PD) has been inconsistent based on the evidence from different cohort studies. Two consecutive studies from Lin CH et al. in 20191 and 20202 identified the association between the rare variant UQCRC1 c.941 A > C (p.Tyr314Ser, with a minor allele frequency ≤ 0.0001) and early-onset familial PD in a Taiwanese cohort including 324 patients with early-onset PD (onset age, <50 years) and 247 PD pedigrees, and an extended analysis with a further 699 unrelated PD probands with autosomal-dominant PD and 1934 patients with sporadic PD. The extended analysis identified another two variants in UQCRC1 among the probands with familial PD, c.931 A > C (p.Ile311Leu) and an allele with concomitant splicing variant (c.70-1 G > A), and also a frameshift insertion (c.73_74insG, p.Ala25Glyfs*27). They then conducted a series of experiments to validate the function of identified rare variants using CRISPR/Cas9-based knock-in technique and found that the UQCRC1 protein with the p.Tyr314Ser variant in human dopaminergic SH-SY5Y cell lines, as well as Drosophila and mouse models disrupts the function of mitochondrial respiratory chain complex III, leading to peripheral neuropathy1,2.

However, several studies failed to replicate the association between UQCRC1 and PD in the Chinese mainland population. Lin et al. did not find an association between sporadic PD and UQCRC1 in 452 sporadic PD cases and 450 healthy controls from Eastern China3. Zhao et al. also found no association in 3274 unrelated samples, including the probands from 477 PD families and 1440 sporadic early-onset (onset age ≤ 50 years) PD patients, as well as 1357 ethnicity-matched controls. Notably, they identified no carriers of loss-of-function variants in UQCRC1 and only a few carriers of missense (n = 48) and damaging missense (n = 6) variants4. By screening the UQCRC1 variants in 913 Chinese patients with early-onset, Wang et al. found no excessive burden of rare UQCRC1 variants5. As the association between UQCRC1 and PD was first identified in Taiwanese, Liao and colleagues attempted to replicate the association in 107 Taiwanese (98 patients with early-onset PD and nine with familial PD)6. They identified three missense variants along with seven rare variants. They found no significant difference in missense-variant carrier frequency between their cohort and individuals in the Taiwan Biobank6. In addition to these Chinese-based studies, there were also some efforts to replicate the association in Europeans. Senkevich et al. performed a variety of burden analyses using whole-genome sequencing data from the Accelerating Medicines Partnership-Parkinson’s disease initiative, including 1647 patients with PD and 1050 controls, but did not reveal an association between rare variants in UQCRC1 and PD7. Courtin et al. sequenced 241 PD patients but did not find the disease-causing variants in UQCRC1 in their population8. Except for the two studies conducted by Lin et al. 1,2 all follow-up studies failed to replicate the association between UQCRC1 and PD risk.

However, these studies had some limitations. Firstly, their sample sizes were small. Due to the rarity of UQCRC1 variants, it would be hard to identify them in such studies. Secondly, all of these studies are based on PD patient cohorts. Evidence from large-scale, general population-based cohorts is still absent. To overcome these limitations, we systematically explored the role of UQCRC1 in PD in the UK Biobank, a large-scale prospective population cohort with a half-million participants.

We conducted our analysis based on 434,328 European samples with whole-exome sequence data9 using the same approach as our previous study (“Methods”)10. Among these samples, there were 3856 PD cases (0.89%), which were reported from different sources, including self-reported, hospital admission records (primary and secondary conditions), and death records (primary and contributory causes). The median age of PD diagnosis was 69.06 years (interquartile range (IQR): 64.09–75.65), and 2395 (62%) were males. We performed gene-burden testing using both BOLT-LMM and SKAT for rare variants (minor allele frequency (MAF) < 0.1%) in UQCRC1 using HC (high confidence) PTVs (protein-truncating variants including stop-gained, splice acceptor, and splice donor variants) masks as defined by LOFTEE and missense variant masks annotated by different algorithms, including CADD (>25), REVEL (0.7 and >0.5), and AlphaMissense (>0.9, >0.7, and >0.56). We also estimated the Odds Ratio by conducting logistic regression. We identified 32 HC PTVs in UQCRC1 with 120 carriers, and the number of missense variants ranged from 12 to 120, with carriers ranging from 36 to 990 (Table 1, Supplementary Table 1). We observed a significant combined association between HC PTVs in UQCRC1 and higher risk of PD (BETABOLT-LMM = 0.04, SEBOLT-LMM = 0.01, PBOLT-LMM = 1.20 × 10−6, ORGLM = 6.59[2.84, 15.27], PGLM = 1.12 × 10−5, PSKAT = 1.17 × 10−6, PSKAT-O = 4.34 × 10−8). We saw a 5% PD case prevalence among the UQCRC1 HC PTVs carriers (6/120) compared with 0.89% in the non-carriers. 0.16% (n = 6) of PD cases carried a qualifying UQCRC1 HC PTV compared to 0.03% of controls (Table 1). All of these six PD cases were confirmed by hospital secondary and none of them indicated that their father, mother, or siblings had a history of PD. These six PD cases carried five UQCRC1 HC PTVs (one frameshift variant, three stop-gained variants, and one splice acceptor variant, Table 2). The average genotype quality (GQ) and depth of coverage (DP) of the variants carried by these 6 PD cases were 38.8 (range: 33–41) and 65.7 (range: 43–87), respectively. We noticed that there were two carriers with the same variant. Then we checked their genetic relationship and confirmed that they had no genetic relationship, as their genetic relationship coefficient is 0. We also run our analysis by excluding all reported pathogenic/likely pathogenic PD risk variants carriers (n = 5928) and the 27 pairs of first-degree-relatives as a sensitivity check. We observed a stronger P-value compared with our previous analysis (BETABOLT-LMM = 0.042, SE BOLT-LMM = 0.001, PBOLT-LMM = 7.50 × 10−7, ORGLM = 6.79[2.92, 15.77], PGLM = 8.37 × 10−6, PSKAT = 9.86 × 10−7, PSKAT-O = 1.42 × 10−8) for the UQCRC1 PTVs and PD association. None of the six PD cases who carried PTVs carry the reported pathogenic/likely pathogenic PD variants. We did not observe a significant association between UQCRC1 missense variants and PD for both our primary analysis and sensitivity check.

Table 1 Results of gene burden testing for different gene masks of UQCRC1
Table 2 Characteristics of the 6 PD cases who carried UQCRC1 HC PTVs

This is the first study to systematically explore the role of UQCRC1 in Parkinson’s disease based on a large-scale population-based cohort. We found that HC PTVs in UQCRC1 are extremely rare with a frequency of 0.03% in this general population. Although the frequency of UQCRC1 HC PTVs carriers was higher in PD patients, it was only 0.16%. Hence, it would be hard to identify carriers in previous studies with sample sizes ranging from 107 to 3274. As mentioned above, Zhao et al. found no LOF variant carriers but only missense variants carriers in their study cohort4. The null association between missense variants in UQCRC1 with predicted deleterious function and PD risk in our study was consistent with their observations. To further validate their observations, we attempted to extract the missense variants identified in their cohorts from the UK Biobank. Among the 24 missense variants, 15 were identified in the UK Biobank. Of these, two variants (p.Arg269His and p.Glu435Lys) had a minor allele frequency (MAF) > 0.1%. We then performed the same association test for these two variants and conducted a burden test combining the remaining variants with MAF ≤ 0.1%. No significant associations were observed (P = 0.30, OR = 0.93[0.82, 1.07]), No. carriers = 26,283 (221 PD cases) for p.Arg269His, P = 0.82, OR = 1.08[0.58, 2.02]), No. carriers = 1100 (10 PD cases) for p.Glu435Lys and P = 0.91, No. carriers = 103(0 PD cases) for burden test). Additionally, most of the above-mentioned cohorts comprised a majority of PD cases with early-onset PD. However, from our observation, the average diagnosis age of PD cases who carried UQCRC1 HC PTVs was 73 years old and none of them had early-onset PD (PT-test = 0.30) compared with non-carriers.

We proposed that UQCRC1 PTVs may drive Parkinson’s pathogenesis through distinct mechanisms: (1) haploinsufficiency from monoallelic loss reduces functional protein below the threshold for proper Complex III assembly, impairing oxidative phosphorylation11; (2) truncated variants retaining interaction domains disrupt wild-type protein incorporation, exacerbating ROS via dominant-negative effects12,13; (3) together, these destabilise mitochondrial membrane potential and impair PINK1/Parkin-mediated mitophagy. Phenotypic severity depends on variant position—N-terminal truncations cause haploinsufficiency, while C-terminal variants exert stronger dominant-negative effects. These defects propagate neurodegeneration through bioenergetic failure, oxidative stress, and neuroinflammation (via mtDNA release), synergising with α-synuclein pathology14,15. Isogenic models are needed to dissect mechanistic contributions.

In addition to our findings, we also noted some limitations of this study. Firstly, the reported healthy participant bias of UK Biobank could underestimate the carrier frequency of UQCRC1. Also, due to the limited PD carriers, there was lack of cohorts with the same scale as UK Biobank to validate our findings and we may need to replicated this finding in more diverse population in our future study as the most participants in UK Biobank are white British. Finally, we reported the strong association between UQCRC1 HC PTVs and PD purely based on statistical tests. The mechanisms need to be illustrated by future experimental works.

In summary, based on the data from a large-scale prospective population-based cohort, we found a significant association between rare PTVs in UQCRC1 and a higher risk of PD, supporting the link between UQCRC1 and PD. We also provided a plausible explanation for why the association between UQCRC1 and PD has been hard to replicate in previous studies. Future genetic studies based on large-scale prospective population cohorts are needed to validate our findings and functional assays are needed to decipher the mechanism of UQCRC1 in PD.

Methods

Ethics

Our research adheres to all pertinent ethical guidelines. All studies encompassed within this research were sanctioned by the appropriate board or committee. The UK Biobank has received approval from the NorthWest Multi-centre Research Ethics Committee (REC reference 13/NW/0157) as a Research Tissue Bank (RTB), and each participant has given informed consent. This approval signifies that researchers are not required to obtain separate ethical clearance and can operate under the umbrella of RTB approval. This RTB approval was initially granted in 2011 and is subject to renewal every five years; consequently, the UK Biobank has successfully renewed its approval in 2016 and 2021.

UK Biobank data processing and quality control

This study is based on the same approach as our previous study10. We used the algorithmically-defined Parkinson’s disease outcomes (Data-Field 42032 and Data-Field 42033) provided by UK Biobank. There are 830 self-reported only cases, 0 hospital admission cases, 0 death only cases, 226 hospital primary case, 73 death primary cases, 3290 hospital secondary cases and 87 death contributory cases. In UK Biobank, the most of participants were healthy when they participated in. Dring the nearly 20-years follow-up, disease cases were gradually reported. In Supplementary Table 2, we showed the occurrence of 1124 newly broadly disease type defined by ICD10 of the individuals not suffering from Parkinson’s disease. For the whole exome sequencing (WES) data, we queried WES data from 469,835 individuals in UK Biobank9, excluding those with excess heterozygosity, those with ≥5% autosomal variant missingness on genotyping arrays, or those not included in the subset of phased samples as defined by Bycroft et al.16 WES data were stored as population-level variant call format (VCF) files, aligned to GRCh38 and accessed through the UK Biobank Research Analysis Platform (RAP). In addition to the quality control measures already applied to the released data, as described by Backman et al.9, we conducted several additional quality control procedures. First, we used ‘bcftools v1.14 norm’17 to split the multiallelic sites and left-correct and normalise indels. Next, we filtered out variants that failed our quality control criteria, including those with: (1) read depth of <7; (2) genotype quality of <20; and (3) binomial test P value for alternative allele reads versus reference allele reads of ≤0.001 for heterozygous genotypes. For indel genotypes, we kept only variants with read depth of ≥10 and genotype quality of ≥20. Variants that failed quality control criteria were marked as missing (i.e., ./.). After filtering, variants where more than 50% missing genotypes were excluded from downstream analyses18.

The remaining variants underwent annotation using Ensembl Variant Effect Predictor (VEP v104)19 with the ‘-everything’ flag and additional plugins for REVEL20, CADD21, and LOFTEE22. The AlphaMissense scores were downloaded from https://github.com/deepmind/alphamissense23. For each variant, a single Ensembl transcript was prioritised on the basis of whether the annotated transcript was protein-coding, MANE select v0.9724 or the VEP canonical transcript. The individual consequences of each variant were then prioritised based on their severity as defined by VEP. Stop-gained, splice acceptor and splice donor variants were merged into a combined PTV category, while annotations for missense and synonymous variants were adopted directly from VEP. We defined a subset of ‘white European’ ancestry samples using a k-means-clustering approach that was applied to the first four principal components calculated from genome-wide SNP genotypes. Individuals clustered into this group who self-identified by questionnaire as being of an ancestry other than white European were excluded. Our analyses focused primarily on individuals of ‘white European’ ancestry, and we excluded those who withdrew consent from the study, resulting in a final cohort of 434,328 individuals.

We downloaded all 625 reported Pathogenic/Likely pathogenic PD variants from ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/?term=parkinson%20disease) and then extracted them from the UK Biobank. In total, 135 of 625 reported Pathogenic/Likely pathogenic PD variants with at least 1 carrier in UK Biobank (Supplementary Table 3). The total number of carriers is 5928. Among these carriers, 76 were PD cases. We extracted all possible first-degree-relatives based on the correlation coefficient >0.49 of Genetic Relationship Matrix (GRM) for all samples with WES data.

Gene-burden testing in UK Biobank

We used BOLT-LMM v2.3.625 as our primary analytical tool to conduct the gene-burden test for UQCRC1. To run BOLT-LMM, we first queried a set of genotypes with minor allele count (MAC) > 100, which was derived from the genotyping arrays for the individuals with the WES data to build the null model. To accommodate BOLT-LMM’s requirement for imputed genotyping data rather than per-gene carrier status, we developed dummy genotype files in which each gene was represented by a single variant. We then coded individuals with a qualifying variant within a gene as heterozygous, regardless of the total number of variants they carried in that gene. HC PTVs were defined as stop-gain, frameshift, or canonical splice-site variants rigorously filtered by LOFTEE to exclude false positives. We then created dummy genotypes for the HC PTVs with MAF < 0.1% as defined by LOFTEE, missense variants with CADD (>25), REVEL (>0.7 and >0.5) and AlphaMissense (>0.9, >0.7, and >0.56). We then used BOLT-LMM to analyse phenotypes using default parameters, except for the inclusion of the ‘lmmInfOnly’ flag. In addition to the dummy genotypes, we included all individual markers in the WES data to generate association test statistics for individual variants. We used age, age2, sex, age*sex, the WES release batch (50k, 200k, 450k, 470k) and the first 20 principal components (PCs) as calculated by Bycroft et al. as covariates16. We also conducted SKAT analyses using R package SKAT v2.2526, based on the same variant filtering and covariates as the BOLT-LMM framework. To estimate the Odds Ratio, we ran logistic regression using R function “glm” with the same settings.