Introduction

Parkinson’s disease (PD) is a common degenerative disease of the nervous system that is clinically characterized by static tremors, motor delays, muscle rigidity, and postural balance disorders1,2. PD has a high incidence rate, high disability rate, and poor prognosis in middle-aged and older individuals3,4. A 2021 meta-analysis5 of studies from China included 24,117 people aged > 60 years and found that the prevalence of PD in the older population was 1.37% (95% confidence interval [CI] 1.02–1.73%). PD is a chronic progressive disease whose incidence increases with age, and there are currently no fully cured cases. As patients with PD experience damage to the motor function, those with late-stage disease lose their ability to take care of themselves, have difficulty moving, and often die from complications such as respiratory or circulatory failure6,7. Together, these factors seriously affect the quality of life of patients and increase the burden on their families and society8,9. Therefore, it is necessary to identify risk factors for PD in the early stages and initiate treatment as soon as possible to delay disease progression.

The main pathological feature of PD is extensive loss of dopaminergic neurons in the substantia nigra, which is associated with enhanced intracellular oxidative stress and mitochondrial dysfunction10,11. A previous pathological report12 confirmed that the concentration of glutathione in the substantia nigra, which has antioxidant effects, decreases, whereas that of lipid peroxides increases. No such changes were observed in other parts of the brain. Therefore, the metabolism of cysteine, glutamic acid, and aminoacetic acid is closely associated with oxidative stress in the substantia nigra13,14. These amino acids also participate in signal transmission as neurotransmitters, and any change leading to an imbalance in neurotransmitters could alter the metabolic pathways in the brain, potentially leading to diseases of the central nervous system. Another study15 suggested that homocysteine directly damages neurons and participates in the degeneration of dopaminergic neurons in patients with PD. In a cohort study by Figura et al.16, the serum amino acid metabolism of patients with PD at different stages of progression was compared, and significant differences were found between groups for alanine, arginine, phenylalanine, and threonine, indicating that changes in the metabolism of these amino acids are related to the occurrence and progression of PD. However, there is currently no consensus on which amino acids independently affect the development of PD.

The expression levels of blood amino acids (BAAs) show dynamic fluctuations during synthesis and decomposition, which are related to various factors (including genetics)17,18. Wiklund et al.19 reported that the serum leucine and isoleucine levels in children can be used to predict hypertriglyceridemia in early adulthood and serve as markers to identify populations at high risk for cardiovascular disease. A study by Hu et al.20 suggested that PPM1K rs1440581 and rs7678928 single nucleotide polymorphisms (SNPs) are associated with elevated levels of serum branched chain amino acids, which could increase the risk of cardiovascular disease. Because of the known genetic pattern of randomly assigning parental alleles to offspring, some genotypes can be used to determine the serum amino acid phenotype and infer the relationship between the phenotype and disease occurrence21,22,23. However, research on this causal relationship is still very rare at present.

In this study, we performed a double-sample genome-wide association study (GWAS) of BAAs and PD to analyze key SNPs and conducted a Mendelian randomization (MR) study to determine the impact of the expression levels of specific BAA on the pathogenesis of PD in search of biomarkers and blocking targets in the early stages of PD.

Results

Roadmap of the analysis process

The roadmap is shown in Fig. 1.

Fig. 1
Fig. 1
Full size image

A. Schematic diagram of multivariate Mendelian randomization and mediation analysis. The basic assumptions of Mendelian randomization analysis include (1) the assumption of relevance, which states that the selected instrumental variable must be significantly correlated with the exposure factor; (2) The assumption of independence, which states that instrumental variables must have no significant correlation with potential confounding factors that may affect exposure or outcome; (3) Exclusivity limitation means that instrumental variables can only affect outcomes through the path of “instrumental variables → exposure → outcome”. B. The analytic process in this study. (SNP, single nucleotide polymorphism. IVW, inverse variance weighted. MR, Mendelian randomization. GWAS, genome-wide association study. LD, Linkage disequilibrium.)

Instrumental variable screening

SNP sites related to 86 BAAs were selected from the compiled SNP data for preliminary causal analysis. After matching the GWAS data of PD (finn-b-G6_PARKINSON), the effect values of all instrumental variables were obtained. After normalization, the instrumental variables associated with BAAs and PD were obtained and included in the MR analysis. The specific instrumental variable data for the BAA exposure factors are shown in Table 1, which includes only indicators with significance (P < 0.05). The F-test statistics of the instrumental variables were all > 10, indicating that the SNPs selected in this study were mostly strong-effect instrumental variables with minimal bias.

Table 1 Instrumental variable screening of blood amino acid level and Parkinson’s disease and instrumental variable strength F test.

MR causal effect estimation of the BAA index and PD

A total of five models were used for calculation, while only the inverse variance weighted (IVW) model results showed a significant causal relationship between the BAA indices, met-a-308 (phenylalanine) (odds ratio [OR] = 0.001, 95% confidence interval [CI]: 0.000–0.106, P = 0.00478), met-a-337 (5-hydroxyproline) (OR = 4.177, 95% CI: 1.125–15.518, P = 0.0327), and met-a-584 (X-12100 hydroxytryptophan) (OR = 0.276, 95% CI: 0.082–0.934, P = 0.0385), and occurrence of PD (Table 2). The MR results are presented in a forest map (Fig. 2a). Figure 2b–d shows the linear relationship between the instrumental variables of the BAA index and occurrence of PD. It was shown that the effect size of met-a-337 estimated by the simple model deviated significantly from that of the IVW model, and the effect sizes of met-a-308 and met-a-584 estimated by all the other four models were consistent with those of the IVW model.

Table 2 Results of mendelian randomization analysis of blood amino acid index on the incidence of Parkinson’s disease(PD).
Fig. 2
Fig. 2
Full size image

Mendelian randomization analysis of the causal relationship between Blood amino acid index Iron and Parkinson’s disease (PD). (A). Blood amino acids (Blood amino acid index) of Parkinson’s disease (Parkinson 's diseases, the pathogenesis of PD) Mendelian randomization of the results of the analysis of forest graph display. (B) - d. Blood amino acids (Blood amino acid index) met - a - 308 (B), met - a - 337 (C), met - a - 584 (D) and Parkinson’s disease (Parkinson 's diseases, The results of Mendelian randomization analysis between PD) are shown in scatter plots. (E) - g. Blood amino acids (Blood amino acid index) met - a - 308 (E), met - a - 337 (F), met - a - 584 (G) and Parkinson’s disease (Parkinson 's diseases, Funnel plot showing the heterogeneity test of the results of Mendelian randomization analysis of causality between PD). (H)-(j). Blood amino acid index met-a-308 (H), met-a-337(I), met-a-584(J) and Parkinson’s disease (PD, P < 0.05). PD) using a single SNP loci analysis of the causal relationship between each effect of estimated results of forest.

Sensitivity analysis

Cochran’s Q-test was used to detect heterogeneity in the IVW model results. The P-values for met-a-308, met-a-337, and met-a-584 were 0.651, 0.545, and 0.636, respectively, all of which were > 0.05, indicating no significant heterogeneity (Table 3).

Table 3 Heterogeneity of the Cochran Q test for Mendelian randomization analysis of blood amino acid index on the incidence of Parkinson’s disease(PD).

We also used the MR-Egger regression to test the variable-level pleiotropy of BAAs (met-a-308, met-a-337, and met-a-584). The test P-values for each indicator were > 0.05, and the intercept was close to 0, indicating that the causal effects of the BAA indices on the development of PD were not affected by horizontal pleiotropy (Table 4).

Table 4 Horizontal pleiotropy test for mendelian randomization analysis of blood amino acid index on the incidence of Parkinson’s disease(PD).

Funnel plots of instrumental variables for the BAA indices are shown in Fig. 2e–g. The scatter plot of met-a-308 (Fig. 2e) was not symmetrically distributed, indicating a potential bias in the results. The scatter plots of met-a–337 (Fig. 2f) and met-a-584 (Fig. 2g) were symmetrical, indicating that there was no potential bias in the results for the two indices.

Through leave-one-out analysis, each instrumental variable was removed individually to analyze the causal effects of met-a-308, met-a-337, and met-a-584 on the occurrence of PD (Fig. 2h–j). The total effects of the met-a-308, met-a-337, and met-a-584 instrumental variable sets did not show significant deviations.

Reverse causal assessment of the risk of PD on the BAA index

We used the same method to screen the SNP sites, and nine instrumental variables were included. A total of five models were used for the reversal analysis, and the IVW model results showed that there was no significant causal relationship between the occurrence of PD and BAA indices, met-a-308 (OR = 1.0069, 95% CI: 0.998–1.016, P = 0.127), met-a-337 (OR = 0.995, 95% CI: 0.977–1.013, P = 0.5616), and met-a-584 (OR = 1.005, 95% CI: 0.978–1.032, P = 0.7249). The occurrence of PD did not result in a decrease in the value of the BAA index (met-a-308, met-a-337, and met-a-584) (Table 5; Fig. 3).

Table 5 Results of mendelian randomization analysis of the effect of Parkinson’s disease(PD)on blood amino acid index.
Fig. 3
Fig. 3
Full size image

Parkinson’s disease (Parkinson 's diseases, PD) and amino acid of Blood (Blood amino acid index) between Iron causality analysis of the Mendelian randomization. Forest plot of the results of Mendelian randomization analysis of Blood amino acid indexes met-a-308, met-a-337, met-a-584 for Parkinson’s disease (PD).

Multivariate MR analysis of the BAA index in the pathogenesis of PD

We first conducted univariate MR analysis of BAA (met a-308, met a-337, and met a-584) levels, and 61 PD-related exposure factors were identified (Table S1). The factors with a significant causal relationship (P < 0.05) with the BAAs were ukb-a-469, ukb-b-11,361, ukb-b-14,461, ukb-b-18,700, ukb-b-18,786, ukb-b-12,227, ukb-d-30270_irnt, ukb-d-1508_3, and ukb-d-2654_2. These factors were included in multivariate MR studies combined with the BAA (met a-308, met a-337, and met a-584) levels (Table 6). As shown in Table 6, we obtained nine multivariate MR models that met the requirements (all multivariate MR model results are shown in Table S4). The indirect effects of exposure factors were adjusted for in the nine multivariate MR models, and the results showed that the BAA (met-a-308, met-a-337, and met-a-584) levels had a significant and direct impact on the onset of PD.

Table 6 The results of multivariable mendelian randomization analysis of blood amino acid index Iron on the incidence of Parkinson’s disease(PD).

Mediation effect analysis

Multivariate analysis revealed that all nine multivariate MR models had mediating effects of PD-related exposure factors on BAA levels with 95% CIs (0.501, 0.805, 0.434, 0.041, -0.042, -0.089, -0.111, -0.016, and − 0.182) (Table 7). However, the 95% CI relm of the mediating effects included 0; thus, they were not significant, indicating all these nine models showed no mediating effects.

Table 7 The mediating effect of mendelian randomization of blood amino acid levels mediated by Parkinson’s disease (PD) related exposure factors on Parkinson’s disease.

Discussion

In this MR study, the levels of three BAAs were found to have a significant causal relationship with the development of PD. Among them, the high-expression levels of met-a-308 (phenylalanine) and met-a-584 (X-12100 hydroxytryptophan) could reduce the risk of PD and were protective factors, while that of met-a-337 (5-hydroxyproline) increased the risk of PD and was a risk factor. We also found that the levels of other amino acids, such as aspartic acid, serine, lysine, and cysteine, were not independent exposure factors that could affect the risk of developing PD.

Phenylalanine (chemical name 2-amino-3-phenylalanine) is an organic compound that is one of the essential amino acids in the human body. It is catalyzed by hydroxylases in the human body to form arginine, participates in the synthesis of neurotransmitters and hormones, and plays an important role in sugar and fat metabolism24,25,26. However, high concentrations in the cerebrospinal fluid can cause phenylketonuria due to neurotoxicity, ultimately leading to cognitive impairment. High concentrations of phenylalanine in the plasma may also affect the blood-brain barrier transport of other neutral amino acids, leading to a deficiency in other types of amino acids in the brain27. Phenylalanine may also be involved in the synthesis of serotonin and catecholamines, which affect the brain function28. Further, it is an independent protective factor against the development of PD; this was discovered for the first time in this study. Sanayama et al.29 reported a negative correlation between phenylalanine levels and glutathione peroxidase activity, indicating that higher phenylalanine levels led to lower oxidative stress responses in patients. Patients with PD are highly sensitive to oxidative stress; under a low oxidative stress response, the risk of developing PD is low30,31. However, the regulatory pathway of phenylalanine in the oxidative stress response remains unclear.

In this study, we found that high expression of X-12,100 hydroxytryptophan reduces the risk of PD while 5-hydroxyproline increases the risk of PD. This is consistent with the significant decrease in the TRP level and increase in the proline level in the serum of patients with PD32. TRP is an essential amino acid in the human body that is mainly metabolized via the kynurenine pathway. The main metabolite of kynurenine (KYN), kynurenic acid (KYNA), plays an important role in maintaining normal physiological functions of the central nervous system and has neuroprotective effects in neurodegenerative diseases such as PD, cerebral ischemia, stroke, and epilepsy33,34,35. Plasma-free TRP and branched-chain amino acids (leucine, isoleucine, and valine) participate in regulating the concentration of TRP in the brain. When the concentration of plasma-free TRP increases, the amount of TRP entering the brain increases and the activation of the KYN metabolic pathway in the brain is enhanced, thereby regulating the concentration of KYNA36. Notably, in this study, TRP (met-a-304) and c-glycosyltryptophan (met-a-502) were not independent influencing factors for PD and only X-12,100-hydroxytryptophan reduced the risk of PD.

Proline is an essential amino acid in the human body that undergoes hydroxylation by catalytic enzymes resulting in the formation of hydroxyproline37,38,39. It is an important component of collagen in animals. Hydroxyproline is also present in various plant proteins, particularly in cell walls40. Plewa et al.41 adopted a metabolomics approach and observed an increase in the serum concentration of hydroxyproline in patients with PD compared with healthy controls, which is consistent with the results of this study. These serological changes may be associated with the development of neurodegenerative diseases. Collagen contains a large amount of proline and hydroxyproline, and an increase in the hydroxyproline content in the serum is partly due to increased degradation of collagen, which might be caused by the activity of matrix metalloproteinases, which are considered potential pathogenic factors for PD42,43,44.

The heterogeneity detection results showed that the P-values of the IVW model for all three amino acids were > 0.05, indicating the absence of heterogeneity. Further, the MR-Egger regression analysis showed that the results of this study were not affected by horizontal pleiotropy; the scatter plot showed no significant bias, and the leave-one-out analysis showed no significant deviation in the results. We also used five models to reverse-deduce the effect of PD on BAAs and found no significant reverse causal relationship. We also introduced a multivariate analysis and found that the mediating effects in the multivariate MR model were not significant, indicating that external factors had no significant impact on the results of this study. Therefore, the results are stable and reliable.

In Cheng JY et al.'s study45, MR model was also built to predict the causal role of 9 amino acids on 6 neurodegenerative diseases, which was somewhat similar to this study in methodology. But in this study, 86 amino acids were included, and the target disease was only PD, with 5 regression models used for analysis, which is more reliable than their study using only IVW method. In this study, mediation effect analysis was also carried out to exclude the influence of exposure factors, strengthening the evidence levels for the influencing factors.

This study has some limitations. First, although the results of this study are stable and reliable, the underlying causes of the impact of the BAA indices on the pathogenesis of PD still need to be explored. Second, the results obtained in this study are all based on data analysis, and their effectiveness still needs to be verified in both in vivo and in vitro experiments.

In conclusion, the MR results in this study suggest a causal relationship between the three blood amino acid indices (phenylalanine, X-12100 hydroxytryptophan, and 5-hydroxyproline) and the risk of developing Parkinson’s disease. These findings indicate that alterations in these amino acids may play a role in the disease process, although further research is needed to clarify the mechanisms underlying these associations and their clinical implications.

Methods

Study design

This study used public datasets to investigate the impact of the BAA indices on the development of PD, including the mediating effects of other exposure factors. This study was conducted in accordance with the STROBE-MR Statement46.

Data sources

GWAS data on BAA levels from the OpenGWAS database (https://gwas.mrcieu.ac.hk/) were downloaded. The human blood metabolite (met-a) ID list provided by Shin et al.47 and circulating metabolite (met-a) ID list provided by Kettunen et al.48 were intersected, and only BAA-related IDs were retained. Finally, 86 BAA IDs were obtained (specific BAA IDs and names are listed in Table S2) and used for subsequent analyses. The data of these 86 BAAs were standardized and organized using the R package “TwoSampleMR”.

The GWAS data for PD (finn-b-G6_PARKINSON) were obtained from a meta-analysis49 of a Finnish database (https://www.finngen.fi/en/access_results), with data sourced from European ethnicities including both male and female individuals. We downloaded correlation summary statistical data for this analysis from the GWAS Catalog50.

The GWAS data for PD-related exposure factors, including all possible factors, were also downloaded from the OpenGWAS database (https://gwas.mrcieu.ac.uk//). We performed univariate MR analysis with PD data using these factors and retained only factors that had P values of < 0.05. We identified 61 qualified PD-related exposure factors (Table S3).

Instrumental variables

Because of the small sample size of the GWAS data on BAA levels in this study, SNPs were screened using the following criteria during the MR analyses: SNPs with P < 5 × 10− 6 and those in linkage imbalance (r2 < 0.001 SNPs with a physical distance between every two genes > 10000 kb) were excluded. Next, we calculated the F-statistics of the screened instrumental variables to remove weak instrumental bias. F < 10 indicated that the genetic variation used was a weak instrumental variable that might have biased the results51. The specific formula for calculating the F-statistic is

$$\:F=(N-k-1)/k\times\:\frac{{R}^{2}}{1-{R}^{2}}$$

where N represents the sample size, K represents the number of instrumental variables used, and R2 reflects the degree of exposure to the inverse variances (IVs). R2 = 2 × (1 - MAF) × MAF × β2, where MAF is the minimum allele frequency, and β is the allele effect value.

MR causal effect estimation

Five types of two-sample MR (IVW, MR-Egger, weighted median, simple mode, and weight mode) methods were used to evaluate the causal effects of exposure on the outcome. However, according to a previous study52, the IVW method fits better for the evaluation of causal effects. In this study, the IVW method was treated as the main method in the MR analysis, while the other methods were treated as a supplement. If pleiotropy was present, the MR-Egger method was used to calculate the results.

Sensitivity analysis

Heterogeneity investigation

Cochran’s Q test was used to evaluate heterogeneity between various SNP estimates. If significant heterogeneity existed, the random-effects model was adopted during IVW53. I2 statistics were used to reflect the proportion of heterogeneity in the total variation of instrumental variables; I2 > 50% indicated high heterogeneity, whereas I2 < 25% indicated mild or no heterogeneity. The formula for I2 calculation is

$$\:{I}^{2}=\frac{Q-df}{Q}\times\:100\%$$

Horizontal pleiotropy

We used the MR-Egger method to conduct a pleiotropy test on the instrumental variables. If the intercept of the MR Egger met P < 0.05, this indicated significant horizontal pleiotropy of genetic variation.

Leave-one-out verification

The MR results of the remaining instrumental variables were calculated by removing individual SNPs. A significant change after excluding the SNP indicated that the MR effect estimation was sensitive to the SNP.

Multivariate MR analysis and estimation of mediating effects

Multivariate MR is an extension of MR that uses genetic variations related to multiple possible exposures to estimate their impact on a single outcome. Before conducting the multivariate MR analysis, we first conducted a univariate MR analysis of BAA levels and exposure factors. Significant exposure factors were included in the multivariate MR analysis to construct the model. By identifying the direct effects of BAA levels and PD-related exposure factors on PD through multivariate MR analysis, the indirect effect of BAA levels on PD-related exposure factors can be obtained through the BAA → PD-related exposure factors → PD pathway axis. The effect value and standard error of the mediating effect were calculated using the following formula:

$$\:{{\upbeta\:}}_{\text{M}}={{\upbeta\:}}_{\text{A}}\times\:{{\upbeta\:}}_{\text{B},\:\:\:\:}{\text{S}\text{E}}_{\text{M}}=\sqrt{{({{\upbeta\:}}_{\text{A}}\times\:{SE}_{B})}^{2}+{({{\upbeta\:}}_{\text{B}}\times\:{SE}_{A})}^{2}}$$

where βM is the value of the mediating effect, βA is the MR effect value of BAA levels on PD-related exposure factors, βB is the direct effect value of PD-related exposure factors on PD, SEM is the standard error of mediating effects, SEA is the standard error of the MR analysis of BAA levels on PD-related exposure factors, and SEB is the standard error of the MR analysis of PD-related exposure factors on PD. The proportion of mediating effect is calculated as |βMC| × 100%, where βC is the effect value of BAA levels on PD in univariate MR.

Statistical analyses

All data calculations and statistical analyses were performed using the R software (https://www.r-project.org/, version 4.2.2). The “TwoSampleMR” package was used for the analyses54. Cochran’s Q test and leave-one-out analysis were used to evaluate the robustness and reliability of the results. The MR-Egger intercept method was used to test horizontal pleiotropy. ORs and 95% CIs were calculated for the effect size. All statistical P-values were tested bilaterally, and statistical significance was set at P < 0.05.