Classifying the risk of cognitive impairment in Parkinson’s disease using serum bile acid profiles and machine learning

Shen, Tong; Cui, Can; Liu, Ruiqi; Yin, Xin; Zu, Jie; Zhang, Wei; Dong, Liguo; Xu, Chuanying; Lu, Hailong; Cui, Guiyun; Qu, Xuebin; Zhang, Chu

doi:10.1038/s41531-025-01229-z

Download PDF

Article
Open access
Published: 08 December 2025

Classifying the risk of cognitive impairment in Parkinson’s disease using serum bile acid profiles and machine learning

Tong Shen¹^na1,
Can Cui²^na1,
Ruiqi Liu³^na1,
Xin Yin⁴,
Jie Zu¹,
Wei Zhang¹,
Liguo Dong¹,
Chuanying Xu¹,
Hailong Lu¹,
Guiyun Cui¹,
Xuebin Qu⁵ &
…
Chu Zhang³

npj Parkinson's Disease volume 12, Article number: 18 (2026) Cite this article

1549 Accesses
7 Altmetric
Metrics details

Subjects

Abstract

Cognitive impairment (CI) is a common and early non-motor manifestation of Parkinson’s disease (PD), yet its biochemical basis remains poorly understood. Given the emerging link between bile acids (BAs) and neurodegeneration, we investigated whether serum BA profiles differ by cognitive status in PD and whether they can classify CI. A total of 363 participants were enrolled, including 63 healthy controls, 154 PD patients with normal cognition, and 146 with CI. Serum BA concentrations were quantified by ultra-performance liquid chromatography–tandem mass spectrometry, and multivariate as well as machine learning analyses were applied. Compared with cognitively normal PD patients, those with CI exhibited distinct BA alterations, characterized by elevated deoxycholic and cholic acids and reduced glyco- and tauro-conjugated species. Deoxycholic acid showed the strongest negative correlations with cognitive scores. Machine learning models based on combined BA profiles, particularly the random forest classifier, achieved robust discrimination between PD-CI and PD-NC groups (AUC up to 0.90). These findings indicate that BA dysregulation is closely linked to cognitive impairment in PD and may serve as a promising metabolic biomarker for early detection. Clinical trial number. Not applicable.

Characteristics and mechanisms of cognitive impairment in Parkinson disease

Article 03 December 2025

Targeted metabolomic analysis in Parkinson’s disease brain frontal cortex and putamen with relation to cognitive impairment

Article Open access 03 June 2023

Machine learning model base on metabolomics and proteomics to predict cognitive impairment in Parkinson’s disease

Article Open access 11 October 2024

Introduction

Cognitive impairment (CI) is a common and early non-motor manifestation of Parkinson’s disease (PD), often emerging in the early stages of the disease and significantly impacting patient quality of life and caregiver burden^1,2. It primarily affects executive/attention, memory, visuospatial and language domains². Approximately 30% of patients with mild cognitive impairment in PD (PD-MCI), which may progress to PD with dementia (PDD)³. While widely used, Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) show imperfect sensitivity/specificity for PD-related CI and may miss milder or domain-specific deficits^4,5, early detection of CI is challenging. Consequently, there is an urgent need for objective biomarkers that can assist in early diagnosis and monitoring of CI in PD.

Recent studies have linked bile acid (BA) dysregulation with neurodegeneration in both PD and Alzheimer’s disease (AD)^6,7. However, it is essential to distinguish the disease-specific mechanisms that drive cognitive decline in these two neurodegenerative disorders. In AD, BA dysregulation has been primarily associated with the accumulation of amyloid plaques and tau. In contrast, PD is characterized by the accumulation of misfolded α-synuclein within the central nervous system, initially involving the dorsal motor nucleus of the vagus and anterior olfactory structures, and subsequently progressing in a stereotyped caudo-rostral pattern toward cortical regions⁸. Notably, α-syn pathology has been detected in the gastrointestinal tract before motor symptoms appear, suggesting a role for the gut-brain axis in PD pathogenesis^9,10. Human studies report dysbiosis accompanied by elevations of secondary, microbially derived BAs in PD, suggesting a microbiota–BA link¹¹.

BAs, synthesized in the liver from cholesterol and subsequently metabolized by the gut microbiota, act as signaling molecules that regulate metabolic, inflammatory, and neuronal processes^12,13. Evidence from non-PD models suggests that BAs can influence cognitive function and blood–brain barrier (BBB) integrity¹⁴, though this remains hypothesis-generating and requires direct validation in PD. Several reviews have indicated that specific BAs can signal within the central nervous system (CNS)^14,15,16, and circulating BA concentrations correlate with brain levels¹⁷, suggesting that peripheral BA alterations may modulate CNS physiology. Recent studies indicate that BA metabolism is disrupted in prodromal PD animal models¹⁸, and human investigations have revealed altered serum levels of specific BAs—most notably reduced unconjugated primary BAs such as cholic acid (CA) and chenodeoxycholic acid (CDCA), along with elevated conjugated species like glycochenodeoxycholic acid (GCDCA)—which correlate with the severity of CI. In addition, neuroprotective BAs such as ursodeoxycholic acid (UDCA) and tauroursodeoxycholic acid (TUDCA) have demonstrated mitochondrial rescue and anti-apoptotic effects in PD experimental models and early clinical studies^19,20,21,22.

In PD, dysregulated BA metabolism may contribute to cognitive decline through the accumulation of neurotoxic secondary BAs, particularly deoxycholic acid (DCA), and the depletion of protective BAs such as UDCA^20,23. Although evidence from AD supports an association between altered BA profiles and cognitive deterioration, experimental studies suggest that secondary BAs such as DCA may exacerbate neurodegeneration in AD through neuroinflammatory and oxidative stress pathways²⁴. However, their specific contribution to CI associated with PD remains to be elucidated. Prior studies investigating BA dysregulation in PD have often been constrained by small sample sizes, inconsistent results, and limited analytical sensitivity of conventional assay platforms. Moreover, few studies have integrated advanced computational approaches, such as machine learning, to assess the discriminative utility of BA profiles for distinguishing cognitive subtypes of PD. Variability in cognitive assessment tools has further hindered reproducibility and interpretability.

To address these gaps, the present study employed ultra-performance liquid chromatography–tandem mass spectrometry (UPLC–MS/MS) for targeted quantification of serum BAs, combined with machine learning models to classify PD patients with normal cognition (PD-NC) and cognitive impairment (PD-CI). We hypothesize that alterations in serum BA profiles, particularly involving secondary BAs such as DCA, are associated with cognitive decline in PD and may serve as non-invasive biomarkers for early detection and monitoring of CI. By integrating metabolomic profiling with computational modeling, this study aims to clarify the role of bile acid dysregulation in PD and provide a foundation for biomarker-guided approaches to PD-related CI.

Results

Participant characteristics and serum bile acid dysregulation

The study included 63 participants in the healthy control (HC) group, 154 in the PD-NC group, and 146 in the PD-CI group. As illustrated in Fig. 1A, all participants underwent targeted UPLC-MS/MS profiling of 15 serum BAs, followed by integration of these metabolomic data with clinical variables and multivariate analyses to identify a diagnostic BA panel. Demographic and clinical characteristics are summarized in Table 1. Groups did not differ significantly in sex distribution, age, smoking status, alcohol use, hypertension or diabetes prevalence (all p > 0.05). By contrast, PD-CI patients had fewer years of education (p < 0.0001) and a lower proportion with high school education or above (p = 0.0497). Disease duration was longer (p = 0.0204) and Hoehn-Yahr (H-Y) stage higher (p = 0.0022) in PD-CI than PD-NC. MMSE and MoCA scores decreased stepwise across HC, PD-NC and PD-CI (p < 0.0001).

**Fig. 1: Overview of study design and serum bile acid profiles in HC and PD groups.**

Table 1 Demographic and clinical data for the PD participants in the cohort

Full size table

Beyond these clinical differences, routine biochemical indices showed a mixed pattern across groups (Table 1). A significant overall group effect was observed for body mass index (BMI), total cholesterol (TC), and low-density lipoprotein cholesterol (LDL), driven primarily by higher levels in HC compared with both PD-NC and PD-CI groups (all p < 0.05). In contrast, PD-NC and PD-CI did not differ in BMI, fasting glucose, serum lipid profiles, or liver enzyme levels—including alanine aminotransferase (ALT) and aspartate aminotransferase (AST)—and all liver-function parameters remained within normal physiological ranges. Thus, while selective metabolic differences were present between HC and PD groups, the two PD subgroups were metabolically comparable, suggesting that PD-related BA abnormalities are unlikely to result from hepatic dysfunction or metabolic impairment. We next compared overall serum BA levels between HC and all PD patients, regardless of cognitive status (Fig. 1B). PD patients showed a marked dysregulation of both primary and conjugated bile acids: CA, CDCA, DCA, glyco-lithocholic acid (GLCA) and taurocholic acid (TCA) were all significantly elevated in PD compared to HC (all p < 0.001). In contrast, LCA, taurolithocholic acid (TLCA), UDCA, glycoursodeoxycholic acid(GUDCA) and TUDCA were significantly reduced in PD patients (all p < 0.05) (Table 2). These findings reveal a pronounced disturbance of BA metabolism in PD patients.

Table 2 Comparison of serum bile acid concentrations between HC and PD patients

Full size table

Comparison of serum bile acid profiles between PD-CI and PD-NC groups

Building on our observation that several BAs are dysregulated in PD versus healthy controls, we next sought to determine which of these—and which additional BA species—distinguish PD-CI from PD-NC. We quantified 15 serum BAs in 154 PD-NC and 146 PD-CI using UPLC-MS/MS. Univariate comparisons revealed that CA (p = 0.008), DCA (p = 0.000) and TUDCA (p = 0.018) were significantly elevated in PD-CI, whereas GLCA (p = 0.024), TCA (p = 0.045) and UDCA (p = 0.002) were decreased (Fig. 2A, Table 3). No significant differences were observed for total BAconcentration or for the ratios of primary to secondary and conjugated to unconjugated BAs (all p > 0.05; Fig. S1A–H).

**Fig. 2: Serum bile acid profiles in PD-CI and PD-NC groups.**

Table 3 Comparison of serum bile acid levels between PD-NC and PD-CI groups

Full size table

To further explore metabolic differences between PD-CI and PD-NC, we performed orthogonal partial least squares discriminant analysis (OPLS-DA), which revealed a modest yet distinct separation between the two groups (R²Y = 0.478, Q² = 0.293; 200-time permutation test, p = 0.005; Fig. 2B, C, Fig. S2). The permutation-derived intercepts (R²Y = −0.086, Q² = −0.198) further confirmed that the model was not overfitted.Given its exploratory nature, the primary purpose of applying OPLS-DA was to identify BA species contributing to group separation. Variable importance in projection (VIP) scores from the OPLS-DA model identified DCA (VIP = 1.761), UDCA (VIP = 1.412) and GLCA (VIP = 1.411) as the top contributors (Table S1). Consistent with these findings, Random Forest analysis (10-fold CV; Fig. 2D, Table S2) ranked DCA, GLCA and UDCA highest by mean decrease in accuracy, and support vector machine (SVM) modeling (radial kernel; Fig. 2E, Table S3) again highlighted DCA as the most discriminatory feature.

Correlation between DCA levels and cognitive performance

Participants were stratified into High-DCA (n = 150) and Low-DCA (n = 150) groups at the median serum DCA concentration (27.7279 nmol/g). Chi-square tests indicated that the High-DCA group had a significantly higher rate of CI than the Low-DCA group by both MMSE (p = 0.0055) and MoCA (p = 0.03755) (Table 4). Spearman’s rank correlation (two-tailed) further revealed an inverse relationship between serum DCA levels and global cognitive performance: MMSE (r = −0.30, p < 0.0001) and MoCA (r = −0.26, p < 0.0001) (Fig. 3A, B). This negative association remained significant after controlling for education using rank-based partial correlation analysis (partial ρ = –0.30, p < 0.0001 for MMSE; partial ρ = –0.23, p < 0.0001 for MoCA), indicating that DCA may contribute to cognitive decline in PD independently of educational differences and other sociodemographic influences (Table S4).

**Fig. 3: Correlation between serum DCA levels and cognitive domain scores in PD patients.**

Table 4 Comparison of cognitive status between Low-DCA and High-DCA groups

Full size table

Building on this inverse association, we next examined domain-specific cognitive performance. High-DCA participants scored significantly lower than Low-DCA participants across all seven MMSE and MoCA domains—orientation, memory recall, attention, visuospatial ability, language/naming, executive function, and calculation (all p < 0.0001; Fig. 3C, Table S5). These domain-wide deficits reinforce the association between elevated DCA levels and both the prevalence and severity of cognitive impairment in PD.

Predictive value of serum DCA levels for cognitive impairment in PD patients

To evaluate the predictive value of serum BAs for cognitive normalcy in PD (MMSE ≥ 27; MoCA ≥ 26), we first assessed DCA alone by ROC analysis, which yielded moderate discrimination (MMSE: AUC = 0.645, 95% CI 0.588–0.702; MoCA: AUC = 0.725, 95% CI 0.672–0.778; Table 5) (Fig. 4A, B). Building on this finding, we performed univariate logistic regressions on all 15 BAs, identifying five independent predictors—CA, GLCA, TCA,GLCA and TLCA (Tables 6–7). We then incorporated these five markers into a forward-stepwise multivariate model, which improved discrimination to AUC = 0.741 (95% CI 0.5318–0.7395) for MMSE and AUC = 0.751 (95% CI 0.5901–0.7897) for MoCA (Fig. 4C, D; Tables 5–7).

**Fig. 4: Sequential modeling and LASSO-based refinement for predicting cognitive normalcy in PD.**

Table 5 Performance of bile acid models for predicting cognitive normalcy in PD patients

Full size table

Table 6 Logistic regression of bile acid biomarkers for predicting MMSE-defined cognitive status in PD patients

Full size table

Table 7 Logistic regression of bile acid biomarkers for predicting MoCA-defined cognitive status in PD patients

Full size table

To further refine this multibiomarker panel, we applied LASSO logistic regression with 10-fold cross-validation to the full 15-acid profile (training 70%: n = 210; test 30%: n = 90). For MMSE prediction, the minimal CV error occurred at λ_min = 0.0258, retaining nine BAs (DCA, LCA, CDCA, GLCA, TCA, TDCA, CA, TLCA, GCA), and the refitted test-set model achieved AUC = 0.715 (95% CI 0.608–0.821; Fig. 4E–G; Table 8). Similarly, for MoCA prediction, λ_min = 0.0284 preserved seven acids (DCA, LCA, GLCA, TCA, CA, TLCA, GCA), with a test-set AUC = 0.737 (95% CI 0.634–0.839; Fig. 4H–J; Table 8). A bubble plot illustrates the absolute LASSO coefficients (bubble size = |coefficient|; color = direction) for each metabolite (Fig. 4G, K). Finally, by intersecting the MMSE- and MoCA-derived LASSO panels, we defined a robust seven BA signature—DCA, LCA, GLCA, TCA, CA, TLCA, GCA—that consistently discriminates cognitive status in PD while balancing model parsimony and performance.

Table 8 Optimal LASSO Regularization Parameters for MMSE and MoCA Prediction Models

Full size table

Machine learning classification of cognitive status using the seven-acid signature

Building on our LASSO-derived 7 BA signature (DCA, LCA, GLCA, TCA, CA, TLCA, GCA), we trained five classifiers—random forest (RF), XGBoost, radial-basis SVM, linear SVM and multilayer perceptron (MLP)—using 70% of subjects for training (n = 210) and 30% for testing (n = 90), with 10-fold cross-validation. In the test cohort, RF achieved the highest discrimination for MMSE ≥ 27 (AUC = 0.777, 95% CI 0.680–0.875), followed by XGBoost (AUC = 0.729, 95% CI 0.624–0.834), linear SVM (0.715, 95% CI 0.605–0.826), MLP (0.700, 95% CI 0.589–0.811) and radial SVM (0.696, 95% CI 0.584–0.808) (Fig. 5A; Table 9). Calibration analysis demonstrated that the random forest model achieved the best agreement between predicted and actual MMSE status, exhibiting the narrowest distribution of residual errors and the highest proportion of predictions falling within a small error margin (Fig. 5B, C); quantitatively, MMSE Brier scores ranged from 0.197 to 0.232, and Hosmer–Lemeshow (HL) p-values were >0.05 for SVMs and RF, whereas XGBoost and MLP showed marginal miscalibration(Table S6). Feature-importance analysis across cross-validation folds highlighted DCA, GLCA, and CA as the leading contributors (Fig. 5D; Table S7). Bootstrap-based stability analysis further confirmed robust rankings for key predictors—DCA showed perfect stability across algorithms, with GLCA/TCA/TLCA/CA/LCA exhibiting generally high stability (Table S8).

**Fig. 5: Machine-learning performance and normalized feature-importance profiles for MMSE and MoCA prediction.**

Table 9 Performance of machine learning classifiers for predicting MMSE ≥ 27 using the seven-acid signature

Full size table

Applying the same models to predict MoCA ≥ 26 produced a nearly identical pattern: RF again led with AUC = 0.833 (95% CI 0.750–0.916), followed by XGBoost (0.803, 95% CI 0.710–0.895), MLP (0.784, 95% CI 0.687–0.881), linear SVM (0.777, 95% CI 0.678–0.875) and radial SVM (0.740, 95% CI 0.637–0.844) (Fig. 5E; Table 10). RF also demonstrated superior MoCA calibration (Fig. 5F, G); quantitatively, MoCA Brier scores ranged from 0.207 to 0.278, with HL p > 0.05 for SVMs and RF and < 0.05 for XGBoost and MLP (Table S6). The stability-validated importance ranking again prioritized DCA, GLCA, and CA (Fig. 5H; Tables S9–10). These two complementary analyses indicate that our compact 7 BA signature delivers robust discrimination, reliable calibration, and clear biochemical interpretability for both MMSE and MoCA outcomes in PD.

Table 10 Performance of machine learning classifiers for predicting MoCA ≥ 26 using the seven-acid signature

Full size table

Random forest-based integration of clinical covariates and bile-acid biomarkers

To assess the incremental value of clinical covariates alone, we first trained a RF model using the four variables that differed significantly between PD-CI and PD-NC—disease duration, H–Ystage, UPDRS-III score and years of education. In the test cohort, this clinical-only RF achieved an AUC of 0.697(95% CI 0.586–0.807) for predicting MMSE ≥ 27 (Fig. 6A, Table 11), with variable importance (mean decrease in Gini) ranking UPDRS-III highest, followed by disease duration, education and H–Y stage (Fig. 6B). When the same clinical model was applied to MoCA ≥ 26, discrimination improved to an AUC of 0.783(95% CI 0.688–0.879) (Fig. 6C), and the importance order remained essentially unchanged (Fig. 6D).

**Fig. 6: Random Forest performance and feature importance for clinical-only versus clinical and bile acid models.**

Table 11 Performance of RF models integrating clinical covariates and bile-acid biomarkers for predicting cognitive normalcy in PD patients

Full size table

Building on this foundation, we then evaluated whether adding our 7 BA signature (DCA, LCA, GLCA, TCA, CA, TLCA, GCA) could further boost performance via RF-based recursive feature elimination (RF-RFE). For MMSE prediction, RF-RFE accuracy peaked when retaining 11 features (Fig. 6E), and the optimized RF model achieved an AUC of 0.841(95% CI 0.750–0.932) on the test set (Fig. 6F, Table 11). In its importance profile, DCA ranked first, followed by education, TLCA, UPDRS-III, CA and LCA (Fig. 6G). A parallel analysis for MoCA showed maximum accuracy with the same 11 features (Fig. 6H) and yielded an RF AUC of 0.901(95% CI 0.801–0.100) (Fig. 6I, Table 11), with DCA again leading the feature ranking alongside GLCA, CA and education (Fig. 6J). We next examined whether BA features conferred additional predictive value beyond demographic covariates by comparing a clinical-only model (age, sex, and education) with a combined clinical + BA model using the DeLong test for correlated ROC curves. The combined model achieved significantly higher discrimination (ΔAUC = 0.144, p < 0.001 for MMSE; ΔAUC = 0.118, p < 0.001 for MoCA; Supplementary Table S11), demonstrating that BA profiles add independent predictive power beyond clinical variables. These results demonstrate that integrating BA biomarkers with key clinical covariates and applying RF-RFE markedly enhances discrimination of cognitive normalcy in PD, with DCA emerging as the most powerful single predictor.

Discussion

BAs, cholesterol-derived metabolites transformed by the gut microbiota and recirculated via enterohepatic circulation, can interact with receptors and signaling pathways within the CNS, where such effects are biologically plausible, though direct evidence for comprehensive CNS BA metabolism remains limited^15,25. The CNS contains enzymes and receptors relevant to BA signaling; their activation influences neuronal energy metabolism, survival, and function, thereby playing important roles in neurological diseases^26,27. Consequently, BAs are emerging as key mediators through which the gut microbiota modulate CNS activity.

Recent studies have proposed that PD may originate in the gut, with misfolded α-synuclein propagating to the brain via neural pathways, underscoring the significance of peripheral systems in PD pathogenesis²⁸. Clinical investigations further suggest that dysregulation of peripheral BA homeostasis contributes to PD development²⁹. Beyond metabolic regulation, BAs and their receptors exert immunomodulatory effects that may influence neuroinflammation and microglial activity³⁰. In this context, our study provides a systematic evaluation of serum BA remodeling in PD and its relationship with CI. Using a targeted UPLC–MS/MS platform combined with machine learning, we identified significant alterations in peripheral BA profiles among PD patients. Compared with cognitively normal participants, the PD-CI group exhibited higher levels of the secondary BA DCA and its conjugated form GLCA, along with decreased levels of the primary BA TCA and the neuroprotective secondary BA UDCA. This pattern indicates an imbalance in BA synthesis and conversion within the enterohepatic cycle, leading to overproduction of neurotoxic secondary BAs and reduced generation of protective species. Similar alterations have been reported in AD, including enhanced conversion of CA to DCA and depletion of UDCA³¹. However, such parallels should be interpreted cautiously, as the mechanisms in PD are likely disease-specific and require direct experimental validation.

Our findings suggest a coexistence of neurotoxic BA accumulation and depletion of neuroprotective BAs that may collectively contribute to CI in PD. Serum DCA exhibited moderate yet consistent inverse correlations with global cognition, and these associations remained significant after adjusting for education, indicating that the observed effect is unlikely to be driven by demographic differences. Although higher DCA levels were associated with worse cognitive outcomes, this relationship does not imply a causal or driving role, but rather reflects a potential biochemical signature of cognitive impairment. While a single-metabolite model for DCA showed only moderate discriminative ability, the multivariate bile acid model substantially enhanced overall prediction performance. The inclusion of 7 LASSO-selected BAs with key clinical covariates (disease duration, H–Y stage, UPDRS-III, education) within a RF-RFE framework further enhanced performance. These results emphasize the advantage of integrating metabolic and clinical dimensions using non-linear ensemble algorithms.

Notably, several primary BAs (CA, CDCA) were elevated in PD compared with controls but did not differ between PD-NC and PD-CI, suggesting that altered primary BA synthesis represents a generalized metabolic signature of PD rather than a cognition-specific feature. Conversely, reduced GUDCA levels across PD groups may reflect systemic cholestatic or mitochondrial dysfunction unrelated to CI. The marked variability observed in certain BAs (Fig. S1) further indicates that a subset of PD patients may exhibit disproportionate metabolic alterations, potentially influenced by inter-individual differences in gut microbiota composition, small intestinal bacterial overgrowth genetic variation in BA-synthetic enzymes, or medication use^32,33,34,35. These findings highlight the importance of personalized metabolic profiling and caution against overgeneralization from group-level averages.

Despite these insights, several limitations warrant consideration. First, the cross-sectional design precludes causal or temporal inference. While our models accurately classified cognitive status based on current serum BA profiles, longitudinal validation will be essential to determine whether these signatures can predict future CI. Second, although medications known to alter BA metabolism (e.g., statins, BA sequestrants, proton pump inhibitors, cholinesterase inhibitors) were reviewed and users excluded, unmeasured factors—such as diet, microbiota composition, renal function, and comorbidities—may still contribute to variability. Routine hepatic indices (ALT, AST, bilirubin) were comparable across groups, but detailed microbiome and dietary assessments were not conducted. Future multi-omics studies integrating metagenomic and liver-function analyses are needed to delineate gut–liver–brain mechanisms underlying BA dysregulation in PD. Third, although rigorous calibration, pooled quality control, and isotope-labeled internal standards were used to minimize analytical bias, inter-laboratory reproducibility remains to be established. Validation across analytical centers will be necessary to confirm the robustness and clinical portability of UPLC–MS/MS BA quantification. Fourth, the clinical-only model (age, sex, education) showed moderate discrimination, and adding the seven-acid signature yielded a significant incremental improvement, indicating additive value of BA features while not yet meeting thresholds for clinical deployment. These results suggest that serum BA profiles may serve as adjunctive, rather than standalone, biomarkers for cognitive stratification in PD. Future studies should refine feature selection, expand population diversity, and include external validation cohorts to assess clinical applicability. Finally, our study did not incorporate established CSF or plasma biomarkers of neurodegeneration (e.g., Aβ, tau, NfL), which primarily reflect neuronal injury. BAs instead represent a complementary metabolic axis linking peripheral and central systems. Future multi-biomarker frameworks should evaluate whether BA signatures provide incremental value beyond traditional neuronal markers for predicting cognitive outcomes in PD.

In conclusion, our targeted metabolomics and machine-learning framework identified several BAs, particularly DCA, as potential biochemical correlates of CI in PD. These findings suggest that peripheral BA dysregulation reflects, rather than drives, PD-related cognitive changes. Comprehensive longitudinal, multi-center, and multi-omics studies integrating microbiome, hepatic, and neurodegenerative markers will be essential to validate these associations and determine the translational potential of BA signatures for early detection and monitoring of cognitive decline in PD.

Methods

Participants and clinical assessment

A total of 300 patients diagnosed with idiopathic PD were consecutively recruited from the Neurology Department of Xuzhou Medical University Affiliated Hospital between October 2022 and March 2024. Diagnosis was independently confirmed by two senior neurologists according to the 2015 Movement Disorder Society (MDS) Clinical Diagnostic Criteria for PD. Inclusion criteria included the ability to complete H-Y staging and MDS-UPDRS III scoring with sufficient language and cognitive function. Exclusion criteria were: secondary or atypical parkinsonism; comorbid neurological or psychiatric disorders (e.g., schizophrenia, stroke, major depression); systemic diseases affecting major organs or metabolic function; and medications known to influence BA metabolism (e.g., statins, BA sequestrants, proton pump inhibitors, cholinesterase inhibitors). Participants using these medications were excluded from the study to minimize confounding effects on BA metabolism. Additionally, participants with active liver disease, gastrointestinal disorders, or other major metabolic conditions (e.g., hyperlipidemia, diabetes) were also excluded, as these conditions can affect BA metabolism.

In parallel, 63 age- and sex-matched HCs, with no history of neurological or psychiatric illness, were recruited from the local community. Clinical parameters were measured, including BMI, fasting blood sugar, lipid levels (TC, TG, LDL, HDL), and liver function markers (AST, ALT). Clinical parameters were measured and the participant characteristics were displayed in Table 1. All participants provided written informed consent. The study protocol was approved by the Ethics Committee of Xuzhou Medical University (approval no. XYFY2021-KL054-01).

Cognitive and motor evaluation

Cognitive assessments were conducted on the second morning of admission while participants were in a stable condition. Trained neurologists administered the MMSE and MoCA^36,37. One point was added to the MoCA score for participants with ≤12 years of education. Cognitive normality was defined as MMSE scores ≥27 and MoCA scores ≥26, with values below these thresholds indicating CI. Based on these criteria, PD patients were classified into PD-CI or PD-NC. Motor function was assessed using MDS-UPDRS part III and disease stage using the H-Y scale. Non-motor symptoms were evaluated with the Non-Motor Symptoms Scale (NMSS). Cognitive assessments were cross-validated by a third neurologist in cases of disagreement, with final classification by consensus. All assessors received standardized training before the study.

Serum sampling and bile acid quantification

Peripheral venous blood (4–5 mL) was collected on admission. After centrifugation at 3500 rpm for 10 min at 4 °C, serum was aliquoted and stored at –80 °C. Biochemical parameters were measured on a Beckman AU5800 autoanalyzer. For BA profiling, a targeted UPLC–MS/MS platform (Waters Corp., USA) was used, identifying 15 BA species including 6 primary and 9 secondary BAs. Internal deuterated standards were obtained from Sigma-Aldrich (Germany). Serum (50 μL) was mixed with 5 μL IS and 500 μL precooled methanol, centrifuged at 12,000 × g, dried under nitrogen, and reconstituted in 100 μL water. Chromatographic separation was performed on a UPLC BEH C18 column (1.7 μm, 100 × 2.1 mm) with a mobile phase of ammonium acetate and formic acid in acetonitrile, gradient-eluted at 0.4 mL/min. Quantification used multiple-reaction monitoring (MRM) in negative ESI mode. All analyses were performed in technical triplicates, with calibration curves (R² > 0.995) and pooled QC samples included for batch correction.

Peripheral venous blood was processed within 2 h and stored at –80 °C in single-use aliquots to prevent repeated freeze–thaw cycles. Stability testing confirmed <8% intra-sample variation over 6 months. Quantification was performed by targeted UPLC–MS/MS (Waters Corp.) using 18 isotope-labeled internal standards (DCA-d₄, CA-d₄, CDCA-d₄ etc.)(Table S12). Each analytical batch was independently calibrated using freshly prepared standard curves (all with R² > 0.99). Every run included low, mid, and high concentration quality control (QC) samples, with both intra- and inter-batch coefficients of variation maintained below 15%. All quantifications were performed using a stable isotope-labeled internal standard to correct for matrix effects and procedural variations.

Statistical analysis

Data were analyzed using R (version 4.4.3), SPSS (version 28.0), and GraphPad Prism (version 8.0). Continuous variables were tested for normality using the Shapiro–Wilk test. Normally distributed variables were compared using t-tests or one-way ANOVA, while non-normally distributed data were analyzed with appropriate non-parametric tests (Mann–Whitney U or Kruskal–Wallis). Categorical variables were assessed using the Chi-square (χ²) test or Fisher’s exact test. Correlations between bile acids and clinical measures were assessed Spearman correlation and rank-based partial correlation analyses, with the latter controlling for key demographic factors. Given the number of targeted bile acids and subgroup analyses, we applied the Benjamini–Hochberg false discovery rate (FDR) correction to control for multiple comparisons. Both raw and adjusted p-values are reported, and only FDR-adjusted p < 0.05 was considered statistically significant.

Multivariate analyses were performed using OPLS-DA in SIMCA (version 13.0) to visualize group separation and identify BAs contributing to discrimination. Model robustness was assessed through sevenfold internal cross-validation and 200-time permutation testing. VIP > 1.0 and FDR-adjusted p < 0.05 were considered significant contributors. Low permutation intercepts (R²Y < 0.3, Q² < 0.05) confirmed the absence of overfitting. To identify independent variables associated with cognitive impairment, we used Cox regression with forward selection as an exploratory feature-screening step. Final multivariable models were constructed using LASSO regression, which applies penalized likelihood estimation to improve stability and reduce overfitting. Optimal regularization parameters were determined via 10-fold cross-validation using the glmnet package in R.

Machine learning classifiers (logistic regression, random forest, support vector machine) were trained on 80% of the data and validated using stratified 10-fold cross-validation and bootstrap resampling (1,000 iterations). Model discrimination was evaluated using the AUC, sensitivity, and specificity. To determine whether BA signatures provided incremental value beyond demographic variables, DeLong’s test for correlated ROC curves was used to compare (i) clinical-only and (ii) combined clinical + BA models. A p < 0.05 by DeLong test indicated significant improvement in classification performance. All statistical tests were two-sided, and FDR-adjusted p < 0.05 was considered statistically significant. For all bile-acid comparisons, fold change (PD-CI/PD-NC) was computed in addition to p-values to describe the magnitude of between-group differences. For classification models, ROC analyses included 95% bootstrapped confidence intervals for AUCs and displayed bootstrapped confidence bands to visualize model uncertainty.

Machine learning modeling

All predictors seven BAs (DCA, LCA, GLCA, TCA, CA, TLCA, GCA) and four clinical variables (disease duration, education years, Hoehn–Yahr stage, and UPDRS-III)—were standardized (z-score) after median imputation of missing values. Participants were classified by cognitive status using both the MMSE (≥27 = normal cognition, <27 = impairment) and MoCA (≥26 = normal cognition, <26 = impairment) thresholds. To capture both linear and non-linear relationships between BAs and cognition, 5 algorithmic families were implemented: logistic regression (baseline linear model), support vector machine with radial and linear kernels, random forest (500 trees), extreme gradient boosting (XGBoost), and a multilayer perceptron (MLP) neural network. Hyperparameters for each classifier were optimized within a nested 10-fold cross-validation framework (inner loop for grid-search tuning, outer loop for model evaluation). This design minimizes bias from random data partitioning and prevents overfitting. Final performance metrics represent the mean ± SD across 100 randomized resampling iterations.

Model performance was quantified by AUC, accuracy, sensitivity, and specificity. Model calibration was assessed both graphically and quantitatively using the Brier score and Hosmer–Lemeshow goodness-of-fit test. Feature importance was computed for each algorithm (permutation importance for tree-based models; absolute weight coefficients for linear models). To evaluate robustness, 1000 bootstrap resamples were used to estimate 95% confidence intervals and rank-stability indices for each feature’s importance. Recursive feature elimination based on the random-forest algorithm (RF-RFE) was then applied to identify the smallest predictor subset achieving maximal accuracy. The RF-RFE-derived feature set was retrained using the random forest classifier, and its classification and calibration metrics were re-evaluated under the same cross-validation procedure. To assess incremental discrimination contributed by bile acids beyond demographic covariates, DeLong’s test for correlated ROC curves compared the AUCs of (i) clinical-only and (ii) combined clinical + BA models. A two-tailed p < 0.05 was considered statistically significant.

Data availability

The data supporting this study’s findings are available from the corresponding author upon reasonable request. The data are not publicly available due to patient privacy and institutional ethical restrictions.

Code availability

The code supporting the findings of this study is available from the corresponding author upon reasonable request.

References

Leite Silva, A. B. R. et al. Premotor, nonmotor and motor symptoms of Parkinson’s disease: a new clinical state of the art. Ageing Res. Rev. 84, 101834 (2023).
Article PubMed CAS Google Scholar
Aarsland, D. et al. Parkinson disease-associated cognitive impairment. Nat. Rev. Dis. Prim. 7, 47 (2021).
Article PubMed Google Scholar
Baiano, C., Barone, P., Trojano, L. & Santangelo, G. Prevalence and clinical aspects of mild cognitive impairment in Parkinson’s disease: a meta-analysis. Mov. Disord. 35, 45–54 (2020).
Article PubMed Google Scholar
Hoops, S. et al. Validity of the MoCA and MMSE in the detection of MCI and dementia in Parkinson disease. Neurology 73, 1738–1745 (2009).
Article PubMed PubMed Central CAS Google Scholar
Kim, J. I., Sunwoo, M. K., Sohn, Y. H., Lee, P. H. & Hong, J. Y. The MMSE and MoCA for screening cognitive impairment in less educated patients with Parkinson’s disease. J. Mov. Disord. 9, 152–159 (2016).
Article PubMed PubMed Central Google Scholar
Nie, K. et al. Distinct bile acid signature in parkinson’s disease with mild cognitive impairment. Front. Neurol. 13, 897867 (2022).
Article PubMed PubMed Central Google Scholar
Nabizadeh, F., Valizadeh, P. & Fallahi, M. S. & Alzheimer’s disease Neuroimaging, I. Bile acid profile associated with CSF and PET biomarkers in Alzheimer’s disease. Aging Clin. Exp. Res 36, 62 (2024).
Article PubMed PubMed Central Google Scholar
Braak, H. et al. Staging of brain pathology related to sporadic Parkinson’s disease. Neurobiol. Aging 24, 197–211 (2003).
Article PubMed Google Scholar
Tan, A. H., Lim, S. Y. & Lang, A. E. The microbiome-gut-brain axis in Parkinson disease—from basic research to the clinic. Nat. Rev. Neurol. 18, 476–495 (2022).
Article PubMed Google Scholar
Zhang, X., Tang, B. & Guo, J. Parkinson’s disease and gut microbiota: from clinical to mechanistic and therapeutic studies. Transl. Neurodegener. 12, 59 (2023).
Article PubMed PubMed Central Google Scholar
Li, P. et al. Gut microbiota dysbiosis is associated with elevated bile acids in Parkinson’s disease. Metabolites 11, https://doi.org/10.3390/metabo11010029 (2021).
Ridlon, J. M. & Gaskins, H. R. Another renaissance for bile acid gastrointestinal microbiology. Nat. Rev. Gastroenterol. Hepatol. 21, 348–364 (2024).
Article PubMed PubMed Central CAS Google Scholar
Lin, S. et al. Bile acids and their receptors in regulation of gut health and diseases. Prog. Lipid Res. 89, 101210 (2023).
Article PubMed CAS Google Scholar
Quinn, M. et al. Bile acids permeabilize the blood brain barrier after bile duct ligation in rats via Rac1-dependent mechanisms. Dig. Liver Dis. 46, 527–534 (2014).
Article PubMed PubMed Central CAS Google Scholar
McMillin, M. & DeMorrow, S. Effects of bile acids on neurological function and disease. FASEB J. 30, 3658–3668 (2016).
Article PubMed PubMed Central CAS Google Scholar
Perino, A. et al. Central anorexigenic actions of bile acids are mediated by TGR5. Nat. Metab. 3, 595–603 (2021).
Article PubMed PubMed Central CAS Google Scholar
Castellanos-Jankiewicz, A. et al. Hypothalamic bile acid-TGR5 signaling protects from obesity. Cell Metab. 33, 1483–1492 e1410 (2021).
Article PubMed CAS Google Scholar
Graham, S. F. et al. Biochemical profiling of the brain and blood metabolome in a mouse model of prodromal Parkinson’s disease reveals distinct metabolic profiles. J. Proteome Res. 17, 2460–2469 (2018).
Article PubMed PubMed Central CAS Google Scholar
Graham, S. F. et al. Metabolomic profiling of bile acids in an experimental model of prodromal Parkinson’s disease. Metabolites 8, https://doi.org/10.3390/metabo8040071 (2018).
Payne, T. et al. A double-blind, randomized, placebo-controlled trial of ursodeoxycholic acid (UDCA) in Parkinson’s disease. Mov. Disord. 38, 1493–1502 (2023).
Article PubMed PubMed Central CAS Google Scholar
Rosa, A. I. et al. Novel insights into the antioxidant role of tauroursodeoxycholic acid in experimental models of Parkinson’s disease. Biochim. Biophys. Acta Mol. Basis Dis. 1863, 2171–2181 (2017).
Article PubMed CAS Google Scholar
Khalaf, K., Tornese, P., Cocco, A. & Albanese, A. Tauroursodeoxycholic acid: a potential therapeutic tool in neurodegenerative diseases. Transl. Neurodegener. 11, 33 (2022).
Article PubMed PubMed Central CAS Google Scholar
Zhao, Z. et al. Multi-omics identifies microbiota-derived deoxycholic acid as a key mediator of blood-brain barrier dysfunction in Parkinson’s disease. Imeta. 4, e70076 (2025).
Article PubMed PubMed Central CAS Google Scholar
Li, C. et al. TGR5 deficiency in excitatory neurons ameliorates Alzheimer’s pathology by regulating APP processing. Sci. Adv. 10, eado1855 (2024).
Article PubMed PubMed Central CAS Google Scholar
Mohanty, I. et al. The underappreciated diversity of bile acid modifications. Cell 187, 1801–1818 e1820 (2024).
Article PubMed PubMed Central CAS Google Scholar
Fleishman, J. S. & Kumar, S. Bile acid metabolism and signaling in health and disease: molecular mechanisms and therapeutic targets. Signal Transduct. Target. Ther. 9, 97 (2024).
Article PubMed PubMed Central Google Scholar
Xing, C. et al. Roles of bile acids signaling in neuromodulation under physiological and pathological conditions. Cell Biosci. 13, 106 (2023).
Article PubMed PubMed Central Google Scholar
Kim, S. et al. Transneuronal propagation of pathologic alpha-synuclein from the gut to the brain models Parkinson’s disease. Neuron 103, 627–641 e627 (2019).
Article PubMed PubMed Central CAS Google Scholar
Shao, Y. et al. Comprehensive metabolic profiling of Parkinson’s disease by liquid chromatography-mass spectrometry. Mol. Neurodegener. 16, 4 (2021).
Article PubMed PubMed Central CAS Google Scholar
Bertolini, A., Fiorotto, R. & Strazzabosco, M. Bile acids and their receptors: modulators and therapeutic targets in liver inflammation. Semin Immunopathol. 44, 547–564 (2022).
Article PubMed PubMed Central CAS Google Scholar
Ehtezazi, T., Rahman, K., Davies, R. & Leach, A. G. The pathological effects of circulating hydrophobic bile acids in Alzheimer’s disease. J. Alzheimers Dis. Rep. 7, 173–211 (2023).
Article PubMed PubMed Central Google Scholar
Nishiwaki, H. et al. Meta-analysis of shotgun sequencing of gut microbiota in Parkinson’s disease. NPJ Parkinsons Dis. 10, 106 (2024).
Article PubMed PubMed Central CAS Google Scholar
Efremova, I. et al. Epidemiology of small intestinal bacterial overgrowth. World J. Gastroenterol. 29, 3400–3421 (2023).
Article PubMed PubMed Central Google Scholar
Balesaria, S. et al. Common genetic variants in the bile acid synthesis enzyme CYP7A1 are associated with severe primary bile acid diarrhea. Gastroenterology 163, 517–519 e512 (2022).
Article PubMed CAS Google Scholar
Vich Vila, A. et al. Impact of commonly used drugs on the composition and metabolic function of the gut microbiota. Nat. Commun. 11, 362 (2020).
Article PubMed PubMed Central CAS Google Scholar
Dubois, B. et al. Diagnostic procedures for Parkinson’s disease dementia: recommendations from the movement disorder society task force. Mov. Disord. 22, 2314–2324 (2007).
Article PubMed Google Scholar
Kandiah, N. et al. Montreal Cognitive Assessment for the screening and prediction of cognitive decline in early Parkinson’s disease. Park. Relat. Disord. 20, 1145–1148 (2014).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 82371360), the Natural Science Foundation of Jiangsu Province (Grant No. BK20231237), the Science and Technology Innovation Team and Leading Talent Project of Jiangsu Medical College, the Jiangsu Provincial Geriatric Health Research Project (Grant No. LKM2024016), the Construction Project of High-Level Hospital of Jiangsu Province (Grant Nos. LCZX202514 and GSPSJ20240804), the Project supported by the Affiliated Hospital of Xuzhou Medical University (Grant No. 2021ZA24), the Science and Technology Project of Xuzhou Health Commission (Grant No. XWKYHT20230062), the Jiangsu Association for Science and Technology Youth Science and Technology Talents Lifting Project (Grant No. JSTJ-2025-771), and the Jiangsu Provincial Health Commission Scientific Research Project (Grant No. ZQ2024018).

Author information

These authors contributed equally: Tong Shen, Can Cui, Ruiqi Liu.

Authors and Affiliations

Department of Neurology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu Province, China
Tong Shen, Jie Zu, Wei Zhang, Liguo Dong, Chuanying Xu, Hailong Lu & Guiyun Cui
Jiangsu Key Laboratory of Brain Disease and Bioinformation, Xuzhou Medical University, Xuzhou, China
Can Cui
Department of Neurology and Cell Biology, School of Life Science, Xuzhou Medical University, Xuzhou, China
Ruiqi Liu & Chu Zhang
Department of Genetic Medicine, Xuzhou Maternity and Child Health Care Hospital Affiliated to Xuzhou Medical University, Xuzhou, Jiangsu, China
Xin Yin
Department of Basic Medical Science, Jiangsu Medical College, Yancheng, Jiangsu, China
Xuebin Qu

Authors

Tong Shen
View author publications
Search author on:PubMed Google Scholar
Can Cui
View author publications
Search author on:PubMed Google Scholar
Ruiqi Liu
View author publications
Search author on:PubMed Google Scholar
Xin Yin
View author publications
Search author on:PubMed Google Scholar
Jie Zu
View author publications
Search author on:PubMed Google Scholar
Wei Zhang
View author publications
Search author on:PubMed Google Scholar
Liguo Dong
View author publications
Search author on:PubMed Google Scholar
Chuanying Xu
View author publications
Search author on:PubMed Google Scholar
Hailong Lu
View author publications
Search author on:PubMed Google Scholar
Guiyun Cui
View author publications
Search author on:PubMed Google Scholar
Xuebin Qu
View author publications
Search author on:PubMed Google Scholar
Chu Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, Guiyun Cui, Xuebin Qu, Chu Zhang; Methodology, Tong Shen, Can Cui, Ruiqi Liu, Xin Yin; Investigation (recruitment, clinical assessment, sampling), Jie Zu, Wei Zhang, Liguo Dong, Chuanying Xu, Hailong Lu; Metabolomics and quality control, Can Cui, Xin Yin; Data curation, Ruiqi Liu, Tong Shen; Formal analysis (statistics/OPLS-DA), Ruiqi Liu, Tong Shen; Software and machine learning, Tong Shen; Resources and project administration, Guiyun Cui, Xuebin Qu, Chu Zhang; Supervision, Guiyun Cui, Xuebin Qu, Chu Zhang; Funding acquisition, Guiyun Cui, Xuebin Qu, Chu Zhang; Writing—original draft, Tong Shen, Can Cui, Ruiqi Liu; Writing—review and editing, all authors.

Corresponding authors

Correspondence to Guiyun Cui, Xuebin Qu or Chu Zhang.

Ethics declarations

Competing interests

The authors declares no competing interests.

Disclosure statement

Language editing assistance was provided using ChatGPT4O (OpenAI, San Francisco, USA) to improve the clarity, consistency, and grammar of the manuscript. This tool was used solely for linguistic refinement and abbreviation standardization; no scientific content, data interpretation, or conclusions were modified.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Shen, T., Cui, C., Liu, R. et al. Classifying the risk of cognitive impairment in Parkinson’s disease using serum bile acid profiles and machine learning. npj Parkinsons Dis. 12, 18 (2026). https://doi.org/10.1038/s41531-025-01229-z

Download citation

Received: 11 August 2025
Accepted: 24 November 2025
Published: 08 December 2025
Version of record: 13 January 2026
DOI: https://doi.org/10.1038/s41531-025-01229-z

Subjects

Abstract

Similar content being viewed by others

Characteristics and mechanisms of cognitive impairment in Parkinson disease

Targeted metabolomic analysis in Parkinson’s disease brain frontal cortex and putamen with relation to cognitive impairment

Machine learning model base on metabolomics and proteomics to predict cognitive impairment in Parkinson’s disease

Introduction

Results

Participant characteristics and serum bile acid dysregulation

Comparison of serum bile acid profiles between PD-CI and PD-NC groups

Correlation between DCA levels and cognitive performance

Predictive value of serum DCA levels for cognitive impairment in PD patients

Machine learning classification of cognitive status using the seven-acid signature

Random forest-based integration of clinical covariates and bile-acid biomarkers

Discussion

Methods

Participants and clinical assessment

Cognitive and motor evaluation

Serum sampling and bile acid quantification

Statistical analysis

Machine learning modeling

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Disclosure statement

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links