Abstract
Associations between the gut microbiota and Parkinson’s disease (PD) have been widely investigated. However, the replicable biomarkers for PD diagnosis across multiple populations remain elusive. Herein, we performed a meta-analysis to investigate the pivotal role of the gut microbiome in PD and its potential diagnostic implications. Six 16S rRNA gene amplicon sequence datasets from five independent studies were integrated, encompassing 550 PD and 456 healthy control samples. The analysis revealed significant alterations in microbial composition and alpha and beta diversity, emphasizing altered gut microbiota in PD. Specific microbial taxa, including Faecalibacterium, Roseburia, and Coprococcus_2, known as butyrate producers, were notably diminished in PD, potentially contributing to intestinal inflammation. Conversely, genera such as Akkermansia and Bilophila exhibited increased relative abundances. A network-based algorithm called NetMoss was utilized to identify potential biomarkers of PD. Afterwards, a classification model incorporating 11 optimized genera demonstrated high performance. Further functional analyses indicated enrichment in pathways related to neurodegeneration and metabolic pathways. These findings illuminate the intricate relationship between the gut microbiota and PD, offering insights into potential therapeutic interventions and personalized diagnostic strategies.
Similar content being viewed by others
Introduction
Parkinson’s disease (PD) is a common neurodegenerative disorder characterized by motor symptoms, including tremor, rigidity, bradykinesia, and postural instability. The hallmark pathology of PD is the loss of dopaminergic neurons in the substantia nigra and the presence of α-synuclein-containing Lewy bodies1. Many PD patients experience constipation and other gastrointestinal symptoms, which often manifest years before the onset of motor symptoms. These gastrointestinal disturbances may be linked to the observed dysbiosis of gut microbiota in PD patients2. The bidirectional communication between the gut and the central nervous system, known as the microbiota-gut-brain axis, has garnered significant attention in the context of neurodegenerative diseases, including PD3. Emerging research suggests that the gut microbiota, a complex community of microorganisms residing in the gastrointestinal tract, plays a crucial role in modulating neuroinflammation, neurotransmitter systems, and neuronal function, potentially influencing the development of PD4. However, the exact pathogenesis has not been fully elucidated.
Recent studies investigating gut microbiota profiles in PD patients have reported dysbiosis compared to healthy controls (HCs). Bacterial genera, including Prevotella, Faecalibacterium, Blautia, Fusicatenibacter, and Haemophilus appeared to be depleted in fecal samples from PD patients in multiple cohort studies5. These genera are known to produce short-chain fatty acids (SCFAs), possessing anti-inflammatory effects in the gut and supporting intestinal epithelial integrity6. In contrast, the relative abundances of Akkermansia, Lactobacillus, Bifidobacterium, Alistipes, Christensenella, and Escherichia genera have been found to increase consistently in PD patients relative to HCs3,7. Several mechanisms have been proposed to link these genera to increased inflammation, α-synuclein aggregation, and modulation of neurotransmitters in PD pathogenesis5,8. However, heterogeneity exists across published studies due to factors such as geography, diet, medication use, and methodology. For example, some genera, including Bacteroides, Parabacteroides, Butyricicoccus, Butyrivibrio, Clostridium, Lachnospira, Megasphaera, and Klebsiella, showed alterations in opposite directions3. Large and global meta-analyses of bacterial taxa in PD patients are needed to deliver a more nuanced understanding of the gut microbiota associated with PD, effectively mitigating study-specific biases and enhancing our insights into this intricate relationship.
To comprehensively address the complexities of the gut-brain axis in PD, rigorous large-scale meta-analyses have been conducted. Nie et al. performed a significant analysis with 2269 samples via 16S rRNA gene sequencing and 236 samples through shotgun metagenomics, revealing noteworthy shifts in microbial composition, indicating the potential role of pro-inflammatory factors and suggesting microbial biomarkers for PD prediction9. Another meta-analysis across ten case-control studies was carried out by Toh et al., revealing significant but geographically varied microbiome differences between 969 PD patients and 734 controls10. In addition, Romano et al. performed a meta-analysis of ten 16S microbiome datasets, identifying consistent PD-associated gut microbiome alterations, indicating a potential pro-inflammatory dysbiosis linked to PD’s gastrointestinal symptoms11. Bardenhorst et al. harmonized methodologies in a systematic review, pinpointing core microbial associations, particularly the reduction of butyrate-producing taxa and disruptions in the gut mucosal layer, potentially contributing to PD pathogenesis4. These studies collectively emphasized the critical role of the gut microbiome in PD, showcasing the potential of microbial biomarkers and shedding light on PD etiology. However, none of these studies have taken batch effects across different studies into consideration, possibly causing bias when integrating the microbial data from various studies.
Our current research focused on comprehensively investigating the relationship between PD and the gut microbiome by integrating large-scale microbiota datasets, with the goal of identifying microbial biomarkers for PD diagnosis using a network-based approach to eliminate the possible bias caused by batch effects from different studies. Additionally, the study aimed to develop a robust classification model based on key microbial genera to facilitate accurate detection of PD.
Results
Characteristics of datasets included in this meta-analysis
By searching all the published research with available raw 16S rRNA gene sequencing data of human fecal samples and patient information, this meta-analysis included a total of six 16S rRNA gene sequencing datasets from 5 studies conducted in the Netherlands, Finland, China, Japan, and Germany. In total, 550 PD and 456 HC fecal samples were analyzed (Fig. 1). The PD groups ranged from 26 to 223 patients, with age ranges of 64.9–68.2 years. The HC groups ranged from 38 to 137 subjects, aged 64.5–69.8 years on average. Female participants comprised 26.9–57.4% of the PD groups and 34.2–50% of HCs. Body mass index (BMI) was similar between PD and HC. Microbiome sequencing targeted different 16S regions (including V3-V4 and V4) by using various primers across studies. All the sequencing were performed on Illumina Miseq v3 platform to obtain pair-ended reads of different lengths. Sequence data are available via public databases, including European Nucleotide Archive (ENA), National Center for Biotechnology Information (NCBI), and DNA Data Bank of Japan (DDBJ), under the accession numbers listed. (Table 1). By integrating 16S rRNA gene sequencing data on the gut microbiome from 5 independent studies, this meta-analysis allows for a more robust analysis of microbiome signatures associated with PD compared to healthy controls.
Six 16S rRNA gene sequencing datasets from 5 studies were collected for meta-analysis. After the curation of taxonomy data, a network-based approach called NetMoss was applied to feature selection. Finally, machine learning was used for the construction of the classification model. Created with BioRender.com.
Alterations of microbial communities in PD
Microbial composition analysis was performed to discern differences between PD patients and HCs based on the entire 6 datasets combined. Notably, the alpha diversity of the gut microbiota exhibited a statistically significant variation between these two groups (Shannon index, P = 0.0014, Fig. 2a; Simpson index, P = 0.0016, Fig. 2b), indicating an altered gut microbiome in the PD group. Then, beta diversity was compared using Permutational multivariate analysis of variance (PERMANOVA) method, showing that microbial composition was significantly differential among various studies or between groups (both P < 0.001, Fig. 2c). Further principal coordinate analysis (PCoA) analysis revealed that beta diversity significantly varied among different studies (both PC1 and PC2, P < 0.001, Fig. 2c). Likewise, beta diversity also varied between the PD and HC groups (PC1, P = 0.153; PC2, P = 0.003; Fig. 2c). These results demonstrated that the difference among studies was much greater than that between case–control groups, suggesting that the factor “study” is a predominant confounder when integrating microbial datasets.
Alpha diversity of the gut microbiota in the two groups, as indicated by the (a) Shannon index and (b) Simpson index. c Principal coordinate analysis of samples from all six datasets based on Bray–Curtis distance. d Relative abundances of microbes at the phylum level in PD and HC across six datasets. e Cladogram plot showing differential analysis of microbial relative abundances at different levels. Taxa with |LDA|> 3, and P value < 0.05 were chosen for illustration. Statistical significance was calculated by Wilcoxon rank-sum test (a, b) and PERMANOVA analysis or Kruskal–Wallis test (c). Abbreviations for different datasets (NL, CHN, JPN, GER, FIN1, and FIN2) were indicated in Table 1.
The dominant phyla in the gut microbiota of these individuals included Firmicutes and Bacteroidetes, followed by Actinobacteria, Proteobacteria, Verrucomicrobia, and Fusobacteria (Fig. 2d). At the class level, the gut microbiota was dominated by Clostridia, Bacteroidia, Actinobacteria, and Negativicutes (Additional file 1: Supplementary Fig. S1a). In addition, our analyses also demonstrated the intestinal microbial compositions at other taxa levels in PD and HC groups across different datasets (Additional file 1: Supplementary Fig. S1b–d).
To further identify the differences in gut microbial taxa between PD and HC individuals, Linear discriminant analysis EffectSize (LEfSe) analysis was performed. In general, some remarkably altered microbial taxa and their evolutionary relationships were illustrated in the Cladogram (Fig. 2e). At the phylum level, Euryarchaeota was found to be enriched in the HC group. In addition, the relative abundances of the class Methanomicrobia, the order Methanosarcinales, the families Methanosarcinaceae, Clostridiaceae_1, and Veillonellaceae, and the genera Methanosarcina, Butyricicoccus, and Faecalibacterium significantly decreased in the PD group in comparison with those in the HC group (Fig. 2e & Additional file 1: Supplementary Fig. S2). In contrast, the PD-associated phyla Actinobacteria, Synergistetes, Tenericutes, and Verrucomicrobia exhibited significantly increased relative abundances in PD patients compared to healthy controls (Fig. 2e & Additional file 1: Supplementary Fig. S2a). In addition, a total of 9 classes (e.g., Actinobacteria, Bacilli, Coriobacteriia; Additional file 1: Supplementary Fig. S2b), 10 orders (e.g., Bacillales, Coriobacteriales, Desulfovibrionales; Additional file 1: Supplementary Fig. S2c), 20 families (e.g., Akkermansiaceae, Barnesiellaceae, Desulfovibrionacea; Additional file 1: Supplementary Fig. S2d), and 38 genera were enriched in the PD group (Additional file 1: Supplementary Fig. S2e). To summarize, all these results highlighted the distinct gut microbiota profiles of PD patients compared to healthy controls, suggesting the potential role of fecal microbiota as biomarkers for the diagnosis of PD.
Identification of key bacterial genera as potential PD biomarkers
At the genus level, our further differential relative abundance analysis unveiled a notable dysbiosis in the composition of the gut microbiota in patients with PD. In the combined 6 datasets, the relative abundances of 32 genera exhibited significant alterations in the PD patients (FDR < 0.05, Wilcoxon rank-sum test, Fig. 3a). Evidently, several genera known as butyrate producers, including Faecalibacterium, Roseburia, and Coprococcus_2, were conspicuously diminished in the PD cohort. Butyrate is an SCFA with anti-inflammatory effects mainly produced by microbiota in the gut, suggesting that these genera may be linked to anti-inflammation in PD. Conversely, some genera, such as Fournierella, Faecalitalea, Akkermansia, Hungatella, Negativibacillus, and Bilophila, displayed discernible increases in relative abundance among PD patients (Fig. 3a).
a Differential analysis of relative abundances at the genus level in the integrated datasets (left) or across studies (right). The significantly differential genera with FDR < 0.05 in the integrated analysis were chosen for visualization. The values in the cells in the right panel represent FDR values. b The top 30 genera with the highest NetMoss scores. NetMoss scores were generated by the NetMoss algorithm to identify the key microbial genera that influenced the network of dysbiotic microbiome. FC Fold Change, indicates fold change in relative abundances. c The network for healthy and PD samples on the integrated six datasets. Each node represents one taxon, each edge represents the correlation between two taxa, while the node size indicates NetMoss score. Statistical significance was calculated by Wilcoxon rank-sum test. Abbreviations for different datasets (NL, CHN, JPN, GER, FIN1, and FIN2) were indicated in Table 1.
Subsequently, we conducted further differential analyses on these altered genera within separate datasets. Notably, the relative abundances of certain genera, including Oscillibacter (Additional file 1: Supplementary Fig. S3a), Akkermansia, Hungatella, Eisenbergiella, UBA1819, Barnesiella, Parabacteroides, Anaerotruncus, and Erysipelatoclostridium (Fig. 3a), demonstrated consistent elevation in PD patients across all the studies. However, for some other genera, the trends of alterations in relative abundances exhibited inconsistency across the studies, such as Faecalibacterium (Additional file 1: Supplementary Fig. S3b), Faecalitalea, Fournierella, Lactobacillus, Tyzzerella, Staphylococcus, and Coprococcus_2 (Fig. 3a).
Considering the marked differences in gut microbiota profiles among various studies, a network-based algorithm called NetMoss was applied to eliminate batch effects and to identify potential biomarkers of PD. When integrating the 6 microbiota datasets, we calculated the NetMoss score of each microbial genus by using the NetMoss package (Version 2). The top 30 genera with the highest NetMoss scores are shown in Fig. 3b, including Faecalibacterium, Blautia, Coprococcus_2, Sutterella, Negativibacillus, and Odoribacter. We next identified 35 potential microbial biomarkers of PD based on these results (NetMoss score >0.6).
For now, we have utilized 3 distinct methods, NetMoss, Wilcoxon, and LEfSe, to identify differential bacterial taxa between PD patients and controls. As demonstrated in the Venn diagram, a total of 5 genera, including Negativibacillus, Alistipes, Faecalibacterium, Butyricicoccus, and Coprococcus_2, were identified as potential microbial biomarkers across all these 3 analytical approaches (Additional file 1: Supplementary Fig. S4). Moreover, 22 of 35 bacterial genera were uniquely identified by NetMoss algorithm, suggesting considering batch effects can find more potential bacteria than Wilcoxon or LEfSe methods during pooling analyses of multiple datasets (Additional file 1: Supplementary Fig. S4). To further analyze the associations of bacterial taxa, we then constructed the network separately in the PD and HC groups. The results displayed different hub microbial taxa and network structures between the PD and HC groups (Fig. 3c), suggesting the predominance of microbial interaction networks in exploring the contribution of gut microbiota to PD.
Construction of the microbial classification model for PD
The above results have demonstrated the significant changes in the relative abundances of potential key microbial taxa of PD. The hub microbial genera identified by NetMoss were then used to build classification models through a machine learning method called random forest. By integrating the 6 PD datasets, we utilized 10-fold cross-validation to construct the random forest model. Additionally, the receiver operating characteristic (ROC) curve and the area under the curve (AUC) were used to evaluate the model performance. Based on the 35 hub microbial genera determined by the NetMoss method (NetMoss score >0.6), we obtained a classification model with high performance (Accuracy = 0.842, AUC = 0.909, Fig. 4a).
a The receiver operating characteristic (ROC) curve of the models was calculated based on 35 genera identified by NetMoss (NetMoss score >0.6 as threshold) using the random forest method. b Recursive feature elimination (RFE) method showing the accuracy of models based on different numbers of features. c Variable importance plot showing the MeanDecreaseGini scores of 35 genera calculated by the random forest method. d The ROC curve of the models was calculated based on 11 genera using the random forest method. In our independent metagenomics validation, relative abundances of the genera (e) Blautia, (f) Lachnoclostridium and (g) Faecalibacterium in the HC and PD groups. For the box plots: centerline, median; box, IQR (the range between the 25th and 75th percentiles); whiskers, 1.5× IQR. Ns not significant; *P < 0.05; **P < 0.01. Statistical significance was calculated by Wilcoxon rank-sum test.
Considering the cost of measured genera during the actual diagnosis process, it is necessary to optimize the classification model by reducing the included feature numbers. To determine the minimum features required for good performance, the recursive feature elimination (RFE) method was applied. For each feature number ranging from 1 to 35, we shuffled the 10-fold test sets 10 times and calculated the optimal performance of the models. Notably, the results showed that from 10 features, the accuracy of the models started to be stable at a relatively high level (Fig. 4b). Then, we calculated the MeanDecreaseGini scores of the 35 bacterial genera, which quantified the contribution of a feature to the model. Subsequently, the top genera with the highest MeanDecreaseGini scores were selected to build random forest models, demonstrating that the model using 11 genera had relatively high accuracy for prediction and minimum feature numbers (Fig. 4c). This optimal PD classification model achieved high performance both in the training set (Accuracy = 1, AUC = 1) and test set (Accuracy = 0.802, AUC = 0.864, Fig. 4d). This optimized classification model included Alistipes, Blautia, Butyricicoccus, Erysipelatoclostridium, Faecalibacterium, Family XIII AD3011 group, Family XIII UCG-001, Lachnoclostridium, Lachnospiraceae FCS020 group, Lachnospiraceae UCG-004, and Ruminococcaceae UCG-002 as predictors.
Consistently, we collected stool samples of 6 PD patients and 6 matched HC individuals from our hospital and conducted metagenomic analyses on the gut microbiota, showing a decrease in the relative abundances of Blautia (P < 0.01, Fig. 4e), Lachnoclostridium (P < 0.05, Fig. 4f) and Faecalibacterium (Fig. 4g) in PD patients. These data further validated the pivotal role of microbial features in our classification model, underscoring their effectiveness in distinguishing PD patients from healthy controls.
Changes in microbial functions of PD
Based on the integrated 6 datasets, we further examined the potential gut microbiota functional changes in PD patients by calculating the Reporter Score. A total of 33 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were enriched in the PD group. Pathways of neurodegeneration-multiple diseases and Parkinson disease were two important predicted pathways with increased abundances in the PD group, suggesting that alterations in gut microbiota profiles might be involved in PD pathogenesis (Fig. 5a). Moreover, the pathway named Metabolic pathways was predicted to be enriched in PD patients (Fig. 5a), indicating that disturbed metabolic pathways might be potential mechanisms of PD pathogenesis. Specifically, some PD-related metabolic pathways, including lipoic acid metabolism (Fig. 5b), glycerophospholipid metabolism (Fig. 5c), and alpha-linolenic acid metabolism, were also predicted to be elevated in PD (Fig. 5a). For lipoic acid metabolism, the increase in pdhA, pdhB, aceE, sucA, bkdA1, bkdA2, and LIPT1 genes of gut microbiota might be related to the altered pathway in PD (Fig. 5b & Additional file 1: Supplementary Fig. S5a). Meanwhile, the elevation of PCYT1, pcs, pldA, clsC, dagK pcrB, CRLS, and pgpC genes in intestinal bacteria was involved in the enrichment of glycerophospholipid metabolism (Fig. 5c & Additional file 1: Supplementary Fig. S5b). Consistently, our metagenomic analyses validated the remarkable increase in the abundances of the aceE, sucA, pldA, and pgpC genes in PD patients (Additional file 1: Supplementary Fig. S6). These functional analyses suggested that the altered metabolic pathways regarding lipoic acid, glycerophospholipid, and alpha-linolenic acid might be the possible mechanisms underlying the involvement of gut microbiota dysbiosis in PD.
a Reporter score for level-3 KEGG pathways showing functional enrichment in different groups. Reporter Score >0 represents enrichment of pathways in PD group, and Reporter Score = 2 was set as the threshold for visualization. Network plots for (b) lipoic acid metabolism and (c) glycerophospholipid metabolism pathways. d Heatmap showing the associations between genus-level relative abundances and pathway abundances. The color gradient indicates the Spearman’s rank coefficient of correlation. **P < 0.01; ***P < 0.001. Statistical significance was calculated by Spearman correlation test.
To determine the impact of key genera we identified in the classification model on these potentially altered functional pathways, Spearman correlation analyses were performed based on the KEGG function abundance and the genus-level relative abundances. As shown in Fig. 5d, Blautia, Butyricicoccus, Erysipelatoclostridium, Faecalibacterium, and Lachnoclostridium abundances were significantly correlated with Parkinson disease and lipoic acid metabolism pathways. Moreover, the genera Blautia, Erysipelatoclostridium, Alistipes, and Lachnoclostridium were remarkably associated with Glycerophospholipid metabolism.
Discussion
Our current research is the first meta-analysis aiming to construct a classification model for PD diagnosis by integrating the large-scale microbial sequencing data of PD using a network-based algorithm that minimized the confounding effects across different studies. We first revealed alterations in the gut microbiota of PD patients at various taxon levels, assessing the capability of microbial markers for the diagnosis of PD as a noninvasive approach. Then, potential biomarkers for PD were identified by a novel method called NetMoss. In addition, features were further selected for the random forest model construction. Finally, functional analyses indicated the possible mechanisms by which the gut microbiota predicts PD. The findings presented in this study contribute significant insights into the intricate relationship between PD and the gut microbiome.
As bidirectional communication along the microbiota-gut-brain axis has been an area of growing interest, an increasing number of studies have reported the gut microbiota as a diagnostic marker for PD. Previous studies have identified microbial biomarkers by comparing the gut microbiota between PD patients and healthy controls. In one study, three different supervised machine learning algorithms were used to analyze 16S rRNA gene sequencing data derived from 6 studies, obtaining a model for PD prediction based on a subset of 22 bacterial families12. Likewise, another study demonstrated that random forest models for PD prediction using 11 genera or 6 genes related to inflammation achieved accuracy rates of 80.7% or 91.7%, respectively9. However, the changes in gut microbiota can vary greatly across different studies3,13, which was consistent with the controversial alterations in genus-level relative abundances across different studies revealed by beta diversity and differential analysis of genus-level microbes in our research. Therefore, advanced methods for removing batch effects across studies are fundamental for the construction of a robust and reliable prediction model based on microbial markers.
Multiple algorithms have been developed to integrate large-scale sequencing data of gut microbiota for the recognition of key taxa, like analysis of compositions of microbiomes with bias correction (ANCOM-BC), Percentile normalization14, and Bayesian Dirichlet-multinomial regression meta-analysis15. Some meta-analyses have utilized algorithms to identify the fundamental bacterial taxa in PD development. For example, Toh et al. used ANCOM-BC method which is based on the linear regression framework to perform differential abundant bacterial taxa analyses10. In addition, Romano et al. applied the classical meta random-effect meta-analysis method to conduct the integrative analysis11. However, the novel network-based algorithm NetMoss with several advantages has been rarely utilized in PD research16. First, it focuses on shifts in network modules rather than the relative abundances of individual taxa to identify biomarkers, making it more sensitive to subtle perturbations between health and disease states. Second, it integrates data across multiple cohorts while reducing batch effects. By weighting networks from larger cohorts more heavily, it captures biological features from different studies more accurately. The bacterial genera identified by NetMoss shows better prediction performance than Wilcoxon, combat, limma, Partial Least Squares Discriminant Analysis (PLS-DA), and random forest methods in case‒control classification16. Third, compared to other network-based methods such as Neighbor Shift (NESH) and Jaccard Edge Index (JEI), NetMoss better distinguishes transitioned network modules and is more robust to noise16. Collectively, NetMoss utilizes module shifts in microbial association networks to find robust biomarkers, enabling more sensitive and accurate data integration across studies compared to traditional differential abundance techniques.
We first observed elevated levels of mucin-degrading genera, Akkermansia and Hungatella, and inflammation-related genus like Bilophila, suggesting a potential shift toward a disrupted intestinal barrier or a pro-inflammatory gut environment17,18,19. Conversely, alterations in the gut microbiota of PD patients, particularly decreases in butyrate-producing genera such as Faecalibacterium, Roseburia, and Coprococcus_2, have significant implications20,21,22,23. Butyrate, an SCFA, plays a crucial role in maintaining intestinal epithelial integrity and possesses anti-inflammatory properties. Its reduction in PD patients may contribute to compromised gut barrier function and heightened inflammation, potentially exacerbating the progression of PD24,25,26. These distinct differences in microbial relative abundances between PD and healthy controls further supported their potential as predictors for PD classification.
Moreover, the potential for developing a streamlined classification model based on selected microbial genera is a promising advancement. In our current research, a random forest prediction model for PD based on 11 genera was developed. Some genera were identified as critical taxa markers for PD diagnosis, although no marked differences were observed in our research. Blautia and Butyricicoccus are two butyrate-producing genera related to inflammatory diseases that were found to be decreased in other microbial analyses of PD patients. Their downregulation might be involved in PD pathogenesis due to the reduced production of SCFAs7,27,28. Associated with chronic intestinal inflammation, the Alistipes genus was also found to be elevated in PD patients7,29. However, the underlying mechanisms by which these genera contribute to PD development still need further investigation. Moreover, there have been few reports on alterations of some other predictors included in the model, such as Erysipelatoclostridium, Family XIII AD3011 group, Lachnoclostridium, and Ruminococcaceae UCG-002. Future studies exploring the impact of these critical genera on PD pathogenesis are warranted.
Furthermore, functional analyses were conducted to predict potential pathways affected by the altered gut microbiota. The enrichment of KEGG pathways related to neurodegeneration, particularly those specific to Parkinson’s disease, highlighted the functional consequences of gut microbial dysbiosis. Additionally, the identification of enriched pathways associated with metabolites such as lipoic acid, glycerophospholipids, and alpha-inolenic acid metabolism underscored the potential mechanisms of these key taxa bacteria in PD pathogenesis. As reported in other studies, lipoic acid is a natural metabolite with antioxidant and iron chelator properties, demonstrating neuroprotective effects in several PD mouse models30,31,32. In addition, increasing lipidomic analyses have revealed the prominent role of glycerophospholipids as one main class of bioactive lipids in PD development33,34. In addition, alpha-linolenic acid has been found to exhibit neuroprotection through inhibiting neuroinflammation and oxidative stress in various models35. These findings together suggested the possible mechanisms underlying the impact of the key microbial taxa on PD, further supporting the reliability of our classification model.
This meta-analysis reinforces the importance of understanding the gut microbial signatures associated with PD. Furthermore, the optimization process, which identified a set of 11 key genera for accurate classification, reflected a practical approach for diagnostic application. Our model could serve as a valuable tool for the noninvasive diagnosis of PD, potentially allowing for more timely interventions and personalized treatment strategies. To translate our microbial prediction model into a clinical setting, we propose the use of qPCR or 16S rRNA sequencing to detect the abundances of key genera in fecal samples. This approach could offer a cost-effective and timely diagnostic method for predicting PD, leveraging the microbial taxa identified by our analysis. Further research and validation in larger, more diverse cohorts will be crucial to refining this method and assessing its practical utility in clinical diagnostics.
Although this meta-analysis provided valuable insights, several limitations should be acknowledged. Heterogeneity in study methodologies, including sequencing techniques and data processing pipelines, might have influenced the observed microbial alterations. Additionally, factors such as geography, diet, and medication use can introduce variability, which should be considered in future analyses. Moreover, it is essential to recognize that further validation and standardization are needed before clinical implementation. Prospective studies with larger cohorts and longitudinal designs will be crucial in confirming the diagnostic utility of these microbial signatures.
In conclusion, this comprehensive meta-analysis advances our understanding of the intricate relationship between the gut microbiota and PD. The identified microbial alterations, potential biomarkers, and functional implications offer a promising foundation for future research and clinical applications. By elucidating the complex interplay between the gut microbiota and PD pathogenesis, this study opens avenues for targeted interventions and personalized approaches in the diagnosis and management of PD.
Methods
Data collection
We searched PubMed for published research regarding “PD” and “gut microbiota” until July 31, 2023. Only studies with available raw 16S rRNA gene sequencing data (FASTQ) and patient information were included in our final analysis. After thorough search and eligibility assessment, a total of 6 raw 16S rRNA gene sequencing datasets from patients with PD and HCs from 5 studies were included. Raw sequencing data were downloaded by SRA tools from the ENA using the following accession numbers: PRJEB55464 for Boertien et al.36. PRJNA391524 for Qian et al.37, DRA009229 for Nishiwaki et al.38, and PRJNA381395 for Heintz-Buschart et al.39, PRJEB27564 for Aho VTE et al.40.
16S rRNA data analysis
The 16S rRNA raw sequencing data processing was provided by Magigene Cooperation (Guangdong, China). Briefly, the paired end reads of 200 bp were overlapped on their 3’ ends for concatenation into original longer tags by using Fastp (V0.14.1) by setting the parameters as follows: window size = 4, quality = 20; read length = 200 bp; -q 15, -u 4041. Then, Operational Taxonomic Units (OTUs) were analyzed and clustered by Usearch (V10.0.240)42 and the uparse algorithm. Subsequent taxonomy annotation at levels from phylum to genus was processed based on the Silva V132 reference database43. To normalize the 16S rRNA gene sequencing data, proportions of different microbial taxa in each sample were calculated to obtain relative abundances.
Alpha diversity indices were calculated based on the relative abundances of known OTUs. Beta diversity was analyzed by calculating Bray‒Curtis dissimilarities and followed PERMANOVA method. Afterwards, group differences in PC1 and PC2 were compared by Wilcoxon rank sum tests and study differences were compared by Kruskal‒Wallis test, displayed by PCoA using the vegan R package (V2.6.4). Linear discriminant analysis (LDA) was conducted to compare the differences in microbial community compositions among groups using LEfSe Tools (V1.0)44. Finally, the functional prediction was analyzed using Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) to identify enrichment of KEGG pathways45. The Reporter Score method was then applied to further microbial function enrichment analysis by using the ReporterScore R package (V1.0.0; https://github.com/Asa12138/ReporterScore)46.
Feature selection and model construction
First, the NetMoss R package (Version 2; https://github.com/xiaolw95/NetMoss2)16 was utilized to measure the driving force in the transition of the network structure to evaluate the importance of bacterial genera by calculating the corresponding NetMoss scores. The relative abundance matrices which refer to the relative abundance of case or control samples from different datasets were separately taken as the input of the NetMoss function. NetMoss method was utilized to remove the batch effects from different studies. After the identification of potential biomarkers using the NetMoss score over 0.6 as a threshold, we built random forest models by the randomForest R package (V4.7-1.1) with 10-fold cross-validation to distinguish PD from healthy controls. Briefly, one tenth samples from PD and HC groups were randomly selected respectively as as training set for constructing the model, while the rest of samples were used for validation. The 10-fold cross-validation was performed using the caret R package (V6.0-94). Then, the RFE method was utilized to optimize the feature number necessary for the model. Subsequently, the MeanDecreaseGini scores of the potential biomarkers were calculated by the randomForest package. During the model construction, the AUC and accuracy were used to evaluate the performance of the optimized models.
Fecal DNA extraction and metagenomic analysis
In this study, fecal samples were collected from 6 PD patients and 6 matched healthy controls (spouses of participating PD patients or age-matched healthy volunteers from the same geographical area). These PD patients were diagnosed with PD at Peking University Sixth Hospital. The fecal samples were collected at hospital, and immediately preserved at −80 °C less than half an hour after collection. Then, fecal DNA extraction and metagenomic analysis were performed by Beijing Genomics Institute. Initially, a standard multi-step procedure was employed to extract and process genomic DNA from fecal samples by MagPure Stool DNA KF Kit B following the manufacturer’s instructions. Then, library preparation and quality control were conducted by using MGIEasy Universal DNA Library Prep Set (BGI-Shenzhen, China). Subsequently, metagenomic sequencing was performed. DNA nanoballs (DNBs) containing multiple copies of DNA were generated, and high-quality DNBs were loaded into patterned nanoarrays using a high-intensity DNA nanochip technique. Sequencing reads of PE150 bases length are generated with DNBSEQ-2000 platform (BGI-Shenzhen, China). The study was approved by Peking University Third Hospital Medical Science Research Ethics Committee (No. S2024418) and individual consent was waived.
Raw data underwent preprocessing using SOAPnuke v.2.2.147, followed by mapping to the host genome using SOAP2 for the removal of host-originated reads48. High-quality reads were subjected to de novo assembly utilizing MEGAHIT49, with contigs shorter than 300 bp discarded from subsequent analyses. Gene prediction over contigs was achieved using MetaGeneMark, and redundant genes were eliminated via CD-HIT50. To construct a normalized gene abundance matrix, Salmon software was employed for quantification51. Annotation information was generated by aligning protein sequences against functional databases using DIAMOND52 with an E value cutoff of 1e−5. Taxonomic annotation was assigned using the Kraken LCA algorithm53. Taxonomic and functional abundance profiles were created using Bracken software (https://github.com/jenniferlu717/Bracken) with the default setting.
Statistical analysis
The comparisons of alpha diversity, beta diversity, and microbial community differences between the PD and HC groups were performed using the Wilcoxon rank-sum test, whereas the differences in beta diversity among various studies were compared using the Kruskal‒Wallis test in R (V4.3.0). In addition, plots including PCoA, taxonomy community, differential analysis, heatmap and boxplots were generated by the ggplot2 R package (V3.4.3). In this study, P < 0.05 was considered statistically significant, and the False Discovery Rate (FDR) method was used for p-value correction during multiple comparisons.
Data availability
Raw 16S rRNA gene sequencing data were available at the ENA with the following accession numbers: PRJEB55464, PRJNA391524, DRA009229, PRJNA381395, and PRJEB27564. The datasets generated during the current study available from the corresponding author on reasonable request.
References
Leite Silva, A. B. R. et al. Premotor, nonmotor and motor symptoms of Parkinson’s Disease: A new clinical state of the art. Ageing Res. Rev. 84, 101834 (2023).
Tansey, M. G. et al. Inflammation and immune dysfunction in Parkinson disease. Nat. Rev. Immunol. 22, 657–673 (2022).
Tan, A. H., Lim, S. Y. & Lang, A. E. The microbiome-gut-brain axis in Parkinson disease - from basic research to the clinic. Nat. Rev. Neurol. 18, 476–495 (2022).
Kleine Bardenhorst, S. et al. Gut microbiota dysbiosis in Parkinson disease: A systematic review and pooled analysis. Eur. J. Neurol. 30, 3581–3594 (2023).
Claudino Dos Santos, J. C., Lima, M. P. P., Brito, G. A. C. & Viana, G. S. B. Role of enteric glia and microbiota-gut-brain axis in parkinson disease pathogenesis. Ageing Res. Rev. 84, 101812 (2023).
Dalile, B., Van Oudenhove, L., Vervliet, B. & Verbeke, K. The role of short-chain fatty acids in microbiota-gut-brain communicat ion. Nat. Rev. Gastroenterol. Hepatol. 16, 461–478 (2019).
Li, Z. et al. Gut bacterial profiles in Parkinson’s disease: A systematic review. CNS Neurosci. Ther. 29, 140–157 (2023).
Shandilya, S., Kumar, S., Kumar Jha, N., Kumar Kesari, K. & Ruokolainen, J. Interplay of gut microbiota and oxidative stress: Perspective on neurodegeneration and neuroprotection. J. Adv. Res. 38, 223–244 (2022).
Nie, S., Wang, J., Deng, Y., Ye, Z. & Ge, Y. Inflammatory microbes and genes as potential biomarkers of Parkinson’s disease. NPJ Biofilms Microbiomes 8, 101 (2022).
Toh, T. S. et al. Gut microbiome in Parkinson’s disease: New insights from meta-analysis. Parkinsonism Relat. Disord. 94, 1–9 (2022).
Romano, S. et al. Meta-analysis of the Parkinson’s disease gut microbiome suggests alterations linked to intestinal inflammation. NPJ Parkinsons Dis. 7, 27 (2021).
Pietrucci, D. et al. Can Gut Microbiota Be a Good Predictor for Parkinson’s Disease? A Machine Learning Approach. Brain Sci. 10, 242 (2020).
Nowak, J. M., Kopczyński, M., Friedman, A., Koziorowski, D. & Figura, M. Microbiota Dysbiosis in Parkinson Disease-In Search of a Biomarker. Biomedicines 10, 2057 (2022).
Gibbons, S. M., Duvallet, C. & Alm, E. J. Correcting for batch effects in case-control microbiome studies. PLoS Comput. Biol. 14, e1006102 (2018).
Dai, Z., Wong, S. H., Yu, J. & Wei, Y. Batch effects correction for microbiome data with Dirichlet-multinomial regression. Bioinformatics 35, 807–814 (2019).
Xiao, L., Zhang, F. & Zhao, F. Large-scale microbiome data integration enables robust biomarker identification. Nat. Comput. Sci. 2, 307–316 (2022).
Vandeputte, D. et al. Prebiotic inulin-type fructans induce specific changes in the human gut microbiota. Gut 66, 1968–1974 (2017).
Patel, V. C. et al. Rifaximin-α reduces gut-derived inflammation and mucin degradation in cirrhosis and encephalopathy: RIFSYS randomised controlled trial. J. Hepatol. 76, 332–342 (2022).
Cani, P. D., Depommier, C., Derrien, M., Everard, A. & de Vos, W. M. Akkermansia muciniphila: paradigm for next-generation beneficial microorganisms. Nat. Rev. Gastroenterol. Hepatol. 19, 625–637 (2022).
Machiels, K. et al. A decrease of the butyrate-producing species Roseburia hominis and Faecalibacterium prausnitzii defines dysbiosis in patients with ulcerative colitis. Gut 63, 1275–1283 (2014).
Shen, Z. et al. Roseburia intestinalis stimulates TLR5-dependent intestinal immunity against Crohn’s disease. EBioMedicine 85, 104285 (2022).
Lopez-Siles, M., Duncan, S. H., Garcia-Gil, L. J. & Martinez-Medina, M. Faecalibacterium prausnitzii: from microbiology to diagnostics and prognostics. ISME J. 11, 841–852 (2017).
Tian, H. et al. Depletion of butyrate-producing microbes of the Firmicutes predicts nonresponse to FMT therapy in patients with recurrent Clostridium difficile infection. Gut Microbes 15, 2236362 (2023).
Xie, A. et al. Bacterial Butyrate in Parkinson’s Disease Is Linked to Epigenetic Changes and Depressive Symptoms. Mov. Disord. Off. J. Mov. Disord. Soc. 37, 1644–1653 (2022).
Cirstea, M. S. et al. Microbiota Composition and Metabolism Are Associated With Gut Function in Parkinson’s Disease. Mov. Disord. Off. J. Mov. Disord. Soc. 35, 1208–1217 (2020).
Chen, S. J. et al. Association of Fecal and Plasma Levels of Short-Chain Fatty Acids With Gut Microbiota and Clinical Severity in Patients With Parkinson Disease. Neurology 98, e848–e858 (2022).
Trachsel, J., Humphrey, S. & Allen, H. K. Butyricicoccus porcorum sp. nov., a butyrate-producing bacterium from swine intestinal tract. Int. J. Syst. Evolut. Microbiol. 68, 1737–1742 (2018).
Liu, X. et al. Blautia-a new functional genus with potential probiotic properties? Gut microbes 13, 1–21 (2021).
Parker, B. J., Wearsch, P. A., Veloo, A. C. M. & Rodriguez-Palacios, A. The Genus Alistipes: Gut Bacteria With Emerging Implications to Inflammation, Cancer, and Mental Health. Front. Immunol. 11, 906 (2020).
Tai, S. et al. Alpha-Lipoic Acid Mediates Clearance of Iron Accumulation by Regulating Iron Metabolism in a Parkinson’s Disease Model Induced by 6-OHDA. Front. Neurosci. 14, 612 (2020).
Zhang, J. et al. Alpha-lipoic acid improved motor function in MPTP-induced Parkinsonian mice by reducing neuroinflammation in the nigral and spinal cord. Neurosci. Lett. 781, 136669 (2022).
Liu, L., Yang, S. & Wang, H. α-Lipoic acid alleviates ferroptosis in the MPP(+) -induced PC12 cells via activating the PI3K/Akt/Nrf2 pathway. Cell Biol. Int. 45, 422–431 (2021).
Chiurchiù, V. et al. Lipidomics of Bioactive Lipids in Alzheimer’s and Parkinson’s Diseases: Where Are We? Int. J. Mol. Sci. 23, 6235 (2022).
Galper, J. et al. Lipid pathway dysfunction is prevalent in patients with Parkinson’s disease. Brain J. Neurol. 145, 3472–3487 (2022).
Yuan, Q. et al. The review of alpha-linolenic acid: Sources, metabolism, and pharmacology. Phytother. Res. PTR 36, 164–188 (2022).
Boertien, J. M. et al. Fecal microbiome alterations in treatment-naive de novo Parkinson’s disease. npj Parkinsons Dis. 8, 129 (2022).
Qian, Y. et al. Alteration of the fecal microbiota in Chinese patients with Parkinson’s disease. Brain Behav. Immun. 70, 194–202 (2018).
Nishiwaki, H. et al. Meta-Analysis of Gut Dysbiosis in Parkinson’s Disease. Mov. Disord. Off. J. Mov. Disord. Soc. 35, 1626–1635 (2020).
Heintz‐Buschart, A. et al. The nasal and gut microbiome in Parkinson’s disease and idiopathic rapid eye movement sleep behavior disorder. Mov. Disord. 33, 88–98 (2017).
Aho, V. T. E. et al. Gut microbiota in Parkinson’s disease: Temporal stability and relations to disease progression. EBioMedicine 44, 691–707 (2019).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Edgar, R. C. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat. Methods 10, 996–998 (2013).
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–596 (2013).
Segata, N. et al. Metagenomic biomarker discovery and explanation. Genome Biol. 12, R60 (2011).
Langille, M. G. et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013).
Patil, K. R. & Nielsen, J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl Acad. Sci. USA 102, 2685–2689 (2005).
Chen, Y. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. GigaScience 7, 1–6 (2018).
Li, R. et al. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967 (2009).
Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).
Acknowledgements
This study was supported by the National Natural Science Foundation of China (82071552, 22376006), the Chinese Academy of Sciences Grant (JCTD-2021-06), Peking University Clinical Scientist Program (BMU2023PYJH018), Clinical Medicine Plus X—Young Scholars Project of Peking University (PKU2023LCXQ015), and Key Clinical Projects of Peking University Third Hospital (BYSY2022053).
Author information
Authors and Affiliations
Contributions
J.Y., R.Z., Z.Z. were primarily responsible for the study design. Z.Z. wrote the manuscript. J.C., D.Z., and B.C. contributed to the collection of datasets. Y.L., J.C., C.B., and X.G. contributed to data analysis. Q.W., J.C., N.H., and B.Z. contributed to patient management and sample collection. Z.Z., D.Z., B.C., J.C., X.G., and Q.W. provided expertise, interpreted results, and commented on the manuscript. All authors contributed to editing of the manuscript. All the authors have read and approved the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhao, Z., Chen, J., Zhao, D. et al. Microbial biomarker discovery in Parkinson’s disease through a network-based approach. npj Parkinsons Dis. 10, 203 (2024). https://doi.org/10.1038/s41531-024-00802-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41531-024-00802-2