Introduction

Alzheimer’s disease (AD) is a neurological disorder associated with aging that affects memory, thinking, and behavior, accounting for 60–70% of dementia cases in older adults worldwide. An estimated 6.7 million elderly Americans are currently living with AD, resulting in over 121,499 deaths and unpaid dementia caregiving valued at $339.5 billion in 20221. Furthermore, the incidence of this disease is on an upward trajectory, primarily due to the increasing average human lifespan. While symptoms can vary greatly, one of the initial issues individuals often notice is severe forgetfulness, which significantly impairs their ability to function and care for themselves. This results in a decline in the quality of life for both patients and their caregivers, imposing economic and time burdens on both their families and society2. Due to the significant impact of AD aforementioned, extensive efforts have been devoted to understanding the complex pathogenesis of this disorder. Despite ongoing research endeavors, our current understanding in this regard remains limited.

The major neuropathological hallmarks of AD are amyloid beta (Aβ) plaques and neurofibrillary tangles. The amyloid-β pathway has been placed at the core of AD pathophysiology for over the decades. The accumulation of Aβ initially appears in brain regions with neurons exhibiting high metabolic activity, such as the association cortices in AD patients3. It then spreads from the neocortex to the allocortex, brainstem, and cerebellum. Aβ biogenesis, known as the amyloidogenic pathway, involves the cleavage of amyloid precursor protein (APP), a type I transmembrane protein synthesized in the endoplasmic reticulum and distributed throughout the central nervous system4,5. APP is transported to the cell membrane, where it plays roles in synaptic formation, vesicle exocytosis, long-term potentiation, and neuron plasticity6,7,8,9. In the amyloidogenic pathway, β-secretase first cleaves APP, producing soluble APPβ and a C-terminal fragment (CTFβ). This fragment is then cleaved by γ-secretase, resulting in the APP intracellular domain and Aβ10. Aβ oligomers, not monomers, inhibit synaptic function, affecting long-term potentiation and depression in mice, indicating their role in AD pathogenesis6. While many proteins interacting with APP have been identified, little is understood about synaptic proteins that interact with the ectodomain of APP and potentially regulate Aβ generation11. Tau is a microtubule-associated protein (MAP) primarily found in axons, where it regulates microtubule stability12. Tau maintains microtubule stability through two mechanisms: the existence of different isoforms and phosphorylation13. Tau mutations or hyperphosphorylation can impair its ability to bind to microtubules, leading to disruptions in axonal transport and synaptic metabolism12,14. In AD, the hyperphosphorylation of Tau results in increased microtubule depolymerization15. Consequently, Tau monomers detach from microtubules and aggregate into oligomers, which eventually form neurofibrillary tangles16. These Tau oligomers are released outside the cell membrane, triggering microglial activation, neuroinflammation, and subsequent neuronal damage. The interplay between Aβ and tau proteins is pivotal in the pathogenesis of AD. A whole-brain dynamic modeling has revealed that Aβ has a dominant effect over tau in the early stages of AD, while tau’s influence becomes more pronounced in the later stages17. This suggests that Aβ and tau have crucial roles in neuronal dynamics, affecting the excitation-inhibition balance of brain nodes. Indeed, Aβ accelerates tau phosphorylation and interferes with tau oligomerization18. Another research indicates that Aβ accumulation can facilitate the spread of tau into critical brain regions, such as the posterior cingulate cortex, through major white matter tracts19.

In recent times, the field of bioinformatics has emerged as a valuable tool for exploring novel pathways implicated in the pathogenesis of various diseases, including AD. Gene expression profiling, for example, has revealed higher expression of genes related to metallothionein and mitochondrial processes in individuals resilient to AD, offering insights into potential protective mechanisms against the disease20. Additionally, long-read RNA sequencing has identified key gene expressions and RNA isoforms that are altered in AD, which could serve as novel biomarkers or therapeutic targets21. The identification of differentially expressed genes (DEGs) has been instrumental in uncovering potential diagnostic biomarkers. For instance, studies have found genes like INPP5D and HLA-DQA1 to be differentially expressed in both brain and blood samples of AD patients, suggesting their involvement in AD pathology22. Single-cell sequencing techniques have provided unprecedented resolution in AD research, allowing the identification of cell-type-specific gene expression changes and the characterization of cellular heterogeneity within the AD brain23. Moreover, recent advancements have seen a surge in comprehensive brain profiling studies, leveraging the power of bioinformatics to compare the entire brain transcriptomes of humans and mice. Projects like the Allen Brain Atlas have pioneered this effort, providing an open-access platform that maps gene expression across the brain, aiding in the understanding of neurological function and disease24. Such endeavors have been instrumental in developing diagnostic models based on transcriptomic data, offering a beacon of hope for early detection of Alzheimer’s, and exemplifying the transformative impact of bioinformatics in medical science.

In this study, we employed a comprehensive approach by integrating transcriptome data from wide range of data sources of AD, including blood samples and tissue samples, to give a thorough view of the molecular scenery. Additionally, we incorporated single-cell datasets, enabling a more detailed and reliable assessment of gene expression patterns. In addition, we conducted a pathway analysis, which revealed the association between the candidate genes and the downregulation of synaptic signaling pathway in AD. Furthermore, the identified hub genes were validated using AD animal model. Our findings finally highlighted RPH3A, a vesicle-associated presynaptic protein, as a potential biomarker for the development of AD diagnosis and therapeutics. Figure 1 summarizes the data acquisition and simplified workflow of this study.

Figure 1
figure 1

Data acquisition and overall workflow of the current study. (A) All public datasets were obtained from the Gene Expression Omnibus database. The microarray analysis utilized blood samples (GSE97760) and brain tissue from the hippocampus, superior frontal gyrus (GSE48350), and entorhinal cortex (GSE5281). Single-cell expression was examined using RNA sequencing for the superior frontal gyrus and entorhinal cortex (GSE147528), along with the hippocampus (GSE175814). (B) Schematic diagram summarizing the methodology outline of the study.

Results

Transcriptional profiling of AD

We collected bulk transcriptomic data of blood and tissue samples from entorhinal cortex, hippocampus, and superior frontal gyrus areas of AD and healthy individuals and conducted the differential gene expression (DGE) analysis between AD and control samples using gene expression omnibus (GEO) dataset. In the blood dataset (GSE97760), we identified a total of 7029 differentially expressed genes (DEGs) that met the pre-defined threshold of |log2FC| > 0.5 and p-adj < 0.05. Subsequently, these 7029 DEGs were categorized into two subgroups: Up-regulated AD (Up AD, consisting of 3812 genes) and Down-regulated AD (Down AD, consisting of 3217 genes) (Fig. 2A). Applying the same criteria in parallel to tissue samples, we found 832 genes upregulated in AD and 1796 downregulated in the superior frontal gyrus (SFG) (GSE48350), 257 genes upregulated and 759 downregulated in the hippocampus (GSE48350), and 945 genes upregulated and 1831 downregulated in the entorhinal cortex (GSE5281) (Fig. 2B–D). Heatmaps provided a visual representation of the expression of 20 significant variable genes in each dataset (Fig. 2E–H).

Figure 2
figure 2

Visualization of Differentially Expressed Genes and Corresponding Heatmaps among different AD datasets. (AD) Volcano plots highlight the DEGs between AD and control groups in blood (GSE97760), superior frontal gyrus, hippocampus (GSE48350), and entorhinal cortex (GSE5281), respectively. (EH) Corresponding heatmaps elucidate the expression profiles of the top 40 most variable DEGs among datasets.

Next, we conducted Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis using the DEGs from two subgroups as input, the distinct enrichment pathways for each gene set were identified through KEGG analysis using the clusterProfiler package, considering pathways with a significance threshold of p < 0.05 as significant. As a result, the “signaling pathway regulating pluripotency of stem cells” was found to be the most dominant in the AD group within the blood dataset (Fig. 3A). Indeed, 19 out of 24 genes linked to this signaling pathway are significantly upregulated in AD samples compared to control samples (Supplementary Fig. S1). A gene set enrichment analysis (GSEA) of blood dataset was conducted, highlighting the top four upregulated signatures and four downregulated signatures (Fig. 3B). The intersection of DEG between three tissue datasets (Fig. 3C) revealed that “Synaptic vesicle cycle” dominated the most significant pathway (Fig. 3D). Additionally, the heatmap (Fig. 3E) shows the common upregulated and downregulated pathways shared among the entorhinal cortex, hippocampus, and superior frontal gyrus, with the normalized enrichment score (NES) obtained through GSEA analysis. Following this, a comprehensive Gene Ontology (GO) analysis was performed, encompassing Biological Process (BP), Cellular Component, and Molecular Function. The analysis covered both the upregulated and downregulated gene lists in the blood AD dataset (Supplementary Fig. S1A). Additionally, GO analysis was extended to the overlapped all DEGs derived from three AD brain tissue datasets (Supplementary Fig. S2B).

Figure 3
figure 3

Functional Enrichment Characteristic of distinct AD dataset. (A) Dot plot displays the KEGG pathway analysis results in the DEGs of the blood dataset. (B) A gene set enrichment analysis of the blood dataset is presented. The top four upregulated signatures and four downregulated signatures are shown. (C) The Venn diagram illustrates the overlap between DEGs of three brain tissue datasets. (D) KEGG results of the intersection obtained from (C). (E) The heatmap shows the common upregulated pathways and downregulated pathways shared between the entorhinal cortex, hippocampus, and superior frontal gyrus. Numbers within the boxes represent the normalized enrichment score (NES) score, using GSEA analysis. Upregulations are color-coded in red, whereas downregulations are marked in blue. Permission has been obtained from Kanehisa laboratories for using KEGG pathway database25

Identification of AD-predictive significance genes across multiple datasets

Along with DEG analysis, we also computed the area under the curve (AUC) score for each gene in identifying patients as AD or normal controls using the DEGs found in our previous analysis. The input includes 42 common genes obtained by overlapping the list of DEGs from four microarray datasets analyzed previously. These genes were used as features for the regression model. The input dataset was divided into subsets where DEG symbols were represented by columns and sample IDs by rows. DEG profiles were used as input variables, with AD status as the target variable. The Receiver Operating Characteristic (ROC) curve and AUC score for each gene were calculated, and only genes with an AUC score greater than 0.75 were selected for further analysis (Fig. 4A).

Figure 4
figure 4

Identification of AD-predictive significance genes across multiple datasets. (A) Workflow summarizing the identification of AD-predictive significance genes across multiple datasets. A total of 19 genes met the criteria and were subdivided into Up-AD common genes and Down-AD common genes. (B) Comprehensive AUC scores for each of the 19 genes in different AD datasets, including blood, hippocampus, superior frontal gyrus, and entorhinal cortex.

Our next objective was to identify genes that displayed consistent expression patterns between disease samples and control samples through differential gene expression analysis. We established criteria whereby genes that were upregulated in AD in blood samples were expected to exhibit upregulation in AD in three brain region tissue samples, and vice versa. Following our analysis, we identified 19 genes out of the initial set of 42 genes that met our criteria. Specifically, two genes including NOTCH2NL and OSBPL11 were found to be consistently upregulated in all 4 microarray datasets. Conversely, 17 genes including ADD2, AP2M1, ATP6V0C, ATP6V1G2, C12orf10, CACNG2, FAM174B, GLT1D1, INA, KCNQ2, KLHL35, MCAT, RPH3A, SEZ6L2, SV2A, SYT3 and TMEM59L exhibited consistent downregulation in these datasets. These 19 genes will be retained for further analysis in subsequent steps. The comprehensive AUC values of each of 19 genes is described in Fig. 4B.

Single-cell transcriptome analysis of AD in different human brain regions

To strengthen our findings and increase the reliability of the results obtained from microarray data, we conducted additional analysis using single-cell data from corresponding brain regions. Specifically, we analyzed single-cell datasets from the superior frontal gyrus (GSE147528), entorhinal cortex (GSE147528), and hippocampus (GSE175814), aiming to maintain consistency with the tissue types investigated in the microarray data. The preprocessing steps for the single-cell data followed the methodologies described in the original studies, including filtering and quality control, data normalization, dimensionality reduction, clustering, and uniform manifold approximation and projection (UMAP) visualization (Fig. 5A–C). All three single-cell datasets exhibited similar cell type annotations, with cells clustered into oligodendrocyte, oligodendrocyte progenitor cell (OPC), microglia, astrocyte, endothelial cell, excitatory neuron and inhibitory neuron. The UMAP visualization of all 19 candidate genes across various cell types in different brain regions was presented in Supplementary Fig. S3. A dot plot illustrates marker gene expressions for different cell types, using canonical markers from previous publications (Fig. 5D–F). Corresponding UMAPs of the brain regions illustrate the cellular distribution across distinct conditions, including Braak 0, Braak 2, and Braak 6 stages (from superior frontal gyrus and entorhinal cortex datasets), along with AD and control states (from hippocampus dataset) (Fig. 5G-I).

Figure 5
figure 5

Cellular annotation of AD single-cell datasets. (AC) Uniform manifold approximation and projection (UMAP) visualization of three different single-cell datasets obtained from the superior frontal gyrus (GSE147528), entorhinal cortex (GSE147528), and hippocampus (GSE175814). (DF) Dot plots illustrate the expression level of canonical markers from different cell types, including SLC17A7 for Excitatory neuron, GAD1 for Inhibitory neuron, CLDN5 for Endothelial, MBP for Oligodendrocyte, DOCK8 for Microglia, GFAP for Astrocyte and PDGFRA for OPC. (GI) Corresponding UMAPs of the brain regions depict the cellular distribution in different conditions, including Braak 0, Braak 2, and Braak 6 stages (superior frontal gyrus and entorhinal cortex datasets), as well as AD and control states (hippocampus dataset).

Next, we proceeded to examine the expression levels of each gene from the list of 19 genes obtained through the above result. Initially, we screened the expression patterns of all 19 genes across various cell types (Supplementary Fig. S4). The violin plot for these genes indicated a predominant expression in neuron cells (Supplementary Fig. S4A–C). Intrigued by the potential role of these genes in AD, particularly how their dysregulation in neuron cells could contribute to AD development, we proceeded with a targeted approach to filter potential gene targets. Our selection criteria were as follows: genes upregulated in AD groups compared to healthy control groups in the microarray data were expected to show similar patterns in single-cell analyses of neuron cells, and vice versa for downregulated genes.

Overall, we identified four genes, KCNQ2, RPH3A, SV2A and TMEM59L that met all the specified criteria. Firstly, all those genes showed significant enrichment in excitatory and inhibitory neurons across the entorhinal cortex, superior frontal gyrus, and hippocampus regions. Indeed, the violin plot indicates the higher expression level of these genes in neurons, compared to other cell types (Supplementary Fig. S4D–F). Moreover, expression levels of the four genes significantly decreased in both early (Braak stage 2) and advanced (Braak stage 6) AD stages relative to healthy controls (Braak stage 0) in the entorhinal cortex and superior frontal gyrus, as well as in AD hippocampal samples versus healthy controls (Fig. 6A,C,E). As shown in (Fig. 6B,D,F), the UMAP projections depict the expression patterns of KCNQ2, RPH3A, SV2A and TMEM59L across excitatory and inhibitory neuronal cell types in each brain tissue dataset.

Figure 6
figure 6

Gene expression patterns and differential analysis in neuron cells across Braak stages and disease states. (A,C,E) Violin plots depicting the significant differences in gene expression levels of KCNQ2, RPH3A, SV2A and TMEM59L between Braak stages 0 and 2, Braak stages 0 and 6, and Braak stages 2 and 6 (entorhinal cortex and superior frontal gyrus) as well as between AD and control samples (hippocampus) among excitatory and inhibitory neurons. *p < 0.05, **p < 0.01, ****p < 0.0001. (B,D,F) Corresponding gene expression patterns of 4 selected genes plotted on the UMAP coordinates in entorhinal cortex, superior frontal gyrus and hippocampus, respectively. These genes exhibit cell-type specificity in excitatory and inhibitory neurons. The color intensity corresponds to the expression level, with a gradient scale ranging from low (blue) to high (red) expression.

Establishment of molecular signature for the target genes

Next, we examined its involvement in biological pathways associated with AD related to the target genes KCNQ2, RPH3A, SV2A and TMEM59L. It is critical to identify the pathogenic roles of AD, which can be more thoroughly explained in brain tissue as opposed to blood samples. Thus, we assessed the degree of functional pattern similarity among three brain tissue datasets. Using Gene Ontology Biological Process (GO:BP) as pathway references in the hippocampus, entorhinal cortex, and superior frontal gyrus datasets, an initial gene set enrichment analysis was carried out. Then, we selected the pathways that were significantly different in each dataset, using the following thresholds: p-value < 0.05, False Discovery Rate < 0.25, and |Normalized Enrichment Score| > 1.5. We specifically target downregulated pathways, which are consistent with our finding that our candidate genes are downregulated in AD relative to the control. We determined which pathways in each dataset involve at least one of the listed genes. The entorhinal cortex dataset showed 11 pathways that met these criteria, the superior frontal gyrus dataset showed 25 pathways, and the hippocampus dataset showed 37 pathways, as shown in Fig. 7A–C. Afterwards, our focus shifted towards identifying the pivotal pathways that are collectively regulated across all three brain regions. To achieve this, we intersected pathways associated with each gene. The Sankey plot illustrates the participation of individual genes in specific pathways (Fig. 8). KCNQ2 plays a crucial role in “synaptic signaling”. RPH3A is linked to five signaling pathways, while SV2A is associated with ten shared pathways across all three brain regions. Nevertheless, TMEM59L does not exhibit involvement in any signaling pathway, as indicated by the GO:BP reference. Notably, our observation highlights that the three candidate genes contribute significantly to “synaptic signaling”, emphasizing the pivotal role of this pathway in the development of AD.

Figure 7
figure 7

Functional pattern similarity among three brain tissue datasets. (AC) Lollipop plots illustrating shared downregulated pathways, with at least one of KCNQ2, RPH3A, SV2A, and TMEM59L, in the entorhinal cortex, superior frontal gyrus, and hippocampus, respectively.

Figure 8
figure 8

The Sankey plot illustrates the participation of individual genes in specific pathways overlapped between the three brain tissue datasets, using bulk-seq data.

To confirm the results from the microarray dataset, we aimed to explore whether these pathways are consistently present in single-cell datasets, particularly in neuron cell types. For this purpose, we conducted differential gene expression analysis across single-cell datasets using the ‘FindAllMarker’ function in Seurat. We compared neurons with other cell types and performed enrichment analysis. We observed that the 11 pathways found in Fig. 7 also ranked as the top significantly upregulated pathways in neurons (Supplementary Fig. S5). Additionally, we identified downregulated pathways in brain tissue single-cell datasets, particularly immune-related pathways.

Validation of the target proteins’ expression level in AD model mouse brain

Drawing on the findings from our initial analyses, we decided to verify the expression levels of these three candidate proteins—rabphilin-3A (RPH3A), potassium voltage-gated channel subfamily KQT member 2 (KCNQ2), and synaptic vesicle glycoprotein 2A (SV2A)—in a mouse model to explore into their potential roles in AD pathology. As shown in Fig. 9, there was a significant reduction of RPH3A levels in the hippocampus of 3xTg-AD group compared to control group. Although KCNQ2 and SV2A also exhibited consistent decreases in their levels, the changes were not statistically significant.

Figure 9
figure 9

Expression level of RPH3A, KCNQ2, and SV2A in the hippocampus of wildtype (WT) and AD (3xTg-AD) mice. (A) Western blotting of 3 candidate proteins in mouse hippocampus brain regions, three mice in each group were examined. The samples derive from the same experiment and that gels/blots were processed in parallel. Original blots are presented in Supplementary Figs. S6S8. (B) Bar graph demonstrates the statistical results of the Western blots in WT and 3xTg-AD groups. **p < 0.01.

Discussion

AD is a complex and prevalent disorder with significant impacts on global health that poses major challenges for health care systems. Investigating the pathogenesis of AD could provide new insights into the etiology and outcomes of the disease, as well as new avenues for diagnosis, prevention, and treatment.

In our recent study, we performed the analysis utilizing a variety of datasets from both healthy individuals and AD patients, including blood samples and tissue samples from different brain regions: the entorhinal cortex, hippocampal, and superior frontal gyrus. These regions are known to be significantly affected by AD, making them relevant for investigating the molecular changes associated with the disease. The superior frontal gyrus is involved in executive functions, working memory, and spatial processing26. The atrophy of cortical thickness in the superior frontal gyrus is a characteristic feature of AD and is associated with the observed cognitive impairments27,28. On the other hand, the entorhinal cortex, which plays a crucial role in memory formation and learning, is frequently affected by early histological changes in AD, including the formation of neurofibrillary tangles and neuronal cell death29,30. In the entorhinal cortex, dysregulation of Yin Yang 1 (YY1) affects the expression of genes essential for neuronal survival, leading to neuronal death and exacerbating the neurodegenerative process in this brain region30,31. Functional magnetic resonance imaging studies have specifically identified the lateral entorhinal cortex as the first region affected by AD before it spreads to other brain regions. Lastly, the hippocampus, a crucial brain region involved in the spatial navigation, long-term memories formation and retrieval, is significantly affected in the early stages of AD32,33. This approach aimed to provide robust evidence demonstrating that our target genes not only show specificity for AD classification in blood samples but also hold significance in the most well-known brain regions affecting AD. The set of 42 common DEGs from blood and tissue datasets was used to calculate the AUC score for each gene in classifying patients as AD or normal controls. By validating the AUC scores of these common DEGs across different types of datasets, we enhanced the reliability and generalizability of our findings, thereby identifying genes that hold promise as potential biomarkers or therapeutic targets in AD research.

19 candidates were identified from the pool of 23 common genes obtained through AUC-based sorting, including ADD2, AP2M1, ATP6V0C, ATP6V1G2, C12orf10, CACNG2, FAM174B, GLT1D1, INA, KCNQ2, KLHL35, MCAT, NOTCH2NL, OSBPL11, RPH3A, SEZ6L2, SV2A, SYT3 and TMEM59L, which met our predefined consistency criteria. It was anticipated that there would be consistent changes in the patterns of gene expression, with the elevated genes in AD blood samples exhibiting upregulation in the tissue microarray datasets from different brain regions, and the downregulated genes exhibiting the opposite pattern. The criteria were established to detect consistent molecular alterations in several sample types, strengthening the reliability of our results. Afterwards, we expected a consistent pattern in the single cell profiling, aligning with the observed gene expression changes in AD microarray datasets. Among 19 candidates, only 4 genes KCNQ2, RPH3A, SV2A, and TMEM59L exhibited a significant decrease of expression level in AD patients compared to control individuals in single cell datasets from superior frontal gyrus, entorhinal cortex, and hippocampus region. Furthermore, the predominant expression of these 4 genes in neuronal cell types including excitatory and inhibitory neurons compared to other cell types highlights their potential importance in the context of AD. Since neurons are one of the primary brain functioning units, understanding the biological complexities of AD development is essential. The increased expression of these genes in neurons points to a particular significance for the brain milieu and suggests that these important cellular actors may be affected by processes that are closely linked to the development of AD. This led us to prioritize neuron cell type for our further analysis. To gain more in-depth understanding, we focused on brain tissue datasets and our gene set enrichment analysis revealed distinct downregulated pathways, aligning with our observation of downregulated candidate genes in AD compared to controls from entorhinal cortex, superior frontal gyrus, and hippocampus datasets. Unexpectedly, TMEM59L, while not directly associated with specific pathways, may still have an effect through complicated interactions that are not yet fully understood. The declined levels of the remaining candidate genes KCNQ2, RPH3A, SV2A were found to be associated with the shared pathway “synaptic signaling” in all three tissue datasets.

To verify the bioinformatic findings, we validated the expression levels of 3 proteins—RPH3A, KCNQ2, and SV2A—through in vivo experiments using a triple transgenic AD mouse model. Analysis of mouse brain tissue revealed a significant reduction only in RPH3A levels, particularly in the hippocampus, but not KCNQ2 or SV2A. The lack of significant decrease in KCNQ2 and SV2A levels in the AD mouse model could be attributed to several factors. Firstly, the stage of disease progression could influence the expression of these proteins, as certain proteins might only show significant changes at specific stages of the disease. Additionally, tissue-specific expression might also play a role, as changes in KCNQ2 and SV2A levels might be more pronounced in other brain regions, not just hippocampus, entorhinal cortex, or superior frontal gyrus. Besides, cellular regulatory mechanisms could maintain the levels of these proteins despite the disease condition. Moreover, post-translational modifications could alter their detection by the antibodies used in the study, potentially affecting the observed expression levels. Lastly, factors such as differences in biological variability, or individual differences between the mice could contribute to the lack of significant changes. These are potential explanations, and further investigations would be needed to confirm them. Nonetheless, the experimental findings suggest that RPH3A could indeed serve as a reliable indicator of AD pathology.

RPH3A encodes a protein involved in membrane trafficking and exocytosis, particularly in the regulation of synaptic vesicle exocytosis in neurons and facilitating the release of neurotransmitters34,35. Via an N-terminal Rab-binding domain, RPH3A binds to Ras-related protein RAB3A, a small GTP-binding protein, and interact with synaptosome-associated protein (SNAP)-25 to regulate the secretory vesicles and neurotransmitter exocytosis by controlling the docking and fusion of synaptic vesicles to the presynaptic membrane during exocytosis36,37,38. A recent study has revealed that missense mutations in RPH3A cause an ultra-rare neurodevelopmental disorder with varied expressivity, which is associated with learning difficulties, intellectual disability, autistic spectrum disorder, and epilepsy39. Besides, recent research has demonstrated an association between RPH3A depletion and neurodegenerative disorder such as AD and Huntington's disease, indicating its critical role in synaptic function40,41. However, the insight mechanisms related to this protein in AD, particularly in the context of synaptic signaling and trafficking, remains unclear. According to Michelle G. K. Tan et al.’s research, there were distinct decreases in RPH3A when compared to aged controls and the loss of RPH3A was linked to higher levels of Aβ, cholinergic deafferentation, and dementia severity40. Additionally, the study demonstrated that RPH3A expression is downregulated in cultured neurons treated with Aβ25–35 peptides. RPH3A has been observed in conjunction with amyloid plaques and appears to be more associated with cerebral amyloid angiopathy, rather than with NFTs or Tau. Its expression was reduced in several brain regions including hippocampus and entorhinal cortex42. On another hand, one study indicates that a group of synaptic vesicle transport-related proteins, including RPH3A, exhibited a strong correlation with both total tau and phosphorylated tau in CSF of asymptomatic individuals43. Research on mouse models with elevated human tau expression has demonstrated a link between tau level and reduced synaptic proteins, yet the specific expression pattern of RPH3A within these models remains to be elucidated44. Moreover, the precise mechanisms of how RPH3A interacts with Aβ or tau proteins in AD warrant more dedicated investigation. This targeted research is essential to unravel the complex interplay and potential therapeutic implications of these interactions in the pathology of AD. On another hand, the interplay between Aβ and tau proteins in AD is complex and reciprocal, where Aβ accumulation induces tau hyperphosphorylation, exacerbating Aβ pathology and creating a feedback loop45. This complexity makes isolating each hallmark’s effects on RPH3A reduction challenging.

In both excitatory and inhibitory neurons, RPH3A, as a synaptic protein, plays a pivotal role in synaptic vesicle exocytosis, influencing neurotransmitter release and synaptic function. In our study, the observed decrease in RPH3A expression in the AD groups aligns with recent studies highlighting synaptic dysfunction as a hallmark of AD pathology42,46. A recent study has revealed that RPH3A undergoes phase separation, a process that regulates the mobility and surface clustering of the GluN2A subunit of N-methyl-d-aspartate receptor receptors47. These receptors are pivotal for the excitatory neurons’ function, playing a key role in brain activity and synaptic plasticity. The disruption of RPH3A’s phase separation impairs synaptic localization and stability of GluN2A, which could contribute to the synaptic deficits observed in AD. On another hand, although direct studies linking RPH3A specifically to inhibitory neurons in AD are limited, the overall loss of synaptic integrity observed in AD likely affects both excitatory and inhibitory neurons48. Inhibitory neurons, which are crucial for maintaining the balance of neural activity, may become dysregulated, leading to increased neural network excitability and contributing to the cognitive deficits observed in AD. The disruption of synaptic proteins, including RPH3A, in inhibitory neurons could affect neural circuitry and brain function. In addition, the downregulation of RPH3A expression observed in AD could potentially disrupt RAB3A GTPase activity, leading to the impairment of APP anterograde transport along axons. APP, known to be anterogradely transported in neurons, undergoes proteolytic processing during transit11,49. Kinesin molecular motors mediate the fast anterograde transport of APP-containing vesicles, ensuring the continuous replenishment of APP at synaptic terminals. Importantly, disruption of APP transport could contribute to the dysregulation of APP metabolism and the accumulation of Aβ peptides, hallmark features of AD pathology9,50,51. Furthermore, synaptic dysfunction resulting from the accumulation of intracellular Aβ (iAβ) has been implicated in the early stages of AD52. iAβ-induced presynaptic dysfunction may arise from aberrant APP processing and trafficking, leading to the accumulation of toxic Aβ oligomers and impairing neuronal communication.

On the other hand, it has been demonstrated that RPH3A also interacts with SNARE proteins, which mediate vesicle docking and fusion during regulated exocytosis35. The formation of the SNARE complex facilitates membrane fusion and neurotransmitter release in response to calcium influx, a critical step in synaptic transmission. RPH3A facilitates the assembly of the SNARE complex by binding its C2B domain to the N-peptide of SNAP25. This interaction promotes a pre-fusion function in vesicle exocytosis36. In the context of synaptic vesicle traffic, it was revealed that APP is a cargo of a vesicle containing the kinesin heavy chain isoform kinesin-1C, the small GTPase RAB3A, and a specific subset of presynaptic protein components such as synapsin-I, SNAP25, syntaxin-1B, VAMP2, Munc13-1, RIM2, Rab3, and Rab3GAP p130 and p15049. Notably, all these protein cargoes are known to play crucial roles in various aspects of synaptic vesicle fusion, highlighting the potential involvement of APP in the regulation of vesicle fusion and synaptic function. Hence, the interaction between RPH3A and SNARE proteins, along with the presence of APP in vesicles, could highlight the complex molecular mechanisms involved in synaptic related pathways in AD.

Overall, this current study still has several limitations. Firstly, our investigation into RPH3A expression in neuronal subtypes is limited by the complexity of synaptic functions and the current bioinformatic tools’ inability to model these interactions precisely. While we can measure expression levels, understanding the detailed mechanisms requires more sophisticated experimental methods not employed in this study. In addition, we acknowledge that while AD model mice provide valuable preliminary insights, they cannot fully replicate the complexity of human disease. Our findings necessitate validation through further experiments using clinical samples from AD patients. Additionally, to gain a deeper understanding of the mechanisms at play, in vitro studies using human neurons—specifically excitatory and inhibitory subtypes—are essential. This study’s blood dataset (GSE97760) comprises exclusively female samples, which may limit the generalizability of our findings to the broader population. Future research should aim to include a balanced.

Conclusion

Through bioinformatic analyses and in vivo experiments, we have identified RPH3A as a promising candidate biomarker for AD pathology, with significant downregulation observed in the hippocampus of AD model mice. Our identification of associated pathways, particularly in synaptic signaling, provides crucial insights into RPH3A’s role in AD molecular mechanisms. While this study lays groundwork for potential biomarkers and therapies, further functional validation is essential for advancing AD diagnosis and treatment.

Methods

Data collection

The data used in this study were collected from a publicly available gene expression dataset obtained from the GEO database using “GEOquery” (version 2.68.0) package. For the differential gene expression (DGE) analysis, we obtained the GSE9776053 microarray dataset, which includes gene expression profiles from blood samples of 9 AD patients and 10 control subjects. Furthermore, three distinct AD tissue datasets were employed for validation, originating from different brain regions: hippocampus, superior frontal gyrus (GSE48350)54, and entorhinal cortex (GSE5281)55. Lastly, the expression patterns of the key gene panel at a single-cell resolution were assessed using high-throughput single nucleus RNA sequencing datasets obtained from consistent brain regions. Specifically, we analyzed the superior frontal gyrus, entorhinal cortex dataset (GSE147528)56, and the hippocampus dataset (GSE175814)57. All expression matrices from microarray datasets, as well as the raw gene-barcode matrices output by Cellranger, have been filtered and preprocessed. They can be directly downloaded from the GEO database using the corresponding GSE numbers.

Differential gene expression (DGE) analysis

We performed a differential gene expression analysis using the limma package to compare gene expression profiles in AD patients and control individuals58. Volcano plots and expression heatmap were performed using ‘ggplot2’ (version 3.5.1) and pheatmap (version 1.0.12) packages, respectively. Genes were considered as DEGs if they exhibited |logFC| > 0.5, and p-adj < 0.05.

Functional enrichment analysis

We identified the pathways in which our target gene is involved by utilizing several databases, including Gene Ontology and KEGG. The potential biological processes and pathways associated with DEG list obtained from the DGE analysis were identified using clusterProfiler (version 4.10.1) R package59 and SRplot online tool (https://www.bioinformatics.com.cn/). In addition, GSEA was performed to calculate the normalized enrichment score of each gene set which is positively or negatively correlated with the pre-rank gene list to get the enriched pathway associated with AD compared to control60. The ‘msigdbr’ package (version 7.5.1) was used to obtain gene sets from the Molecular Signatures Database. Additionally, we utilized the ‘fgsea’ package (version 1.28.0) for calculating the enrichment score of each pathway and the ‘gggsea’ package (version 0.1.0) for plotting the curves.

Validation and predictive modeling of hub genes for AD by receiver operating characteristic curve analysis

To validate the potential hub genes and assess their predictive power, we employed a predicting AUC score model. The model aimed to predict the incidence of AD based on gene expression profiles. First, we utilized the gene expression data from our study cohort, which consisted of samples from AD and control individuals. We selected the DEG list identified from the DGE analysis as features for the regression model. Subsequently, we partitioned the dataset into a subset of gene expression matrices, where the columns represented gene symbols from the list of DEGs, and the rows corresponded to sample IDs. The gene expression profiles of the DEG list were utilized as input variables, while the disease status (AD) served as the target variable. We calculated the receiver operating characteristic (ROC) curve, which plots the true positive rate against the false positive rate and calculated the AUC as a measure of the model’s discriminatory power, using the pROC R package61.

Initially, we applied our selected methods to analyze the blood dataset (GSE97760) specifically associated AD and brain tissue samples obtained from three distinct regions: the superior frontal gyrus and hippocampus (GSE48350), entorhinal cortex (GSE5281). We explored the intersection of DEGs across all microarray datasets using the UpSetR package (version 1.4.0). Genes that exhibited an AUC value exceeding 0.75 in all the datasets were selected for further analysis.

Single-cell transcriptomic analysis

The expression patterns of the identified genes were further explored using single-cell resolution data from various brain regions mentioned in the data collection section. All hub genes were screened in three single-cell datasets obtained from entorhinal cortex, superior frontal gyrus (GSE147528)56, and hippocampus (GSE175814)57. We examined the expression profiles of our genes of interest in different cell types, including neurons, astrocytes, microglia, oligodendrocytes, OPCs and endothelial cells, which are known to have important roles in the pathophysiology of AD.

The preprocessing steps for the single-cell data followed the methodologies described in the original studies, including filtering and quality control, data normalization, dimensionality reduction, clustering, and UMAP visualization. Briefly, we obtained all output files from CellRanger of each sample, then create the merged Seurat objects using ‘Seurat’ package (version 5.0.3). We excluded cells with fewer than 500 UMIs. Next, we followed a standard workflow: normalizing the expression matrix, identifying variable features (2000 features), scaling the data, performing clustering, and running UMAP (with a resolution of 1). Cell types were annotated using canonical markers, specifically SLC17A7 for Excitatory Neuron, GAD1 for Inhibitory Neuron, CLDN5 for Endothelial, MBP for Oligodendrocyte, DOCK8 for Microglia, GFAP for Astrocyte and PDGFRA for OPC, and the reliability of the annotations was validated using “The Alzheimer Cell Atlas” (https://taca.lerner.ccf.org/)62, SCAD-Brain (https://www.bioinform.cn/SCAD/)63, and the original publications.

The comparisons in single-cell analysis were performed between three pairs: “Braak 0 and Braak 2,” “Braak 0 and Braak 6,” and “Braak 2 and Braak 6” for entorhinal cortex and superior frontal gyrus single-cell datasets. “AD and Control” comparison were performed for hippocampus single-cell dataset. The differences within groups were determined using the unpaired Wilcoxon test. Statistical significance was indicated by: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001.

Establishment of molecular signature for the target gene

To further investigate the functional relevance of the biological pathways linked to the candidate genes, we compared the pathway signature scores between the AD and control groups by screening the previously described GSEA data. Next, we used a filter approach to identify important pathways that our target genes shared throughout various brain regions. Gene set enrichment analysis was conducted for each tissue dataset, and significant downregulated gene sets were identified (NES < − 1.5, False Discovery Rate < 0.25). Shared downregulated pathways were then sought, each containing at least one of the genes KCNQ2, RPH3A, SV2A in the entorhinal cortex, superior frontal gyrus, and hippocampus, respectively.

Animal model

Wild-type (C57BL/6 or B6) and 3xTg-AD mice [B6; 129-Psen1tm1Mpm Tg(APP-Swe, TauP301L)1Lfa/Mmjax] expressing three mutant genes, namely Psen1-M146V, APP-K670N/M671L, and Tau-P301L, were obtained from the Jackson Laboratory (Bar Harbor, ME, USA) and used in our other studies9,64. All mice were maintained under identical conditions and fed the same diet until they reached 9 months of age, with no experimental interventions applied. In this study, three adult 3xTg-AD and three C57BL/6J mice were anesthetized using isoflurane and subsequently euthanized by decapitation. After that, their hippocampal tissues were quickly collected for subsequent western blot experiments. All animal procedures were approved by the Institutional Animal Care and Use Committee in Taipei Medical University (Laboratory Access Certificate No.: LAC-2022-0130). All experimental procedures adhered to the ARRIVE guidelines and the American Veterinary Medical Association Guidelines for the Euthanasia of Animals (2020).

Immunoblotting

After homogenizing the hippocampal tissues to collect the tissue lysates, a protein assay was performed to determine total protein concentration in the lysates of mouse brain using Bradford reagent (Bio-Rad Laboratories, Inc., CA, USA). Subsequently, boiling was performed for denaturation, and samples containing 25 µg of total proteins from the lysates were separated on SDS-PAGE gels and transferred onto PVDF membranes. After blocking and washing, the membranes were probed with primary antibody specifically against KCNQ2 (Cell Signaling Technology, #14752), RPH3A (Proteintech, #11396-1-AP), SV2A (Abcam, ab254351) at 4 °C overnight. After adequate washing, the membranes were incubated with HRP-labeled goat-anti rabbit IgG and goat-anti mouse IgG secondary antibodies for 1 h. Afterwards, the membrane was washed again and soaked into a chemiluminescent substrate (Bio-Rad). Bands were detected and imaged using ImageQuant LAS 4000 machine. Band densities were quantified using ImageJ software.

Statistical analysis

Statistical tests were performed using R software (version 4.1.3). Statistical analysis and graphical representations for experimental data were conducted using Prism software.