Introduction

Subarachnoid hemorrhage (SAH) is a disease with high mortality and morbidity, with approximately 80% of cases caused by the rupture of intracranial aneurysms1,2. Despite notable advancements in surgical resection and endovascular interventions in recent years, the prognosis for patients with ruptured aneurysms remains unfavorable1,3,4. Following SAH, the body undergoes a series of pathophysiological changes, including the activation of immune and inflammatory responses, neuronal cell death, microvascular dysfunction, delayed cerebral ischemia, and widespread systemic dysfunction5,6,7. The pathophysiological process of SAH is extremely complex, yet our understanding of its mechanisms remains limited. Therefore, an in-depth knowledge of these intricate pathological mechanisms and the identification of effective therapeutic targets are crucial for improving the long-term prognosis of SAH patients.

The objective of this study is to utilize bioinformatics methods and machine learning algorithms to investigate the distinctive gene profiles and immune cell infiltration patterns following SAH. Candidate genes and key modules are identified through Weighted Gene Co-Expression Network Analysis (WGCNA) and then subjected to further screening and identification of characteristic genes of SAH through the application of machine learning algorithms and proteomics techniques (Based on our cerebrospinal fluid (CSF) proteomics data from SAH patients). Finally, we analyze correlations between diverse immune cells and characteristic genes, and further verify high levels of A2M and GLIPR2 expression in the cerebrospinal fluid of SAH patients and the cortex of SAH rats, providing promising targets for future research.

Results

Variance analysis and quality control

Quality control and differential analysis of the transcriptome and metabolome are detailed in our previous article. A total of 1198 differentially expressed genes (DEGs, |Fold Change| ≥ 2 and False Discovery Rate (FDR) < 0.05) are identified. Between the sham group and the SAH 1d group, 942 DEGs are identified in rat cortical tissues, including 799 upregulated genes and 143 downregulated genes. A total of 540 differentially expressed metabolites in rat cortical tissues (DEMs, p value < 0.05 and VIP > 1) are identified. Between the sham group and the SAH 1d group, 540 differentially expressed metabolites (DEMs, p value < 0.05 and Variable important in projection (VIP) > 1) are identified in rat cortical tissues, including 264 upregulated metabolites and 276 downregulated metabolites8.

WGCNA network construction and module identification

The WGCNA analysis was conducted on the transcriptome of samples from the sham and the SAH 1d groups. Using β = 17 (scale-free R2 = 0.84) as the soft threshold, hierarchical clustering is conducted on the samples to construct a scale-free network (Fig. 1A). The application of hierarchical clustering has enabled the identification of seven distinct co-expression modules, each comprising a minimum of 30 genes (Fig. 1B). The correlation analysis of WGCNA revealed a significant association between the MEyellow module (cor = 0.98, p = 2e−08) and the MEbrown module (cor = 0.95, p = 2e−06) with SAH (Fig. 1B). We proceed to analyze t he module core genes (Module Membership, MM > 0.8 and Gene Significance, GS > 0.5) identified in the aforementioned modules. A combined WGCNA analysis was conducted on the transcriptome and metabolome data, after which the correlation between the transcriptome modules and metabolome modules was calculated following dimensionality reduction. A heatmap of the correlation coefficient (CC) and the p value of the correlation (CCP) is plotted (Fig. 1C, Supplementary Table 1). Modules with CCP < 0.05 and the top 10 correlation coefficients were selected for further analysis. Interaction analyses were performed on the core genes of the transcriptome WGCNA analysis modules, the top 10 related module genes from the transcriptome-metabolome combined WGCNA analysis, and the differentially expressed genes (DEGs) between the sham and the SAH 1d groups. Ultimately, 73 potential DEGs are identified (Fig. 1D). These genes would undergo further functional analysis.

Fig. 1
figure 1

Construction of WGCNA. (A) The clustering dendrogram of transcriptomics, each leaf corresponds to a different gene module. (B) Heatmap of correlations between modules and traits, each cell containing the corresponding correlation and p value. (C) Heatmap of metabolite module-gene module correlation, with gene module on the right and metabolite module on the bottom. (D) Venn map is used to identify the intersection between DEGs and WGCNA related genes.

Enrichment analysis of 73 DEGs

Subsequently, a functional enrichment analysis was conducted on the 73 DEGs. GO enrichment analysis indicates that the aforementioned DEGs are predominantly enriched in the following biological processes, response to viruses, leukocyte-mediated immunity, immune response, and complement activation. With regard to cellular components, the aforementioned DEGs are predominantly enriched in blood microparticles, collagen-containing extracellular matrix, and specific granules. In terms of molecular functions, the aforementioned DEGs are predominantly enriched in integrin binding, calcium-dependent protein binding, and enzyme inhibitor activity (Fig. 2A). The KEGG enrichment analysis shows that the aforementioned DEGs are mainly enriched in the following pathways, Pertussis, Complement and coagulation cascades, TNF signaling pathway, Sphingolipid signaling pathway, and Leukocyte transendothelial migration (Fig. 2B). The DO enrichment analysis demonstrates that the aforementioned DEGs are involved in nephritis, blood platelet disease, systemic lupus erythematosus and ischemia, etc. (Fig. 2C). To gain further insight into the differences in pathways between the sham and the SAH 1d groups, we conduct a GSEA analysis. The results demonstrate that five pathways are significantly active in the sham group, including the calcium signaling pathway, cardiac muscle contraction, long-term potentiation, oxidative phosphorylation, and Parkinson’s disease (Fig. 2E). In contrast, in the SAH 1d group, five pathways are significantly active, including complement and coagulation cascades, cytokine-cytokine receptor interaction, JAK-STAT signaling pathway, systemic lupus erythematosus, and toll-like receptor signaling pathway (Fig. 2D).

Fig. 2
figure 2

Functional enrichment of 73 DEGs. (A) Significant enriched GO terms for DEGs. (B) The significantly enriched KEGG pathways. (C) DO analysis of DEGs. (D) Five active pathways of sham group in GSEA analysis. (F) Five active pathways of SAH 1d group in GSEA analysis.

Machine learning screening for genes

Subsequently, the LASSO analysis is conducted on the aforementioned 73 DEGs, resulting in the identification of eight related genes, namely A2M, CXCL10, GFAP, GLIPR2, GPNMB, LCN2, MCM5, and SPTSSB (Fig. 3A, B). Figure 3C illustrates the differential expression levels of these genes in the sham group, SAH 1d, and SAH 7d groups. Except for SPTSSB, these genes exhibit high expression levels in the SAH 1d group.

Fig. 3
figure 3

Machine learning screening for relevant genes. (A) Lasso regression analysis cross-validation curve. When 8 genes are used in the analysis, Lasso fits best and cross-validation error is minimized. (B) LASSO coefficient path diagram, each curve represents one gene. (B)Expression of 8 related genes in transcriptomics.

Potential association of the LASSO-related genes with metabolites

The association between the LASSO-related genes and metabolites was further analyzed. An interaction analysis is performed on the top 10 related module metabolites from the transcriptome-metabolome combined WGCNA analysis, and the differentially expressed metabolites (DEMs) between the sham and the SAH 1d groups (Fig. 4A). Ultimately, 267 potential DEGs are identified (Fig. 4A). The aforementioned DEMs were imported into the KEGG database and identified, further imported into Metaboanalyst 6.0 for KEGG analysis. The results show that the above DEMs are enriched in the following pathways, phenylalanine, tyrosine, and tryptophan biosynthesis, phenylalanine metabolism, pentose phosphate pathway, arginine and proline metabolism, pantothenate and COA biosynthesis (Fig. 4B, C). Based on the Pearson correlation methods, the correlations between the aforementioned DEMs and the LASSO-related genes are analyzed and heatmaps were plotted (Fig. 4E). Results with |CC|>0.8 (correlation coefficient) and CCP < 0.05 (p value of the correlation) are selected for importing into Cytoscape and visualization of the results is performed (Fig. 4D, Supplementary Table 2). The results indicated a significant correlation between LASSO-related genes and DEMs.

Fig. 4
figure 4

Correlation between related genes and DEGs. (A) Venn map is used to identify the intersection between DEMs and WGCNA related genes. (B, D) Kegg enrichment analysis of 267 DEMs. (C, E) Correlation of Lasso-related genes with DEMs. (*p < 0.05, **p < 0.01, ***p < 0.001, n = 6)

Gene expression in human cerebrospinal fluid

4D-DIA protein quantification was performed on patients with SAH (SAH) and a control group of patients with non-SAH (sham). The principal component analysis (PCA) of the protein expression profiles reveals a distinct separation between the two groups, suggesting a high degree of correlation within each group (Fig. 5A). Figure 5B shows the number of proteins identified by the two groups. A total of 1336 differentially expressed proteins are identified (|Fold Change|≥1.5 and P value < 0.05). In comparison to the sham group, the SAH group exhibits 729 upregulated proteins and 607 downregulated proteins (Fig. 5C). To analyze the expression patterns of samples between and within groups and thus test the rationality of the grouping in this project, we perform a hierarchical clustering analysis on differentially expressed proteins and display the result in the form of a heatmap (Fig. 5D). The results showed that the differentially expressed proteins could effectively distinguish between different groups, indicating that the alterations in the protein expression levels could reflect the substantial impact of the disease on the samples. In the intersection analysis of LASSO-related genes and proteomics data, five potentially significant feature genes are identified, A2M, GFAP, GLIPR2, GPNMB, and LCN2 (Fig. 5E). Figure 5F shows the expression of these potential feature genes in the proteomics data.

Fig. 5
figure 5

Proteomic analysis. (A) 3D PCA distribution of all samples. (B) Venn diagram showing proteins identified in different groups. (C) Differential histogram of protein quantification in various groups. (D) Heatmap of clustering analysis of differentially expressed proteins in different groups. (E) Venn diagram demonstrating the intersection of proteins with Lasso-associated genes. (F) Characterized genes in proteomics expression.

Immune characterization of samples and immune-related functions of feature genes

The CIBERSORT algorithm was employed to ascertain the discrepancies in the 22 categories of infiltrating immune cells across the diverse samples. The results demonstrate that in comparison to the sham group, the SAH 1d group exhibits a notable reduction in plasma cells and naïve CD4 + T cells, accompanied by a considerable elevation in activated dendritic cells (Fig. 6A, B). A comparison of the sham and SAH 7d groups reveals a significant downregulation of plasma cells, naïve CD4 + T cells, and macrophages M0 (Fig. 6C, D). The aforementioned results suggested the potential for alterations in the immune microenvironment in the SAH 1d, with continued evidence of such changes in the SAH 7d. Furthermore, the immune microenvironments in these two phases displayed similarities. The following section of the study further explored the correlation between the five feature genes and immune cells. The results demonstrate that, in comparison to the sham group, the expression of LCN2 is positively correlated with the levels of activated dendritic cells (correlation coefficient, r = 0.91) and monocytes (r = 0.58) in SAH 1d (Fig. 7A). The expression of A2M is positively correlated with the levels of activated dendritic cells (r = 0.82) and naive B cells (r = 0.64), and negatively correlated with the levels of plasma cells (r = − 0.66) and naive CD4 + T cells (r = − 0.73) in SAH 1d (Fig. 7B). The expression of GLIPR2 is positively correlated with the levels of activated dendritic cells (r = 0.77), T cells follicular helper (r = 0.73), and eosinophils (r = 0.64), and negatively correlated with the levels of naive CD4 + T cells (r = − 0.62) and plasma cells (r = − 0.73) in SAH 1d (Fig. 7C). The expression of GPNMB is positively correlated with the levels of activated dendritic cells (r = 0.87) and naive B cells (r = 0.59), and negatively correlated with the levels of plasma cells (r = − 0.65) and naive CD4 + T cells (r = − 0.73) in SAH 1d (Fig. 7D). The expression of GFAP is positively correlated with the levels of activated dendritic cells (r = 0.79) and naive B cells (r = 0.65), and negatively correlated with the levels of plasma cells (r = − 0.70) in SAH 1d (Fig. 7E). In the sham group vs. SAH 7d group, we similarly investigate the correlation between the expression of LCN2, A2M, GLIPR2, GPNMB, and GFAP and immune cells (Supplementary Fig. 1). In summary, the expression of these genes might potentially be associated with the levels of immune cell infiltration.

Fig. 6
figure 6

Infiltration analysis of immune cells. (A) Relative abundance of 22 infiltrating immune cells between sham and SAH 1d samples. (B) Violin plot of all 22 immune cell differentially infiltrated fractions between sham and SAH 1d samples. (C) Relative abundance of 22 infiltrating immune cells between SAH 1d and SAH 7d samples. (D) Violin plot of all 22 immune cell differentially infiltrated fractions between SAH 1d and SAH 7d samples. (*p < 0.05, **p < 0.01, ***p < 0.001, n = 6)

Fig. 7
figure 7

Characterized genes and immune cells correlation analysis between sham and SAH 1d samples. Lollipop charts of the correlation of LCN2 (A), A2M (B), GLIPR2 (C), GPNMB (D), GFAP (E), respectively, with 22 immune cell types.

Construction of a regulatory network of CeRNAs and TFs for feature genes

To further explore the molecular mechanisms of SAH, we constructed competing endogenous RNAs (ceRNAs) and transcription factors (TFs) regulatory networks for feature genes.

The ceRNAs hypothesis reveals a new mechanism of RNA interactions. ceRNAs are a newly discovered gene expression regulation mechanism, including protein-coding mRNAs, long non-coding RNAs (lncRNAs), miRNAs, and pseudogene transcripts. It is established that miRNA induces gene silencing by binding to mRNA9. LncRNAs have been demonstrated to influence gene expression through several mechanisms, including the recruitment of transcription activators, the regulation of gene promoters, the modulation of mechanisms, the promotion of mRNA stability, and the functioning as a ‘sponge’ for miRNAs, thereby regulating their activity10,11. Following, we not only predicted the miRNAs bound by feature genes, but also the lncRNAs bound by these miRNAs. The results demonstrate that A2M can bind to 2 miRNAs (Fig. 8A), LCN2 can bind to 3 miRNAs (Fig. 8B), 40 lncRNAs can compete with GLIPR2 for binding 8 miRNAs (Fig. 8C), 33 lncRNAs can compete with GPNMB for binding 7 miRNAs (Fig. 8D), and 23 lncRNAs can compete with GPNMB for binding 4 miRNAs (Fig. 8E). Additionally, the results also predicted the TFs binding to feature genes. Among them, 18 TFs can bind to GFAP (Fig. 9a), 21 TFs can bind to A2M (Fig. 9B), 24 TFs can bind to GPNMB (Fig. 9C), 34 TFs can bind to LCN2 (Fig. 9D), and 43 TFs can bind to GLIPR2 (Fig. 9E).

Fig. 8
figure 8

CeRNA of characterized genes. The ceRNA regulatory networks of A2M (A), LCN2 (B), GLIPR2 (C), GPNMB (D) and GFAP (E), separately. Red represents characterized genes, green represents miRNAs, and blue represents lncRNAs.

Fig. 9
figure 9

TFs regulatory networks of characterized genes. TFs regulatory networks of A2M (A), LCN2 (B), GLIPR2 (C), GPNMB (D) and GFAP (E), respectively.

WB analysis and immunofluorescence validation of A2M and GLIPR2 expression in rat cortex and cerebrospinal fluid of SAH patients

The WB analysis reveals that the expression of A2M and GLIPR2 is elevated in the cerebrospinal fluid of patients with SAH and cortical tissue of rats following SAH (Fig. 10A, B, C). The above results are further verified by immunofluorescence (Fig. 11A, B, C). Figure 11D shows the number of cortical microthrombi in different groups of rats after SAH. In addition, immunodouble-labeling co-localization curve analysis and Pearson’s R-value show that A2M has significant co-localization with CD31 and FGA (Fig. 11E, F, G).

Fig. 10
figure 10

A2M and GLIPR2 are highly expressed after SAH. (A, C ) Western blot images and quantitative analyses of A2M and GLIPR2 in rat cortical tissues. (B) Western blot images of A2M and GLIPR2 in the cerebrospinal fluid of patients. (*p < 0.05, **p < 0.01, ***p < 0.001)

Fig. 11
figure 11

Relationship between A2M and microthrombosis. (A, B) Typical micrographs showing immunofluorescence staining of A2M with blood vessels and microthrombi in different experimental groups. n = 4 per group. Scale bar = 50 μm. (C) Quantitative analysis of A2M in different experimental groups. (D) Number of microthrombi in different experimental groups. (E, F, G) Immunodouble-labelling co-localisation curve analysis and Pearson’s R-value for A2M with microvessels and microthrombi. (*p < 0.05, **p < 0.01, ***p < 0.001)

Discussion

Subarachnoid hemorrhage (SAH) is an acute and highly fatal hemorrhagic stroke typically caused by the rupture of an intracranial aneurysm that can lead to impaired microvascular function, delayed cerebral ischemia, and widespread systemic dysfunction1,7. Transcriptomic, proteomic, and metabolomic data were analyzed using WGCNA, machine learning, and other bioinformatics techniques to screen for feature genes following SAH. Finally, we verified the proteins corresponding to the identified genes in the cerebrospinal fluid of SAH patients and the cortex of SAH rats, and analyzed correlations between different immune cells and characteristic genes, providing promising targets for future research.

The DEMs were analyzed in conjunction with the core module genes of WGCNA, resulting in the identification of 73 potential genes. Functional enrichment analysis revealed that these genes were significantly enriched in the complement and coagulation cascades, TNF signaling pathway, sphingolipid signaling pathway, blood platelet disease, systemic lupus erythematosus, and ischemia pathways. Machine learning was employed to further refine the selection of relevant genes, which were then combined with proteomics analysis to identify five feature genes associated with SAH, A2M, GFAP, GLIPR2, GPNMB, and LCN2. Through analysis of immune infiltration characteristics, we further found that changes in the immune microenvironment might be closely related to the pathogenesis of SAH, and these feature genes were very closely related to immune cells. In addition, the experimental results showed that A2M and GLIPR2 were highly expressed after SAH, and A2M was closely associated with microthrombosis.

The A2M gene encodes Alpha-2-Macroglobulin, a protease inhibitor that inhibits a variety of proteases including elastase, protease 3, cathepsin G, and MMP-9. Additionally, it functions as a cytokine transport protein, capable of inhibiting inflammatory cytokines, thereby disrupting the inflammatory cascade and regulating the functions of immune cells, including neutrophils, macrophages, and lymphocytes12. Our results also showed that A2M was closely related to macrophages, NK cells, and T cells. A2M not only significantly inhibits inflammation processes mediated by TNF-α, IL-1β, NF-κB, LPS, and IFN-γ, but also reduces the expression of ROS and inflammatory cytokines, thereby alleviating persistent damage in osteoarthritis and acute inflammatory damage in myocardial infarction in vivo13,14. More importantly, the elevated plasma A2M levels are significantly and positively correlated with the survival rate of human sepsis patients and can enhance the phagocytic and bactericidal abilities of neutrophils and macrophages, improve the chemotaxis and activity of neutrophils, promote the adhesion of neutrophils to endothelial cells, and reduce neutrophil extravasation into normal tissues15. In patients with SAH, Walcott et al. reported that A2M in plasma also exhibited notable changes, which might be a potential biomarker after SAH16. GLIPR2 belongs to the cysteine-rich secretory proteins and pathogenesis-related (CAP) protein superfamily. By binding to hBeclin1, it retains hBeclin1 in the cytoplasmic leaflet of the Golgi complex, thereby regulating autophagy function17,18. MiR-140-5p ameliorates high glucose-induced apoptosis, oxidative stress, and inflammation in HK-2 cells by inhibiting GLIPR2 expression19. In addition, the overexpression of GLIPR2 promotes the development of renal fibrosis by enhancing the epithelial-mesenchymal transition (EMT) process through the activation of the ERK1/2 signaling pathway20.

After the occurrence of SAH, both our rat cortex transcriptome and human cerebrospinal fluid proteomics studies showed elevated protein levels of A2M and GLIPR2. Furthermore, the levels of A2M and GLIPR2 in cortical tissue of rats and cerebrospinal fluid of patients after SAH were further confirmed by WB and immunofluorescence. In addition, we found that A2M was closely related to microvascular thrombosis in rat cortical tissue. Available evidence revealed that A2M not only acts as one of the components of thrombus but also inhibits platelet activation and further inhibits thrombus formation21,22,23. To further validate the accuracy of the A2M and GLIPR2 genes, we performed ROC analysis using the external dataset GES 54,083 (Supplementary Fig. 2). The results showed that the AUC values of A2M and GLIPR2 were 0.812 and 0.787, respectively, further indicating the accuracy of machine learning for screening genes. However, the precise mechanisms through which they influence the physiological and pathological processes following SAH remained unclear, necessitating further comprehensive investigation.

Lipocalin-2 (LCN2), also known as neutrophil gelatinase-associated lipocalin (NGAL), encodes a protein from the lipid transport protein family. Members of this family transport small hydrophobic molecules such as lipids, steroid hormones, and retinoids24. In stroke (including hemorrhagic stroke), the upregulation of LCN2 activates reactive astrocytes and microglia, promotes neutrophil infiltration, and amplifies post-stroke inflammatory response, leading to blood–brain barrier disruption, white matter damage, and neuronal death25,26. After SAH, LCN2 inhibits oligodendrocyte differentiation and maturation through SCL22A17/EGR1 signaling, leading to white matter damage and insufficient myelin regeneration27. Additionally, ultra-early cerebral thrombosis and blood–brain barrier disruption are significant after SAH. LCN2 deficiency significantly reduced thrombosis, erythrocyte embolism, and blood–brain barrier disruption in mice after SAH24,28,29,30. In a clinical trial, it was found that levels of LCN2 in cerebrospinal fluid are positively correlated with levels of IL-6, TNF-α, and MMP-9. Additionally, higher levels of LCN2 are associated with poorer 6-month prognosis in patients31. Glycoprotein NMB (GPNMB) is a type I transmembrane glycoprotein that is widely present in neural tissue, epithelial tissue, bone tissue, and the mononuclear-phagocyte system32. In addition, GPNMB is upregulated in diseases such as Alzheimer’s disease, Parkinson’s disease, and amyotrophic lateral sclerosis, and is closely associated with the activation of microglia and neuroinflammation33,34,35. After SAH, GPNMB improves brain edema and neuroinflammation protects the blood–brain barrier, and improves medium- to long-term neurological function through AMPK/NFκB signaling36. Glial Fibrillary Acidic Protein (GFAP) is a marker of astrocytes and can reflect the severity of reactive astrogliosis37,38. In pathological conditions such as inflammation, neurodegenerative diseases, and injury, astrocytes become “reactive” and interact with microglia and the central nervous system vasculature, thereby modulating immune-inflammatory and neurotrophic functions of the central nervous system and altering blood–brain barrier integrity39. Astrocytes play a dual role after SAH. On one hand, they help clear extracellular glutamate by increasing glutamate uptake and enhancing Na+/K + ATPase activity, which helps reduce neuronal excitotoxicity and also releases neurotrophic factors that are crucial for neuronal recovery following SAH. However, astrocytes also form glial scars that can hinder axonal regeneration affect neural repair, and produce pro-inflammatory cytokines, free radicals, and neurotoxic molecules that may further exacerbate neural damage40. In addition, after SAH, the astrocyte-derived HIF-1α-LCN2-VEGFA pathway and glial-vascular interaction increase the permeability of the blood–brain barrier (BBB) in low-perfusion brain regions41,42. Some clinical studies suggest that plasma GFAP may be an important biomarker for identifying SAH and isolated traumatic subarachnoid hemorrhage (tSAH) and is closely associated with disease severity and prognosis43,44. Previous studies have shown that alterations in the levels of LCN2, GPNMB, and GFAP are tightly related to the progression of SAH, further confirming the reliability of our screening results.

As sequencing technologies continue to be refined, our understanding of RNA and transcription factors (TFs) potential increasingly deepens. Accordingly, this study further constructs a regulatory network of ceRNAs and TFs involving five feature genes. A considerable number of miRNAs are involved in the physiological and pathological processes associated with SAH45,46. Extracellular vesicles derived from mesenchymal stem cells have been demonstrated to reduce M1 polarization and inflammation, alleviate neuronal apoptosis following SAH, and improve neurological function through the interaction of miR-140-5p and HDAC747. MiR-124 has been demonstrated to inhibit microglia activation by targeting protein C/EBPα following SAH, which reduces the release of inflammatory factors such as TNF-α, IL-1α, and C1q and alleviates neuroinflammation and neuronal death48. Extracellular vesicle-mediated transfer of miR-21-5p from mesenchymal stromal cells to neurons alleviates early brain injury to improve cognitive function via the PTEN/Akt pathway after SAH49. Furthermore, one or more lncRNAs can competitively bind to miRNAs with feature genes. LncRNAs play important roles in both physiological and pathological processes by coordinating and integrating multiple signaling pathways50,51. For instance, LncRNA H19 exacerbates cerebral edema and neuronal injury by promoting the expression of NLRP3 inflammasome and microglial activation following SAH52. LncRNA TM7SF3-AU1 exacerbates white matter injury and neurological dysfunction following SAH through the MiR-702-3p/SARM1 signaling pathway53. Meanwhile, TFs are also closely related to SAH. For instance, ATF2 regulates the expression of BAP1 by binding to the BAP1 promoter. The upregulation of BAP1 expression can additionally augment the activity and stability of P53, thereby facilitating neuronal apoptosis and exacerbating neurological dysfunction after SAH54. Netrin-1 alleviates early brain injury by regulating ferroptosis via the PPARγ/Nrf2/GPX4 signaling pathway after SAH55. These findings lend further support to the accuracy of the ceRNAs and TFs regulatory networks constructed by this study. A more profound comprehension of these regulatory networks may potentially inform novel strategies for the future treatment of SAH.

It is undeniable that this experiment still has some limitations. Firstly, this experiment was mainly based on our sequencing data for further bioinformatics, there existed a small sample size of sequencing data, and thus the conclusion had limitations. Second, we only confirmed the high expression of A2M and GLIPR2 in brain tissues of SAH rats and cerebrospinal fluid of SAH patients, so more in vivo and in vitro experiments were needed to explore their potential mechanisms of action on SAH. Finally, GFAP, GPNMB and LCN2 had been extensively validated in clinical trials and randomized controlled trials in SAH; therefore, the above genes were not further validated in this experiment.

Conclusion

In the present study, five SAH-related feature genes (A2M, GFAP, GLIPR2, GPNMB, and LCN2) were screened by multi-omics from different biological specimens, and combined with WGCNA analysis and machine-learning algorithms. Their pertinent ceRNAs and TFs were further predicted. Experimentally verified that A2M and GLIPR2 are highly expressed after SAH, and A2M is closely related to microthrombosis, these molecules may play a key role in the development of SAH.

Materials and methods

Transcriptomics and metabolomics data sources

Transcriptomics and metabolomics data sources were detailed in our previously published articles8.

Animals

In this study, adult Sprague Dawley rats (SD, aged 8–10 weeks, weighing approximately 260–280 g) were sourced from Hunan Slike Jingda Laboratory Animal Co., Ltd. All rats were housed under standard environmental conditions with a 12-hour light/dark cycle and had free access to water and food provided under conditions at the Animal Experimental Center of Wuhan University People’s Hospital for 3 days. Throughout the experiment, we maintained each rat’s weight between 280 and 300 g. All animal experimental protocols were approved by the Institutional Animal Care and Use Committee of Wuhan University People’s Hospital (Ethical Approval Number for Animals:20230105 A). All rats were executed prior to the experiment using high-dose sodium pentobarbital intraperitoneal injection.

Establishment of animal models

The construction of animal models was detailed in our previous articles8.

Construction of the WGCNA

The construction of the co-expression network was achieved through the utilization of the “WGCNA” method, a system biology approach employed to describe gene interactions within samples, as facilitated by the R software. The fundamental premise of WGCNA was the categorization of genes with analogous expression patterns into an identical module, followed by the identification of pivotal genes and prospective therapeutic targets through the assessment of the correlation between gene connectivity and phenotypes56. The “pickSoftThreshold” function was employed to identify the optimal soft threshold, after which the adjacency matrix was transformed. Subsequently, a hierarchical clustering dendrogram was created through the Topological Overlap Matrix (TOM), which we classified to classify similar gene expressions into the same module, with a minimum gene count of 30 per module. The dynamic tree-cutting algorithm was employed to set the cutting threshold at 0.25, thereby facilitating a more accurate definition of accurately defining module boundaries. Modules exhibiting gene correlation (p < 0.05) were selected for further analysis using the “moduleTraitCor” and “moduleTraitPvalue” algorithms.

Functional enrichment analysis

The “ClusterProfiler”, “enrichplot”, “DOSE”, and “org.Hs.eg.db” packages were employed for the analysis of significant functions and pathways of differentially expressed genes (DEGs), including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Disease Ontology (DO)57,58,59. The reference genome file “c2.cp.kegg.Hs.symbols.gmt” was employed for the examination of pathway discrepancies between the sham and SAH 1d groups through the utilization of Gene Set Enrichment Analysis (GSEA). All results are presented in a visual format using the “ggplot2” software package, and the statistical significance threshold was set at a p value < 0.05.

Machine learning-based hub gene screening

The aforementioned DEGs were initially screened using the LASSO algorithm, resulting in the generation of a “LASSO coefficient path” and a “LASSO regularization path”. The former demonstrated the variation of feature coefficients for varying values of the regularization parameter (λ) in the LASSO algorithm. The latter illustrated the model fitting effect for different values of λ in the LASSO algorithm. The results of this Fig. enabled us to identify an optimal value of λ that yielded the most accurate Lasso fit and minimized the cross-validation error.

Correlation between genes and metabolites

To conduct a joint WGCNA analysis of transcriptomes and metabolomes, a Spearman correlation analysis was employed to explore the correlations between genes and metabolites60.

Immune cell infiltration

The relative proportions of 22 infiltrating immune cell types were estimated using the CIBERSORT algorithm. Histograms and violin plots were drawn using the R packages “ggplot2” and “vioplot”, respectively61. Furthermore, correlation coefficients between gene expression and the percentage of infiltrating immune cells were calculated and the results were expressed using the R package “ggplot”.

Constructing a regulatory network

Miranda (http://mirtoolsgallery.tech/mirtoolsgallery/node/1055), miRDB (https://mirdb.org/), and TargetScan (https://www.targetscan.org/vert_80/) software were used together to predict miRNAs bound to core genes, and spongeScan predicted lncRNAs bound to miRNAs. Subsequently, the results were then imported into Cytoscape software (Cytoscape_v3.9.1, https://cytoscape.org/) to map ceRNA regulatory networks62. Additionally, NetworkAnalyst (http://www.networkanalyst.ca) was utilized to construct the transcription factors (TFs) regulatory network of core genes63.

Clinical samples

Patients with SAH were diagnosed by computed tomography (CT) combined with CT angiography (CTA) and digital subtraction angiography (DSA). Six patients with SAH and six control patients were selected. There was no difference in gender distribution between the two groups, but it was more common in females. The mean age of the patients in the SAH and control groups was (65.5 ± 13.4) and (61.0 ± 11.4) years, respectively. The history of smoking, hypertension, hyperlipidemia, and diabetes were not significantly distinct between the two groups. CSF specimens of SAH patients were obtained by lumbar puncture within 5–7 days after bleeding. Analogously, CSF samples from the control group (e.g. patients with hydrocephalus undergoing CSF release tests or conducting diagnostic lumbar puncture for headache, etc.) were collected in the same way. Patients and control individuals were excluded if they had a history of central nervous system (CNS) disease (e.g., stroke, brain or spinal cord injury, CNS infection) or other organ or system dysfunctions within 6 months. Once collected, the specimens were immediately centrifuged at 1500 ×g for 5 min at 4 °C and then preserved at -80 °C until assayed. Clinical characteristics of SAH patients and controls have been shown in Table 1. All human procedures of the study were approved by the Research Ethics Committee of the Renmin Hospital of Wuhan University and performed according to the principles of Good Clinical Practice and the Declaration of Helsinki (Approval No. WDRY2023-K048).

Table 1 Patient demographics.

Proteomics data sources

4D-DIA Proteomics was a next-generation DIA technology based on the timsTOFPro ion mobility platform, which performed differential quantitative proteomics analyses by Trapped Ion Mobility (TIMS) with Simultaneous Accumulation Sequential Fragmentation (PASEF) scanning mode. The core principle was that based on three dimensions of separation (i.e. retention time, mass-to-charge ratio (m/z), ion intensity), a new dimension of ion mobility had been added, and the full-scan DIA technology had been added, which effectively reduced the complexity of the mass spectrometry and greatly improved the identification capability of the mass spectrometry.

100µL of each sample was taken for low abundance protein enrichment using the magnetic bead method. 100 µL of cerebrospinal fluid was added to 4 µL of magnetic beads (washed twice with wash buffer) and incubated for 2 h. The samples were placed on a magnetic rack, and the cerebrospinal fluid was removed and then washed by adding wash reagent, and the washing was reversed and mixed for 5 min each time, which was repeated 5 times. For all samples, trypsin digestion was performed using the in-solution method. The peptides of all samples were desalted using a C18 Cartridge, and the peptides were lyophilized and reconstituted by adding 20 µL of 0.1% formic acid solution, and the peptide concentration of the samples was determined by OD280.

DDA analysis was performed using a nanolitre flow rate Evosepone system for chromatographic separation (Denmark), and samples after nanolitre HPLC separation were analyzed by DDA (data-dependent) mass spectrometry using a timsTOF mass spectrometer (Bruker). Detection mode: positive ions. Mass spectrometry method: voltage 1500v, ion source temperature 180 °C, drying gas 3 L/min, PASEFcycles:8, TIMS accumulation time:100 ms, TIMSrange (ion mobility):0.75–1.35 Vs/cm2, MS and MSMS range:100–1700Da, dynamic exclusion time:24 s. DIA Analyses were performed using a nanoliter flow rate Evosep one system for chromatographic separation, and samples after nanoliter flow rate HPLC separation were analysed by DIA (data-independent) mass spectrometry using a timsTOF mass spectrometer (Bruker). Detection modes: positive ion, MS, and MSMS mass spectral scanning range: 100–1700 m/z. The MS2 uses DIA data acquisition mode with four TIMSScan acquisition windows set up, each with an accumulation time of 100 ms. The magnitude of the collision energy varies linearly with the ion mobility 1/K0 in the PASEF mode (fragmentation energies ranging from 20 to 59 eV corresponding to ion mobilities 1/K0 = 0.85 to 1.30 Vs/cm2). K0 = 0.85 to 1.30 Vs/cm2).

The DIA data were processed using Spectronaut software (SpectronautTM 16, https://biognosys.com/resources/spectronaut-the-deepest-proteome-coverage-available/), with the same database as that used for library construction. The software parameters were set as follows: retention time prediction type was set to dynamic iRT, interference on MS2 level correction was enabled, cross-run normalisation was enabled, and all results had to be passed. The filter parameter Q Value cutoff was set to 0.01 (equivalent to FDR < 1%).

Western blot (WB) analysis

The specific WB and IF steps were conducted by the previously reported methodology64. In short, following transcardial perfusion with pre-cooling saline, brain tissue samples from the injured cortex were promptly harvested and homogenized with a freezing grinding mill. Total protein was isolated and extracted by the protocol for RIPA buffer (Servicebio, China). Subsequently, an equal volume of rat cortical protein sample was subjected to SDS-PAGE and then transferred to PVDF membranes. Furthermore, a 10 µl sample of the patient’s cerebrospinal fluid was separated by SDS-PAGE and transferred to a PVDF membrane. The PVDF membranes were blocked for one hour at room temperature in 5% BSA and incubated overnight at 4 °C with the following primary antibodies: Rabbit Anti-GLIPR2 Antibody (GLIPR2, A10232-1, BOSTER), Rabbit Anti-alpha 2 Macroglobulin Antibody (A2M, ab316101, Abcam), Rabbit Anti-Fibrinogen alpha chain Antibody (FGA, ab92572, Abcam), Rabbit Anti-CD31 Antibody(CD31, ab182981, Abcam) and Rabbit Anti-β-actin antibody (β-actin, AF7018, Affinity). After being rinsed with PBS, the PVDF membranes were incubated with secondary antibodies (Goat Anti-Rabbit IgG (H + L), 1:10000, GB23303, Servicebio, China) for one hour at room temperature. Finally, the protein bands were visualized using a chemiluminescence detection system (Bio-Rad, USA). Statistical analysis was carried out using GraphPad Prism 8.0. Comparison between three groups was determined by one-way analysis of variance (ANOVA) and followed by LSD test or Dunnett’s T3 test for the two groups comparison within the multiple groups. Statistical significance was accepted when p < 0.05.

Immunofluorescence

The steps of fluorescence homology double labeling staining were as follows: Firstly, rat cortical tissues were paraffin-embedded, followed by the preparation of paraffin-embedded brain tissue Sect. (5 μm). The antigen was then extracted with Citrate-EDTA Antigen Extract (P0086, Beyotime, China) in a microwave oven for 25 min. The sections were then closed for 30 min at room temperature with 3% goat serum (SL038, Solarbio, China) to seal the sections. Subsequently, the sections were incubated at 4 degrees Celsius for 24 h with the primary antibody (GLIPR2 (A10232-1, BOSTER), A2M (ab316101, Abcam)) and then with the secondary antibody for an additional hour. Subsequently, TSA dye was added, incubated for 10 min in the absence of light, washed, and antigen repair was performed once more. A second primary antibody (CD31 (ab182981, Abcam), FGA (ab92572, Abcam)) was added and incubated at 4 degrees overnight, followed by an additional hour of incubation with the secondary antibody. Subsequently, the nuclei were restained with DAPI and incubated for 10 min in the absence of light. The images were captured using a fluorescence microscope (OLYMPUS BX53).