Introduction

Chronic Kidney Disease (CKD) is a progressive condition affecting over 800 million individuals globally1. It is defined by a persistent estimated glomerular filtration rate (eGFR) of less than 60 mL/min/1.73 m² body surface area, or urinary albumin excretion equal to or greater than 30 mg/day, or both, sustained for a duration exceeding three months2. Historically, the management of CKD primarily focused on delaying renal failure, implementing end-stage dialysis, or pursuing renal transplantation3. However, recent advancements in the field have led researchers to explore pharmacological interventions, such as empagliflozin, which has shown promise in decelerating CKD progression and mitigating cardiovascular risks associated with the condition4. Despite advancements in medical research, effective treatments for CKD remain elusive. Consequently, the identification and exploration of novel drug targets for CKD treatment are imperative.

Expression quantitative trait loci (eQTLs), which are single nucleotide polymorphisms (SNPs) that influence gene expression levels, present a promising avenue for this endeavor. Utilizing MR analysis of druggable genes, cis-eQTLs—SNPs located in the proximal genomic regions of target genes and closely associated with their expression—serve as instrumental variables (IVs) to elucidate causal relationships between druggable genes and CKD. This approach facilitates the identification of potential drug targets for CKD treatment. This approach has also found extensive application in the investigation of various other diseases5,6,7.

Mendelian Randomization (MR), a statistical technique that leverages genetic variation as an instrumental variable, has demonstrated significant potential in recent years for causal inference in the context of complex diseases8. By utilizing genetic variants that exhibit strong correlations with exposure factors as IVs, MR facilitates the assessment of causal relationships between exposure factors and outcomes, thereby mitigating the influence of confounding variables. Consequently, MR analyses is frequently employed to evaluate etiological inferences in epidemiological research9,10. Importantly, the effectiveness of MR analyses parallels that of randomized controlled trials, albeit with reduced bias and the elimination of reverse causality (Fig. 1)11.

Fig. 1
figure 1

The principle of Mendelian randomization analysis is shown in the figure, where instrumental variables can represent exposure (druggable gene), are independent of confounding factors, and do not directly affect outcomes (Chronic kidney disease).

Methods

Study design

In this MR analysis, we selected cis-expression quantitative trait loci (cis-eQTL) of druggable genes as the exposure variable and employed genome-wide association study (GWAS) data for CKD from the FinnGen database as the outcome variable. This approach was designed to elucidate the complex relationship between the expression of druggable genes and CKD. To identify the most suitable SNPs as instrumental variables, we conducted association analyses of the exposure factors and established stringent inclusion and exclusion criteria. A series of sensitivity analyses was implemented in order to ensure the robustness and reliability of the MR analysis. Furthermore, to elucidate the functional roles of the significant genes identified through MR analysis in terms of cellular composition, biological processes, and molecular functions, as well as to interpret the biological pathways they are involved in according to the Kyoto Encyclopedia of Genes and Genomes (KEGG), we performed comprehensive functional annotations. We conducted Gene Ontology (GO) and KEGG enrichment analyses on these significant genes, and provided detailed information regarding their chromosomal positions. Additionally, we performed colocalization analysis to evaluate whether cis-eQTL and CKD share the same causal variants. The overall design idea is detailed in Fig. 2.

Fig. 2
figure 2

The overall design of this study is shown in the figure.

Data sources

By comparing the sequences and structures of proteins present in human blood with those of currently identified drug target proteins, scientists identified several proteins that exhibit structural similarity to known drug targets and possess the potential to be modulated by drug-like small molecules. Consequently, scientists identified a specific group of genes for further investigation12. Finan identified 1,427 genes encoding drug targets that have been approved or are in the clinical phase, 682 genes encoding proteins that bind to known drug molecules or are similar to approved drug targets, and 2,370 genes that are members of a family of key druggable genes or encode proteins that are distantly similar to approved drug targets12. These 4,479 druggable genes have also been utilized by researchers to investigate potential therapeutic targets for pulmonary fibrosis (Supplementary Material: Table S1)5. (Supplementary Material: Table S1). We queried the eQTLGen Consortium database for these 4,479 druggable genes and retrieved blood cis-eQTL data for 2,888 of them13. The eQTL data facilitated the identification of genetic variants associated with gene expression levels in blood samples, specifically those variants located within 1 Mb of the central location of each gene.

FinnGen is a well-known and large database, and FinnGen provides genetic insights from a phenotypically well-defined and isolated population14. The CKD outcome data were derived from the FinnGen database, which collected information on 447,473 individuals, including 11,265 in the case group and 436,208 in the control group.

Selection of instrumental variables

Reliable and accurate instrumental variables are characterized by three distinctive features:1 a very strong association with the exposure2, no relationship with confounders other than the exposure, and3 the absence of horizontal pleiotropy. It is essential to note that horizontal pleiotropy should be entirely absent in MR analyses. To identify IVs meeting these criteria, we conducted a rigorous screening of each druggable gene. Initially, we performed association analyses to select SNPs from the cis-eQTL data, ensuring that only SNPs with P-values below the genome-wide significance threshold (5.0 × 10− 8) were considered. Secondly, to mitigate the knock-on disequilibrium effects of these SNPs, we utilized the “TwoSampleMR” package in the R programming language to set the linkage disequilibrium (LD) threshold to R² < 0.1, using data from the 1000 Genomes Project (EUR population) with an aggregation distance of 10,000 kb15. In addition, we calculated the proportion of variance explained (R2) and the F statistic to quantify the strength of the tool with the following equations: R2 = 2 × MAF × (1 - MAF) × β2, and F = R2(n-k-1)/ k(1-R2), where “MAF” is the SNP used as the IV secondary allele frequency, “n” is the sample size, “k” is the number of IVs used16. To avoid weak IV bias, we selected SNPs with F > 10 as the IVs we analysed.

Mendelian randomization analysis

MR analyses were conducted using the “TwoSampleMR” package in R (version 4.4.1). For the MR analysis, the Wald ratio method was employed when a single SNP served as the IV. When the IV comprised two or more SNPs, we applied five different methods: inverse variance weighting (IVW), MR-Egger, weighted median, simple mode, and weighted mode. We used the IVW method as the primary analysis method because previous studies have shown that the IVW method is more conservative but more robust17. To account for multiple testing, we applied the False Discovery Rate (FDR) correction to identify significant MR analysis results. Additionally, to ensure the robustness of our findings, we employed several methods for sensitivity analyses. Potential heterogeneity of IVs was assessed using Cochrane’s Q test18. Heterogeneity was considered absent if the p-value of the Cochrane Q test exceeded 0.05. Potential pleiotropy between exposure and outcome was examined using MR-Egger regression. Pleiotropy was considered absent if the p-value for the MR-Egger regression intercept was greater than 0.0518.

Colocalization

For druggable genes showing significant MR results, colocalization analysis was performed using the R package “coloc“19. The default a priori probabilities were P1 = 1.0 × 10− 4, P2 = 1.0 × 10− 4, and P12 = 1.0 × 10− 5, indicating that the SNPs were associated with the expression of the druggable genes, the outcome, or both, respectively. Posterior probabilities for the following five hypotheses were generated from the colocalization analysis: PPH 0, not associated with the expression or outcome of a druggable gene; PPH 1, associated with the expression of a druggable gene but not with the outcome; PPH 2, associated with the outcome but not with the expression of a druggable gene; PPH 3, associated with the expression and outcome of a druggable gene with a different causal variant; and PPH 4, association with expression of the druggable genes and outcome, with a shared causal variant. PPH 4 > 0.80 was considered strong.

Gene function and pathway enrichment analysis

GO is the world’s largest source of information on gene function. This knowledge is both human-readable and machine-readable, and is the basis for large-scale computational analyses20,21. KEGG is a database for understanding high-level functions and utilities of biological systems from molecular-level information22,23,24. For biological process and pathway enrichment analyses, KEGG and GO analyses were performed using the R “clusterProfiler” software package. Finally, the R “circlize” package was used25 to display information on the location of these genes on the corresponding chromosomes.

Results

Twelve significant genes were identified by MR analysis of druggable genes

We employed the IVW method as the primary analytical approach. Sixteen druggable genes exhibited a strong association with the development of CKD at a FDR of less than 0.05. These genes are PTPN22, KSR1, MAP2K5, CASP9, C4A, S100B, MYLK4, CHSY1, TUBB, C4B, XYLT1, DHFR, GUCY1B2, CXCR1, CD96, and VAMP8. To assess the pleiotropy of these 16 significant genes, we utilized MR-Egger regression. The intercept p-value for “S100B” was less than 0.05, indicating the presence of horizontal pleiotropy. Conversely, the intercept p-values for the remaining 15 significant genes were greater than 0.05, suggesting no evidence of horizontal pleiotropy for these genes. The potential heterogeneity of IVs was assessed using Cochrane’s Q-test. The results indicated significant heterogeneity for the IVs associated with “C4A,” “C4B,” and “XYLT1,” while no evidence of heterogeneity was found for the other significant genes. To ensure the reliability of our findings, we did not accept the presence of horizontal pleiotropy and heterogeneity. Consequently, we concluded that a causal association exists between 12 genes—PTPN22, KSR1, MAP2K5, CASP9, MYLK4, CHSY1, TUBB, DHFR, GUCY1B2, CXCR1, CD96, and VAMP8—and CKD (Fig. 3). It is worth mentioning that the MR-PRESSO test revealed no evidence of pleiotropy among these 12 significant genes.

Fig. 3
figure 3

The 12 druggable genes are closely related to the risk of CKD after FDR correction. If OR is greater than 1, the gene is considered to be positively correlated with the occurrence of CKD, and if it is less than 1, it is considered to be negatively correlated. FDR is the corrected p-value, and FDR is greater than 0.05, which is considered statistically significant.

Among these 12 significant genes, TUBB (OR: 1.2352; 95% CI: 1.1305–1.3496; FDR = 0.0085), PTPN22 (OR: 1.2897; 95% CI: 1.1893–1.3985; FDR < 0.001), MAP2K5 (OR: 1.1055; 95% CI: 1.0679–1.1445; FDR < 0.001), CHSY1 (OR: 1.1152; 95% CI: 1.0667–1.1659; FDR = 0.0044), DHFR (OR: 1.0390; 95% CI: 1.0214–1.0570; FDR = 0.0347), CXCR1 (OR: 1.1188; 95% CI: 1.0634–1.1771; FDR = 0.0430) and CD96 (OR: 1.0945; 95% CI: 1.0505–1.1404; FDR = 0.0475) expression were positively correlated with the risk of CKD. KSR1 (OR: 0.8809; 95% CI: 0.8441–0.9192; FDR < 0.001), CASP9 (OR: 0.8782; 95% CI: 0.8389–0.9195; FDR < 0.001), MYLK4 (OR: 0.9078; 95% CI: 0.8744–0.9423; FDR = 0.0011), GUCY1B2 (OR: 0.9068; 95% CI: 0.8679–0.9475; FDR = 0.0359) and VAMP8 (OR: 0.9215; 95% CI: 0.8879–0.9565; FDR = 0.0489) expression were negatively associated with CKD risk. (Information on the IVs used for these significant genes is detailed in Supplementary Material Table S2, and the results of MR analyses are detailed in Supplementary Material Table S3)

Colocalization analysis of 12 significant genes

Colocalization analysis can provide additional evidence for Mendelian randomization analysis, suggesting that associations between instrumental variables used for analysis and outcomes may be causal. Colocalization supports both an association between genetic variation and exposure and an association between exposure and outcome. When colocalization analyses show positive results, specific genetic variants share the same genetic location as signals associated with both exposure and outcome, this strengthens the belief that the association is due to genetic variants directly affecting exposure, and thus outcome. In addition, it reduces the likelihood of pleiotropy to some extent.

In order to strengthen the evidence for an association between these key genes and CKD. For the 12 genes, we performed colocalization analyses to estimate the likelihood that the cis-eQTL and CKD results share a causal variant. The result of colocalization analysis indicated that CKD and TUBB genes might share a causal variant, with PP.H4 exceeding 0.80% (TUBB: 97.27%).

Consequently, based on MR and colocalization analyses, we infer that TUBB is statistically likely to be a potential drug target for CKD treatment. (For details of the results of significant gene colocalization analysis, see Table S4 in the Supplementary Material)

Demonstrate information about key genes on the corresponding chromosome

The position of a gene on a chromosome may be related to its function, expression regulation, and interactions with other genes. By analyzing the chromosomal location of genes, we can obtain clues about their potential involvement in specific biological processes or pathways. Besides, many genetic disorders are associated with abnormalities in specific genes or gene clusters on chromosomes. Knowing the chromosomal location of these genes helps identify genetic variations related to diseases, providing important information for disease diagnosis, treatment, and prevention. Moreover, chromosomes are carriers of genetic information, and their structure, organization, and evolution are crucial for understanding the functional and evolutionary history of an organism’s genome. By analyzing the chromosomal location of genes, we can obtain clues about genome structure, chromosomal rearrangements, and evolutionary events. So, we show the information of the starting position of the remarkable genes on specific chromosomes. For example, the TUBB gene is located on chromosome 6 in the segment 30,687,978–30,693,203, and the PTPN22 gene is located on chromosome 1 in the segment 114,356,433–1,144,414,381 (Fig. 4). (Seeing Supplementary Material Table S1 for details).

Fig. 4
figure 4

This figure shows the relationship between significant genes and chromosome positions. For example, TUBB is located on chromosome 6.

Analysis of specific biological processes, functional classes and signalling pathways corresponding to key genes by GO/KEGG enrichment

GO enrichment analysis showed that in the Biological Process module, these genes were mainly enriched on regulation of MAP kinase activity and regulation of protein serine/threonine kinase activity. In the Cellular Component module, these genes were mainly enriched in primary lysosome. In the molecular Function Module, these genes were mainly enriched on protein serine kinase activity (Fig. 5). KEGG pathway analyses showed that these genes were mainly enriched on Gap junction, Platelet activation and Oxytocin signaling pathway (Fig. 6). (For details of the results of GO and KEGG enrichment analyses, please refer to Table S5 and Table S6 in the Supplementary Material.)

Fig. 5
figure 5

This figure shows the results of GO enrichment analysis of 12 significant genes. For example, in the Biological Process module, there are three significant genes enriched in the regulation of MAP kinase activity.

Fig. 6
figure 6

This figure shows the results of KEGG enrichment analysis of 12 significant genes. For example, there are two significant genes enriched in Gap junction.

Notably, TUBB was pathway linked to Cellular community - eukaryotes, Infectious disease/bacterial, and Cellular Processes. In GO enrichment analysis, TUBB was associated with natural killer cell mediated immunity, leukocyte mediated immunity, lymphocyte mediated immunity, primary lysosome, azurophil granule, nuclear envelope lumen, spindle, intercellular bridge, GTPase activating protein binding, MHC class I protein binding, and MHC protein binding and other related.

Discussion

In this study, we found that the expression of 12 genes was closely related to CKD. For genes demonstrating significant associations in the MR analysis, we performed a comprehensive suite of sensitivity analyses and functional assessments. These included Cochrane’s Q-test, MR-Egger regression, colocalization analysis, and GO and KEGG enrichment analyses. Our findings revealed a strong correlation between the expression of the TUBB gene and CKD. MR analysis of druggable genes, along with colocalization analysis, has yielded compelling evidence suggesting that TUBB is a potential therapeutic target for CKD.

The TUBB gene encodes β-tubulin, a structural component of microtubules that binds to α-tubulin to form a heterodimer, which subsequently contributes to microtubule assembly26. Microtubules (MTs) have been characterized as long, hollow polymers with a diameter of 25 nm and lengths ranging from less than 1 μm to over 100 μm. They were initially described as consisting of approximately 13 linear protofilaments, forming a polymer with a fast-growing positive end (exposed β-tubulin) and a slow-growing negative end (exposed α-tubulin)27. MTs are integral to the formation of cellular structures and the regulation of cellular functions, such as cell polarity and morphology, chromosome segregation during cell division, and the localization and transport of organelles28. Furthermore, MTs facilitate intracellular transport, assemble into larger structures with the aid of auxiliary proteins, and interact with various cell types to establish mature networks27. It has been proposed that microtubule-targeting compounds inhibit microtubule polymerization, thereby causing an imbalance in microtubule dynamics. This disruption results in spindle apparatus damage, cell cycle arrest, and ultimately, tumor cell death29. Consequently, compounds that target microtubules are capable of inhibiting microtubule polymerization. It is apparent that these compounds can interfere with a range of critical cellular processes30. Among the proteins constituting microtubules, β-tubulin is particularly significant. During microtubule polymerization, only the β-tubulin subunit is capable of hydrolyzing GTP following the incorporation of microtubule dimers31. This underscores the critical role of TUBB and its encoded β-tubulin proteins in regulating cell proliferation, differentiation, and human immunity.

Microtubule-targeting agents (MTAs) developed based on β-tubulin proteins have a longstanding history and are considered among the earliest anti-cancer drugs30. For instance, compounds such as vincristine, vinblastine, and vindesine target the Vinca domain located at the inter-dimer interface between two longitudinally aligned tubulin dimers within the β-tubulin monomer32,33,34. Additionally, colchicine, benzimidazoles, and combretastatins interact with the colchicine binding sites situated in the deep pocket between αβ-tubulin dimers within β-tubulin monomers35. Furthermore, binding to the Taxane site of β-tubulin monomers in the inner lumen of MTs stabilizes the MT lattice. Consequently, MT-stabilizing agents such as paclitaxel and epoxomicin have been developed36. Secondly, Maytansine and Spongistatin interact near the Vinca site on the exposed pocket of β-tubulin37,38. Similarly, the compounds Laulimalide and Peloruside target β-tubulin protein pockets oriented towards the exterior of microtubules (MTs)39,40. In 2021, the Gatorbulin site was identified, located within the α-tubulin subunit, between α- and β-tubulin, and proximal to the colchicine site. This site functions to form a wedge between two longitudinally aligned MT dimers at the MT terminus41. Recently, a novel compound named Cevipabulin has been reported to bind to this pocket42. The numerous sites of targeted drug development, whether currently known or unknown, associated with TUBB present substantial opportunities for TUBB to be considered a potential drug target in CKD.

As discussed above, TUBB gene expression produces β - tubulin, which is involved in the composition of microtubules. Meanwhile, microtubule associated proteins (MAPs) interact with microtubules, such as participating in microtubule binding, regulating microtubule stability, polymerization, and depolymerization processes. They interact with each other and jointly participate in maintaining the cytoskeleton structure and cell morphology within the cell24. This is similar to our GO/KEGG enrichment analysis results, where in the Biological process module, these genes are mainly enriched in the regulation of MAP kinase activity. The relationship between GTPase activating protein (GAP) and microtubule proteins is close, and the microtubule network formed by the β - microtubule protein encoded by the TUBB gene can serve as a transport track and localization platform for signaling molecules28. This is similar to the results of our KEGG enrichment analysis, which mainly concentrates on the Gap junction.

These β-tubulin protein-based microtubule-targeting agents (MTAs) play a crucial role in tumor treatment by disrupting microtubule homeostasis within the spindle apparatus, thereby activating mitotic checkpoints. This disruption arrests cells at the mid-stage of mitosis, ultimately leading to cell death through apoptosis43,44. Additionally, these agents have demonstrated efficacy against parasites45. Recent research has indicated that TUBB, a microtubule protein, may play a role in the pathological mechanisms underlying various renal diseases, including tubulointerstitial disease and renal fibrosis. For instance, Sang et al. demonstrated that the inhibition of microtubule dynamics hinders the repair of renal ischemia/reperfusion injury and promotes renal fibrosis46. Additionally, Yang et al. identified β-tubulin as a serum marker for renal clear cell carcinoma, noting that this type of carcinoma is associated with elevated TUBB expression47. Although there is a paucity of comprehensive studies specifically addressing the relationship between TUBB and CKD, emerging research has increasingly concentrated on the interplay between MTs and their associated proteins in the context of CKD. A cross-sectional study has identified β-tubulin as a potential biomarker for cognitive impairment in CKD patients48. Additionally, recent findings indicate that the long non-coding RNA (lncRNA) mini-nuclear host gene 12 exacerbates inflammation and apoptosis while inhibiting autophagy in acute kidney injury by upregulating TUBB expression49. This finding aligns with the results of our analysis, which demonstrated a positive association between TUBB expression and CKD risk. Consequently, we propose that TUBB represents a potential therapeutic target for CKD, meriting further in-depth investigation into its underlying mechanisms. Apart from TUBB, several other potential targets were identified (Fig. 3). Although these additional significant genes were not corroborated by colocalization analysis, the important value of them cannot be entirely dismissed, and they continue to present a broad spectrum of possibilities for CKD drug development.

Despite our rigorous analytical approach, which included the exclusion of results exhibiting heterogeneity and pleiotropy (including horizontal pleiotropy), our study has certain limitations. Firstly, our findings are based on statistical extrapolation, which may not fully capture the complexity of real-world scenarios where other confounding factors may interfere. Consequently, our results cannot entirely substitute for basic or clinical trials, nor can they guarantee the actual efficacy of the drug. Therefore, further basic and clinical research is warranted to elucidate the mechanisms and effects of TUBB on CKD. Besides, the current analysis is ethnically constrained, which may restrict the generalizability of our findings to other populations. Importantly, our study offers significant insights and guidance for advancing research on identifying drug targets for CKD.

Conclusion

The causal relationship established through MR analysis and CKD suggests new avenues for targeted drug research. In summary, our MR analysis indicates that TUBB may serve as a potential target for reducing CKD risk.