Abstract
Immunity has shown potentials in informing drug development for cardiometabolic diseases, such as type 2 diabetes (T2D) and coronary artery disease (CAD). Here, we performed a transcriptome-wide Mendelian randomization (MR) study to estimate the putative causal effects of 11,021 gene expression profiles during CD4+ T cells activation on the development of T2D and CAD. Robust MR and colocalization evidence was observed for 162 genes altering T2D risk and 80 genes altering CAD risk, with 12% and 16% respectively demonstrating CD4+ T cell specificity. We observed temporal causal patterns during T cell activation in 69 gene-T2D pairs and 34 gene-CAD pairs. These genes were eight times more likely to show robust genetic evidence. We further identified 25 genes that were targets for drugs under clinical investigation, including LIPA and GCK. This study provides evidence to support immune-to-metabolic disease connections, and prioritises immune-mediated drug targets for cardiometabolic diseases.
Similar content being viewed by others
Introduction
Immunity has been linked with the revolution of type 1 diabetes (T1D) treatment1. Recently, an immunotherapy drug targeting chimeric antigen receptor (CAR) regulatory T cell, Teplizumab, was approved by the Food and Drug Administration (FDA) for T1D treatment2. Immunomodulatory drugs have also shown their value for cardiovascular disease (CVD) treatment3,4, including the repurposing of interleukin-6 receptor (IL6R) inhibitors5. The efficacy of some immunotherapies on CVDs, especially those mediated via CD4+ T cells, has been examined in animal models and clinical trials6,7. In contrast to T1D, which has long been recognized as having immune etiology, type 2 diabetes (T2D) has historically been categorized as a metabolic disease. However, recent studies have highlighted the involvement of the immune system in the development of T2D and the progressive deterioration of β-cell function, suggesting that the cross-talk between immunity and metabolism may contribute to T2D pathogenesis8,9. Chronic low-grade or systemic inflammation and islet autoimmunity10 have been linked to T2D, and its progression is accompanied by quantitative changes in CD4+ T cells, CD8+ T cells and T regulatory (Treg) cells11. Moreover, the presence of islet-reactive T cells in some T2D patients12 has been associated with more severe β-cell dysfunction13, indicating the potential role of T lymphocytes in the progression of T2D. Thus, targeting T lymphocytes through treatments or prevention strategies may have beneficial effects for both T1D and T2D. Current immunomodulatory therapies, such as IL-1β receptor antagonist (IL-1βRa) and tumor necrosis factor (TNF)-α blockers, have exhibited clinical efficacy in the treatment of T2D14. However, few studies have been conducted on T lymphocytes, especially on CD4+ T cells, to systemically investigate the potential immune-mediated targets for T2D prevention or treatment. Therefore, understanding the complex interplay between CD4+ T cells and T2D could lead to the identification of immune-mediated drug targets for T2D prevention.
Previous studies have integrated summary data from genome-wide association studies (GWASs) of diseases and expression quantitative trait loci (eQTL) to identify putative causal genes for complex traits and diseases15,16. However, most of these studies used eQTL data from bulk tissues, which cannot capture cell-type specific effects on gene expression. Recently, advances in single-cell technologies have enabled us to map gene expression across cell types, which are challenging to isolate using conventional approaches17. Dynamic eQTLs refer to genetic loci containing sequence variations that influence gene expression differentially according to the dynamically changing cell states, influencing the cell-state specificity in single-cell eQTLs18. Thus single-cell sequencing approaches can reveal previously hidden regulatory roles of gene expression in the etiology and molecular mechanisms of complex diseases.
Mendelian randomization (MR) is a cost-effective approach that has demonstrated its value in prioritizing drug targets19,20. Recent studies have provided evidence that drugs targeting validated genetic targets are more likely to be successful in clinical trials21,22. In light of this, we propose that integrating dynamic eQTL data with the MR approach could provide a more comprehensive assessment of the effects of gene expression across diverse subsets of activated T cells on disease outcomes. This approach aligns with the dynamic nature of T cell activation and better reflects T cell effector function, which often plays a role in driving or contributing to disease phenotypes.
In this study, we aimed to utilize dynamic single-cell eQTL data obtained at five time points during CD4+ T cell activation (0 h, lowly active [LA], 16, 40 h and 5 days) to systematically estimate the putative causal effects of 11,021 gene expression profiles encompassing 1817 genes involved in 46 expression profiles17 on T2D and coronary artery disease (CAD) using MR approaches. This has the potential to enhance our understanding of the underlying genetic mechanisms linking immunity with cardiometabolic diseases, as well as facilitate the prioritization of potential immune-mediated drug targets for T2D and CAD.
Results
Summary of instrument selection
Figure 1 summarized the study design. We used eQTL (genetic variants associated with gene expression) as instruments for the MR analysis. The eQTLs were derived from single-cell transcriptomes of CD4+ T cells across five-time points [from resting to the most activated cell state, including cells in resting state (0 h), lowly active cells after activation (LA), 16 hours after activation (16 h, before dividing), 40 hours after activation (40 h, after the first cell division) and 5 days after activation (5 days, after acquiring effector functions)], in samples taken from 119 European individuals17. In addition, these single-cell transcriptomes were subjected to unsupervised clustering of cells based on the associated gene expression patterns and activation markers of CD4+ T cell across five-time points, identifying 17 cell types (CD4 Memory, CD4 Naive, heat shock protein [HSP], nTreg, T ER-stress, central memory T cell [TCM], effector memory T cell [TEM], TEM human leukocyte antigen [HLA] positive, effector memory cells re-expressing CD45RA [TEMRA], TM cycling, TM ER-stress, TN, TN cycling, TN HSP, TN interferon [IFN], TN nuclear factor κB [NFKB], TN2)17. These 17 cell types were eventually merged into 46 expression profiles with cell types and time points (e.g. CD4_Memory_5d). Thus, the eQTLs in this study represented genetic variants being associated with gene expression under a resting/activation time-point in the corresponding cell types, which we defined as a gene expression profile containing a combination of gene, cell type and time point.
Soskic et al.17 identified 482,971 eQTLs with the expression of 13,379 genes and 46 expression profiles during CD4+ T cell activation, these were considered as candidate instruments. To better fit the MR assumptions23, we took a three-step approach to select instruments. In step one, we used linkage disequilibrium (LD) clumping to exclude correlated eQTLs (LD r2 < 0.001). In step two, we only included instruments with F-statistics greater than 10 to reduce weak instrument bias24. In step three, we applied the Steiger filtering approach25 to minimize the risk of eQTLs representing reverse causality. After filtering, 11,021 eQTLs mapping to 1,817 genes involved in 46 expression profiles during CD4+ T cell activation were selected as instruments for the dynamic eQTL MR analysis (Supplementary Data 1).
To include more instruments and apply additional MR sensitivity methods, we re-selected instruments, using a more relaxing P-value cut-off (P < 0.05), LD correlation threshold (LD r2 < 0.1; data from 1000 Genome Europeans) and F-statistic threshold (≥5). To maximise the number of instruments, only SNPs that passed Steiger filtering for T2D or CAD were subjected to clumping, and instruments for 11,007 exposures (cell-type and time point-specific gene expression profiles) can be identified. Thus, we identified 45,913 instruments for T2D analyses and 46,766 instruments for CAD, among which 10,112 and 10,140 of the exposures have two or more instruments, respectively (Supplementary Data 2).
Given that existing eQTL data primarily come from resting cells or bulk tissues, we aimed to explore whether the genes discovered in dynamic eQTL data could be detected in any previous databases with various cell/tissue types. As a comparison, for genes that showed robust MR evidence in the main dynamic eQTL MR analysis, we further selected non-dynamic eQTLs from four independent datasets of European ancestry, including immune cell eQTL data from the OneK1K cohort (n = 982)26 and DICE (n = 91)27, and tissue-specific eQTL data from GTEx v8 (n = 73–706)28 and the eQTLGen consortium (n = 1418–31,684)29. The same instrument selection process was applied to these datasets. After selection, 7,762 eQTLs of 222 genes were selected as instruments for this comparison MR analysis (Supplementary Data 3–6; noted as non-dynamic eQTL MR in this study).
Causal effect of gene expressions on T2D and CAD using dynamic eQTLs across CD4+ T cell activation and non-dynamic eQTLs from four studies
We undertook a two-sample MR to systematically evaluate the putative causal effects of 11,021 gene expression profiles during CD4+ T cell activation on T2D and CAD. The T2D and CAD data were derived from the GWAS summary statistics of the DIAGRAM (DIAbetes Genetics Replication And Meta-analysis) consortium30 and Aragam KG, et al.31, with 80,154 T2D cases (853,816 controls) and 181,522 CAD cases (984,168 controls), respectively (Supplementary Data 7). All exposure and outcome data sources were restricted to subjects of European ancestry to minimize the possibility of confounding due to population stratification. The Wald ratio method32 was applied, as each gene expression profile only had one instrument.
Among the 15,982 gene–disease pairs, 766 gene–T2D pairs and 211 gene–CAD pairs showed robust evidence in the MR analysis (Benjamini–Hochberg false discovery rate [FDR] < 0.05), with little evidence of reverse causality using Steiger filtering. To further evaluate the evidence for causality in the MR analyses, we conducted genetic colocalization33 and LD check34 (more details in Methods). The colocalization analysis revealed 530 MR signals for T2D and 153 MR signals for CAD with strong evidence of colocalization (colocalization probability > 70%), whereas the LD check analysis identified 578 MR signals for T2D and 170 MR signals for CAD with approximate colocalization evidence (pair-wise LD r2 > 0.7). Collectively, 652 (85.1%) gene–T2D pairs with expression profiles (e.g. PEAK1-CD4 memory-16h-T2D) and 197 (93.4%) gene–CAD pairs with expression profiles showed robust MR and colocalization evidence, which consisted of 150 unique genes altering T2D risk and 62 unique genes altering CAD risk (a unique gene indicates a gene–disease pair regardless of cell type and time point, e.g. PEAK1-T2D; Fig. 2c, Supplementary Data 8). These results were noted as the main MR results in this study.
a Illustrating the number of all the MR estimates, MR estimate with Benjamini–Hochberg FDR corrected P value < 0.05 (two-sided) and with evidence of colocalization. The bar chart shows the number of MR signals with and without colocalization evidence according to the disease. b Validation in the results of the eight sensitivity analysis methods compared to the Wald ratio results. c A Manhattan plot illustrating the associations of genetically influenced gene expression derived from CD4+ T cells on T2D and CAD, respectively. The Y-axis shows the minus log 10 P values (two-sided) of the MR estimates. The labeled genes are the most significant genes with strong MR (FDR < 0.05) and colocalization evidence on each chromosome among the 162 genes affecting T2D and 80 genes affecting CAD. T2D type 2 diabetes, CAD coronary artery disease.
To validate the main MR findings, we applied eight sensitivity MR methods using the re-selected instruments, including Wald ratio, cML-MA, MR-RAPS, MR-Robust, debiased IVW, weighted median approach, weighted mode estimator, and MR-PRESSO. The additional MR estimates obtained from these eight MR methods were then compared with the main MR results. We found that 148 (99%) of the 150 T2D genes and 61 (98%) of the 62 CAD genes identified in the main MR analysis could be replicated using the eight sensitivity analyses (Supplementary Data 9). The correlation of MR estimates between each sensitivity method and the main MR estimates ranged from 80% for the MR-PRESSO method to 96% for the weighted mode method (Supplementary Fig. 1). With an increased number of instruments, we also identified 12 additional genes associated with T2D and 18 additional genes associated with CAD. These genes showed MR evidence in three or more sensitivity methods and exhibited colocalization evidence (Supplementary Data 10). Among the top MR findings, 121 (75%) of the 162 T2D genes and 45 (56%) of the 80 CAD genes were previously reported as associated with T2D or CAD in the OpenTargets database35 (Supplementary Data 11).
Subsequently, we performed a differential gene expression analysis on the 162 T2D genes identified using single-cell gene expression data from seven studies (Fig. 3a and b, Supplementary Data 12; more details in the “Methods” section). We revealed that 79 (49%) of the 162 T2D genes exhibited evidence of differential expression between T2D and non-T2D individuals in at least one of the seven studies (Supplementary Data 13, 14). Specifically, 32 genes expressed in naive CD4+ T cells and other 29 genes expressed in memory CD4+ T cells showed differential expression analysis evidence in two or more of the seven studies (Supplementary Data 13-14). For example, our MR results indicated that STK17B gene expression in naive CD4+ T cells was associated with a reduced risk of T2D (OR = 0.95, 95% CI: 0.92–0.98, P = 0.3 × 10−3), which is consistent with our findings from the differential gene expression analysis (Fig. 3c and d).
a Number of genes among 162 genes differentially expressed and detected in different scRNA-seq data of diverse tissues from T2D patients. The number of samples from each group is as follows: Intestine (Control naive CD4+ T cells: n = 643, T2D naive CD4+ T cells: n = 1403, Control memory CD4+ T cells: n = 3300, T2D memory CD4+ T cells: n = 1509), Kidney (Control naive CD4+ T cells: n = 409, T2D naive CD4+ T cells: n = 1594, Control memory CD4+ T cells: n = 112, T2D memory CD4+ T cells: n = 866), Liver (Control naive CD4+ T cells: n = 741, T2D naive CD4+ T cells: n = 107, Control memory CD4+ T cells: n = 3264, T2D memory CD4+ T cells: n = 454), Lung (Control naive CD4+ T cells: n = 3411, T2D naive CD4+ T cells: n = 393, Control memory CD4+ T cells: n = 1210, T2D memory CD4+ T cells: n = 649), PBMC1 (Control naive CD4+ T cells: n = 2008, T2D naive CD4+ T cells: n = 3691, Control memory CD4+ T cells: n = 678, T2D memory CD4+ T cells: n = 773), PBMC2 (Control naive CD4+ T cells: n = 13072, T2D naive CD4+ T cells: n = 7301, Control memory CD4+ T cells: n = 4928, T2D memory CD4+ T cells: n = 4138), Wound-edge (Control naive CD4+ T cells: n = 689, T2D naive CD4+ T cells: n = 173, Control memory CD4+ T cells: n = 288, T2D memory CD4+ T cells: n = 880). b Genes validated in differential expression analysis. The color represents their average expression fold change levels. The size of a point represents the number of times it has been validated in different datasets. c UMAP embedding of scRNA-seq data of naive CD4+ T cells from T2D patients and healthy controls. d STK17B gene expression in naive CD4+ T cells.
We further conducted the MR analysis by using four non-dynamic immune cell and tissue-specific eQTL datasets from the OneK1K cohort, DICE, GTEx v8, and the eQTLGen consortium26,27,28,29 for comparison. Of the 162 T2D genes and 80 CAD genes, 158 gene–T2D pairs and 77 gene–CAD pairs showed MR evidence in one or more of the four non-dynamic eQTL datasets (Fig. 4a–c, Supplementary Data 15–18).
Red and yellow dots represent genes causally affecting T2D and/or CAD risk with colocalization evidence in the dynamic eQTL data or in any of the non-dynamic eQTL datasets. Gray dots indicate the gene is absent in a given database or the result was tested in a given database but did not show robust MR and colocalization evidence. a Genes causally affecting T2D risk that are only detected in dynamic eQTL data and for example, genes detected in dynamic or non-dynamic eQTL datasets. b Genes causally affecting CAD risk that were only detected in dynamic eQTL data and for example, genes detected in dynamic or non-dynamic eQTL datasets. c Distribution of the 162 T2D genes and 80 CAD genes in dynamic and non-dynamic datasets.
It is worth noting that compared with the bulk tissue eQTL MR results, 21 of the 242 (9%) gene–disease pairs were only detectable in immune-cell specific eQTL MR analysis (Fig. 4c). For example, expression of CFD, the product of which is a target for Lampalizumab, a complement factor D inhibitor for the treatment of age-related macular degeneration in phase III clinical trial (NCT02745119), was observed to be associated with T2D risk in memory CD4+ T cell at 16 h and TEM at 5 days using dynamic eQTLs (OR = 1.03, 95% CI: 1.01–1.05, P = 4.5 × 10−3; OR = 1.03, 95% CI: 1.01–1.05, P = 3.8 × 10−3, respectively; Supplementary Data 8).
The pathway and protein-protein interaction analysis results of the identified genes are in Supplementary Fig. 2 and Supplementary Data 19, 20.
Cell-type-specific causal effect of CD4+ T cell gene expression on T2D and CAD
Among 46 CD4+ T cell expression profiles, we identified 39 expression profiles with evidence from MR and colocalization supporting the association with T2D, as well as 35 profiles associated with CAD, respectively (Fig. 5a). Besides memory and naive CD4+ T cells, we observed that the causal genes for T2D and CAD were predominantly expressed in TCM and TEM subsets, across five time points of CD4+ T cell activation (0 h, LA, 16 h, 40 h, 5 days). These findings aligned with the abundance of instruments in these specific cell types and suggested a correlation between the number of MR signals for each expression profile and the number of eGenes per cell type reported in Soskic et al.17 (R = 0.98, P < 1 × 10−4; Supplementary Fig. 3). Here, an eGene was defined as a gene with cis-eQTL associated with expression levels of the corresponding gene reported by Soskic et al.17.
a Distribution of the number of genes with colocalization evidence and the number of genes for instruments in dynamic eQTL data across CD4+ T cell expression profiles. Numbers in the box represent the number of genes corresponding to each expression profile. b Correlation of gene expression in CD4+ T cells and that in other immune cell types (n = 91). c Identified genes were specific to CD4+ T cells and not highly expressed in any other immune cell types.
We performed some additional analyses to determine the cell type specificity of the T2D and CAD genes identified in the main MR and sensitivity analyses. We first assessed whether the identified genes were exclusively expressed in CD4+ T cells using gene expression data from DICE (n = 91)27, where 159 T2D genes and 80 CAD genes have cell-type expression data in DICE. We revealed that 153 (96%) of the 159 T2D genes and 73 (91%) of the 80 CAD genes identified in CD4+ T cells using MR were also expressed in CD4+ T cells in the DICE dataset (Supplementary Data 21).
The correlations between gene expressions of these identified genes in CD4+ T cells compared to other cell types were as follows: 76% for monocytes, 86% for NK cells, 78% for B cells, and 94% for CD8+ T cells (Fig. 5b). Considering the high correlation between expression in CD8+ T and CD4+ T cells, we combined gene expressions in T cells and found that 23 (14%) of the 162 T2D genes and 8 (10%) of the 80 CAD genes were specific to T cells and not highly expressed in other immune cell types (Fig. 5c). The cell-type enrichment analysis, conducted using the entropy method36, revealed that 91.3% of the T2D and CAD genes were ranked within the top 50% quantiles of specificity (Supplementary Fig. 4). Additionally, simulations utilizing expression data from DICE indicated that T2D genes identified through genetic tools exhibited a higher likelihood of enrichment in T cells compared to randomly selected genes. This observation was consistent across various gene categories, including expressed, specifically expressed, and differentially expressed genes in T cells. In contrast, there was minimal evidence to suggest that CAD genes exhibited similar enrichment within T cells (Supplementary Fig. 5).
We further investigated whether these genes with eQTLs in CD4+ T cells, as reported by Sockic et al., also exhibited eQTLs in other immune-cell types, using data from the OneK1K project. Similar to the gene expression comparison, we found that 16% of the CAD genes and 12% of the T2D genes had eQTLs exclusively in CD4+ T cells (Supplementary Data 21). These analyses provided further evidence that our identified genes were highly expressed in CD4+ T cells, with over 10% of the identified T2D and CAD genes likely demonstrating T cell specificity.
Activation time-specific causal effect of gene expression on T2D and CAD during CD4+ T cell activation
Using dynamic CD4+ T cell eQTL data enables us to estimate the putative causal effects of gene expression on disease throughout CD4+ T cell activation. Among the 162 T2D genes and 80 CAD genes, 73 (32.9%) genes were identified at resting state, while 192 (86.5%) genes were detected upon activation (some genes showed MR signals for both resting and active states). Specifically, 5.0%, 37.8%, 50.0%, and 37.4% of the genes were detectable in LA, 16, 40 h, and 5 days, respectively. In total, 149 (67.1%) genes demonstrated MR effects on T2D and/or CAD solely in activated T cells (Fig. 6a). This emphasizes the importance of investigating CD4+ T cell activation status, as it may aid in the identification of additional causal genes for cardiometabolic diseases. For example, the expression levels of MRPL21 and GCDH showed a strong causal effect on T2D only following activation, with the MR effects reaching their peak or nadir at 40 h in activated memory CD4+ T cells (Fig. 6b, Supplementary Data 8). In other instances, the MR effects of HLA-C expression on CAD peaked at 16 h after naive CD4+ T cell activation (Fig. 6b, Supplementary Data 8).
a Proportion of the 162 T2D genes and 80 CAD genes by activation status and five time points. b Example genes that showed distinguish causal effects across activation time points. Effect estimates were odds ratio and 95% confidence interval of disease risk per unit change of the expression levels of the relevant gene. The error bars represent 95% confidence intervals. c Pair-wise Z score of gene–disease pairs across different activation time points. X-axis is Z-score from two-sided pair-wise Z test with degree of freedom of one. The Y-axis is minus log 10 P value of the pair-wise Z test. d Number and two-sided Fisher’s exact test of gene-disease pairs with or without robust MR evidence and time point-specific effects.
We next conducted two additional analyses to identify gene-disease pairs that exhibited distinctive MR signals across different CD4+ T cell activation time points. First, among all the 10,089 tested gene–disease pairs, we extracted 111 gene–CAD and 263 gene–T2D MR results with robust MR signal (FDR adjusted P value < 0.05) at one activation time point and with MR estimates (either passing or not passing the FDR threshold) at other time points (Supplementary Data 22). Pair-wise Z test of MR estimates at different time points revealed that seven gene–CAD pairs and 24 gene–T2D pairs exhibited robust evidence to show temporal causal patterns across different time points (Fig. 6c and Supplementary Data 22). Second, among the 10,089 tested gene–disease pairs, 1865 gene–T2D and 1806 gene–CAD pairs have MR results at two or more time points (with no P value cut-off for MR estimates). We systematically conducted the heterogeneity test of MR estimates across time points among these pairs. The Cochrane Q test suggested that 34 gene–CAD pairs and 69 gene–T2D pairs showed time point-specific effects, in which seven and 30 of them have robust MR evidence in at least one-time point (FDR adjusted P value < 0.05 and colocalization probability > 70%; Fig. 6d; Supplementary Data 23, 24). Fisher’s exact test further suggested that the gene–T2D and gene–CAD pairs with temporal causal patterns are 9.5 and 7.8 times more likely to show robust MR and colocalization evidence (Fig. 6d).
Prioritization of immune-related drug targets for T2D and CAD
To identify immune-related drug targets and repurposing opportunities for T2D and CAD, we compared our MR findings (gene–disease pairs) with existing clinical trial evidence (drug–indication pairs). We mapped the prioritised genes with relevant drugs using Drugbank, OpenTargets and Drug–Gene Interaction database35,37,38, and further collected randomized clinical trial information of the relevant drugs using data from ClinicalTrials.gov. After a comprehensive search, 25 prioritised genes with dynamic eQTL MR and colocalization evidence could be mapped to relevant drugs that are currently under clinical trial investigations (Table 1). The IMPC database search, literature review, and pathway analysis collectively revealed that all 25 prioritized genes are involved in immune-mediated mechanisms and pathways related to CD4+ T cells (or T cells) (Table 1 and Supplementary Data 25).
Among the top MR findings, four gene–disease pairs with MR and colocalization evidence matched the same target-indication pairs of cardiometabolic drugs that were under clinical trial evaluation. Among these four genes, KHK, GCK, and ERN1 protein products are targets of antidiabetic drugs, whereas drugs targeting GCK are still active in clinical trials. Another gene, LIPA, is a target for Sebelipase alfa (BLA: 125561), which has been approved by the U.S. Food and Drug Administration (FDA) for the treatment of lysosomal acid lipase deficiency, which can result in atherosclerosis. Lysosomal acid lipase encoded by LIPA has also been tested in a clinical trial for atherosclerosis treatment (Table 1).
In addition, 13 (52%) of the genes are targets of drugs currently under clinical development, our MR results showed putative causal evidence to support repurposing these targets for T2D and/or CAD prevention. For instance, ERAP2 is the target gene for ERAP inhibitors (Grey Wolf Therapeutics), an immunotherapy targeting the ERAP proteases in the antigen presentation that is under preclinical trial investigation for cancer treatment (https://greywolftherapeutics.com). It has also been proposed as a potential target for the treatment of autoimmune diseases39,40. Our MR results showed strong genetic evidence to support the causal role of inhibiting expression levels of ERAP2 in reducing CAD risk (e.g., ERAP2 expression in memory CD4+ T cell at 0 h: OR = 1.02, 95% CI: 1.01–1.03, P = 7.3 × 10−4), which implies a repositioning opportunity of the ERAP inhibitors for CAD prevention.
Discussion
In this study, we estimated putative causal effects of 11,021 gene expression profiles throughout CD4+ T cell activation on T2D and CAD, which identified 162 genes altering T2D risk and 80 genes altering CAD risk. The cell-type specific analysis showed that the identified T2D and CAD genes were enriched in TCM and effector TEM, with over 10% of the T2D and CAD genes showing T cell specificity. The activation time point-specific analysis identified 69 gene-T2D and 34 gene-CAD pairs with temporal causal patterns, and showed that these pairs are 9.5 and 7.8 times more likely to be enriched in those with robust MR and colocalization evidence. Integrated with drug trial information, we identified 25 genes that were targeted for drugs with clinical trial information. Collectively, the connection between immune-related genes and two major cardiometabolic diseases conducted by MR analysis showcases the value of immune-metabolism convergency in identifying potential causal genes, functional pathways, and drug targets for T2D and CAD prevention.
Recent large-scale genetic studies of T2D have highlighted the disease heterogeneity, revealing that T2D can be clustered into several subgroups based on the metabolic features of the genetic variants. Although some cell-type specific analyses have been conducted on T2D-associated genes41, these analyses were primarily focused on key tissues such as the pancreas, brain, and liver. As a result, our understanding of the influence of immunity and immune cell gene expressions on T2D remains limited. In this study, we enhanced our understanding of the potential causal relationships between immune-mediated gene expressions and T2D/CAD. We have identified a comprehensive set of immune-mediated genes associated with T2D risk, indicating a robust connection between immunity and T2D. These findings not only enhance our comprehension of the immunological underpinnings of T2D but also provide a foundation for future research into targeted therapies that address the immune components of the disease.
Although previous studies have linked CD4+ T cells with T2D, there remains a significant opportunity to develop immune-related drug targets, as some immunomodulatory therapies have demonstrated clinical efficacy in T2D treatment11. In this study, we identified 162 causal genes for T2D expressed in CD4+ T cells, with 90% of their expressions in CD4+ T cells validated in an independent dataset. However, only 12% of the genes exhibit CD4+ T cell-specific effects on T2D, implying a substantial interplay between immune-cell gene expressions and T2D, not exclusive to CD4+ T cells. These findings acknowledge the complexity of gene expression across different immune cells and its implications for disease mechanisms. We identified four unique putative causal genes for T2D at three activation time points of nTreg cells, whereas we only identified one causal gene for CAD in nTreg cells at a single activation time point. Of particular interest is TOMM7, identified as a putative causal gene for T2D. TOMM7 has previously been reported to be associated with mitochondrial dysfunction42, and neurological and metabolic diseases43,44. Our findings indicate that higher expression levels of TOMM7 in resting nTreg cells could increase T2D risk. This aligns with existing evidence that Treg cells are involved in glucose metabolism and insulin resistance45,46,47,48. One previous study reported that Treg cells engineered with chimeric antigen-receptor (CAR) were considered as potential immunotherapy candidates for T1D treatment, and the therapeutic drug Teplizumab, targeting Treg cells, was recently approved by the FDA2. Given the substantial economic burden diabetes imposes on the global healthcare system and individuals, it is crucial to develop updated prevention strategies. We believe that leveraging human genetics tools to connect immune-related drug targets with T2D prevention represents a promising avenue for future research.
In addition, recent studies have indicated the pathogenic role of immune dysregulation in cardiometabolic diseases6. Immune cells, including Tregs, T helper cell 17 (Th17), and B cells, have been considered essential for maintaining local immunity during the progression of atherosclerosis49 and hypertension50. Currently, immunotherapies targeting immune checkpoints, including CTLA4, CD40, and CD40L51, have been explored for CVD treatment. Our MR results support these findings and enhance our understanding of the impact of gene expression in T cells on cardiometabolic disease and prioritize drug targets for CAD prevention. For example, our study revealed that decreased expression levels of ADAMTS7, one of the prioritized genes, were causally associated with reduced CAD risk. This is consistent with previous evidence highlighting ADAMTS7 as an emerging therapeutic target and demonstrating the protective effects of ADAMTS7 inhibitors against atherothrombosis52. Importantly, our results demonstrated a cell type-specific effect of ADAMTS7 in CD4+ T cells on CAD, which complements previous findings.
This study focuses on combining the MR and colocalization methods to identify putative causal genes for T2D and CAD. These two methods address similar causal questions and have been previously linked34. We also noticed the existence of other methods, including Transcriptome-Wide Association Studies (TWAS) and MetaXcan53,54, which utilize similar data but employ different settings and assumptions. However, some previous studies have demonstrated that the agreement between these methods is not always perfect55, which sometimes may introduce complexity when interpreting the results. Our results suggested the dynamic single-cell eQTL data may enhance the identification of relevant genes. Among the prioritized genes identified in our study, the majority were identified in two later time-points of activation. This observation may be partly due to the relatively high number of eGenes identified at 40 h and 5 days in the original study. Additionally, it is possible that some metabolic pathways function primarily after T cell activation56. Our results further support the notion that temporal causal patterns are likely to be enriched in genes with strong MR and colocalization evidence at least in one activation time point. Therefore, when utilizing genetic tools for future drug target identification, it is crucial to carefully consider the effect of the drug during cell activation. Such an approach may provide valuable insights into determining the optimal activation time for targeting a specific disease.
Our study has some limitations. First, the estimated effects at different activation time points are based on the assumption of a dynamic interaction between gene expression and activation time. This assumption has been previously tested by Soskic et al., who suggested that expression levels of most genes interacted with the dynamics of CD4+ T cells17. Therefore, this is likely to be a valid assumption for our study. Secondly, we were unable to compare our results with other dynamic eQTL data on immune cells due to the scarcity of such data. However, we applied multiple study designs to increase reliability, and our results demonstrated good statistical power. Future studies should aim to incorporate large-scale independent dynamic eQTL data to minimize the potential for winner’s curse57. Thirdly, as the genetic associations in our gene expression data were limited to the cis-acting region, we were unable to directly conduct reverse MR analyses of the liability of T2D or CAD on gene expression. Instead, we applied Steiger filtering to check for the directionality of the relationship. Finally, it is important to notice that the putative causal genes we identified were based on genetically predicted gene expression levels rather than protein expression. Considering that single-cell gene expression does not always correlate with protein levels of the same target58, this introduces an additional layer of uncertainty in terms of identifying the drug target effects for T2D and CAD.
In conclusion, this study identified 162 putative causal genes for T2D and 80 putative causal genes for CAD, with 9% of these gene-disease pairs identifiable only through immune-cell-specific eQTLs. The cell-type and activation time point-specific analyses extended our understanding of gene expressions patterns in relation to T2D and CAD. Our study highlights the potential for investigating the convergence of immune and metabolic pathways, paving the way for the development of immunomodulatory medications targeting cardiometabolic diseases.
Methods
Genetic instrument selection for dynamic expressions of genes in CD4+ T cells
Summary-level eQTL data were obtained from the recent GWAS study of single-cell gene expressions in CD4+ T cells, which was conducted in 119 participants with European ancestry17. We started the instrument selection process by accessing this GWAS data with 482,971 eQTLs associated with expression levels of genes in unstimulated and activated CD4+ T cells. We selected conditionally independent cis-eQTLs from full summary statistics as candidate instruments to fit with the data requirements of the MR and colocalization analyses. We applied linkage disequilibrium (LD) clumping to remove eQTLs in LD with the top association signals (LD r2 < 0.001). To ensure the statistical power of the eQTLs, we calculated the F-statistics of each eQTL and selected instruments with F-statistics > 10, which were considered to have sufficient instrument strength. Furthermore, we used Steiger filtering25 to test the directionality of the eQTL-outcome associations for all MR instruments. Only eQTLs with the proposed directionality (which means the eQTL influences the expression of the gene first and then influences the disease outcome as a causal consequence) were included in the instruments. As a result, we selected 11,021 dynamic eQTLs of 1817 genes for the main MR analysis using dynamic eQTLs.
In addition, we re-selected genetic instruments, which included more additional weaker instruments, so that we could apply more MR sensitivity methods. A more relaxing association P-value cut-off (P < 0.05) and F-statistic (≥5) were applied, and Steiger filtering was used before removing highly correlated SNPs through clumping (LD r2 < 0.1 referring to the 1000 Genome Project Europeans) to maximise the numbers of instruments. We finally identified 11,007 exposures (cell-type and time point-specific gene expression profiles) having at least one instrument for analyses of T2D (45,913 instruments) and CAD (46,766 instruments), in which 10,112 and 10,140 of the exposures have two or more instruments, respectively (Supplementary Data 2).
Genetic instrument selection for expressions of genes from non-dynamic eQTL datasets
For consistency, the same strong instrument selection criteria of eQTLs were applied for non-dynamic datasets, including LD clumping with LD r2 < 0.001, F-statistics > 10, and Steiger filtering to confirm directionality. We selected conditionally independent eQTLs of the 162 T2D genes and 80 CAD genes that were identified in dynamic MR analysis from four non-dynamic datasets of European ancestry: (i) 952 eQTLs of 180 genes expressed in 14 types of immune cells were selected from the OneK1K cohort; (ii) 1232 eQTLs of 149 genes expressed in 15 types of immune cells were selected from the DICE consortium; (iii) 4990 eQTLs of 221 genes expressed in 49 tissues were selected from GTEX v8; iv) 588 eQTLs of 201 genes expressed in whole blood were selected from eQTLGen consortium.
Outcome selection
We selected T2D GWAS meta-analysis summary statistics from the DIAGRAM (DIAbetes Genetics Replication And Meta-analysis) consortium, which is comprised of 80,154 cases and 853,816 controls of European ancestry, published by Mahajan et al.30. The CAD GWAS summary statistics were extracted from Aragam KG, et al.31, and we selected the European-only GWAS data that includes 181,522 cases and 984,168 controls.
Statistical analysis
Dynamic single-cell eQTL MR analysis (main analysis)
In the dynamic eQTL MR analysis, we estimated the putative causal effects of 11,021 gene expression profiles on T2D and CAD by using the Wald ratio method32. The MR estimates with Benjamini–Hochberg FDR corrected P value < 0.05 were selected as candidate gene-disease signals for follow-up analyses. All the analyses were conducted using the TwoSampleMR59 packages. Spearman correlation analysis was used to estimate the relationship between the number of eGenes with instruments and the number of eGenes with robust MR results for each CD4+ T cell expression profile.
Colocalization analyses of candidate MR signals
To increase the credibility of the candidate gene–disease MR signals, we applied two sets of sensitivity analyses to obtain more reliable colocalization evidence. The aim of the analysis was to distinguish causal MR signals from gene-disease pairs confounded by LD (therefore introducing a fake association between gene expression of the gene and the disease outcome). First, we conducted an approximate colocalization analysis that was called LD check34. We estimated the LD r2 between each centinal eQTL and disease-associated GWAS signals in the cis region, and r2 > 0.7 was considered as evidence for approximate colocalization. Second, we used the ‘Coloc’ R package (version 5.2.0)33 to perform genetic colocalization analysis to test whether there are shared causal variants influencing T2D or CAD within 500 kb upstream and downstream of each candidate eQTL. A colocalization probability [PP.H4/(PP.H3 + PP.H4)] > 70% would suggest that the two genetic association signals (for expression and disease) are likely to share the same causal variant under the conditional that there are signals for both the exposure and outcome in the test region. Finally, either LD r2 > 0.7 or colocalization probability > 70% were considered as evidence to support the existence of colocalization.
Validation of MR findings using re-selected instruments and eight MR sensitivity analyses
We estimated the Wald ratio method for 881 exposures (expression profiles), as only one SNP was identified for each of them. For the rest 10,140 exposures, we applied a range of sensitivity analyses, including weighted median60, weighted mode61, MR-Robust62, cML-MA63, MR-PRESSO64, debiased IVW65, MR-RAPS66, to explore and/or minimize potential bias due to violations of MR assumptions. In particular, MR-PRESSO, an extension of IVW with the removal of outlier SNPs if identified, can only be applied with four or more instruments. These analyses were conducted using ‘TwoSampleMR’59, ‘MendelianRandomization’62, ‘MR-PRESSO’64 and ‘mr.raps’66 R packages. More specifically, MR-Egger was not applied in this study since all the instruments we used were within a specific genomic region rather than across the whole genome. Therefore, the INSIDE assumption of MR-Egger is not likely to stay in our case.
MR analysis for non-dynamic eQTL datasets
In the non-dynamic eQTL MR analysis, the putative causal effects of expression levels of the genes identified in the main MR analysis on T2D and CAD were estimated by using the Wald ratio or IVW method. A Benjamini–Hochberg FDR corrected P value > 0.05 was considered.
Identification of immune-cell specific expressed genes associated with type 2 diabetes and coronary artery disease
To investigate whether our identified genes showed T cell-specific expression in T cells (or even CD4+ T cells) compared to other cell types, we conducted four additional analyses using immune cell-specific expression data from DICE27. The gene expression of five major immune cell types (CD4+ T cells, CD8+ T cells, B cells, NK cells, and monocytes) and their subtypes were included.
First, to examine the specificity of the expression of these genes in the five major cell types, we compared the expression levels of the same gene in different cell types. A gene was considered to be specifically expressed in T cells if its expression in T cells was two or more times higher than its expression in each of the other four cell types.
Second, differential expression analysis was performed by analyzing single-cell expression data from seven studies67,68,69,70,71,72,73 (Supplementary Data 12). We used the SCTransform function to normalize single-cell data in the Seurat package (version 4.1.2)74. Then, we used the FindClusters function (resolution = 0.1) and FindAllMarkers to identify CD4+ T cells from all PBMCs. The CD3E(+)CD4(+) clusters were considered as CD4+ T cells in this study. In CD4+ T cells, those with high expression of CCR7 / IL-7R are considered as naive CD4+ T cells, and those with high expression of CD44 are considered as memory CD4+ T cells. FindMarkers was performed to identify the differential genes in CD4+ T cells between diabetic individuals and healthy controls. Genes that meet the following conditions are considered differentially expressed genes: (i) min.PCT > 0.1; (ii) P < 0.05; and (iii) average log2FC > 0.25.
Third, we employed the entropy method as proposed by Scott et al.36 to rank CD4+ T cell specificity for all the detectable genes, assigning scores between 0 and 1, with 1 indicating the highest specificity. Among the 33,840 genes with data in DICE27, 15,113 of them expressed in CD4+ T cells and were included in this analysis. This ranking was compared against other immune cell types. Subsequently, we assessed the rankings of the T2D and CAD genes identified as T cell-specific (defined as genes with TPM > 1 and expression levels at least twice as high in T cells compared to other cell types) among the 15,113 genes.
Fourth, we conducted a simulation using the same dataset to determine if the percentage of our identified T2D and CAD genes exhibited higher T cell specificity compared to randomly selected genes (serving as the background expectation). The randomly selected genes were simulated 10,000 times to generate a distribution for the background expectation. The Shapiro–Wilk test75 was applied to assess the normality of the distribution. Subsequently, the hypergeometric test (using the “phyper” function in R) was employed to evaluate whether our identified genes were statistically different from the randomly selected genes76. Three sets of simulations were performed for CD4+ T cell expressed genes (159 genes in T2D and 80 genes in CAD), T cell specifically expressed genes (23 genes in T2D and 8 genes in CAD), and differentially expressed genes in T cells (79 genes in T2D).
Statistical analyses to estimate temporal causal patterns across activation time points of CD4+ T cells
We conducted three statistical analyses to understand the activation time-specific causal effect of gene expressions on T2D and CAD.
Pair-wise Z test
For the 4932 gene–CAD pairs and 5157 gene–T2D pairs been tested, we extracted 111 gene–CAD and 263 gene–T2D MR results with robust MR signal (FDR adjusted P value < 0.05) in one activation time point and with MR estimates (either pass or not pass the FDR threshold) in other time points. For each of the 374 (111 + 263) MR results, the time point with the most distinct MR effect estimate was set as a reference, pair-wise Z score was estimated between the reference MR effect and effect at other time points (i.e. MR effects of expression of SPATA20 in memory CD4+ T cell at 16 h on T2D [reference] were tested against that at 0, 40 h and 5d respectively; Supplementary Data 22).
Heterogeneity test
We further split the gene–T2D/CAD pairs into four groups, (1) pairs with time-specific effects and robust MR evidence in at least one time point; (2) pairs without time-specific effects but with robust MR evidence; (3) pairs with time-specific effects but without robust MR evidence; and (4) pairs without time specific effects and robust MR evidence (Supplementary Data 24). We identified 1806 and 1865 exposure–outcome pairs for CAD and T2D, respectively, which had main MR estimates available at two or more time points. Thus, for each exposure-outcome pair, we meta-analyzed associations across all available time points using fixed-effects with inverse variance weights and used I2 to evaluate heterogeneity among time points. Exposure-outcome pairs with I2 ≥ 50% were interpreted as having substantial heterogeneity.
Fisher’s exact test
We further split the gene-T2D/CAD pairs into four groups, (1) pairs with time-specific effects and robust MR evidence in at least one time point; (2) pairs with no time-specific effects but with robust MR evidence; (3) pairs with time-specific effects but without robust MR evidence; and (4) pairs without time specific effects and robust MR evidence (Fig. 6d). Fisher’s exact test was then applied based on the numbers within the four subgroups to systematically understand whether the gene-disease pairs with robust MR and colocalization evidence are more likely to be enriched in those with temporal causal patterns.
Gene set enrichment analysis
Gene set enrichment analyses were performed using metascape77, an efficient tool for the comprehensive analysis and interpretation of OMICs-based studies. Before analysis, we used STRING78 to construct the protein-protein interaction (PPI) network of the prioritised T2D and CAD genes. By using the Cytoscape (version 3.9.1)79 plugin cytoHubba80 based on the network maximal clique centrality (MCC) algorithm, we identified 47 core genes according to the criteria of MCC > 3. Then we performed pathway and process enrichment analysis and obtained PPI networks selected from these 47 core genes by using metascape. Pathway and process enrichment analysis was applied to each MCODE component independently, and the two or three best-scoring terms by p-value were retained as the functional description of the corresponding components.
Triangulation with drug trial evidence
We searched the prioritised T2D and CAD genes as drug targets in ClinicalTrials.gov (https://clinicaltrials.gov), Drugbank (https://go.drugbank.com), OpenTargets (https://platform.opentargets.org), and the Drug Gene Interaction database (https://dgidb.org), as well as searching related drug trial studies. Genes with register information or as intervention targets that were under clinical trials were considered to have RCT evidence. For the genes with drug trial evidence, we compared the indications with our results.
To further identify the potential immune-mediated mechanism of the prioritised drug target genes, we conducted three analyses:
-
(1)
Searched for existing mouse knockout evidence from the IMPC database (https://www.mousephenotype.org/);
-
(2)
Searched literature for existing CD4/T cell mechanisms of the identified genes;
-
(3)
Selected the top 10% and bottom 10% of the individuals based on the expression levels of each of the targeted genes, conducted differential expression analysis between the top and bottom group, and then conducted pathway enrichment analysis to identify downstream immune-mediated genes and pathways.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The dynamic single cell-eQTL Mendelian randomization data were available via the supplementary files. The T2D and CAD GWAS summary statistics used in the analyses are freely accessible via the DIAGRAM website (http://diagram-consortium.org/downloads.html) and CVDKP website (https://cvd.hugeamp.org/downloads.html), respectively. The GWAS summary statistics for the single-cell eQTLs can be downloaded via (https://zenodo.org/records/6006796). The single-cell gene expression data and relevant clinical data were accessible via Gut cell survey (https://www.gutcellatlas.org/), Lung cell atlas (https://www.lungcellatlas.org/), and the Gene Expression Omnibus (GEO) repository (https://www.ncbi.nlm.nih.gov/gds) with ID (GSE165816, GSE244515, GSE176415, GSE183279 and GSE192742). The genetic instruments used in this study are publicly available for download from the resources summarized in Supplementary Data.
Code availability
The dynamic single cell-eQTL Mendelian randomization pipeline was shared via the Omics Harbour GitHub repository https://github.com/geneinmylife/single-cell-eQTL-MR-T2D-CAD/.
References
Bluestone, J. A., Buckner, J. H. & Herold, K. C. Immunotherapy: building a bridge to a cure for type 1 diabetes. Science 373, 510–516 (2021).
Hirsch, J. S. FDA approves teplizumab: a milestone in type 1 diabetes. Lancet Diabetes Endocrinol. 11, 18 (2023).
Felten, D. L. et al. Noradrenergic sympathetic neural interactions with the immune system: structure and function. Immunol. Rev. 100, 225–260 (1987).
Bellinger, D. L., Lorton, D., Felten, S. Y. & Felten, D. L. Innervation of lymphoid organs and implications in development, aging, and autoimmunity. Int. J. Immunopharmacol. 14, 329–344 (1992).
Ogata, A., Kato, Y., Higa, S. & Maeda, K. Subcutaneous tocilizumab: recent advances for the treatment of rheumatoid arthritis. Expert Opin. Drug Deliv. 16, 639–648 (2019).
Carnevale, D. Neuroimmune axis of cardiovascular control: mechanisms and therapeutic implications. Nat. Rev. Cardiol. 19, 379–394 (2022).
Hermansson, A. et al. Immunotherapy with tolerogenic apolipoprotein B-100-loaded dendritic cells attenuates atherosclerosis in hypercholesterolemic mice. Circulation 123, 1083–1091 (2011).
Ying, W., Fu, W., Lee, Y. S. & Olefsky, J. M. The role of macrophages in obesity-associated islet inflammation and β-cell abnormalities. Nat. Rev. Endocrinol. 16, 81–90 (2020).
Donath, M. Y., Böni-Schnetzler, M., Ellingsgaard, H. & Ehses, J. A. Islet inflammation impairs the pancreatic beta-cell in type 2 diabetes. Physiology (Bethesda) 24, 325–331 (2009).
Blériot, C., Dalmas, É., Ginhoux, F. & Venteclef, N. Inflammatory and immune etiology of type 2 diabetes. Trends Immunol. 44, 101–109 (2023).
Brooks-Worrell, B., Narla, R. & Palmer, J. P. Biomarkers and immune-modulating therapies for type 2 diabetes. Trends Immunol. 33, 546–553 (2012).
Brooks-Worrell, B. M., Reichow, J. L., Goel, A., Ismail, H. & Palmer, J. P. Identification of autoantibody-negative autoimmune type 2 diabetic patients. Diabetes Care 34, 168–173 (2011).
Goel, A., Chiu, H., Felton, J., Palmer, J. P. & Brooks-Worrell, B. T-cell responses to islet antigens improves detection of autoimmune diabetes and identifies patients with more severe beta-cell lesions in phenotypic type 2 diabetes. Diabetes 56, 2110–2115 (2007).
Stanley, T. L. et al. TNF-alpha antagonism with etanercept decreases glucose and increases the proportion of high molecular weight adiponectin in obese subjects with features of the metabolic syndrome. J. Clin. Endocrinol. Metab. 96, E146–E150 (2011).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Richardson, T. G., Hemani, G., Gaunt, T. R., Relton, C. L. & Davey Smith, G. A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome. Nat. Commun. 11, 185 (2020).
Soskic, B. et al. Immune disease risk variants regulate gene expression dynamics during CD4(+) T cell activation. Nat. Genet. 54, 817–826 (2022).
Ackermann, M., Sikora-Wohlfeld, W. & Beyer, A. Impact of natural genetic variation on gene expression dynamics. PLoS Genet. 9, e1003514 (2013).
Chong, M. et al. Novel drug targets for ischemic stroke identified through mendelian randomization analysis of the blood proteome. Circulation 140, 819–830 (2019).
Henry, A. et al. Therapeutic targets for heart failure identified using proteomics and Mendelian randomization. Circulation 145, 1205–1217 (2022).
Ochoa, D. et al. Human genetics evidence supports two-thirds of the 2021 FDA-approved drugs. Nat. Rev. Drug Discov. 21, 551 (2022).
Nelson, M. R. et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 47, 856–860 (2015).
Morrison, J., Knoblauch, N., Marcus, J. H., Stephens, M. & He, X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat. Genet. 52, 740–747 (2020).
Burgess, S. & Thompson, S. G. Avoiding bias from weak instruments in Mendelian randomization studies. Int. J. Epidemiol. 40, 755–764 (2011).
Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 13, e1007081 (2017).
Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
Schmiedel, B. J. et al. Impact of genetic polymorphisms on human immune cell gene expression. Cell 175, 1701–1715.e1716 (2018).
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 1803–1815 (2022).
Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson, N. & Davey Smith, G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat. Med. 27, 1133–1163 (2008).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 52, 1122–1131 (2020).
Koscielny, G. et al. Open Targets: a platform for therapeutic target identification and validation. Nucleic Acids Res. 45, D985–d994 (2017).
Scott, L. J. et al. The genetic regulatory signature of type 2 diabetes in human skeletal muscle. Nat. Commun. 7, 11764 (2016).
Freshour, S. L. et al. Integration of the Drug–Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 49, D1144–d1151 (2021).
Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006).
Hamilton, F. et al. Variation in ERAP2 has opposing effects on severe respiratory infection and autoimmune disease. Am. J. Hum. Genet. 110, 691–702 (2023).
Camberlein, V. et al. Discovery of the first selective nanomolar inhibitors of ERAP2 by kinetic target-guided synthesis. Angew. Chem. Int. Ed. Engl. 61, e202203560 (2022).
Suzuki, K. et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature 627, 347–357 (2024).
Hasson, S. A. et al. High-content genome-wide RNAi screens identify regulators of parkin upstream of mitophagy. Nature 504, 291–295 (2013).
Li, Z. et al. Identification of potential blood biomarkers for early diagnosis of schizophrenia through RNA sequencing analysis. J. Psychiatr. Res. 147, 39–49 (2022).
Garg, A. et al. Autosomal recessive progeroid syndrome due to homozygosity for a TOMM7 variant. J. Clin. Investig. 132, e156864 (2022).
Garidou, L. et al. The gut microbiota regulates intestinal CD4 T cells expressing RORγt and controls metabolic disease. Cell Metab. 22, 100–112 (2015).
Ilan, Y. et al. Induction of regulatory T cells decreases adipose inflammation and alleviates insulin resistance in ob/ob mice. Proc. Natl Acad. Sci. USA 107, 9765–9770 (2010).
Peng, M. & Li, M. O. Metabolism along the life journey of T cells. Life Metab. 2, load002 (2023).
He, J. et al. The effect of short-term intensive insulin therapy on inflammatory cytokines in patients with newly diagnosed type 2 diabetes. J. Diabetes 14, 192–204 (2022).
van Gils, J. M. et al. The neuroimmune guidance cue netrin-1 promotes atherosclerosis by inhibiting the emigration of macrophages from plaques. Nat. Immunol. 13, 136–143 (2012).
Marvar, P. J. et al. Central and peripheral mechanisms of T-lymphocyte activation and vascular inflammation produced by angiotensin II-induced hypertension. Circ. Res. 107, 263–270 (2010).
Bonfiglio, C. A., Weber, C., Atzler, D. & Lutgens, E. Immunotherapy and cardiovascular diseases: novel avenues for immunotherapeutic approaches. QJM 116, 271–278 (2023).
Feinstein, M. J. & Thorp, E. B. ADAMTS7 knockdown in context: emerging therapeutic targets in atherothrombosis. Circ. Res. 129, 471–473 (2021).
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
Namba, S., Konuma, T., Wu, K. H., Zhou, W. & Okada, Y. A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. Cell Genom. 2, 100190 (2022).
Christoffersson, G., Ratliff, S. S. & von Herrath, M. G. Interference with pancreatic sympathetic signaling halts the onset of diabetes in mice. Sci. Adv. 6, eabb2878 (2020).
Huang, Q. Q., Ritchie, S. C., Brozynska, M. & Inouye, M. Power, false discovery rate and Winner’s Curse in eQTL studies. Nucleic Acids Res. 46, e133 (2018).
Reimegård, J. et al. A combined approach for single-cell mRNA and intracellular protein expression analysis. Commun. Biol. 4, 624 (2021).
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife 7, e34408 (2018).
Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted Median estimator. Genet. Epidemiol. 40, 304–314 (2016).
Hartwig, F. P., Davey Smith, G. & Bowden, J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 46, 1985–1998 (2017).
Patel, A. et al. MendelianRandomization v0.9.0: updates to an R package for performing Mendelian randomization analyses using summarized data. Wellcome Open Res. 8, 449 (2023).
Hiraoka, T., Kiuchi, G., Hiraoka, R., Maruo, K. & Oshika, T. Multifocal contact lenses and 0.01% atropine eye drops for myopia control study: research protocol for a 1-year, randomized, four-arm, clinical trial in schoolchildren. Eye Contact Lens 49, 172–177 (2023).
Verbanck, M., Chen, C. Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).
Ye, T., Shao, J. & Kang, H. Debiased inverse-variance weighted estimator in two-sample summary-data Mendelian randomization. Ann. Stat. 49, 2079–2100 (2021).
Zhao, Q., Wang, J., Hemani, G., Bowden, J. & Small, D. S. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Ann. Stat. 48, 1742–1769 (2020).
Theocharidis, G. et al. Single cell transcriptomic landscape of diabetic foot ulcers. Nat. Commun. 13, 181 (2022).
Lee, H. et al. Immunological link between periodontitis and type 2 diabetes deciphered by single-cell RNA analysis. Clin. Transl. Med. 13, e1503 (2023).
Singh, K. et al. Genome-wide DNA hypermethylation opposes healing in patients with chronic wounds by impairing epithelial–mesenchymal transition. J. Clin. Investig. 132, e157279 (2022).
Lake, B. B. et al. An atlas of healthy and injured cell states and niches in the human kidney. Nature 619, 585–594 (2023).
Guilliams, M. et al. Spatial proteogenomics reveals distinct and evolutionarily conserved hepatic macrophage niches. Cell 185, 379–396.e338 (2022).
Madissoon, E. et al. A spatially resolved atlas of the human lung characterizes a gland-associated immune niche. Nat. Genet. 55, 66–77 (2023).
Elmentaite, R. et al. Cells of the human intestinal tract mapped across space and time. Nature 597, 250–255 (2021).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e3529 (2021).
SHAPIRO, S. S. & WILK, M. B. An analysis of variance test for normality (complete samples). Biometrika 52, 591–611 (1965).
Lee, J. S., Katari, G. & Sachidanandam, R. GObar: a gene ontology based analysis and visualization tool for gene sets. BMC Bioinform. 6, 189 (2005).
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1523 (2019).
Szklarczyk, D. et al. The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 49, D605–d612 (2021).
Doncheva, N. T., Morris, J. H., Gorodkin, J. & Jensen, L. J. Cytoscape StringApp: network analysis and visualization of proteomics data. J. Proteome Res. 18, 623–632 (2019).
Chin, C. H. et al. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8, S11 (2014).
Acknowledgements
This work was supported by grants from the National Key Research and Development Program of China (Grant Nos. 2022YFC2505202, 2022YFC2505203, and 2023ZD0508402), National Natural Science Foundation of China (Grant Nos. 82370810, 82170819, and 82088102), Science and Technology Commission of Shanghai Municipality (Grant nos. 23JS1400900, 23Y11908400, 23XD1422400), Shanghai Clinical Research Center for Metabolic Disease (19MC1910100). J.Z. is supported by the Academy of Medical Sciences (AMS) Springboard Award, the Wellcome Trust, the Government Department of Business, Energy and Industrial Strategy (BEIS), the British Heart Foundation and Diabetes UK (SBF006\1117). E.E.V. is supported by Diabetes UK (17/0005587), the World Cancer Research Fund (WCRF UK), as part of the World Cancer Research Fund International grant program (IIG_2019_2009) and Cancer Research UK (CRUK) Integrative Cancer Epidemiology Program (C18281/A29019). M.C.B. is supported by a University of Bristol Vice-Chancellor’s Fellowship, the Bristol British Heart Foundation Accelerator Award (AA/18/1/34219). TRG and MCB are supported by the University of Bristol and the Medical Research Council (MRC) Integrative Epidemiology Unit (MC_UU_00032/03 and MC_UU_00032/05). T.R.G. is also supported by the NIHR Bristol Biomedical Research Centre, which is funded by the NIHR and is a partnership between University Hospitals Bristol and Weston NHS Foundation Trust and the University of Bristol. Department of Health and Social Care disclaimer: The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. We would like to express our gratitude to Dr. Youqiong Ye, Dr. Ying Wang, and Dr. Kankan Wang for their valuable discussions and insights.
Author information
Authors and Affiliations
Contributions
Conceptualization: X.Y.W., J.Z., and J.L.L.; Formal analysis: X.Y.W., H.L.Z., Q.Y., Y.L.D., and H.Y.L.; Methodology: H.L.Z. and J.Z.; Investigation: Z.H.C., R.Z.Z., H.L., S.Y.W., M.L., T.G.W., Z.Y.Z., M.X., Y.H.C., and Y.X.; Visualization: X.Y.W. and H.Y.L.; Funding acquisition: E.E.V., M.C.B., T.R.G., Y.F.B., J.Z., and J.L.L.; Project administration: G.N. and W.Q.W.; Supervision: W.Q.W., Y.F.B., J.Z., and J.L.L.; Writing—original draft: X.Y.W., H.Y., and Q.Q.Y.; Writing—review and editing: E.E.V., M.C.B., T.R.G., J.Z. and J.L.L. W.Q.W., Y.F.B., J.Z., and J.L.L. are the guarantors of this work and shall take responsibility for the full access and integrity of the data. All authors have approved the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
T.R.G. receives funding from Biogen and GSK for research not represented in this manuscript.
Peer review
Peer review information
Nature Communications thanks Marie Piqeyre and the other, anonymous, reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wu, X., Ying, H., Yang, Q. et al. Transcriptome-wide Mendelian randomization during CD4+ T cell activation reveals immune-related drug targets for cardiometabolic diseases. Nat Commun 15, 9302 (2024). https://doi.org/10.1038/s41467-024-53621-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-53621-7
This article is cited by
-
Proteomic signatures of type 2 diabetes predict the incidence of coronary heart disease
Cardiovascular Diabetology (2025)
-
Dissecting immune cell-specific genetics in migraine: a multi-omics framework for target discovery and therapeutic prioritization
The Journal of Headache and Pain (2025)