Abstract
Comorbidity among atopic diseases (ADs) and gastrointestinal diseases (GIDs) has been repeatedly demonstrated by epidemiological studies, whereas the shared genetic liability remains largely unknown. Here we establish an atlas of the shared genetic architecture between 10 ADs or related traits and 11 GIDs, comprehensively investigating the comorbidity-associated genomic regions, cell types, genes and genetically predicted causality. Although distinct genetic correlations between AD-GID are observed, including 14 genome-wide and 28 regional correlations, genetic factors of Crohn’s disease (CD), ulcerative colitis (UC), celiac disease and asthma subtypes are converged on CD4+ T cells consistently across relevant tissues. Fourteen genes are associated with comorbidities, with three genes are known treatment targets, showing probabilities for drug repurposing. Lower expressions of WDR18 and GPX4 in PBMC CD4+ T cells predict decreased risk of CD and asthma, which could be novel drug targets. MR unveils certain ADs led to higher risk of GIDs or vice versa. Taken together, here we show distinct genetic correlations between AD-GID pairs, but the correlated genomic loci converge on the dysregulation of CD4+ T cells. Inhibiting WDR18 and GPX4 expressions might be candidate therapeutic strategies for CD and asthma. Estimated causality indicates potential guidance for preventing comorbidity.
Similar content being viewed by others
Introduction
The prevalence of atopic diseases (ADs), including atopic dermatitis (or eczema), hay fever (or allergic rhinitis), and asthma, is increasing globally, affecting approximately 20% of the population worldwide1,2. Compared with the general population, patients with ADs have an increased risk of immune diseases, metabolic syndrome and mental disorders3. These comorbidities associated with ADs can lead to poor health status, a high economic burden and reduced quality of life4,5.
One of the main comorbidities of ADs pertains to gastrointestinal disorders (GIDs) observed from epidemiological studies6,7, including inflammatory bowel disease (IBD)8,9, celiac disease10, gastroesophageal reflux disease (GERD)11, irritable bowel syndrome (IBS)12, and gastrointestinal cancers13,14. For example, the risks of incident CD in patients with atopic dermatitis, allergic rhinitis, or asthma were reported to be higher than individuals without those ADs with hazard ratios (HR) of 2.02, 1.33, and 1.60 respectively9. Similarly, the HRs of incident UC in patients with the above ADs were 1.51, 1.32, and 1.29 respectively9. Furthermore, the prevalence of GERD in asthmatic patients ranged from 32% to 80%11,15, and GERD were also associated with increased the risk of eczema (OR 1.21, 95% CI, 1.07–1.37)16. Likewise, IBS has been associated with increased risk of allergic disease. The risk of allergic diseases in patients with IBS was reported to be 15-99%, higher than healthy individuals12.
Multiple scenarios may explain the co-occurrence between ADs and GIDs, (1) one disease causes the other, (2) the treatment of the first disease influences the risk of the other, (3) shared etiology including genetic basis, environmental exposures, treatment history, epithelial barrier dysfunction and microbiota alterations10,17. Recent large-scale genome-wide association studies (GWAS) of ADs and GIDs have identified a considerable number of shared disease-associated genomic loci, for example, variants within genes SMAD3, GSDMB, ORMDL3, and IL1R118,19,20 conferred increased susceptibility for both asthma and IBD. Significant genetic correlations have also been reported between IBS-asthma and GERD-asthma21,22. These studies suggested that an individual’s genetics acts as a shared intrinsic factor of AD-GID comorbidities, providing an important view of the common etiology between two traits which are phenotypically different.
Despite this progress, a deeper investigation of genetic relationships between ADs and GIDs is still lacking. More specifically, genomic regions that are genetically correlated, genes which are likely to be causal and the relevant cell types are largely unknown. Understanding of the shared genetic architecture between ADs and GIDs has important clinical implications, (1) to determine whether one type of disease may represent a risk of another23, (2) to unveil shared molecular insights of comorbidities24, and (3) to re-evaluate current drugs for repurposing or new drug design based on pleiotropic genes25.
In this study, we used the up-to-date GWAS summary statistics for seven traits of ADs (eczema, allergic rhinitis, asthma, asthma subtypes), three lung function measurements related to asthma, and 11 GIDs based on reported comorbidities. We first proved their genetic correlations at both genome-wide and regional levels. We then hypothesized that (1) shared genetic factors implied common molecular basis and (2) causalities between ADs and GIDs partially explained the comorbidity. Therefore, we integrated large-scale of GWAS, sc-RNAseq, bulk- and sc-eQTLs to prioritize comorbidity-associated cell types and -genes. Multiple MR approaches were used to assess the causality between AD-GID pairs. By these approaches, we aimed to comprehensively explore the shared genetic architecture between ADs and GIDs in an attempt to improve the understanding of the molecular basis of these comorbidities.
Results
GERD/IBS and ADs were genetically correlated at genome-wide level
GID is one of the main comorbidities of ADs. We first summarized previous studies that reported AD-GID comorbidity evidence, which is provided in the Supplementary Materials. In total, seven ADs and disease subtypes, three lung function measurements that related to asthma, and 11 GI diseases were selected, and all samples of the selected studies were from European populations (Fig. 1, Table 1, Supplementary Data 1). We subsequently explored the genetic correlations among these 10 AD related traits and 11 GIDs by linkage disequilibrium score regression (LDSC). In total, we identified 14 pairs of AD-GID (FDR < 0.05, Fig. 2A, Supplementary Data 2). Gastroesophageal reflux disease (GERD) was correlated with eight ADs, six of which were positively correlated, including asthma, Child-onset asthma (COA), Adult-onset asthma (AOA), Moderate-to-severe asthma (MtoS asthma), hay fever and allergy (rg from 0.07–0.33), and two were negative, including FEV1 and FVC (rg =−0.082 and −0.081). Irritable bowel syndrome (IBS) showed similar correlation patterns with ADs as GERD, including asthma, AOA, allergy (rg from 0.26–0.29) and FVC (rg =−0.067). We also observed correlations between peptic ulcer disease with MtoS asthma (rg = 0.21), and celiac disease with allergy (rg = 0.24). These findings suggest that IBS and GERD might possess similar genetic background related to comorbidity with ADs while the correlations were limited across other AD-GID pairs at a genome-wide level.
The genetic correlations between 10 traits of ADs and 11 GIDs were first determined on both genome-wide level and genomic regional level by LDSC and LAVA, respectively. The subsequent analysis was conducted based on two hypotheses. (1) To detect if the shared genetic architecture between ADs and GIDs were converged on common cellular pathways, comorbidity-associated genes were prioritized using SMR in bulk-eQTLs data of nine tissues and sc-eQTL data from PBMC. Comorbidity-associated cell types were projected to single-cell RNAseq data from blood, gut, lung and airway tissues. Then the druggability of the prioritized genes were assessed in public drug databases. (2) To investigate the directional causality between AD-GID which might partially explain the comorbidities, multiple bi-directional MR approaches were incorporated, followed by a variety of sensitivity analysis. AD, atopic disease. GID, gastrointestinal diseases. LDSC, linkage disequilibrium score regression. LAVA, local analysis of [co]variant association. sc-eQTL, single cell expression quantitative trait loci. SMR, summary-data-based Mendelian Randomization. PBMC, peripheral blood mononuclear cells. COA, child-onset asthma. AOA, adult-onset asthma. MtoS asthma, Moderate-to-severe asthma. CD, Crohn’s disease. CRC, colorectal carcinoma. GERD, gastroesophageal reflux disease. UC, ulcerative colitis. IBS, irritable bowel syndrome. Created in BioRender. Hu, S. (2024) https://BioRender.com/q95x753.
A heatmap of 14 pairs of genome-wide significant correlations. # indicates significant signals with FDR < 0.05. The color represents the directionality of the correlation coefficients. B heatmap of 23 unique genomic regions which were correlated with at least one pair of AD-GID with FDR < 0.05. The numbers indicate the amount of genetically correlated loci. AOA adult-onset asthma, CD Crohn’s disease, COA childhood-onset asthma, CRC colorectal carcinoma, IBS irritable bowel syndrome, UC ulcerative colitis.
Correlated genomic regions were associated with previously unrelated diseases
Global genetic correlations investigated the average association across the genome. Complementary to global correlation analysis, we applied Local Analysis of [co]Variant Association (LAVA) to assess the genetically correlated loci. In total, 23 unique genomic regions (rl) were identified to be shared between 28 pairs of AD-GID (FDR < 0.05, Fig. 2B, Supplementary Data 3). Notably, the majority of local correlated loci (18 out of 23) showed no significant rg from LDSC analysis. For example, seven GIDs (diverticular disease, chronic gastritis, celiac disease, gastric cancer, IBS, Colorectal carcinoma (CRC), Crohn’s disease (CD)) were correlated with measures of lung function (FVC, FEV1 and FEV1/FVC) at ten unique loci. Moreover, CD, Ulcerative colitis (UC), celiac disease, diverticular disease and GERD were also associated with asthma or its subtypes at six specific regions, suggesting the widely existed genetic correlations of AD-GID were driven by local effects which were however not captured by LDSC analysis.
A LD block (18.65–20.06 Mb at chromosome 2) contains gene NT5C1B, which has been reported in GWAS of diverticular disease, FEV1 and FVC. We observed strong correlations among these disease pairs (diverticular disease vs. FEV1 rl = 0.74 and diverticular disease vs. FVC rl = 0.73). This locus also revealed positive correlations of CRC with FEV1 and FVC (rl = 0.88 and 0.87 respectively), but did not include genome-wide significant SNPs in CRC GWAS (Supplementary Data 4). Another region (130.57–132.55 Mb) at chromosome 5 was positively associated with asthma and GERD (rl = 0.61). This region contained genome-wide significant SNPs from asthma GWAS but not from GERD GWAS. Pathway enrichment analysis showed that the genes IL3, IL4, IL5, and IL13 within this region were involved in inflammatory responses classically associated with a Th2-driven immune response in asthma (adjusted P = 1.32 × 10−7, Supplementary Data 4, Supplementary Fig. 1). A genomic region (0.76–1.5 Mb) on chromosome 19 presented significant correlation between CD and COA (rl =−0.65), including genes WDR18, GRIN3B, GPX4, TMEM259 and CNN2, enriched in functions related to secretory granule lumen (adjusted P = 3.69 × 10−3), cytoplasmic vesicle lumen (adjusted P = 3.89 × 10−3), and serine-type endopeptidase activity (adjusted P = 7.90 × 10−3, Supplementary Data 4, Supplementary Fig. 1). Collectively, the pleiotropic effect of these regions indicated their evolutionarily conserved properties and thus might contain core genes involved in comorbidity26.
SMR revealed 14 pleiotropic genes on AD-GID pairs
We subsequently aimed to prioritize key pleiotropic genes in the correlated genomic blocks identified by LAVA, through the integration with eQTL data from multiple tissues by Summary-data-based Mendelian Randomization (SMR) analysis. After multiple testing correction, 70 genes showed putatively causality on diseases, among which 14 genes were associated with at least one AD-GID pair (FDRSMR < 0.05 and PHEIDI > 0.01 Fig. 3A, Supplementary Data 5).
A dot plot showing the significant causal genes identified by SMR, using bulk eQTL analysis from different tissues (FDRsmr <0.05). Blue diamond-shaped dots represent negative effect size estimated from SMR while red dots represent positive relationships. B, C Representative examples of locus zoom plots. X-axis indicated the genomic positions and Y-axis indicated the –log (10) P-values representing the significance of GWAS of diseases and eQTLs. SC sigmoid colon, TC transverse colon, EGJ gastro-esophageal junction, EMa esophagea mucosa, Ems esophagus muscularis.
One prominent example concerns PABPC4. Here, higher expression of PABPC4 was identified to be associated with decreased risk of IBS and increased of lung function measured by FEV1/FVC ratio (Fig. 3B), and this pleiotropic effect was identified in five tissues, including blood, sigmoid colon, transverse colon, gastro-esophageal junction and esophageal mucosa. Another example is SLC22A5, an important transporter in active cellular uptake of carnitine associated with spirometry indices27, which was correlated with an increased risk of both asthma and GERD in esophageal muscularis (Fig. 3C). WDR18 and HMHA1 were causally associated with both CD and COA in blood. Interestingly, other genes TMEM259, CNN2, ABCA7, POLR2E, MIDN and TEME259 in the same LD block of WDR18, were specific to COA risk but not to CD, whereas STK11 was specific to CD, suggesting that multiple altered gene expressions might contribute to COA and CD onset at this genomic region (Supplementary Data 5).
Disease-associated genetic factors converged on common cell types across AD-GID
To determine whether ADs and GIDs converge on shared molecular basis, we performed association analysis between the GWAS signals of ADs and GIDs with gene expressions of a variety of cell types from four relevant tissues. In PBMCs, we observed a significant enrichment of ADs- (asthma, AOA, COA, allergy) and GIDs- (CD, UC) -associated loci in T cells, including γδ T cells, cytotoxic CD4+ T cells and central memory CD4+ T cells (FDR < 0.05, Fig. 4A). Considering the disease-relevant tissues differ between ADs and GIDs, we extended the analysis to scRNAseq data from lung, airway, colonic and ileal tissues. Strikingly, asthma, AOA, COA, allergy, hay fever, CD, UC and celiac disease showed consistently stronger enrichment in T cells across airway, colonic and ileal tissues (FDR < 0.05, Fig. 4B–D). For example, these disease-associated genetic factors were enriched in regulatory T cells, NK cells and dendritic cells in lung. In colon and ileum, CD4+ T cells, CD8+ T cells/NK-like cells and T reg cells were predominant. Moreover, B cells, plasma cells and monocytes were identified to be associated with these diseases in airway tissue (FDR < 0.05, Fig. 4E). These findings indicate that disease-associated genetics of ADs-GIDs captured common immune signatures across distinct tissues, pinpointing the potentially central role of CD4+ T cells in a shared molecular background.
The CELLECT method was used to associate SNPs derived from GWAS summaries with 178 different cell types from four tissues (lung, airway, colon and terminal ileum). Panels A–E demonstrate representative examples of shared cell type (T cells, B cells, NK cells, monocytes) enrichments across asthma, AOA, COA, hay fever, allergy, CD, UC and celiac disease in PBMCs and lung, airway and gut tissues (colon and terminal ileum). Red color indicated the significance at FDR < 0.05 level while yellow indicates a more lenient threshold with nominal P < 0.05. Full summaries are presented in Supplementary Figs. 2–8 and Supplementary Data 6.
Disease-specific cell-type enrichment was also observed. For instance, CRC-associated loci converged more prominently on NK cells and memory B cells in blood. Lung function-related traits, including FEV1, FVC and FEV1/FVC ratio, were enriched in fibroblasts across lung, colon and ileum, suggesting the specificity of disease- and tissue characteristics. Full summaries are provided in Supplementary Figs. 2–8 and Supplementary Data 6.
Activation of WDR18 and GPX4 in CD4+ T cells predicted an increased risk of CD and asthma
Accumulating evidence indicates that eQTLs could exert different effects across cell types28,29 which might confound bulk-eQTLs. Therefore, we further integrated sc-eQTLs to identify comorbidity-associated genes on single-cell level. Five genes were identified with evidence for disease causality at singe-cell resolution (FDRSMR < 0.05) (Fig. 5A, Supplementary Data 7). Strikingly, WDR18 and GPX4 expressed in blood CD4+ T cells were associated with FEV1, FVC or MtoS asthma. When we adopted a more lenient threshold at PSMR < 0.05, both genes also exerted positive correlations with increased CD, asthma or MtoS asthma risks. Colocalization analysis further confirmed that the expression of WDR18 potentially shared the same causal variant with CD and asthma in naïve central memory CD4+ T cells (PPH4 > 0.8, Supplementary Data 8), which complemented the discoveries from SMR and scRNAseq data, presenting additional evidence for the key role of CD4+ T cells in mediating CD and asthma onset. However, the causality of GPX4 has been shown to be disease and cell-type specific, indicating different roles of GPX4 across CD4+ T cell subtypes.
A Illustrative plot of SMR-identified five candidate genes presenting associations with ADs or GIDs or both, expressed in CD4+ T cells, CD8+ T cells, NK cells or monocytes. Solid lines indicated significance with FDR < 0.05 while dashed lines indicated a more lenient significance with nominal P < 0.05. B Schematic visualization of the overlaps between predicated candidate genes from bulk- and sc-eQTLs with three public drug targets databases, including DGIdb, DrugBank and OpenTargets. Duggable candidates were defined as the genes which were well-established drug targets.
Druggability of candidate causal genes
An interrogation of three drug databases and the 14 SMR-identified pleiotropic genes revealed that three genes were well-established drug targets, including SLC22A5 (levocarnitine, treatment of carnitine deficiency), GM2A (Choline alfoscerate, treatment of neurodegenerative and vascular diseases) and ARHGEF28 (methylphenidate, treatment of attention deficit hyperactivity disorder), indicating potential candidates for drug repurposing (Fig. 5B, Supplementary Data 9). In addition, seven genes identified by SMR that were only related to one type of disease, have also been reported as drug targets (Supplementary Data 9). Moreover, the pleiotropic effects of WDR18 and GPX4 suggested that suppressing their expressions could be new therapeutic strategies for AD-GID comorbidities.
Causality between ADs and GIDs
Shared genetic architecture could increase the risk of one phenotype, the onset of which causes another30. This phenomenon could be examined by causality inferring. To test this hypothesis, we further applied primary and complementary Mendelian Randomization (MR) approaches for each pair of AD and GID (Fig. 6A). In total, five pairs of AD-GID showed significant unidirectional causality and one pair showed bi-directional causality with FDRivw < 0.05 (Fig. 6B).
A Schematic workflow of MR analysis (Methods), including IVs selection and quality controls, primary and complementary MR approaches, and sensitivity analysis. B A total of five unidirectional causal relationships with FDRIVW < 0.05 were identified, and one significant bi-directional causality between GERD and asthma (both directional FDRIVW < 0.05). All results shown passed quality control and sensitivity analysis. NS, non-significant.
When GIDs were exposures with AD as outcomes, we observed that GERD increased the risk of AOA (OR = 1.18) and associated with reduced FVC (OR = 0.92). CD decreased the risk of allergy (OR = 0.97). Conversely, when ADs were treated as exposures and GIDs as outcomes, asthma slightly reduced the risk of CD (OR = 0.86). Moreover, allergy was a potential causal factor of IBS (OR = 1.07).
Interestingly, a significant causal association was identified of asthma on the risk of GERD (OR = 1.04, 95% CI, 1.02–1.06) while GERD also increased the risk of asthma (OR = 1.18, 95% CI, 1.08–1.29). These findings indicate the causal relationships between ADs and GIDs which might partially explain comorbidities. Full summaries of IV quality controls, results and sensitivity analyses are provided in Supplementary Data 10, 11.
Discussion
In this study, we identified significant genome-wide correlations between GERD and IBS with ADs. Instead, the associations between CD, UC, celiac and diverticular diseases with ADs were locally driven by distinct genomic regions. Following integration of GWAS summaries with bulk-eQTL analysis of nine distinct tissues, 14 key genes were prioritized and involved in comorbidity. Despite the varied genetic correlation patterns, cell type enrichment analysis using scRNA-seq data identified CD4+ T cells as shared etiologic cells between AD and GID across four relevant tissues. We further projected the candidate key genes on a single-cell level and found causal evidence for five genes (CDC42SE2, WDR18, GPX4, GM2A, and AFF4) expressed in CD4+ T cells, CD8+ T cells, NK cells and monocytes. Moreover, three genes were established druggable targets, emphasizing their potential therapeutic amenability. Finally, nine pairs of ADs-GIDs were suggested to present causal relations with one significant bi-directional relationship found between GERD and asthma.
Until now, comorbidity between ADs and GIDs has largely been investigated through epidemiological work. Most of the genome-wide correlations in our study were consistent with epidemiological evidence, e.g., both GERD and IBS were correlated with multiple asthma-related traits11,31. However, we did not identify significant global genetic correlations between IBD and asthma, which frequently co-occur according to epidemiological data8,9. Considering that the global genetic correlations only capture the average of the associations across the genome, there is still the possibility that local genetic correlations exist in the absence of global relations32. Indeed, the majority of locally correlated loci was not detected in global co-occurrence analysis (LDSC), which was exemplified by the identification of a locus that was genetically correlated between CD and COA, while the genes from this locus were enriched in pathways relating to secretory functions and leukocyte activation, which may form the common genetic basis of CD and COA within this specific region.
Pleiotropic genomic regions might imply conserved genetic properties during evolution and thus contain important genes involved in cross-disease causality26. We identified 14 genes that showed potential causality on both ADs and GIDs on bulk tissue level and further projected these at single-cell level with five genes. WDR18 was causally associated with both CD and COA in CD4+ T cells and CD8+ T cells of PBMCs, and encodes a member of the WD40-repeat protein family, being related to DNA damage checkpoint signaling33. Elevated expression of GPX4 mediated higher risk of asthma, chronic gastritis and CD, and suggested the dysregulation of CD4+ and CD8+ T cells. GPX4 has been shown to restrict cytokine responses of small intestinal epithelial cells (IECs), and mice lacking one allele of Gpx4 in IECs can develop mucosal inflammation, resembling CD34. However, the role of GPX4 in asthma onset needs further investigations. Nevertheless, highly consistent patterns of enriched immune cells, especially for CD4+ T cells, were found across asthma, AOA, COA, allergy, CD, UC and celiac disease. This consistency did not largely differ across tissues, indicating the systematic nature of immune perturbations in these diseases. Taken together, these findings revealed that the genetics of some ADs and GIDs converge on shared cell types, especially for those immune-related diseases, implying common immune responses underlying these diseases. A more comprehensive discussion on candidate genes is provided in the Supplementary discussion.
MR analysis was used to infer whether the altered gene expression implies causality for diseases and thus, may accelerate the development or repurposing of drugs35. Three genes identified from SMR, including GM2A, SLC22A5 and ARHGEF28, were well-established drug targets of neurodegenerative- or metabolic diseases. GM2A was a target of choline alfoscerate which is used in the treatment of neurodegenerative- and vascular diseases. Levocarnitine, targeting SLC22A5, was developed for carnitine deficiency. Methylphenidate was FDA-approved for treating attention deficit hyperactivity disorder (ADHD) targeting ARHGEF28. The pleiotropic effects of these genes further suggest the feasibility of drug repurposing for asthma, impaired lung function, celiac disease, CD and diverticular disease. Moreover, the higher expression of WDR18 and GPX4 increased the risk of CD, COA, asthma or chronic gastritis. WDR18 has been described as putatively involved in the differentiation and self-renewal of intestinal stem cells36. Recently, GPX4 was identified as central hub gene in autophagy- and ferroptosis-related genes which were associated with postoperative recurrence in patients with CD who underwent ileocecal resection37. Colocalization analysis further confirmed these genes shared causal variants with AD-GID pairs or individual disease, indicating that inhibiting these genes might be novel therapeutic strategies for CD and asthma. However, further studies investigating the roles of these genes across different cell types followed by cell-specific drug targeting approach should be warranted.
Shared genetic factors have been shown to be involved in cross-disease causality which was related to the comorbidities30,38. A bi-directional causal effect between GERD and asthma was observed in our analysis which was consistent with a study by Ahn et al.16, indicating the comorbidity of the two diseases was partially due to mutual causality. In addition, we identified an inverse association between the risk of asthma and CD, which was also reported by Freuer et al.39. These previously reported findings support the robustness of our study. By incorporating a more comprehensive phenotypes for respiratory traits, we also uncovered novel leads of causality between ADs and GIDs. For example, GERD could be an important predictor of reduced spirometry indices and high risk of AOA, and allergy showed positive causal effect on IBS, calling for attention of healthcare providers to see whether AD develops in patients who have manifested GIDs or vice versa.
Other underlying mechanisms may also explain the link between ADs and GIDs in addition to shared genetics. Both the airway and GI tracts arise from the foregut40,41, locate in close proximity, and share similar physiological structures of epithelial tissues. Moreover, it has been reported that early life environmental exposures are associated with a predisposition towards atopic diseases and with changes in the intestinal microbiota42,43, and studies also suggested a cross-talk between the intestinal and airway microbiota compartments44,45, which may explain the mechanism of airway-gut axis.
The main strength of this study includes (1) novel genetic evidence that may explain comorbidities of ADs and GIDs, and distinct genetic correlation patterns were identified between AD-GID pairs, (2) the identification of potentially shared etiologic cell types and genes by integration of a large number of bulk- and sc-RNAseq datasets for drug repurposing, and (3) the observation that disease-to-disease causality might participate in comorbidities. On the other hand, some limitations warrant recognition. (1) This study is restricted to European populations, limiting external generalizability of our results to other ethnicities. (2) Although we integrated the largest publicly available eQTL datasets, more layers of omics data such as proteins and metabolites could have deepened the understanding of the functional consequences of the observed key genes. (3) The genetic correlations and MR-based causalities reported in this study relied on the sample size of original datasets which might affect our observations. Therefore, follow-up studies with greater statistical power of larger datasets would be required to validate the robustness of these findings.
In conclusion, within this study we constructed a comprehensive atlas of shared genetic architecture between ADs and GIDs, showing the correlated genomic loci were converged on dysregulation of CD4 + T cells. More importantly, we uncovered underlying pleiotropic genes which are relevant to current drug re-evaluation efforts and the development of novel therapeutic targets. The potential causalities between a group of AD-GID pairs provide important clinical prevention strategies to reduce the incidence of these comorbidities.
Methods
GWAS datasets
The criteria of disease datasets selection included: (1) the presence of reported comorbidities of AD-GID14,46,47,48,49,50 (Supplementary Materials), (2) public data availability. In total, seven atopic diseases (general asthma, hay fever, eczema, allergy) and disease subtypes (childhood-onset asthma (COA), adult-onset asthma (AOA), moderate to severe asthma (MtoS asthma)), three lung function measurements (forced expiratory volume in one second (FEV1), forced vital capacity (FVC) and FEV1/FVC) related to asthma, and 11 GI diseases (Crohn’s disease (CD), ulcerative colitis (UC), colorectal cancer (CRC), celiac disease, chronic gastritis, diverticular disease, esophageal cancer, gastric cancer, gastroesophageal reflux disease (GERD), irritable bowel disease (IBS) and peptic ulcer) were selected, and all samples of the selected studies were from European populations previously described in the datasets (Table 1). The full GWAS summaries were obtained from the MRC IEU OpenGWAS database (https://gwas.mrcieu.ac.uk/) and the GWAS Catalog (https://www.ebi.ac.uk/). The data was quality controlled by the following (1) removal of SNPs with MAF < 1%, (2) allele harmonization using R package SNPlocs.Hsapiens.dbSNP144.GRCH37 (v. 0.99.20).
Global and local genetic correlations
We used linkage disequilibrium score regression (LDSC) (https://github.com/bulik/ldsc) to estimate the SNP-based heritability (h2) of each of the 21 traits. The known LD scores from 1000 Genomes Project European (1000 G) data was used to estimate global genetic correlations (rg) between each pair of AD-GID which represented the shared genetic factors which were not influenced by environmental confounders. In addition, the intercepts and standard errors were also obtained from LDSC as an estimation of sample overlap. The significance was determined by Benjamini-Hochberg (BH) procedure considering the total number of tested disease pairs.
Local Analysis of [co]Variant Association (LAVA) (https://github.com/josefin-werme/LAVA) was used to test local genetic correlations of 2495 independent genomic loci as defined by Werme et al.32. For each pair of traits, we selected only the loci significantly associated with both traits in univariate analysis. The final significance was determined by FDR < 0.05 (Benjamini-Hochberg procedure, considering the total number of trait-trait pair-wise tests).
Summary-data-based Mendelian randomization analysis with bulk- and sc-eQTLs
To prioritize pleiotropic genes within genetically correlated genomic regions, we obtained expression quantitative trait loci (eQTL) data which assessed the genetic effects on the transcriptome, including bulk eQTLs of blood from eQTLGen datasets (n = 31,864) and bulk eQTLs of lung, esophagus-gastroesophageal junction, esophagus-mucosa, esophageal muscularis, stomach, ileum, transverse colon and sigmoid colon from the Genotype-Tissue Expression (GTEx) project (n = 860). In addition, sc-eQTLs (peripheral blood mononuclear cells, PMBCs) from Onek1k project51 (n = 982, cell count =1,267,758) were applied to project the candidate genes to single-cell level.
Summary-based Mendelian randomization (SMR) multi-tools52 was used to detect whether effects of SNPs on the phenotype were mediated by gene expression. SNPs were treated as instruments, with gene expression as exposures and diseases as outcomes. The 1000 G reference was used to calculate LD. Two-step SMR was conducted, step 1, SMR test on eQTL and one trait of an AD-GID pair, step 2, SMR test on the other trait53. The comorbidity-associated candidates were defined as (1) were suggestively genome-wide significant (P < 1 × 10−5) in both eQTLs and GWAS results54,55, (2) FDR < 0.05 (BH-method, considering the total number of the trait-gene pairs) in at least one pair of AD-GID, and (3) without heterogeneity in the dependent instrument (HEIDI) test at P > 0.05.
As a complementary analysis for SMR, we adopted colocalization analysis (coloc R package with default parameters)56 to test whether the expressions of certain genes were shared causal variants with diseases. This analysis would further help to 1) assess the validity of instrumental variable assumption, 2) prioritize the most likely therapeutic targets within the same locus causally associated with disease57.
Cell type enrichment
To explore the etiologic cell types underlying the genetic factors, we used CELL-type Expression-specific integration for Complex Traits (CELLECT) analysis to associate the GWAS signals with sc-RNAseq data from four tissues, including PBMC (Onek1k project)51, lung tissue58, airway tissue59, and gut60 (terminal ileum and colon). The MHC region was excluded and only healthy individuals were kept in this analysis. In total, 178 cell types from four tissues were included in this analysis, and the significance was determined with BH-approach considering the total number of the trait-cell type pairs at FDR < 0.05.
Pathway enrichment
We used gProfiler (https://biit.cs.ut.ee/gprofiler/gost) to provide functional enrichment analyses (Wikipathway, GO and KEGG) and report the significant pathways (BH-adjusted P < 0.05).
Druggability of candidate causal genes
To assess whether the SMR-selected bulk and sc-eQTL genes were potential drug targets, we overlapped the candidate genes with DGIdb, DrugBank and OpenTargets (https://www.dgidb.org/, https://www.drugbank.ca/, www.opentargets.org).
Trait pair-wise bi-directional MR analysis
Shared genetic architecture could increase the risk of one phenotype, the onset of which causes another30. This phenomenon could be examined by causality inferring. To test this hypothesis, we performed MR analysis for each pair of AD-GID. The instrument variables (IVs) were selected and quality controlled according to stringent criteria61 as following (1) the independent SNPs from GWAS summary statistics of exposure by using the clump function in PLINK 2.0 software (https://www.cog-genomics.org/plink/2.0/), with the LD reference of European population from 1000 G dataset. The parameters of LD pruning were set at r2 threshold of 0.001, a window size of 1 Mb, minor allele frequency (MAF) > 0.05 and a significant threshold of with P value < 5e−08 in each GWAS study, (2) heterogeneity test was performed to detect IV outliers using ivw_radial (alpha = 0.05, weights = 1, tol = 0.0001) and egger_radial (alpha = 0.05, weights = 1) functions in RadialMR R package (https://github.com/WSpiller/RadialMR). Outliers were defined as a nominal significant P value < 0.05 in either of the two approaches above and further filtered out, (3) moreover, the IVs with larger variation explained in the outcome compared with exposure were also excluded (nominal P < 0.05 with Steiger’s test), (4) finally, we calculated the F-statistics and included only those IVs with F-statistics >10 in the analysis. All the IVs were harmonized with the SNPs from the outcome trait (Fig. 6A).
Three two-sample MR approaches were utilized to explore the potential causal relationships across AD-GID pairs, including primary inverse variance weighting (IVW) analysis and two complementary approaches, MR-Egger and weighted median, which relaxed certain assumptions of MR. IVW treated each valid SNP as independent, and used Wald ratios estimation for each SNP then meta-analyzed under a fixed effects model. The weighted median measured the weighted median rather than weighted mean of the SNP ratio, which has the ability to identify true causality if ≤50% of the weights are from invalid SNPs. The MR-Egger further allowed for the presence of directional (i.e. non-zero mean) uncorrelated pleiotropy and adds an intercept to the IVW regression to exclude confounding from such pleiotropy. All these methods were implemented by the functions of mr_ivw, mr_weighted_median and mr_egger_regression in TwoSampleMR R package (v0.4.26).
The MR results were further verified by sensitivity analysis (Fig. 6A). First, the leave-one-out analysis was used to examine whether the causal association was driven by a single SNP. Second, the MR-Egger regression was conducted to test the potential bias of directional pleiotropy effects represented by the intercept. Third, MR-PRESSO approach was used for horizontal pleiotropy testing62. Forth, we performed Cochran Q test for heterogeneity.
We applied the MR analysis in a bi-directional way, including ADs to GIDs and GIDs to ADs. The significance was defined as (1) FDR (BH approach) <0.05 of primary IVW method, and nominal P < 0.05 of MR Egger or MR weighted median methods, (2) passed all the sensitivity check with nominal P > 0.05.
To control for potential confounders associated with both GIDs and ADs, we further adopted a multivariable MR (MVMR) approach by adjusting for genetically determined BMI and smoking63. The GWAS summary statistics of BMI and smoking were obtained from Pulit, S. L et al.64 and Karlsson Linnér et al.65 studies. Causal relationships with MVMR P value < 0.05 were finally reported.
Statistics and reproducibility
The statistical significance was determined by FDR correction for multiple testing (FDR < 0.05). All the data used in this study is public available and analysis code has been provided which ensures the reproducibility.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The description of all the GWAS summaries is provided in Table 1. The origins of sc-RNA seq data, bulk-eQTL and sc-eQTL data are described in Methods and Supplementary Materials. Numerical source data for figures in the manuscript can be found in Supplementary Data 1. All other data are available from the corresponding author upon reasonable request.
Code availability
All analytic code used for this study can be found at the following link: https://github.com/EMO-Consortium/GeneticCorrelation/.
References
Dierick, B. J. H. et al. Burden and socioeconomics of asthma, allergic rhinitis, atopic dermatitis and food allergy. Expert Rev. Pharmacoecon Outcomes Res. 20, 437–453 (2020).
Asher, M. I. et al. Worldwide time trends in the prevalence of symptoms of asthma, allergic rhinoconjunctivitis, and eczema in childhood: ISAAC Phases One and Three repeat multicountry cross-sectional surveys. Lancet (Lond., Engl.) 368, 733–743 (2006).
Brew, B. K. et al. Paediatric asthma and non-allergic comorbidities: A review of current risk and proposed mechanisms. Clin. Exp. Allergy 52, 1035–1047 (2022).
GBD 2015 Chronic Respiratory Disease Collaborators. Global, regional, and national deaths, prevalence, disability-adjusted life years, and years lived with disability for chronic obstructive pulmonary disease and asthma, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet Respir. Med. 5, 691–706 (2017).
Law, A. W., Reed, S. D., Sundy, J. S. & Schulman, K. A. Direct costs of allergic rhinitis in the United States: estimates from the 1996 Medical Expenditure Panel Survey. J. Allergy Clin. Immunol. 111, 296–300 (2003).
Ho, S.-W., Lin, C.-P. & Ku, M.-S. The impact of allergic rhinitis on gastrointestinal disorders among young adults. J. Eval. Clin. Pr. 26, 242–247 (2020).
Bekić, S. et al. Atopic dermatitis and comorbidity. Healthc. (Basel) 8, 70 (2020).
Bernstein, C. N., Wajda, A. & Blanchard, J. F. The clustering of other chronic inflammatory diseases in inflammatory bowel disease: a population-based study. Gastroenterology 129, 827–836 (2005).
Soh, H. et al. Atopic diseases are associated with development of inflammatory bowel diseases in korea: a nationwide population-based study. Clin. Gastroenterol. Hepatol. 19, 2072–2081.e6 (2021).
Rossi, C. M., Lenti, M. V., Merli, S., Santacroce, G. & Di Sabatino, A. Allergic manifestations in autoimmune gastrointestinal disorders. Autoimmun. Rev. 21, 102958 (2022).
Hait, E. J. & McDonald, D. R. Impact of gastroesophageal reflux disease on mucosal immunity and atopic disorders. Clin. Rev. Allergy Immunol. 57, 213–225 (2019).
Jones, M. P., Walker, M. M., Ford, A. C. & Talley, N. J. The overlap of atopy and functional gastrointestinal disorders among 23,471 patients in primary care. Aliment Pharm. Ther. 40, 382–391 (2014).
Tambe, N. A. et al. Atopic allergic conditions and colorectal cancer risk in the Multiethnic Cohort Study. Am. J. Epidemiol. 181, 889–897 (2015).
Choi, Y. J. et al. Allergic diseases and risk of malignancy of gastrointestinal cancers. Cancers (Basel) 15, 3219 (2023).
Sandur, V. et al. Prevalence of gastro-esophageal reflux disease in patients with difficult to control asthma and effect of proton pump inhibitor therapy on asthma symptoms, reflux symptoms, pulmonary function and requirement for asthma medications. J. Postgrad. Med. 60, 282–286 (2014).
Ahn, K. et al. Mendelian randomization analysis reveals a complex genetic interplay among atopic dermatitis, asthma, and gastroesophageal reflux disease. Am. J. Respir. Crit. Care Med. 207, 130–137 (2023).
Sánchez-Valle, J. & Valencia, A. Molecular bases of comorbidities: present and future perspectives. Trends Genet. 39, 773–786 (2023).
Lees, C. W., Barrett, J. C., Parkes, M. & Satsangi, J. New IBD genetics: common pathways with other diseases. Gut 60, 1739–1753 (2011).
Sazonovs, A. et al. Large-scale sequencing identifies multiple genes and rare variants associated with Crohn’s disease susceptibility. Nat. Genet. 54, 1275–1283 (2022).
Andreoletti, G. et al. Exome analysis of patients with concurrent pediatric inflammatory bowel disease and autoimmune disease. Inflamm. Bowel Dis. 21, 1229–1236 (2015).
Tsuo, K. et al. Multi-ancestry meta-analysis of asthma identifies novel associations and highlights the value of increased power and diversity. Cell Genom. 2, 100212 (2022).
Eijsbouts, C. et al. Genome-wide analysis of 53,400 people with irritable bowel syndrome highlights shared genetic pathways with mood and anxiety disorders. Nat. Genet. 53, 1543–1552 (2021).
Li, L. et al. Disease risk factors identified through shared genetic architecture and electronic medical records. Sci. Transl. Med. 6, 234ra57 (2014).
Grotzinger, A. D. et al. Genetic architecture of 11 major psychiatric disorders at biobehavioral, functional genomic and molecular genetic levels of analysis. Nat. Genet 54, 548–559 (2022).
Mishra, A. et al. Stroke genetics informs drug discovery and risk prediction across ancestries. Nature 611, 115–123 (2022).
Romero, C. et al. Exploring the genetic overlap between twelve psychiatric disorders. Nat. Genet. 54, 1795–1802 (2022).
Tang, M. F. et al. Genetic effects of multiple asthma loci identified by genomewide association studies on asthma and spirometric indices. Pediatr. Allergy Immunol. 27, 185–194 (2016).
Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).
van der Wijst, M. G. P. et al. Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs. Nat. Genet. 50, 493–497 (2018).
Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).
Koloski, N. et al. Population based study: atopy and autoimmune diseases are associated with functional dyspepsia and irritable bowel syndrome, independent of psychological distress. Aliment Pharm. Ther. 49, 546–555 (2019).
Werme, J., van der Sluis, S., Posthuma, D. & de Leeuw, C. A. An integrated framework for local genetic correlation analysis. Nat. Genet. 54, 274–282 (2022).
Yan, S. & Willis, J. WD40-repeat protein WDR18 collaborates with TopBP1 to facilitate DNA damage checkpoint signaling. Biochem Biophys. Res. Commun. 431, 466–471 (2013).
Mayr, L. et al. Dietary lipids fuel GPX4-restricted enteritis resembling Crohn’s disease. Nat. Commun. 11, 1775 (2020).
Minikel, E. V., Painter, J. L., Dong, C. C. & Nelson, M. R. Refining the Impact of Genetic Evidence on Clinical Success (2003). http://medrxiv.org/lookup/doi/10.1101/2023.06.23.23291765.
Fan, Y. et al. Cullin 4b-RING ubiquitin ligase targets IRGM1 to regulate Wnt signaling and intestinal homeostasis. Cell Death Differ. 29, 1673–1688 (2022).
Verstockt, S. et al. OP01 Sequencing-based gene network analysis reveals a profound role for ferroptosis key gene GPX4 in post-operative endoscopic recurrence in Crohn’s disease. J. Crohn’s Colitis 17, i1–i3 (2023).
Hemani, G., Bowden, J. & Davey Smith, G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).
Freuer, D., Linseisen, J. & Meisinger, C. Asthma and the risk of gastrointestinal disorders: a Mendelian randomization study. BMC Med. 20, 82 (2022).
Morrisey, E. E. & Hogan, B. L. M. Preparing for the first breath: genetic and cellular mechanisms in lung development. Dev. Cell 18, 8–23 (2010).
Faure, S. & de Santa Barbara, P. Molecular embryology of the foregut. J. Pediatr. Gastroenterol. Nutr. 52, S2–S3 (2011).
Marsland, B. J., Trompette, A. & Gollwitzer, E. S. The Gut-Lung Axis in Respiratory Disease. Ann. Am. Thorac. Soc. 12, S150–S156 (2015).
Wang, Y. et al. Decoding microbial genomes to understand their functional roles in human complex diseases. Imeta 1, e14 (2022).
Dang, A. T. & Marsland, B. J. Microbes, metabolites, and the gut-lung axis. Mucosal Immunol. 12, 843–850 (2019).
Sun, W. et al. eHypertension: a prospective longitudinal multi-omics essential hypertension cohort. iMeta 1, e22 (2022).
Deshmukh, F., Vasudevan, A. & Mengalie, E. Association between irritable bowel syndrome and asthma: a meta-analysis and systematic review. Ann. Gastroenterol. 32, 570–577 (2019).
Clarke, K. & Chintanaboina, J. Allergic and immunologic perspectives of inflammatory bowel disease. Clin. Rev. Allergy Immunol. 57, 179–193 (2019).
Chiesa Fuxench, Z. C. et al. Risk of inflammatory bowel disease in patients with atopic dermatitis. JAMA Dermatol 159, 1085–1092 (2023).
Rittmeyer, D. & Lorentz, A. Relationship between allergy and cancer: an overview. Int Arch. Allergy Immunol. 159, 216–225 (2012).
Azouz, N. P. & Rothenberg, M. E. Mechanisms of gastrointestinal allergic disorders. J. Clin. Invest. 129, 1419–1430 (2019).
Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Wu, Y. et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat. Commun. 9, 918 (2018).
Liu, X. et al. Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome. Nat. Genet. 54, 52–61 (2022).
Xu, S. et al. Oxidative stress gene expression, DNA methylation, and gut microbiota interaction trigger Crohn’s disease: a multi-omics Mendelian randomization study. BMC Med. 21, 179 (2023).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Zuber, V. et al. Combining evidence from Mendelian randomization and colocalization: review and comparison of approaches. Am. J. Hum. Genet. 109, 767–782 (2022).
Adams, T. S. et al. Single-cell RNA-seq reveals ectopic and aberrant lung-resident cell populations in idiopathic pulmonary fibrosis. Sci. Adv. 6, eaba1983 (2020).
Deprez, M. et al. A single-cell atlas of the human healthy airways. Am. J. Respir. Crit. Care Med. 202, 1636–1645 (2020).
Kong, L. et al. The landscape of immune dysregulation in Crohn’s disease revealed through single-cell transcriptomic profiling in the ileum and colon. Immunity 56, 444–458.e5 (2023).
Burgess, S. et al. Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res. 4, 186 (2019).
Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).
Clayton, G. L. et al. A framework for assessing selection and misclassification bias in mendelian randomisation studies: an illustrative example between body mass index and covid-19. BMJ 381, e072148 (2023).
Pulit, S. L. et al. Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry. Hum. Mol. Genet. 28, 166–174 (2019).
Karlsson Linnér, R. et al. Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nat. Genet. 51, 245–257 (2019).
Acknowledgements
Research reported in this publication was supported by Natural Science Foundation of China (NSFC) under grant number 82302610 (C.Q), NSFC under grant number 82300623 (S.H), NSFC (81970483, 82170537 and 82222010) (R.M), Key-Area Research and Development Program of Guangdong Province (2023B1111040003), and National Key R&D Program of China (2023YFC2507300). ARB acknowledges support from the Netherlands Organization for Scientific Research (NWO Rubicon, 452022317).
Author information
Authors and Affiliations
Contributions
Conceptualization: S.H. and C.Q. Investigation: S.H., H.Z., C.Q., and A.R.B. Methodology: C.Q. and S.H. Supervision: S.H., H.Z., and A.R.B. Writing—original manuscript: C.Q., S.H., A.R.B. and A.L. Writing—review and editing: C.Q., A.L., F.S., Y.W., L.Z., C.T., R.F., R.M., M.C., L.C., G.K., A.B., H.Z., and S.H.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interest.
Peer review
Peer review information
Communications Biology thanks Tong Gong and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Rosie Bunton-Stasyshyn. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Qi, C., Li, A., Su, F. et al. An atlas of the shared genetic architecture between atopic and gastrointestinal diseases. Commun Biol 7, 1696 (2024). https://doi.org/10.1038/s42003-024-07416-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-07416-7