Introduction

Autoimmune diseases (ADs) affect approximately one in ten individuals worldwide, posing significant health risks as the immune system mistakenly attacks healthy tissues1. These conditions have a strong genetic component. Genome-wide association studies (GWAS) have identified thousands of genetic loci linked to disease susceptibility, significantly advancing our understanding of the underlying mechanisms2. However, most loci reside in non-coding genomic regions or even “gene deserts”, complicating the identification of functional regulatory variants and target genes. Additionally, high linkage disequilibrium (LD) among common genetic variants further obscures the identification of risk‑influencing variants, leaving the functional mechanisms underlying most associated loci largely uncharacterized3.

Genetic sharing among ADs is common, with nearly half of the AD-associated loci linked to multiple traits4,5. This sharing is often mistaken for common association mechanisms. Understanding the details of these shared genetic loci and their functional regulations can uncover both common and specific immunopathogenic mechanisms and pathways, aiding in identifying potential targets for drug repurposing or developing novel therapies. While several studies have explored shared genetic functions across ADs4,6,7, a detailed and systematic analysis is crucial to fully understand disease association mechanisms, including both shared and disease-specific functions or pathways.

In this study, we systematically analyzed the association signals of the 15 most common ADs. We integrated GWAS with comprehensive functional genomics data, including expression quantitative trait loci (eQTLs), chromatin accessibility and enhancer-gene promoter connections, to identify relevant cell types and target genes at AD-associated signals. Utilizing a protein-protein interaction (PPI) network, we explored common and specific functional modules across these diseases. Our results highlight the importance of identifying target genes, functional cell types, and regulatory mechanisms at the signal level for various ADs. These analyses provide insights into precision treatment and drug repurposing opportunities. Using the IL12RB2/IL23 locus as an example, we demonstrate that genetic studies can accurately predict the efficacy and specificity of drugs targeting molecules in relevant pathways for the treatment of psoriasis and inflammatory bowel disease (IBD).

Methods

Defining loci and signals: data collection for 15 autoimmune diseases

Fifteen autoimmune diseases were selected for further analysis based on their prevalence, clinical significance, and the availability of genomic data in the GWAS Catalog (https://www.ebi.ac.uk/gwas/). For each disease, GWAS studies were prioritized according to cohort size when summary statistics were available, or by the number of reported associated variants when summary statistics were unavailable in the GWAS Catalog. To ensure data quality and reduce redundancy, GWAS studies with overlapping samples were assessed, and only the study with the largest cohort size or the highest number of reported significant variants was retained. The selected GWAS studies for these autoimmune diseases were subsequently utilized for downstream analyses (Table 1, Supplementary Table 1).

Table 1 The shared and disease-specific association loci, signals and target genes among the 15 autoimmune diseases

To define independent loci and signals across these ADs, we first identified all reported variants with a significance threshold of P < 10−5 for each study from the GWAS Catalog. We then delineated loci by extending a ± 250 kb region around each reported variant. Overlapping loci were merged into a single locus. We excluded loci that met the following criteria: having only one reported variant, being reported by a single study, and with no variant achieving a statistical genome-wide significance of 5 × 10−8. For each reported variant, we identified all tag SNPs that are in high LD with the reported variant (r2 >= 0.8) within a ± 500 kb window, based on 1000 Genomes Project Phase 3 data from either European or East Asian populations using the LDlinkR package8. Variants in high LD (r2 >= 0.8) within the same locus were grouped into a single association signal; accordingly, a locus may harbor multiple signals. Within each signal, the reported variant with the smallest p-value for a disease was designated the lead variant.

The Jaccard score (J = |A ∩ B | / | AB | ), which calculates the fraction of overlapping values between two diseases, was used for each pair of diseases to measure the fraction of overlapping loci and signals, respectively.

Ranking association signals in each disease

To better understand the genes associated with each AD and their functional implications, we adopted a ranking scheme for the associated signals. For each GWAS study (Table 1), we ranked the signals based on the original association p-values of the independent lead variants. For diseases with multiple studies, we used a robust rank aggregation method9 to combine the signal lists from each study and determine the final ranking of signals. This method aggregates multiple ranked lists into a single statistical consensus ranking and is used for meta-analysis to combine ranked lists from different datasets. Ranking the signals could facilitate identifying and understanding the important candidate genes associated with each disease in the subsequent steps.

Identifying the target gene(s) for each signal

We employed five main SNP-to-gene linking strategies across 13 databases to identify cis gene targets for the tag variants in various immune cell types, aiming to determine the target gene(s) for the associated variants. These strategies included:

  1. 1)

    Functional consequence analysis for the variants using the annotation tool Variant Effect Predictor(VEP)10;

  2. 2)

    Identification of significant cis-eQTL (FDR < 0.05) from six studies on immune cells, namely DICE11, ImmuNexUT12, Genotype-Tissue Expression (GTEx)13, BLUEPRINT14, eQTLGen15, and scRNA-seq of ADs16;

  3. 3)

    Enhancer-gene linking from two resources: EpiMap17 and Activity-By-Contact(ABC) model18,19; ABC model predicts enhancer-gene connections in each cell type based on measurements of enhancer activity and 3D contact frequencies (Hi-C)19 and EpiMap predicts enhancer-gene links using the Pearson correlation between gene expression and enhancer activity17.

  4. 4)

    Information of promoter-interacting regions by Promoter-capture Hi-C (PCHiC)20;

  5. 5)

    Integrated SNP-to-gene linking strategies from three tools: cS2G21, Open Targets V2G22, and L2G23;

By integrating these approaches, we developed a new method to identify target gene(s) based on a gene score for the association variants (Table 2). This process involved consideration of significant raw SNP-to-gene linking values (p-values or scores) from each database and transforming them into scores ranging from 0 to 1 using quantile transformation. This transformation ensured the harmonization of scores across different sources using the scikit-learn of Python. Additionally, when multiple cell types were involved in SNP-to-gene evaluation within each database, only the top-scored SNP-gene pair was considered. Next, we integrated the scores of each variant-gene pair obtained from different sources into a unified framework.

Table 2 List of strategies for SNP-to-gene linking approaches and identification of the relevant cell types involved in signals

For each variant-gene pair \(g\), we calculated the weighted total score (\({S}_{g}\)) as follows:

$${S}_{g}=\sum\limits_{i=1}^{8}{W}_{i}{S}_{i}$$

where \(i\) corresponds to the SNP-to-gene linking databases, \({W}_{i}\) corresponds to the weight assigned to the database \(i\), and \({S}_{i}\) corresponds to the transformed variant-gene score derived from category \(i\) (with six significant cis-eQTL databases serving as a single piece of evidence). The weight assigned to each database reflects the level of confidence in its evidence, as informed by prior knowledge and previous studies2,22.

We computed a weighted score for each variant-gene pair to identify the target genes for each signal. For signals with multiple variants, the gene with the highest score was selected. This process resulted in a ranked list of genes for each signal. To pinpoint the most relevant genes for each signal, we applied an ‘elbow point’ cutoff, determined as the inflection point in the gene score curve. This cutoff served as a threshold for the gene list, excluding genes beyond it from being considered primary targets for downstream analysis. Ultimately, the top-ranked gene(s) for each signal were identified and referred to as the top-tier genes.

Identifying the relevant cell types for each signal

We leveraged various databases to pinpoint the specific immune cell types in which each association signal might be active. These databases often vary in their resolutions regarding the cell types, ranging from exploring a few major cell types to examining several dozen immune cell types, which may complicate the integration of information from different studies. To address this issue, we consolidated the cell type information into six primary immune cell types, specifically: monocytes, B cells, dendritic cells, CD4 + T cells, CD8 + T cells, and NK cells. This consolidation was achieved using data from six main databases: ENCODE snATAC-seq and DNase-seq data24,25,26, Epimap chromHMM17, ABC model19, Fantom5 CAGE-seq27,28, and Significant cis-eQTL (Table 2). ENCODE snATAC-seq and DNase-seq are used to identify open chromatin regions across various cell types, while Epimap chromHMM is employed to annotate functional genomic regions. FANTOM5 utilizes CAGE-seq to map transcription start sites and enhancers across cell types.

We implemented an evidence-based approach to determine the cell types relevant to a given signal. This involved calculating the number of databases that corroborated each of the six identified cell types for each variant within the signal. Additionally, we computed the cumulative number of databases supporting each cell type involved in the signal. By setting a cutoff at the ‘elbow point’ in the cumulative number of supporting databases for each cell type, we could determine the cell types that are potentially relevant to the association signal.

Pathway enrichment analysis for target genes

We performed Kyoto Encyclopedia of Genes and Genomes(KEGG) and Reactome pathway enrichment analysis on the target genes identified for each AD using the R package Clusterprofiler29,30, with default parameters. Additionally, as we ranked the signals within each AD as well as the target genes, we also conducted Gene Set Enrichment Analysis (GSEA)31 using the package Clusterprofiler. GSEA is a rank-based method used to determine whether a predefined gene set is significantly enriched at either the top or bottom of a ranked list. For visualization of the results, we utilized the R package ggplot232 to plot the figures.

Protein-protein interaction (PPI) network analysis for functional connections and differences across ADs

Distinguishing functional variations within large sets of immune-related genes can pose a significant challenge. This is primarily because the dominant immune functions tend to overshadow less prevalent ones, complicating the detection of subtle differences or unique functionalities between two gene sets. To circumvent this issue, we constructed a PPI network. This was achieved by importing all top-tier target genes associated with ADs into the STRING database (version 12.0, https://string-db.org/). This process utilized protein-protein interaction evidence gathered from high-throughput experiments, curated databases, and co-expression sources. The minimum confidence score for interactions was set to 0.4 (medium confidence). The visualization of the resulting network was achieved using Cytoscape33 (version 3.10.1). The modules were determined using the default parameters of the MCODE plugin, facilitated through the clusterMaker234 (version 2.3.4) within Cytoscape. We assessed the networks considering the quantity of genes within each module, along with the node degree and closeness centrality measures. These parameters signify the significance of individual genes within the network.

We performed pathway enrichment analyses for each module utilizing a variety of databases and tools, including KEGG, Reactome, Wikipathways, and gene ontology (GO) biological process enrichment analyses. These were performed using the g:profiler from EnrichmentTable35 (Version 2.0.5) plugin within Cytoscape to identify significantly enriched pathways and biological process terms. By using modules as functional units for pathway analysis, we were able to perform a detailed functional analysis of target genes across various ADs. This approach also facilitated the comparison of these target genes, enabling us to identify functional differences between the ADs.

Comparison of modules across ADs utilizing GSEA

Using the modules identified from the PPI network, along with the genes in these modules, we constructed custom background gene sets, with each module constituting a gene set. These modules and genes were adapted to meet the requirements of the standard GMT format. We then performed Gene Set Enrichment Analysis (GSEA) using the ranked top-tier target genes for each disease with the ClusterProfiler package. A positive normalized enrichment score (NES) was obtained from this analysis, reflecting the tendency of members within each gene set (module) to cluster at the top of the ranked list of associated genes for each AD. These positive NES values were then used to compare the relative importance of the modules for each AD.

Identifying drug candidates and targets based on the functional modules

We utilized the ChEMBL (https://www.ebi.ac.uk/chembl/, Release 30) and DrugBank (https://go.drugbank.com/) databases for a target-drug search using the top-tier target genes. These databases are comprehensive and manually curated repositories that provide detailed information on drugs, their targets, and clinical trials for specific disease indications. To confidently identify potential drug targets relevant to the disease, our analysis specifically focused on the targets of drugs that are currently in phase III or IV clinical trials.

Inclusion and ethics

As the included studies were approved by their respective independent review boards, no additional ethical approval was required for our study, which is based on summary statistics data.

Results

Diverse association signals even when the locus is shared

After reviewing GWAS studies for each autoimmune disease from the GWAS Catalog, we selected 28 distinct studies that covered the 15 most prevalent ADs. These were chosen based on either the largest cohort sizes or the highest numbers of associated variants that surpassed genome-wide significance (Table 1). Detailed information for each GWAS source is provided in Supplementary Table 1. These studies yielded a total of 2129 lead variants associated with ADs. To capture additional related variants, we expanded the lead variants to include those in high linkage disequilibrium (LD, r² ≥ 0.8) with them, based on the 1000 Genomes Project for European or East Asian populations. This expansion resulted in 38,402 tag SNPs, providing a more comprehensive set of genetic variants for further analysis (Supplementary Table 2).

We grouped these variants into 502 loci based on their genomic locations. Of these, 255 loci were associated with at least two diseases, which we refer to as pleiotropic loci. Within each locus, we defined association signals as LD clusters of variants with r² ≥ 0.8 and treated distinct clusters within the locus as independent if they were not in high LD with one another, yielding 1800 potentially independent signals. These included 265 pleiotropic signals defined as associated with multiple diseases and 1535 disease-specific signals (Table 1, Supplementary Table 2). Locus sizes varied widely, with a median of 638.4 kb (ranging from 500 kb to 4.14 Mb). The HLA region was a notable outlier, spanning 8.22 Mb. Multiple independent signals per locus were quite common, ranging from 1 to 22 (median 2), whereas the HLA locus presented a remarkable 230 signals (Supplementary Table 2).

Locus-sharing (255/502, 50.8%) across ADs is much more common than signal-sharing across the diseases (265/1800, 14.7%). This pattern is further demonstrated by the clustering of diseases based on association signals rather than loci, as indicated by the Jaccard similarity scores (Fig. 1A–B; Supplementary Fig. 1A–B). Three distinct groups of diseases were identified using complete-linkage hierarchical clustering on Jaccard overlaps derived from the signal matrix: Group 1 (SJS, SS, SLE, RA, PBC), Group 2 (IBD, AS, PV, MS, CD, BD), and Group 3 (T1D, ATD, VIT, AA) (Fig. 1B). We also ranked the association signals within each disease (see “Ranking association signals in each disease”), and examined the pleiotropic signals and their corresponding ranked association values. This analysis revealed similar clustering patterns (Fig. 1C).

Fig. 1: Genetic locus-sharing and signal-sharing across autoimmune diseases.
Fig. 1: Genetic locus-sharing and signal-sharing across autoimmune diseases.The alternative text for this image may have been generated using AI.
Full size image

A The Jaccard score measures the fraction of overlapping loci between each pair of diseases, highlighting shared genetic loci. B The Jaccard score measures the fraction of overlapping signals between each pair of diseases, with three distinct groups of ADs clustering together. Both (A) and (B) were generated using the R package corrplot71, where the color intensity and circle size are proportional to the fraction of overlap. C Clustering based on pleiotropic signals and their association ranking values within each disease. The right panel displays bars representing pleiotropic signals, with color intensity indicating the association ranking values for each disease. The bars are ordered by the number of diseases sharing a given signal. The left panel features a dendrogram illustrating the clustering of diseases in the right panel. Abbreviations: Multiple sclerosis (MS), Systemic lupus erythematosus (SLE), Inflammatory bowel disease (IBD, includes Crohn’s disease, and ulcerative colitis), Rheumatoid arthritis (RA), Autoimmune thyroid disease (ATD), Type 1 diabetes (T1D), Psoriasis (PV), Primary biliary cholangitis (PBC), Celiac disease (CD), Vitiligo (VIT), Ankylosing spondylitis (AS), Systemic sclerosis (SS), Alopecia areata (AA), Behcet’s disease (BD), Sjogrens syndrome (SJS).

This suggests that comparing diseases at the signal level may offer a more effective approach for identifying both shared and specific association mechanisms. The presence of independent association signals within a shared locus suggests mechanistic differences across these ADs, potentially involving different target genes, different relevant cell types or regulatory mechanisms. Comparing the similarities and differences in the association architecture of these diseases may provide a unique perspective for a deeper understanding of autoimmunity.

Cell type specificity of the association signals

Differences in functionality across cell types might explain why diseases exhibit different association signals when they share the same locus. Understanding the cellular context of various ADs could also lead to more precise treatments. To determine the relevant cell types for each association signal, we used an evidence-based approach that integrates multiple data types. This includes significant cis-eQTL data, histone modification marks, chromatin accessibility information, and other genomic features (see “Identifying the relevant cell types for each signal”).

To account for differences in cell type resolution across datasets and to improve statistical power, our analysis focused on six major immune cell types: CD4⁺ T cells, CD8⁺ T cells, B cells, NK cells, monocytes, and dendritic cells. We standardized cell classifications by aggregating higher-resolution cell subsets from the original datasets into these broader categories. However, for specific loci of particular interest, we referred back to the original studies with more detailed cell type annotations to facilitate a more nuanced interpretation.

Of the 1800 autoimmune association signals analyzed, relevant cell types were identified for 1693 signals (Supplementary Table 2). Among these, 856 (or 50.6%) were predicted to be functional in three or fewer cell types. Notably, 349 signals (or 20.6%) appeared to be specific to one of the six immune cell types, suggesting potential cell type specificity (Supplementary Fig. 2A). We examined the distribution of cell types for each disease based on the 856 signals ascribed to 1–3 cell types, aiming to assess potential cell type specificity in these associations. Our results show that CD4 + T and CD8 + T cells were the most frequently involved cell types among signals exhibiting variation across cell types. In contrast, NK and dendritic cells were less commonly implicated across the six cell types.

Broadly, these findings suggest that lymphoid cells are more likely to be implicated in ADs compared to myeloid cells (Fig. 2A). Additionally, when comparing cell type associations for each AD against the overall distribution across all ADs, we observed a significantly higher involvement of B cells in SLE and PBC relative to other diseases. Detailed results of these comparisons are provided in Supplementary Fig. 2B. We also attempted to analyze the distribution of more detailed sub-cell types using specific databases; however, limited statistical power prevented the detection of significant differences at the sub-cell type level across diseases. More genomic data with higher cell type resolution are needed to fully understand the relevant cell types for these diseases.

Fig. 2: Distribution of immune cell types within each AD and specific immune cell types involved in signals.
Fig. 2: Distribution of immune cell types within each AD and specific immune cell types involved in signals.The alternative text for this image may have been generated using AI.
Full size image

A CD4 + T and CD8 + T cells are the most prevalent immune cell types across all ADs, while NK or dendritic cells are less frequently involved. The proportion of B-cell signals in AA and AS is significantly lower compared to the other ADs. Conversely, B cells are more enriched in SLE and PBC. B Two independent signals at the IL10 locus: the right signal G1 on the upstream of IL10 is associated with BD and the left signal G2 on the downstream of IL10 is associated with SLE, IBD and T1D. C ENCODE DNase-seq and snATAC-seq data suggest that Signal 1 (G1) is specific to monocytes/dendritic cells, whereas Signal 2 (G2) is accessible in various immune cell types, as visualized using the WashU Epigenome Browser72. Significant cis-eQTL from DICE (D) and ImmuNexUT (E) indicate that G1 variants serve as monocyte-specific eQTL for IL10, while no significant eQTLs were detected for G2 variants. F G1 is specifically associated with BD and may show regulatory activity in monocytes, while signal G2 is associated with SLE, IBD and T1D and may exert functions across various immune cell types.

Cell type specificity determines the associations of IL10 to different diseases

The role of cell type specificity in various ADs is emphasized by the associations observed at the IL10 locus. There are two independent signals around the IL10 locus linked to different ADs. Signal 1 (G1, depicted in Fig. 2B), located upstream of IL10, includes three reported variants (rs1518111, rs1800871, rs3024490) in the GWAS Catalog. They have near absolute LD to each other, and are associated with Behcet’s Disease (BD). Conversely, Signal 2 (G2, Fig. 2B) is situated downstream of G1 and contains three reported variants (rs3024493, rs3024505, rs3122605) associated with SLE, IBD, and T1D in GWAS Catalog (Supplementary Table 3). Data from both ENCODE DNase-seq and snATAC-seq (Fig. 2C) indicate that signal G1 is situated in an open chromatin region specific to monocytes/dendritic cells. This is further corroborated by significant cis-eQTL data from DICE and ImmuNexUT, which suggest that G1 variants are monocyte-specific eQTLs for IL10 expression (Fig. 2D–E), with the risk allele associated with reduced IL10 expression (Supplementary Fig. 3A). Moreover, based on HOCOMOCO human transcription factor-binding models36, the risk allele rs1518111-T potentially modify the binding affinity of IRF4/8, providing a probable mechanism for the association (Supplementary Fig. 3B).

In contrast, signal G2 in the IL10 locus is associated with SLE, T1D and IBD37,38, and the region containing G2 appears to be accessible ubiquitously in various immune cells according to ENCODE DNase-seq and snATAC-seq data (Fig. 2C). We conducted colocalization analysis for this region using HyPrColoc39, which supported shared genetic etiology for SLE, T1D, and IBD (Supplementary Fig. 3C–E). Different from the G1 signal with strong evidence of cell type-specific eQTLs, no significant eQTL data for the G2 variants were detected in either the DICE or ImmuNexUT databases (Fig. 2D–E). For this locus, it seems that both different cellular contexts and regulatory mechanisms contribute to the different disease associations, with IL10 the most likely target gene in both cases.

Our evidence-based approach to identify relevant cell types for different signals at the IL10 locus agrees with prior studies: the BD-risk allele of signal G1 (rs1518111-T) reduces IL10 expression in purified monocytes40,41, whereas the SLE-risk allele of signal G2 is associated with higher IL10 at both mRNA and protein levels, and increased proportions of IL-10+p-ELK-1+ cells in B cells, T cells, and monocytes in SLE patients42. To further corroborate cell type specificity using public resources, we analyzed bulk RNA-seq from isolated immune subsets obtained from the GEO to compare case–control IL10 expression patterns by disease and cell type. For BD, we used GSE6139943, which includes CD4 + T cells (BD n = 9; healthy controls n = 3) and CD14+ monocytes (BD n = 8; controls n = 9); isolated B cells were not available. For SLE, we used GSE14860144, which profiles T cells (SLE n = 21; controls n = 14), B cells (SLE n = 9; controls n = 10), and monocytes (SLE n = 15; controls n = 14). After standard normalization and per–cell type comparisons using two-sided Wilcoxon rank-sum tests, IL10 expression in BD was significantly decreased in monocytes (P = 0.015) but not in T cells (P = 0.282) (Supplementary Fig. 4A). In SLE, IL10 expression was significantly increased in monocytes (P = 0.007), T cells (P = 0.006), and B cells (P = 0.017) (Supplementary Fig. 4B). All these lines of evidence support distinct, disease- and cell type–specific effects at the IL10 locus. The G1 signal is specifically associated with BD and links the BD-risk allele to reduced IL10 expression in monocytes, consistent with monocyte-specific regulatory activity. By contrast, the G2 signal is associated with IBD, T1D, and SLE; its SLE-risk allele corresponds to increased IL10 expression across multiple immune cell types, indicating a broader, multi–cell type regulatory mechanism (Fig. 2F).

Genetic evidence from the IL23R/IL12RB2 locus supports targeted treatment

For the locus around IL23R/IL12RB2, we identified nine signals associated with relevant cell types and linked to multiple ADs. They include signals mostly specific to CD4 + T cells or ubiquitous for all six cell types (Supplementary Table 3). Upon detailed examination, Signal 4 (G4, left), with the reported variant rs2295359, intronic to IL23R, is associated only with psoriasis (PV). In contrast, Signal 2 (G2, right) includes seven variants associated with various diseases and likely targets IL12RB2 (Supplementary Table 3, Fig. 3A). G2 appears to be specific to CD4 + T cells (Th1) and NK cells (Fig. 3B). On the other hand, G4 seems to be specific to CD4 + T cells, including follicular helper T cells, Th17, and memory regulatory T cells, based on cis-eQTL data from DICE (Fig. 3C). These findings are consistent with the expression patterns of the IL23R or IL12RB2 genes in immune cells, as documented on the Protein Atlas (https://www.proteinatlas.org/).

Fig. 3: Genetic signals in the IL23R/IL12RB2 locus.
Fig. 3: Genetic signals in the IL23R/IL12RB2 locus.The alternative text for this image may have been generated using AI.
Full size image

A At the IL23R/IL12RB2 locus, the left signal G4 on the IL23R is associated only with PV, and the right signal G2 on the IL12RB2 is associated with SLE, SS, ATD and PBC. B Signal G2 near IL12RB2 is specific to CD4 + T cells (Th1) and NK cells. C Signal G4 near IL23R is specific to CD4 + T cells (follicular helper T, Th17, and memory regulatory T cells) based on cis-eQTL data. D IL-12 signaling is composed of the IL-12/23p40 and IL-12p35 subunits, while IL-23 signaling consists of IL-23p19 and IL-12/23p40. IL-12 signals through the IL-12Rβ1 and IL-12Rβ2 receptor subunits, whereas IL-23 signals through IL-12Rβ1 and IL-23R. The figure was created using BioRender (https://biorender.com/).

IL-12 and IL-23 signaling have been confirmed to drive aberrant Th1 and Th17 immune responses, respectively, contributing to ADs45. The IL-23 signal pathway comprises the p19 subunit (encoded by IL23A) and the p40 subunit (encoded by IL12B), with its receptor consisting of IL23R and IL12RB1. IL-12 is composed of the p35 subunit (encoded by IL12A) and the p40 subunit (encoded by IL12B), with its receptor consisting of IL12RB1 and IL12RB2 (Fig. 3D). IL23R is associated with PV, IBD, BD, and AS, while IL12RB2 is associated with multiple ADs except for PV (Supplementary Fig. 5). This indicates that IL-23 signaling, rather than the IL-12 pathway, is likely pivotal to the pathogenesis of PV and IBD. The p19 subunit is unique to the IL-23 signaling, while the p40 subunit is shared with both IL-12 and IL-23 signals. Risankizumab, Tildrakizumab, and Guselkumab specifically target p19 in the IL-23 pathway, whereas Ustekinumab targets p40, thus potentially inhibiting both IL-23 and IL-12 pathways46.

Based on the genetic findings, focusing on p19 may represent an effective and more specific strategy than targeting p40 for treating PV and IBD. In recent years, therapies targeting p19 have been used more frequently than those targeting p4047. Recent experimental evidence supports this hypothesis, showing that anti-p19 antibodies are safe and do not increase the risk of adverse events when treating patients with moderate-to-severe PV compared to Ustekinumab48. A comprehensive analysis of association signals in the loci of IL23R/IL12RB2, IL12RB1, IL23A, IL12A and IL12B also suggests differential involvement of the IL23 and IL12 pathways in ADs (Supplementary Fig. 5). Psoriasis and IBD appear to involve the IL23 pathway, likely showing regulatory activity in Th17 cells, while SLE, PBC, and MS are more likely to involve the IL12 pathway and activation in Th1 cells. Genetic analyses nominate IL12A (IL12p35) as a potential therapeutic target for SLE, MS and PBC, and targeting IL23A (IL23p19) might be a more specific and safer approach compared to targeting IL12p40.

Other examples include IL12A, encoding p35 of IL-12, which has one signal associated with celiac disease (CD) and BD and appears to be specific to monocyte/dendritic cells. In contrast, another signal in the same locus is linked to PBC and SLE and appears to be B cell-specific (Supplementary Table 3, Supplementary Fig. 6A–E). In the WDFY4 locus, there are two signals and both are associated with SLE but differ in their cellular specificity: one signal appears to be specific to naïve regulatory T cells, while the other seems to be functional exclusively to monocytes and neutrophils (Supplementary Table 3, Supplementary Fig. 6F–H). Investigating the underlying mechanisms of these specific associations could lead to a deeper understanding of disease pathogenesis and promote the development of precision treatments for ADs.

Functional comparison of pleiotropic and disease-specific signals

We functionally annotated these 1800 signals associated with various ADs, including 265 pleiotropic and 1535 disease-specific signals, using genomic data including cis-eQTLs, enhancer-gene linking and Promoter-capture Hi-C (see “Identifying the target gene(s) for each signal”). Our analyses revealed a significant functional enrichment of the pleiotropic signals, such as cis-eQTLs, compared to disease-specific signals (97% vs 86%, p = 2.298e-08, one-sided Fisher’s exact test). This indicates that pleiotropic signals have a stronger association with gene expression regulation, potentially being more widespread across cell types and exhibiting more robust or detectable connections. Similarly, pleiotropic signals were significantly enriched in detected active enhancers than disease-specific signals, as identified using EpiMap data (71% vs 48%, p = 2.726e-12). Additionally, the pleiotropic signals were also more likely to be supported by the ABC model (75% vs 50%, p = 7.398e-14) and the promoter-interacting data from PCHiC (89% vs 80%, p = 0.0003191) (Supplementary Fig. 7). These findings highlight the crucial roles of the pleiotropic signals in mediating autoimmunity across multiple diseases.

Identifying target genes for the association signals

We developed a scoring system to identify the target genes for each association signal, utilizing a robust approach with a combination of five SNP-to-gene linking approaches (“Identifying the target gene(s) for each signal”, Table 2). Out of these 1800 signals, we identified 1554 target genes from 1740 signals using this in-house scoring system. Notably, for most of the signals (68.3% or 1189 signals), a single target gene can be identified using this approach. Additionally, we can narrow down the target genes to two for 17% of the signals (295 signals). Three or more target genes are detected for 14.7% (256 signals) of the signals (Supplementary Table 2).

Furthermore, we ranked the target genes for each disease based on the number of supporting independent association signals and their significance from GWAS studies (Supplementary Table 4). This ranking provides insight into the relative significance of each target gene for a given disease, providing a valuable resource for understanding the functional implications of genetic associations. From the 1554 target genes, we identified 503 genes (32.4% of the total) associated with at least two ADs. Among these shared target genes, 90 were associated with five or more ADs (Fig. 4A), suggesting their crucial roles in autoimmunity.

Fig. 4: Target Genes across ADs and functional enrichment.
Fig. 4: Target Genes across ADs and functional enrichment.The alternative text for this image may have been generated using AI.
Full size image

A A Circos plot73 shows 90 genes associated with five or more ADs. B The top 10 associated genes for each AD. C An UpSet plot74 illustrates the number of associated genes shared across ADs. The figure shows the number of shared or disease-specific genes exceeding four. KEGG (D) and Reactome (E) pathway enrichment analyses reveal major shared immune processes across ADs. F Certain pathways are significantly enriched in specific ADs, as demonstrated by Reactome enrichment results.

Excluding MHC Class II genes, the top three pleiotropic genes, SH2B3, STAT4, and BACH2, are shared by 11, 10, and 9 diseases, respectively (Supplementary Table 4). The top 10 target genes for each AD are also presented in Fig. 4B. For instance, in IBD, genes such as IL23R, NOD2, and TNFSF15 were identified as crucial association genes. As discussed above, the IL23R gene is involved in the IL23/Th17 signaling pathway, essential for maintaining intestinal immune homeostasis. NOD2 is specific to IBD and plays a key role in innate immunity, particularly in microbial recognition and autophagy49, while TNFSF15 promotes antimicrobial pathways and is currently being explored as a potential therapeutic target for IBD treatment50 (Fig. 4B).

Among the top ten target genes for each of the 15 diseases, we identified 111 unique genes. Remarkably, 80.2% (89 out of 111) of these unique genes were also associated with at least one other AD, although not all of them made the top ten list for each disease (Supplementary Fig. 8A). This proportion is notably greater than the proportion of all disease-shared genes (503/1554, 32.4%), suggesting that the highly ranked genes are more likely to play central functional roles common to multiple diseases. However, it is possible that detection bias may have influenced these results, as top-ranked genes tend to exhibit stronger associations, resulting in increased detection power.

Most of the target genes were specific to a single disease (Fig. 4C, Supplementary Table 4). Overall, sharing of target genes across different ADs is more prevalent than sharing of the underlying signals, with 32.4% of target genes being shared compared to 14.7% of the signals. This indicates that variations in cellular contexts or regulatory mechanisms may further differentiate disease associations.

Pathway characterization of the target genes

KEGG and Reactome pathway enrichment analyses were conducted on the target genes for each disease. The pathways significantly enriched for each disease are shown in Supplementary Table 5. The most significantly enriched pathways, ranked by adjusted p-values, were commonly shared across multiple ADs. Notably, these include T-cell differentiation and response to viral infection according to KEGG (Fig. 4D, Supplementary Fig. 8B, Supplementary Table 5). Similarly, according to the Reactome database, enrichment in TCR signaling, Interferon responses, and Interleukin signaling was observed (Fig. 4E, Supplementary Fig. 8C, Supplementary Table 5). When pathway enrichment was performed exclusively on genes shared by various ADs (Supplementary Table 6), we observed similar pathway enrichment patterns. In contrast, disease-specific genes exhibited limited pathway enrichment, indicating a limited knowledge of their specific functions in autoimmunity (Supplementary Table 7).

Despite these commonalities, certain pathways were significantly enriched only in specific diseases. For instance, based on the Reactome enrichment, the initial triggering of complement was exclusively enriched in SLE, the VEGFA-VEGFR2 pathway was unique to ATD, and the formation of the cornified envelope/keratinization was specific to PV. Signaling by CSF3 (G-CSF) and FCERI-mediated MAPK activation were unique to MS. Additionally, TRAF6-mediated IRF7 activation was enriched in both SLE and ATD, whereas Interleukin-1 family signaling was solely enriched in IBD and CD (Fig. 4F, Supplementary Table 5).

Analysis of protein interaction network reveals common and specific functional modules

To further characterize the shared and specific functions across the ADs, we constructed a PPI network using the 1554 target genes as input to the STRING database. This network incorporated interaction evidence that was derived from experimental data, curated databases, and co-expression from the STRING. Protein interaction pairs were included in the network based on a medium-confidence score threshold of 0.4. About 65.4% of the target genes (1016 out of 1554) with a total of 8978 interactions were incorporated into the network (Supplementary Fig. 9A). Notably, 80% of the interacting genes (813 out of 1016) converged into 32 functional clusters (modules) based on MCODE with default parameters via the clusterMaker2 plugin in Cytoscape (Fig. 5A, Supplementary Table 8).

Fig. 5: PPI network and module Identification.
Fig. 5: PPI network and module Identification.The alternative text for this image may have been generated using AI.
Full size image

A AD-associated genes are clustered into 32 multi-functional modules using MCODE. In the network, nodes represent proteins and edges represent interactions. Cytoscape was used to visualize the network. B The proportion of pleiotropic and disease-specific genes in each module. C The proportion of pleiotropic genes within each module for a specific disease.

Among the 813 genes identified across these 32 modules, roughly 35% of the genes were pleiotropic. The median proportion of pleiotropic genes across these modules was 34%, with significant variations among modules (Fig. 5B). As expected, the gene products of the Human leukocyte antigens (HLA) genes exhibited strong interactions and were clustered together in Module C1. Significantly, by analyzing the pleiotropic genes in each module across different diseases, we identified five modules - C1, C3, C7, C12, and C14 - that were predominantly shared across various ADs (Fig. 5C). Similar patterns were observed when examining the proportion of both pleiotropic and disease-specific genes (Supplementary Fig. 9B).

Assessing the role and relative significance of individual modules in each disease

We treated the genes within each module as gene sets and performed gene set enrichment analysis (GSEA) to evaluate their enrichment in each disease. This analysis utilized the genes associated with each disease, incorporating their respective ranking values within each disease context (“Comparison of modules across ADs utilizing GSEA”). This GSEA-based method allows us to evaluate the relative significance of each module for a specific disease, as reflected by the Normalized Enrichment Score (NES). A positive NES signifies the extent to which a module is overrepresented among the top-ranked genes for each disease (Fig. 6A). Additionally, we annotated the functions of these modules using databases such as KEGG, Reactome, Wikipathways, and GO biological processes. From these annotations, we selected the top five most enriched pathways for each module (Supplementary Table 9). A summary of the modules and their annotated major functions is presented in Fig. 6B.

Fig. 6: Assessing the role of modules in ADs and functional analysis for each module.
Fig. 6: Assessing the role of modules in ADs and functional analysis for each module.The alternative text for this image may have been generated using AI.
Full size image

A GSEA was performed on the ranked gene list for each AD. The top 10 gene sets (modules) with positive Normalized Enrichment Scores (NES) for each disease are shown, where the dot size represents the number of genes enriched per term. B Pathway enrichment analysis for each module was conducted using KEGG, Reactome, Wikipathways and GO biological process databases via g:profiler. The number of drugs in clinical trials (phases III and IV, ChEMBL) and their respective targets are listed for each module.

Modules shared across multiple diseases may play critical roles in the underlying mechanisms of autoimmunity (Fig. 6A). For instance, Module C1 is primarily enriched in HLA-mediated antigen processing and presentation. Module C3 is linked to cytokine-cytokine receptor interaction, chemokine signaling pathway, and TNFR2 non-canonical NF-κB signaling. Module C5 shows enrichment in regulatory circuits of STAT3 signaling, natural killer cell-mediated cytotoxicity, and interleukin-2 family signaling. Module C7 is linked to differentiation pathways of Th1, Th2, and Th17 cells, as well as interferon signaling pathways. Module C12 is enriched for nuclear receptor signaling and TNFR1-induced proapoptotic signaling. Lastly, Module C14 is associated with the production of reactive oxygen species (ROS) in phagocytes and the MAPK signaling pathway.

Potential drug repurposing based on shared modules across the diseases

Module C1 comprises 32 genes from the major histocompatibility complex (MHC) region and contains associated genes for all 15 ADs. Among these, 24 genes exhibit varying levels of pleiotropy (Fig. 5B). Our analysis revealed enrichment of HLA class I genes for AS and PV, and HLA class II genes for the other 13 ADs. Additionally, we observed that the top 10 proteins—those with the highest degree and closeness centrality within this module—are frequently shared across the ADs (Fig. 7A).

Fig. 7: Shared and specific modules of ADs.
Fig. 7: Shared and specific modules of ADs.The alternative text for this image may have been generated using AI.
Full size image

A Module C1 contains genes in the MHC region. Node colors indicate the number of diseases associated with each gene. Central nodes have the highest degree and closeness centrality scores. B Module C7 is enriched for CD4 + T helper cell differentiation. Drugs in phase IV clinical trials (ChEMBL) and their targets are denoted with rhombuses and squares, respectively. C Module C16 is enriched for complement and coagulation cascades, as well as B cell-mediated immunity, and exhibits high importance specifically for SLE. D Module C8 is enriched for keratinization and epithelial cell differentiation, showing high specificity for PV. All modules were visualized in Cytoscape.

Module C7 involves CD4 + T helper cell differentiation and consists of 180 proteins encoded by genes associated with all 15 ADs (Fig. 7B). Our analysis identified 20 drug targets in this module, involving 28 drugs in phase III or phase IV clinical trials from the ChEMBL and DrugBank databases (Supplementary Table 10). These drugs hold potential for repurposing. For example, NATALIZUMAB, a monoclonal antibody that targets ITGA4, is approved by FDA for the treatment of multiple sclerosis and Crohn’s disease. Since ITGA4 is associated with MS, IBD, Crohn’s disease, autoimmune thyroid disease, and ankylosing spondylitis, natalizumab may potentially be effective for these conditions. However, several caveats and considerations remain before this approach can be widely adopted. Additionally, FOSTAMATINIB, the first FDA-approved spleen tyrosine kinase (SYK) inhibitor for the treatment of chronic immune thrombocytopenia51, targets genes associated with PV, MS, IBD, and SLE, indicating potential opportunities for repurposing (Fig. 7B).

Module C3, consisting of 63 proteins encoded by genes associated with 15 ADs, is primarily enriched in cytokine-cytokine receptor interactions and chemokine signaling pathways (Supplementary Fig. 9C). Within this module, we identified 15 targets from 40 drugs that are currently in phase III or phase IV clinical trials (Supplementary Table 10). For instance, IL-6 inhibitors such as SIRUKUMAB, OLOKIZUMAB, and SILTUXIMAB, may represent some of the most promising candidates for repurposing in ADs. IL-6 is a key molecule in Module C3 and plays a central role as a cytokine involved in the pathogenesis of numerous autoimmune conditions. Notably, OLOKIZUMAB has been used for RA treatment52. Additionally, CYCLOSPORINE targets PPIA/PPIF--another important molecule in this module--for the treatment of RA and psoriasis, as listed in DrugBank. Our analysis further indicates that PPIF is associated with multiple diseases, including MS, IBD, CD, and AS.

To systematically evaluate drug repurposing opportunities across all modules, we classified drug-disease pairings into three categories: drugs already approved for multiple ADs (27 drugs), drugs actively tested for new autoimmune indications (16 drugs), and novel repurposing candidates not yet explored in ADs (7 drugs) (Supplementary Table 11). By prioritizing drugs within these categories, our framework facilitates the identification of the most promising candidates for future research and clinical trials, ultimately supporting more efficient and targeted therapeutic development for autoimmune conditions.

Specific modules for ADs

Intriguingly, we noted specific modules exhibiting unique functions in certain diseases. These modules may provide information for targeted treatment of autoimmune conditions. For example, Module C20, enriched for mature B cell differentiation, emerges as one of the top functional modules in SLE, MS, IBD and RA (Fig. 6A). On the other hand, module C16 is highly significant only in SLE, involving complement and coagulation cascades and B cell-mediated immunity (Fig. 7C). The C3 gene is a molecule in module C16 and is targeted by PEGCETACOPLAN. This drug may mitigate complement-mediated kidney damage in glomerular diseases where complement plays a pathogenic role53.

Notably, module C8, which shows high specificity in psoriasis, is significantly enriched for processes related to keratinization and epithelial cell differentiation (Fig. 7D). Psoriasis, a chronic inflammatory skin disease, is characterized by acanthosis, abnormal keratinization, and inflammatory cell infiltrates54. It involves an atypical keratinization process, and the crucial role of keratinocytes in triggering and perpetuating the inflammatory state emphasizes the importance of targeting these cells for effective treatment55.

Discussion

It is widely reported that ADs have widespread sharing of genetic effects and immunopathology. However, without thorough examination, locus sharing is often interpreted as a demonstration of shared association mechanisms. Our study demonstrates that locus-sharing is indeed prevalent, however, signal-sharing is far less common, even when a locus is shared by multiple diseases. Thus, identifying target genes, active cell types, and regulatory mechanisms at the signal level is crucial for comparing the differences and similarities across ADs. This approach enhances our understanding of the associations and pathogenesis mechanisms of these diseases.

We expanded each lead variant to include all variants in strong LD (r² >= 0.8), and used this threshold to define signals. This is consistent with established GWAS default practice (e.g., PLINK, HaploReg, LDlink) and earlier studies56,57. This threshold balances sensitivity and specificity, capturing the true regulatory variants while minimizing noise from variants less likely to be involved in association causality. Although fine-mapping and colocalization can provide more precise delineation of association signals, these approaches require detailed summary statistics and greater statistical power, which may not always be available for many autoimmune diseases. Thus, LD expansion at r² >= 0.8 represents a widely accepted, evidence-based approach for defining candidate signals. Importantly, for signals defined by strong LD, we performed colocalization analyses for a number of selected pleiotropic loci and found that these signals are indeed associated with multiple traits (Supplementary Fig. 3C–E). This result further supports the validity of our approach and demonstrates that the LD-based signal definition effectively captures biologically meaningful, shared genetic architecture. Looking ahead, Bayesian fine-mapping frameworks can further narrow causal variants within each signal (e.g., credible sets with posterior inclusion probabilities, multi-ancestry models). In parallel, Mendelian randomization—particularly two-sample MR using cis-eQTL or cis-pQTL instruments with sensitivity analyses—can strengthen causal inference from target genes to disease risk and help distinguish mediation from horizontal pleiotropy. As larger, harmonized summary statistics across all studied ADs become available, including diverse ancestries, these approaches will refine causal attribution, improve target prioritization, and clarify population-shared versus population-specific effects.

We found that CD4 + T cells were the most prominent cells involved in the associations of ADs. Multiple studies have consistently confirmed the crucial role of CD4 + T cells in the pathogenesis of autoimmune diseases4,12. Additionally, we discovered that B-cell signals are significantly enriched in SLE, consistent with studies showing that SLE variants are particularly enriched in B cells58,59. Association signals can be strongly cell-type or disease-specific. We explored the conditions under which disease-associated signals exert their functions and identified several signals that, despite sharing the target gene(s), function under different cellular contexts across various diseases, including signals around IL10, IL23R/IL12RB2, IL12A and WDFY4.

GWAS have successfully identified numerous variants associated with complex traits, with over 90% located in non-coding regions of the genome60. Identifying the target genes of these non-coding variants remains a significant challenge. Each GWAS locus typically comprises multiple genes, and non-coding variants may not necessarily regulate the nearest gene23. Large-scale gene expression quantitative trait loci (eQTL) datasets from various immune cells have demonstrated their value in linking disease variants to their target genes11,12. Various strategies also have been developed to link regulatory SNPs to their target genes in various cell types. These include the Activity-by-contact (ABC) model and EpiMap to predict enhancer-gene connections in each cell type17,19. Recently, integrated SNP-to-gene linking tools such as V2G, L2G, and cS2G have been developed to enhance the identification of target genes of genetic variants21,23. However, gene regulation can be cell-type or context-specific, which restricts the power to detect target genes, especially for immune-related diseases for which the general tools do not have adequate resolution and adequate resource data. In this study, we developed a scoring scheme that combines these strategies, particularly making use of rich multi-omics data generated from various immune cell types, to identify the target genes for each signal. This approach enhances our ability to identify genes associated with ADs.

If a GWAS study is sufficiently powered, it is reasonable to prioritize the disease genes based on significant GWAS p-values, which is a reflection of effect size and population allele frequency. We prioritized the top-tier genes for each AD and found that 32.4% of genes were shared by at least two diseases. Despite the limited overlap in associated genes across these diseases, pathway enrichment analysis showed enrichment of the same top pathways, such as T-cell differentiation, Interferon signaling, and Interleukin signaling. This observation aligns with studies demonstrating that different AD groups have unique genetic association patterns but impact largely the same primary pathways6,7. On the other hand, it is also a reflection of the limitations of pathway analysis, which is restricted by current knowledge and overshadowed by most prominent pathways.

Interestingly, besides sharing similar top pathways, we identified several pathways significantly enriched in specific ADs. For instance, the initial triggering of complement was only enriched in SLE, characterized by the production of autoantibodies against nuclear and cytoplasmic antigens, leading to immune-mediated tissue injury61. The VEGFA-VEGFR2 pathway was only enriched in ATD, with VEGFA being a critical factor in these diseases62. The Formation of the cornified envelope/Keratinization pathway was only enriched in PV, where premature keratinocyte differentiation disrupts the cornified envelope formation in this disease63. TRAF6-mediated IRF7 activation was only enriched in SLE and ATD, with increased IRF7 expression promoting inflammation and autoantibody production64. Lastly, IL-1 family signaling was enriched only in IBD and CD, highlighting the therapeutic potential of targeting IL-1 family cytokines, such as Canakinumab (anti-IL-1β) for IBD65 and anti-IL-15 antibodies under investigation for CD66.

It has been shown that when a protein is involved in a molecular process, its direct interactors often participate in the same process and function67,68. Therefore, we utilized PPI-based clustering to investigate the similarities and differences across these diseases, which is hypothesized to provide better resolution in functional analysis when we could partition the target genes into PPI clusters. We constructed a PPI network using all top-tier target genes for the 15 ADs and this allowed us to group these genes into 32 clusters. This approach enabled us to better understand the functional similarities and differences across these diseases compared to enrichment analysis that uses all the associated genes collectively. From this analysis, we identified five significant common modules (C1, C3, C7, C12, and C14) and several specific modules (e.g., C20, C16, C8) by assessing the proportion of pleiotropic genes in each module or using GSEA on the ranked gene list of each AD. This implies that certain common modules may contribute to shared disease characteristics of ADs, while specific modules may have unique roles in individual diseases.

The process of drug discovery and development is fraught with risks and high costs, resulting in a relatively low success rate in translating discoveries into clinical applications. Research has indicated that GWAS can assist in identifying compounds suitable for drug repurposing. It has been shown that when a drug’s target is supported by underlying GWAS evidence, it can enhance the chances of its approval for clinical use69,70. Our study prioritized disease-associated genes and identified both common and specific gene modules. The common modules could present opportunities for drug repurposing, while disease-specific modules or pathways could suggest potential new therapeutic targets for drug development and precision treatment.

This study has several limitations. First, the use of LD to differentiate signals associated with various ADs can be restrictive, particularly when there is intermediate LD among the signals. Second, identifying cell types responsive to association signals and the target genes is constrained by the availability and resolution of genomic and epigenomic data. This limitation is especially pronounced when association signals are active under specific conditions, such as infection or pathological states. Third, although we used immune cell-derived data to provide a consistent framework across all 15 ADs, critical non-immune cell types—such as intestinal epithelial cells (IBD), pancreatic beta cells (T1D), synovial fibroblasts (RA), and keratinocytes (psoriasis)—are not represented in our analysis. Fourth, although colocalization is a powerful approach for determining whether two traits share the same regulatory variants or for identifying target genes by linking GWAS signals to eQTLs, the limited availability of summary statistics for some ADs restricted our ability to perform comprehensive colocalization analyses. Fifth, variations in sample size across study cohorts may affect statistical power, with smaller cohorts identifying fewer genetic signals compared to diseases with much larger sample sizes. This disparity may limit the comparability of genetic signals across diseases. Sixth, most of the major GWAS we analyzed were conducted in European and East Asian cohorts. This ancestry predominance may limit the generalizability of our findings to underrepresented populations, where allele frequencies, LD structure, effect sizes, and environmental interactions can differ. As a result, some signals may be missed, and target gene, pathway, or cell type prioritizations may not fully translate across ancestries. Future work should incorporate larger, diverse-ancestry cohorts and population-specific functional genomics to improve transferability and ensure equitable relevance of the results. Finally, our findings from genetic associations should be considered as starting points that require further functional validation.

In conclusion, this study leveraged public datasets and integrated GWAS with comprehensive functional genomics data to identify relevant cell types and target genes at associated signals for 15 ADs. We explored both common and disease-specific functional modules by analyzing the PPI network, addressing the inherent complexity of these diseases. While we made significant efforts to identify shared and unique functional characteristics across various ADs, some challenges remain. Identifying shared major immune functional dysregulation is relatively straightforward, akin to recognizing the proverbial ‘elephant in the room’. However, detecting subtle differences that may be crucial to the specificities of different autoimmune disorders is considerably more challenging. Future research, encompassing genetic findings and genomic functional characterizations, may contribute to advances in this area. Despite these limitations, our study serves as a critical initial step toward understanding the insights offered by genetic findings through comprehensive genomic data analysis, particularly in the realm of autoimmune diseases.