Introduction

Inflammatory bowel disease (IBD) is characterized by chronic relapsing inflammation that can affect any segment of the digestive tract, as observed in Crohn’s disease (CD), or be limited to the colon, as seen in ulcerative colitis (UC). The prevalence of IBD is increasing, especially in Eastern countries. IBD arises in the context of intricate interplays between genome, epigenome, gut microbiota, immune dysregulation and the environment, the full understanding of which remains elusive. Large-scale genome-wide association studies have revealed over 200 disease-associated loci, yet the overall genetic contribution to IBD risk remains modest, estimated at 13.1% for CD and 8.2% for UC1,2,3. Recently, epigenetic mechanisms, including DNA methylation (DNAm), histone modifications, and miRNA synthesis, have been recognized as plausible mechanisms for both initiating and sustaining intestinal mucosal inflammation in human IBD4.

Remarkably, the most consistent progress in understanding DNAm changes in IBD has been achieved due to technological innovations for genome-wide methylation assessment. For instance, a recent systematic review and meta-analysis of peripheral blood DNAm studies in IBD observed differentially methylated positions (DMPs), such as VMP1/TMEM49/MIR21 and RPS6KA2, were consistently differentially methylated across all studies5. However, DNAm changes in peripheral blood cells are primarily associated with inflammatory status rather than disease status6. It also found that methylation patterns in blood tend to revert to “normal” following anti-inflammatory treatment, irrespective of the underlying disease state. Consequently, an increasing number of studies have focused on the DNA methylome of specific mucosal cell types in IBD, including epithelial cells7,8,9, adipose stem cells10, and CD4+ lymphocytes11. These studies have unveiled distinct DNAm patterns linked to inflammation and different disease subtypes. Additionally, investigations into genome-wide DNAm in the intestinal tissue of UC have identified specific DNAm alterations associated with genetic variations, disease status, severity, and clinical outcomes12,13,14. However, the differentiation in DNAm between inflamed and non-inflamed mucosa, as well as the relationship between DNAm and disease severity in treatment-naïve CD patients, remains unclear.

This study analyzed the DNAm profiles in the mucosa of treatment-naïve CD patients and examined the correlation between DNAm patterns and disease severity. It aims to investigate the role of inflammation-associated DNAm in the immune signaling pathways of CD and to identify specific DNAm alterations that are significantly associated with disease severity, thereby enhancing the management of CD.

Methods

Patient enrollment and sample collection

All patient recruitment and sample collection were performed under full ethical approval from the Ruijin Hospital Ethics Committee, Shanghai Jiaotong University School of Medicine (2019 − 186). The study was conducted in accordance with the principles of the Declaration of Helsinki and all methods were performed according to the relevant guidelines and regulations15,16,17. This study enrolled 24 consecutive adult patients with treatment-naïve Crohn’s disease between January 2022 and May 2023. The diagnosis of CD requires a comprehensive evaluation based on clinical manifestations, laboratory tests, endoscopic examinations and histopathological analyses in accordance with guideline18. A paired tissue sample consists of two specimens obtained from the same patient during a single colonoscopic examination: one from the diseased tissue (inflammatory regions) and the other from adjacent normal tissue. Typical preparatory protocols preceding a colonoscopy include dietary modifications, the administration of oral laxatives, and the suspension of certain medications, among other interventions19. All participants adhere to a standardized bowel preparation regimen and subsequently undergo a colonoscopic examination. The area around the ulcer was inflammation site, and biopsy should be performed on the surrounding area, rather than at the base of the ulcer. Biopsies were obtained from intestinal segments exhibiting the most pronounced inflammation and ulceration. Four biopsies were collected from both inflamed and non-inflamed sites and stored at − 80 °C. The key clinical characteristics, including age, gender, disease location (according to Montreal classification), Simple Endoscopic Score for Crohn’s Disease (SES-CD)20, Crohn Disease Activity Index (CDAI)21,22, and behaviour, were collected (Table 1). Informed consent was obtained from all patients in this study.

Table 1 Basic characteristics of patients with CD in this study.

DNA extraction and RRBS

According to the manufacturer’s instructions, formalin-fixed paraffin-embedded (FFPE) samples from CD patients were underwent DNA extraction using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany)23. Quality control measures were implemented to ensure the integrity of the extracted DNA samples. The methylation status of CpG sites was assessed using the reduced representation bisulfite sequencing (RRBS)  method, as previously described24. A DNA input ranging from 50 ng to 100 ng was digested with the MspI enzyme prior to ligation with a methylated adaptor containing complementary sticky ends. Subsequently, the ligation products underwent bisulfite conversion using the Methylcode Bisulfite Conversion Kit (ThermoFisher, MECOV50), followed by purification and recovery steps25,26. To introduce a barcode for Illumina sequencing, the converted DNA was amplified. Finally, the libraries were sequenced on the Illumina Hiseq X10 platform27,28.

DNA methylation analysis

The Illumina bcl2fastq software was performed to do the demultiplexing of reads (https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html). FASTQ data were adapter-trimmed by the first 2 bases from each end with trim-galore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore). Single-end reads were generated by merging the paired-end read FASTQ files. The single-end reads were aligned to the bisulfite-converted human reference genome (version hg19) using Bismark29 and Bowtie v.130, resulting in BAM files. The mapped bam files were subsequently utilized for further analysis. CpG methyRate calculating was performed with Bismark to identify the differentially methylated sites (DMCs) between inflamed and non-inflamed samples from CD patients. A minimum of five CpG sites was required to define a differentially methylated region (DMR). p-values were adjusted using the false discovery rate (FDR). The definitive DMCs were selected using the following criteria: an absolute methylation difference (|Δβ value|) of at least 10 and a corrected p-value (P.adjust) below 0.0001. Likewise, the definitive DMRs between inflamed and non-inflamed samples were determined by a threshold of |Δβ value| ≥ 10 coupled with a P.adjust < 0.01. For subgroup analyses stratified by SES-CD and CDAI, the DMR criteria were adjusted to an |Δβ value| ≥ 10 and a P.adjust < 0.05.

Statistical analysis

The volcano plot was generated using the ggplot2 package in R software (version 4.3.2, R Foundation for Statistical Computing, Vienna, Austria), and the heatmap was constructed with the pheatmap package in R31. Gene Ontology (GO) for biological process enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment (www.kegg.jp/kegg/kegg1.html)32,33,34 analyses were performed for genes annotated to DMRs using R with the BiocManager package clusterProfiler35,36. KEGG copyright permission is 251,684. To cluster similarity matrices of the GO terms, the simplifyEnrichment R package was implemented for visualizing, summarizing, and comparing the clusterings37. The VennDiagram R package was used to create a Venn diagram to illustrate the intersection of the DMRs. Two-tailed Mann–Whitney tests were used to compare distributions between two groups. A two-sided P value < 0.05 was considered statistically significant. Statistical analyses for the two groups in Figs. 1 and 3 were conducted using R version 4.3.2, while those for the two groups in Fig. 4 were performed with GraphPad Prism version 9.0.

Fig. 1
figure 1

Differential methylation analysis of inflamed and non-inflamed mucosae from CD patients. (A) A volcano plot based on DMRs (P.adjust < 0.01, |Δβ value| ≥ 10). Blue represents hypomethylation sites, while red represents hypermethylation sites. (B) A total of 1933 DMRs were hypermethylated (upper), while 764 DMRs were hypomethylated (lower) (p-values calculated using the Mann-Whitney U test). (C) Pie charts illustrate functional genomic distribution of hypermethylated (upper) and hypomethylated DMRs (lower). (D) A heatmap of DMRs identified in inflamed and non-inflamed mucosae. Each row represents a DMR based tag. Each column represents a tissue specimen. The brown represents inflamed samples, and the green represents the non-inflamed ones. Yellow indicates increased DNA methylation in inflamed tissues compared to non-inflamed tissues, blue indicates decreased DNA methylation in inflamed tissues. ***P<0.001.

Results

Identification of inflammation-associated methylation signatures in CD

A total of 17,097 DMCs (Additional file 1) and 2,687 DMRs (Additional file 2) were identified between inflamed and non-inflamed mucosae. Significant DMR differences, with more hypermethylated DMR and reduced hypomethylated DMR were observed in inflamed regions than in non-inflamed regions (Fig. 1A-B). The majority of DMRs were in promoter regions (32%) and intron regions (31%). Specifically, hypermethylated DMRs were predominantly present in promoter regions (33%) and intron regions (30%), while hypomethylated DMRs showed a similar distribution in promoter regions (31%) and intron regions (32%) (Fig. 1C). Heatmap revealed a clear visual distinction of methylation features between the two groups (Fig. 1D). The top 10 DMRs are presented in Table 2.

Table 2 Top 10 DMRs associated with inflammation.

Immunological relevance of inflammation-associated DMRs in CD

GO analysis for biological process enrichment revealed that the set of genes (n = 2,028) annotated to our inflammation-associated DMRs were enriched in immune function, including changes in immune cell proliferation, activation, and differentiation (Additional file 3). After clustering the similarity matrices of the 221 biological process terms, 14 clusters were obtained with 8 of these clusters containing at least five biological processes each (Fig. 2A). In alignment with our priori hypotheses, two clusters, immune cell function and cell adhesion were closely related to the immunology of IBD (Table 3). Intriguingly, several other clusters pointed towards a convergence with epithelial development and proliferation pathways, suggesting a broader impact of the identified DMRs on both immune regulation and tissue homeostasis. The cluster of immune cell functions comprises 14 biological processes and involves 89 genes, corresponding to 123 DMRs (Additional file 4). Similarly, the cluster of cell adhesion includes seven biological processes and 117 genes, accounting for 173 DMRs (Additional file 5). To prioritize the most significant DMRs, we ranked them based on P.adjust and listed them in Supplementary Table 1, for further examination. Among the genes annotated to these DMRs, the methylation of JAK3, SBNO2, LIMK1, CXCL5, and RUNX3 has been reported to associated with IBD, confirming the findings of previous studies38,39,40,41,42. The top 20 biological processes associated with immunology were presented in a bubble chart (Fig. 2B). KEGG enrichment analysis revealed nine significant pathways, including axon guidance, the Rap1 signaling pathway, regulation of actin cytoskeleton, Yersinia infection, the Notch signaling pathway, focal adhesion, Fc gamma R-mediated phagocytosis, bacterial invasion of epithelial cells, and pathogenic Escherichia coli infection (Fig. 2C). These signaling pathways play crucial roles in various biological processes, including immune regulation, inflammatory responses, cell migration, and cell adhesion, and are intimately linked to the pathological mechanisms underlying IBD.

Fig. 2
figure 2

Biological relevance of the inflammation-associated DMRs in CD. (A) Similarity matrix of 221 biological process terms obtained from the enrichment analysis of genes mapped by the inflammation-associated DMRs. (B) The top 20 biological processes related to immunology, identified through GO enrichment analysis. (C) Enriched KEGG categories for DMRs.

Table 3 GO clusters associated with the immunology of IBD.

Methylation signatures were associated with disease severity in CD

As a proof-of-concept, we further investigated whether DNAm could distinguish patients with CD based on their disease severity. We compared the inflamed mucosae from 10 patients with mild CD (0 < SES-CD ≤ 3) to 12 patients with moderate CD (4 < SES-CD ≤ 6) and 2 patients with severe CD (SES-CD ≥ 7). The analysis revealed significant DNAm differences between patients with mild CD and those with moderate to severe (serious) CD. A total of 9,029 DMCs (Additional file 6) and 389 DMRs (Additional file 7) were identified based on a P.adjust < 0.05 (Fig. 3A). Among these, 291 were hypermethylated and 98 were hypomethylated (Fig. 3B). Most DMRs were located in promoter regions (33%) and intron regions (26%). Hyper- and hypomethylated DMRs were similarly distributed, with 32% and 36% in promoter regions and 27% and 20% in intron regions, respectively (Fig. 3C). Importantly, a heatmap shows the top 100 DMRs based on P.adjust displayed a clear visual distinction between mild and moderate to severe mucosae from CD patients, indicating unique DNAm patterns associated with disease severity (Fig. 3D). DMRs associated with both SES-CD and inflammation are displayed in Table 4. Likewise, we compared the inflamed mucosae from 5 patients with mild CD (150 < CDAI < 220) to 16 patients with moderate CD (221 < CDAI < 450) and 3 patients with severe CD (CDAI > 450). A total of 9,268 DMCs (Additional file 8) and 327 DMRs (Additional file 9) were identified, with 201 being hypermethylated and 126 hypomethylated (Fig. 3E-F). The global distribution of hypermethylated and hypomethylated DMRs was like that compared by SES-CD (Fig. 3G). A heatmap results of the top 100 DMRs displayed an unclear visual distinction between mild and moderate to severe mucosae from CD patients (Fig. 3H). We identified 18 DMRs associated with both CDAI and inflammation. However, only SLC25A10 has been reported to be associated with IBD (Table 5).

Fig. 3
figure 3

Differential methylation analysis of inflamed mucosae categorized by SES-CD and CDAI scores. (A) A volcano plot based on DMRs (P.adjust < 0.05, |Δβ value| ≥ 10) between mild and moderate to severe (serious) inflamed mucosal samples. Blue represents hypomethylation sites, while red represents hypermethylation sites. (B) A total of 291 DMRs were hypermethylated (upper), while 98 DMRs were hypomethylated (lower). (C) Pie charts illustrate functional genomic distribution of hypermethylated (upper) and hypomethylated DMRs (lower). (D) A heatmap of DMRs identified in mild and serious inflamed mucosal samples from CD patients categorized by SES-CD score. (E) A volcano plot based on DMRs (P.adjust < 0.05, |Δβ value| ≥ 10) between mild and moderate to severe (serious) inflamed mucosal samples. Blue represents hypomethylation sites, while red represents hypermethylation sites. (F) A total of 291 DMRs were hypermethylated (upper), while 98 DMRs were hypomethylated (lower). (G) Pie charts illustrate the proportion of genome-wide coverage of hypermethylated (upper) and hypomethylated (lower) DMRs. (H) A heatmap indicated the top 100 DMRs profile between mild and serious inflamed mucosal samples from CD patients categorized by CDAI score. Each row represents a DMR based tag. Each column represents a tissue specimen. The brown represents mild samples, and the green represents the serious ones. Yellow indicates increased DNA methylation in serious tissues compared to mild tissues, blue indicates decreased DNA methylation in serious tissues. ***P<0.001.

Table 4 A total of 22 DMRs associated with simple endoscopic score for crohn’s disease (SES-CD) and inflammation.
Table 5 A total of 18 DMRs associated with crohn’s disease activity index (CDAI) and inflammation.

Disease severity-specific inflammatory DMRs

To further select DNAm signatures specific to inflammation and disease severity in CD, we compared inflammation-associated DMRs with those identified in subgroups categorized by the SES-CD and CDAI. We found significant overlap within these groups of DMRs, particularly between inflammation-associated DMRs and those associated with severe disease status (Fig. 4A). Subsequently, we identified DMRs that were simultaneously hypermethylated or hypomethylated in both inflammation-associated and severe disease status-associated DMRs. This screen identified 36 shared DMRs between high SES-CD-associated and high CDAI-associated DMRs, of which 6 were also inflammation-associated (Fig. 4B). Of these six DMRs, KDM4B and CLDN15 showed the most significant differences between groups (P < 0.001).

Fig. 4
figure 4

Relationship between DNA methylation signatures of inflammation and disease severity. (A) Study design and the workflow of finding disease severity-specific inflammatory DMRs. Upon 24 patient admission, clinical information was collected, and paired tissues were obtained for RRBS. We compared inflammation-associated DMRs with those identified in subgroups categorized by the SES-CD and CDAI. A Venn diagram displays the intersection of DMRs associated with inflammation, SES-CD and CDAI. (B) Six shared DMRs exhibit a consistent pattern of hypermethylation across inflammation, high SES-CD, and high CDAI conditions.

Discussion

Over the past decades, extensive research has established a strong correlation between DNAm and the pathogenesis of IBD. Studies have shown that altered DNAm patterns in the circulation of IBD patients reflect inflammatory states and correlate with disease progression, treatment response, and genetic variants. Despite identifying methylation changes in disease tissues and specific cell types in IBD, the precise differences in DNAm between inflamed and non-inflamed mucosa, and their impact on disease severity and clinical outcomes in treatment-naïve CD patients, remain unclear. In this study, we utilized a treatment-naïve CD cohort to uncover distinct methylation patterns between inflamed and non-inflamed mucosae, as well as between inflamed mucosae of CD patients with varying disease severity.

We first demonstrated that significant DMRs were identified between inflamed and non-inflamed mucosa in CD, with most of them being hypermethylated (1,933 hypermethylated DMRs and 754 hypomethylated DMRs). The majority of these DMRs were located in gene body regions, including exons and introns, and secondarily in promoter and distal intergenic regions. This localization pattern aligns well with the DMC outcomes reported in previous investigations, which compared colonoscopy samples of CD patients against healthy controls utilizing the HumanMethylation450K BeadChip platform43. Notably, no significant differences in the distribution patterns were discerned between hypermethylated and hypomethylated DMRs. Furthermore, several genes annotated to the most prominent DMRs. Among the genes annotated as DMRs, PKD144, RNF18645 and TPPP346 have been implicated in the pathogenesis of UC, while RUNX347 as a gene associated with IBD susceptibility. In addition, other notable DMRs have been shown to act as biomarkers of malignancy in a variety of cancers48,49,50,51,52,53,54,55,56, suggesting their potential role in CD. Notably RNF186 and RUNX3, have been previously implicated in UC, highlighting the potential involvement of inflammation-associated DMRs in the underlying mechanisms of CD 42,57,58.

The intestinal immune system, composed of the intestinal epithelium, immune cells, and the gut microbiota, plays a crucial role in maintaining gut health. IBD arises from an atypical immune reaction to gut microorganisms in individuals with a genetic predisposition; however, the detailed mechanisms underlying this condition are still being elucidated. GO analysis revealed that the 2,028 genes associated with inflammation-related DMRs are significantly enriched in immune-related biological processes, including immune cell proliferation, activation, and differentiation. This enrichment underscores the pivotal role of epigenetic modifications in orchestrating immune responses central to CD pathology. Notably, the clustering of biological processes highlighted two major clusters—immune cell function and cell adhesion—that are intrinsically linked to IBD immunology. These findings align with established evidence that immune dysregulation and impaired cell adhesion are critical drivers of IBD pathogenesis. Furthermore, the identification of additional clusters related to epithelial development and proliferation suggests that the epigenetic alterations in CD extend beyond immune regulation to encompass epithelial integrity and tissue homeostasis. This dual impact is consistent with the multifactorial nature of CD, where both immune dysfunction and epithelial barrier defects contribute to disease progression. The interplay between immune cells and epithelial cells is crucial for maintaining intestinal homeostasis, and disruptions in this balance can lead to chronic inflammation and tissue damage observed in CD patients. KEGG pathway enrichment analysis identified nine significant pathways. These pathways are integral to various biological processes such as immune regulation, inflammatory responses, cell migration, and adhesion, all of which are intimately linked to the pathological mechanisms of IBD. The identification of these pathways highlights the complex interplay between immune responses and epithelial cell functions in CD.

Upon conducting a more detailed analysis of the DNAm profiles in the mucosae of patient subgroups categorized by SES-CD and CDAI, we uncovered a substantial number of DMRs. This finding suggests that there are significant epigenetic variations in the mucosal tissues of patients with differing levels of disease activity and severity as measured by these clinical indices. However, the heatmap representation of the top 100 DMRs distinctly visualized a separation between the mild and moderate-to-severe mucosae of CD patients when categorized using the SES-CD, whereas no such clear distinction was observed when utilizing the CDAI. The clear distinction in DNAm patterns between mild and severe CD mucosae, as evidenced by heatmap analyses, indicates that DNA methylation profiles could serve as biomarkers for disease severity and progression. This prognostic potential is particularly valuable for tailoring therapeutic interventions and monitoring treatment efficacy. Additionally, our analysis revealed that several genes annotated to the top 10 DMRs, such as APBB1IP59, SLC25A1060, NLRP661,62 and NOTHC1 63,64have been previously implicated in IBD. FMO565 and ECE166 have been implicated in intestinal inflammation, further validate the relevance of our findings and highlight key players in the epigenetic regulation of CD. These observations contribute to the growing understanding of the role of epigenetics in the progression of IBD and suggest that endoscopic assessments like the SES-CD may be particularly valuable in identifying molecular signatures associated with disease activity. This knowledge could pave the way for the development of more targeted diagnostics and personalized treatment strategies tailored to the specific epigenetic profiles of individual patients.

In our final analysis, we explored the overlaps among DMRs linked to inflammation and disease severity in patients with CD. Our findings revealed that a significant proportion of these shared DMRs display consistent methylation patterns—either hypermethylation or hypomethylation——across inflammation, high SES-CD, and high CDAI-associated DMRs. This consistency suggests that there may be common epigenetic mechanisms underlying the inflammatory process and the severity of the disease as assessed by these clinical metrics. The identification of overlapping DMRs across different evaluations of disease severity highlights the intricate relationship between epigenetic modifications and the pathophysiology of CD. Specifically, the recurring methylation patterns indicate that certain epigenetic changes are central to both the initiation and exacerbation of inflammatory responses in CD patients. These shared epigenetic signatures may serve as critical biomarkers for disease monitoring and prognosis, offering insights into the molecular underpinnings that drive disease progression. Among the shared DMRs, KDM4B and CLDN15 emerged as a particularly noteworthy gene. KDM4B is hypomethylated in favorable clinical outcomes related to high CDAI and SES-CD scores. This inverse relationship implies that distinct methylation patterns may be associated with better disease control and improved patient outcomes, as opposed to those indicative of active disease or higher disease activity. CLDN15 plays a key role in intestinal barrier function. CLDN15 methylation levels are also positively correlated with the occurrence and severity of inflammation, which may influence the permeability of intestinal epithelial cells and exacerbate intestinal inflammatory responses in patients with IBD.

Beyond KDM4B and CLDN15, the consistency of methylation patterns across various DMRs underscores the potential of DNA methylation profiling as a tool for personalized medicine in CD. Future research should focus on the functional validation of identified DMRs and associated genes to uncover their mechanistic roles in CD. Integrating genetic and epigenetic data could provide a more comprehensive understanding of CD susceptibility and progression, elucidating how genetic predispositions interact with epigenetic modifications to drive disease. Moreover, exploring the therapeutic potential of targeting specific epigenetic regulators, such as KDM4B, may pave the way for innovative interventions aimed at restoring immune balance and epithelial integrity in CD patients.

While our study provides significant insights into the epigenetic mechanisms underlying CD, several limitations must be acknowledged. First, the relatively small sample size, particularly within the severe CD subgroup, may limit the generalizability of our findings and necessitate validation in larger, independent cohorts. Second, the absence of a well-matched control group restricts our ability to fully contextualize the methylation changes observed in CD patients. Including a healthy control group in future studies would allow for a more comprehensive comparison and enhance the robustness of our conclusions. Additionally, the cross-sectional design of our study precludes establishing causal relationships between DNA methylation (DNAm) changes and disease progression. Longitudinal studies are essential to elucidate the temporal dynamics of epigenetic modifications and their causal roles in CD pathogenesis.

Conclusions

Our study identifies distinct DNAm patterns in inflamed mucosae in treatment-naïve CD patients. These DMRs are involved in immune cell function and cell adhesion, suggesting a potential role in immune modulation and tissue equilibrium in CD. Our analysis further suggests the potential relevance of six inflammation-associated markers, particularly those identified by SES-CD/CDAI, in understanding disease activity and progression.