Introduction

IgA nephropathy (IgAN) is a highly prevalent primary glomerulonephritis, with a significant proportion of patients progressing to end-stage kidney disease (ESKD)1. However, accurate assessment of disease severity and progression risk remains a major clinical challenge. Current prognostic markers, including proteinuria and histopathological findings, are limited in capturing patients with early disease dynamics before irreversible loss of kidney function occurs2,3. This delay in detection limits the opportunity for early intervention. Therefore, disease-specific and non-invasive biomarkers that reflect underlying immunopathology and can discriminate early-stage patients with progressive potential are needed to guide timely and personalized treatment.

Increasing evidence suggests that epigenetic dysregulation—including DNA methylation, histone modifications, and non-coding RNA activity—plays a critical role in IgAN pathogenesis. Altered expression of C1GALT1, a key enzyme for galactose-deficient IgA1, has been linked to microRNA upregulation4, while histone methylation changes have been associated with aberrant IgA receptor signaling5. Recent advances in chromatin accessibility profiling allow for cell-type–specific insights into transcriptional regulation and immune cell state dynamics in various diseases6,7,8,9. Since chromatin accessibility governs transcription factor binding and gene expression, this approach offers a powerful tool to elucidate disease-specific regulatory mechanisms and identify biomarkers that reflect disease heterogeneity, even among patients at similar clinical stages.10

Previous studies have highlighted the role of T cells in the pathogenesis of IgAN11,12. In particular, increased activated CD8⁺ T cells showed correlation with renal function deterioration12, while increased tubulointerstitial CD8⁺ T cell infiltration was associated with disease progression13. Recent single-cell analyses have identified specific CD8⁺ T cell subpopulations with distinct transcriptional signatures in IgAN patients, suggesting heterogeneous functional roles14. These findings collectively suggest that CD8⁺ T cells may contribute to both glomerular and interstitial injury through cytotoxic and pro-inflammatory functions. However, the precise immunological regulatory pathways of CD8⁺ T cells on pathogenesis and progression of IgAN require further investigation.

Thus, in the study, we applied Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq) to sorted peripheral CD8⁺ T cells from biopsy-proven IgAN patients, and aimed to identify stage-specific chromatin accessibility profiles and uncover novel epigenetic biomarkers associated with discrimination of early IgAN. Furthermore, we evaluated the potential of these epigenetic markers as candidate biomarkers and explored their translational applicability in personalized therapeutic approaches.

Methods

Ethical statement

This study was approved by the Institutional Review Board (IRB) of Seoul National University Hospital (SNUH) (IRB No. 1903-128-1020). The use of peripheral blood samples stored in the SNUH Human Biobank was authorized with written informed consent. The authors adhered to the ethical principles of the latest version of the Declaration of Helsinki.

Study design and participants

We screened adult patients (≥ 20 years old) with biopsy-confirmed IgAN who donated kidney tissue, blood, and urine samples to the SNUH human biobank between 2006 and 2019. Patients were stratified into early- and late-stage groups based on eGFR and urine protein-to-creatinine ratio (UPCR) value measured at the time of peripheral blood sampling and during follow up (median follow up: 3.16 years, interqurartile ranges 2.71–3.35). Early-stage disease was defined as having an eGFR ≥ 60 mL/min/1.73 m2 at the time of peripheral blood sampling, an average eGFR ≥ 30 mL/min/1.73 m2 (calculated from serial measurements during the follow-up period after peripheral blood sampling), or UPCR < 3 g/g. Those who did not meet the criteria were classified as late-stage group. In addition, those with a last follow up eGFR ≤15 mL/min/1.73 m2 were classified as late-stage irrespective of other criteria. Then, a total of 17 patients were enrolled and stratified into an early-stage group (n = 11) and a late-stage group (n = 6).

Cell sorting and sample preparation

Peripheral blood mononuclear cells (PBMCs) were isolated using a Ficoll–Histopaque gradient (GE Healthcare Life Sciences, Piscataway, NJ, USA). Isolated PBMCs were resuspended in a freezing medium (50% fetal bovine serum [FBS], 10% dimethyl sulfoxide [DMSO], and 40% RPMI-1640) (Invitrogen, Carlsbad, CA, USA) and stored in liquid nitrogen. For CD8+ T cell isolation, cryopreserved PBMCs were thawed, stained with fluorochrome-conjugated anti-human CD8 antibodies, and sorted using a FACSAria III cell sorter (BD Biosciences, San Jose, CA, USA).

ATAC library preparation and sequencing

ATAC libraries were prepared from 50,000 sorted viable CD8⁺ T cells per sample, as previously described15. Cells were washed with 1× cold Dulbecco’s phosphate-buffered saline (DPBS) at 4°C and pelleted by centrifugation at 500 × g for 5 min at 4°C. Pellets were resuspended in 50 μL of cold lysis buffer (10 mM Tris-HCl, pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.1% Tween-20, and 0.01% Digitonin) and incubated on ice for 3 min. Lysis was stopped with 1 mL of wash buffer (10 mM Tris-HCl, pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20), followed by centrifugation at 500 × g for 10 min at 4°C to collect nuclei.

Isolated nuclei were incubated in transposase reaction mix (25 μL 2× TD buffer, 16.5 μL 1× cold DPBS, 2.5 μL Tn5 transposase [Illumina, San Diego, CA, USA], 0.5 μL 0.1% Tween-20, 0.5 μL 0.01% Digitonin, and 5 μL nuclease-free water), at 37°C for 45 min on a thermomixer at 1,000 rpm. Transposed DNA was purified using a Qiagen PCR cleanup kit. Library amplification was performed with indexing primers from the Nextera kit (Illumina). The optimal number of PCR cycles for each sample was determined via qPCR to avoid over-amplification. Final libraries were quantified using a Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and verified with the KAPA Library Quantification Kit (Roche Applied Science, Basel, Switzerland). Indexed libraries were pooled and sequenced on an Illumina NextSeq6000 platform using a paired-end 150 bp sequencing strategy.

ATAC-Seq data processing and biomarker selection

Paired-end reads from the sequenced ATAC-Seq libraries were aligned to the human reference genome (hg19) using Bowtie2 (v2.4.1) software with the default parameter set and the “-X 2000” option to accommodate the expected fragment size range16. Aligned reads were converted to BED format using the “bamToBed” function in BEDtools17. To correct for Tn5 transposase insertion bias, the 5′ ends of reads were adjusted by +4 bp for the positive strand and −5 bp for the negative strand, accounting for the 9 bp duplication introduced during tagmentation. Prior to downstream analysis, library quality and complexity were assessed according to the ENCODE ATAC-seq quality control guidelines, using parameters such as PBC1, PBC2, NRF, FRiP, and TSSE (Supplementary Table S1). Genome browser tracks were generated for each sample using MACS3 (v3.0.0a5) software with the “callpeak --bdg” option18. Sequentially, the resultant bedGraph files were converted to bigWig format using the “bedGraphToBigWig” function in KentUtils19. To enable inter-sample comparison, genome browser tracks were normalized using a scaling factor derived from stable control ATAC-peak regions identified across all samples. Initially, 20 control loci were used as described previously20, and subsequent validation confirmed that the normalization factor converged with expansion to >200 invariant loci. In total, 44,298 raw ATAC-peaks were detected across all CD8⁺ T-cell samples (q < 0.001). After removing sample-specific and low-confidence peaks, we defined a consensus set of 22,149 peaks that were present in ≥ 80% of donors and had an average ATAC-peak area above the median. These peaks were used for downstream analyses to ensure reproducibility. We then identified candidate ATAC-peaks that differentiated between the early and late groups by selecting regions with thresholds of fold change > 2.0 in either direction and P < 0.05 using a t-test. To maintain consistency with current genomic references, all genomic coordinates were converted from hg19 to hg38 using the UCSC liftOver tool. The complete list of liftover-converted differentially accessible regions (DARs) for the 279 ATAC-marker candidates is provided in Supplementary Table S2, and the statistical results from the negative-binomial generalized linear model (GLM) using edgeR21, incorporating age and sex as covariates (~ group + age + sex) are provided in Supplementary Table S3.

Enrichment score estimation of CD8+ T-cell subsets

To more accurately reflect disease-associated alterations in CD8+ T-cell subsets, we established a disease-specific reference framework using a single-cell RNA-sequencing (scRNA-seq) dataset generated from the same IgAN patient cohort (GSE285335; data under review). CD8⁺ T cells were clustered using Seurat (v5.2.1), and four major subsets were identified based on IL-7Rα expression and canonical differentiation markers, corresponding broadly to naïve-like+central memory and progressively differentiated effector-memory populations.

Signature genes from total four subsets were determined by differential expression analysis (one subset versus all others) using edgeR with thresholds of fold change > 1.5 and P < 0.05 21. Genes showing stable intra-cluster expression (coefficient of variation < 0.3) were retained, and overlapping genes between subsets were excluded. Promoter-associated ATAC-peaks in the bulk dataset were selected when these peaks overlapped to promoter regions (core-, proximal-, and distal promoters) of non-redundant signature genes.

The corresponding promoter peaks were assembled into a subset-specific signature matrix after Z-score normalization to minimize the inter-sample variability. Subset enrichment within each bulk ATAC-seq sample was quantified using single-sample gene set enrichment analysis (ssGSEA) implemented in the GSVA R-package (v2.0.4). Enrichment scores derived from promoter accessibility and compared between early- and late-stage IgAN groups using Student’s t-test.

Calculation of weighted chromatin openness score

To quantify chromatin openness, a weighted composite score was calculated for each donor based on selected ATAC-based biomarkers. For each biomarker, the normalized peak area was binarized by assigning a value of 1 if it exceeded a predefined threshold, and 0 otherwise. Biomarkers were ranked by relative importance, and weights were assigned accordingly in descending order. The final weighted score for chromatin openness was obtained by multiplying each binary value by its corresponding weight and summing the products across all selected biomarkers.

Determination of cut-off for biomarkers and weighted score

To determine optimal cut-off values for predictive biomarkers based on normalized chromatin accessibility, Cutoff Finder tool22 was adopted. The cut-off value was established by comparing the early- and late-stage groups and selecting the point that minimized the Euclidean distance to the top-left corner of the ROC curve to achieve optimal sensitivity and specificity. The determination process for each DAR cut-off value consisted of two steps. First, the predictive performance of the cut-off value was evaluated by calculating the area under the ROC curve for each biomarker. Second, the discriminative ability of the cut-off value was assessed by calculating its sensitivity and specificity, ensuring robust separation between the early- and late-stage IgAN groups. These cut-off values were subsequently used for both individual biomarker analysis and the calculation of the weighted chromatin openness score.

Motif analysis and Identification of downstream genes

Transcription factor motifs within DARs were identified using the “findKnownMotifs.pl” function in the HOMER suite23, which assigns known motifs and calculates their statistical significance. Significant motifs were ranked based on ascending P-values, and top five significantly enriched motifs were selected independently for the early- and late-stage groups. Consensus motifs were then generated based on the frequency of actual sequence occurrences within assigned peaks. The visualization of consensus motifs was performed using the “seqLogo” R-package24, allowing graphical representation of base composition across motif positions.

To explore the downstream regulatory effects of the identified transcription factors, Ingenuity Pathway Analysis (IPA)25 was used. IPA enabled the prediction of transcriptional regulatory networks and the identification of functionally relevant target genes. To assess the expression profiles of these downstream genes, we used publicly available RNA-seq data from defined immune cell subsets in peripheral blood of healthy individuals (GSE227743). Only genes with a minimum expression level greater than 20 were included for visualization.

Quantitative assessment of chromatin openness using ATAC-qPCR

Chromatin accessibility at selected genomic loci was quantitatively assessed using real-time PCR with primers targeting regions proximal to the summits of previously identified ATAC- peaks. Primer pairs were designed using Primer-BLAST to generate amplicons of 100 – 150 base pairs. Primers predicted to produce off-target products under 500 base pairs or containing mismatches within the 3′ terminal three bases were excluded to ensure specificity. A comprehensive list of validated primer sequences for all target and control loci is provided in Supplementary Table S4.

ATAC-qPCR was performed using a 1:40 dilution of the original ATAC library as the template. Reactions were run in technical triplicates on a QuantStudio™ 6 Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA) using SYBR GreenER™ qPCR SuperMix (Thermo Fisher Scientific) and 200 nM each of forward and reverse primer. The thermal cycling protocol included UDG activation (50°C for 2 min), polymerase activation (95°C for 2 min), followed by 40 cycles of denaturation (95°C for 15 sec) and annealing/extension (60°C for 1 min). All primer sets underwent pre-evaluation using 1 ng of genomic DNA isolated from pooled PBMCs of IgAN patients (QIAamp® DNA Mini Kit, Qiagen) under the same cycling conditions. Only primer pairs exhibiting efficient and specific amplification, without primer-dimer formation (confirmed by melting curve analysis), were selected for subsequent analysis. To determine normalized chromatin openness, mean Ct values were transformed using the 2–Ct method. Relative chromatin openness was then calculated by normalizing these transformed values to the geometric mean of three control loci.

Statistical analyses

Data are presented as means ± standard deviation (SD) or means ± standard errors of the mean (SEM). Group comparisons were performed using Student’s t-test. Correlations were assessed using Pearson’s correlation coefficient. All statistical analyses were conducted using GraphPad Prism 8.02 (GraphPad Software, La Jolla, CA, USA) and R statistical software (v4.4.1; R Core Team 2021). Biomarker cut-offs were derived using Cutoff Finder algorithm22. ROC analysis was used to evaluate classifier performance. A two-tailed P < 0.05 was considered statistically significant. Data visualization was performed using GraphPad Prism 8.02 for volcano plot, dot plots, bar plots, pie charts, ROC curves, and waterfall plots. Genome browser tracks were visualized using the UCSC Genome Browser (https://genome.ucsc.edu/). Heatmaps were generated in R using the pheatmap package (v1.0.12). Regulatory network analyses were performed using Ingenuity Pathway Analysis (IPA, Qiagen).

Results

Clinical characteristics of IgAN patient cohort

We analyzed cryopreserved lymphocytes from the peripheral blood of 17 patients with biopsy-confirmed IgAN. The late-stage group showed a lower eGFR value at the time of kidney biopsy compared to the early-stage group (Supplementary Table S5).

Identification of differential chromatin landscapes and stage-specific biomarkers

We hypothesized that variations in chromatin accessibility in CD8⁺ T cells would reflect differences in disease severity. To explore this, we conducted ATAC-Seq to compare chromatin accessibility between the groups. CD8+ T cells were isolated from PBMCs using FACS, and ATAC libraries were prepared via Tn5 transposase-mediated tagmentation. Sequencing reads were mapped to the human reference genome (hg19) and ATAC-peak area values were normalized using control peaks for robust inter-sample comparison20 (Figure 1a).

Fig, 1
Fig, 1
Full size image

Workflow for ATAC-Seq library preparation, analysis, and biomarker selection in circulating CD8+ T cells from IgAN patients. a Schematic of the ATAC-Seq workflow. Circulating CD8+ T cells were isolated from peripheral blood mononuclear cells (PBMCs) of IgAN patients. Genomic DNA from open chromatin regions was fragmented via tagmentation using Tn5 transposase. These constructed ATAC libraries were sequenced and aligned to human reference genome (hg19). Chromatin accessibility for each sample was normalized using a factor derived from 20 control peaks. Normalized enrichment values within ATAC-peaks were converted to area values. DARs were identified by comparing early- and late-stage groups using fold change and Student’s t-test. Schematic created with BioRender.com. b Biomarker selection process. A total of 279 DARs with fold change > 2 and P < 0.05 between the early-stage (blue) and late-stage (red) group were identified (left, middle). Of these, 122 DARs with peak widths < 500 bp were selected as candidate biomarkers (right, top), including 104 early-stage group and 18 late-stage group enriched peak. Biomarkers were ranked based on relative distance and mean area values in the non-enriched group. A heatmap shows normalized area values as Z-scores ranging from ‒2 (navy) to +2 (yellow) (right, bottom).

From these data, 279 differentially accessible regions (DARs) were identified based on a fold-change > 2 and P < 0.05 from 330 accessible regions satisfying fold-change > 2 (Figure 1b, left and middle and Supplementary Table S2). Among them, 122 peaks < 500 base pair (bp) in width—representing mono- and di-nucleosome-sized ATAC-peaks—were further classified as potential stage-specific biomarkers. These were ranked based on inter-group differences in normalized peak areas (Figure 1b, right). Notably, the top-ranked biomarkers were predominantly enriched in early-stage group, whereas late-stage-specific peaks became apparent at lower ranks (after rank 31), suggesting stage-dependent regulatory signatures in CD8+ T cells (Figure 1b, right).

Chromatin landscape analysis reveals stage-specific regulatory signatures

The chromatin profiles were characterized in more detail. A total of 234 DARs were enriched in early-stage group, while 45 peaks were enriched in late-stage group, indicating higher overall chromatin accessibility in early-stage group (Figure 2a). Additionally, the generalized linear model confirmed the robustness of the identified DARs (Supplementary Table S3). Clustering using t-distributed stochastic neighbor embedding (t-SNE) based on normalized ATAC-peak area values revealed clear segregation of early- and late-stage groups (Figure 2b). Genomic annotation of DARs showed that 81.7% were located within gene bodies or promoter regions, underscoring potential impacts on transcriptional regulation (Figure 2c). The findings suggest a more accessible epigenetic landscape in early-stage IgAN and highlight the potential of chromatin accessibility signatures to reflect early IgAN, offering insight into underlying immune mechanisms before overt renal function decline.

Fig. 2
Fig. 2
Full size image

Characterization of DARs in early and late-stage groups. a Volcano plot of 234 peaks enriched in the early-stage group (blue) and 45 peaks in the late-stage group (red) with fold change > 2 and P < 0.05. b t-SNE based on 279 DARs for 17 patients (early: blue, late: red). Each dot represents an individual patient. c Genomic annotation of the 279 DARs. Peaks were categorized into promoters, gene bodies, ncRNA, and intergenic regions. Promoters were further divided into distal (–30,000 to –10,000 bp from TSS), proximal (–10,000 to –500 bp), and core (–500 to +500 bp). Gene bodies included 5’-UTR, introns, exons, 3’-UTR, and TTS (+1,000 bp downstream from 3’-UTR). ATAC-peaks mapped to non-coding RNA regions were annotated as ncRNA. The other ATAC-peaks were grouped as intergenic regions.

Transcription factor motif analysis and CD8+ T-cell subset deconvolution

To investigate whether the observed DARs in CD8+ T cells reflected shifts in cellular composition and were regulated by distinct transcription factors governing differentiation, we re-estimated CD8+ T-cell subset composition using a disease-specific reference framework. This framework was derived from a single-cell RNA-sequencing dataset generated from the same IgAN patient cohort (GSE285335; data under review).

CD8+ T cells were clustered based on IL-7Rα expression and canonical differentiation markers, corresponding broadly to naïve-like + central memory (CM) and progressively differentiated effector-memory populations. Subset-specific signature genes were identified by differential expression analysis (one subset versus all others) and mapped to promoter-overlapping ATAC-peaks in the bulk dataset to construct a subset-specific signature matrix. Single-sample gene set enrichment analysis (ssGSEA) was then performed using promoter accessibility values to quantify subset enrichment in each sample (Figure 3a)26.

Fig. 3
Fig. 3
Full size image

Deconvolution of CD8⁺ T-cell subsets and transcription factor motif analysis. a Schematic overview of the integrative analysis framework. Single-cell RNA-sequencing (scRNA-seq) data from the same IgAN patient cohort (GSE285335; data under review) were used to define disease-specific CD8+ T-cell subsets based on IL-7Rα expression and canonical differentiation markers. Subset signature genes were identified by differential expression analysis and mapped to promoter regions overlapping ATAC-peaks in the bulk dataset. Promoter accessibility values were converted to Z-score, and single-sample gene set enrichment analysis (ssGSEA) was performed to quantify subset-specific enrichment across individual ATAC-seq samples. Schematic created with BioRender.com. b Heatmap showing the scaled enrichment scores of four CD8+ T-cell subsets (Naïve-like+CM, IL-7Rαhigh EM, IL-7Rαmed EM, IL-7Rαlow EM) across early-stage and late-stage IgAN samples.c Comparison of ssGSEA enrichment scores for the Naïve+CM (top) and IL-7Rαlow EM (bottom) subsets between early-stage (blue) and late-stage (red) groups. Each point represents an individual patient; bars indicate mean ± SEM. Statistical significance was determined using two-tailed Student’s t-test. d Top 5 enriched Homer transcription factor motifs in the early-stage (left) and late-stage (right) groups, with consensus sequences derived from motif-matched genomic region of DARs.

Early-stage groups showed enrichment of naïve-like+CM CD8⁺ T cells, while IL-7Rαlow EM CD8⁺ T cells were more prevalent in late-stage groups (Figures 3b and c). Given that IL-7Rα expression on CD8+ T cells is known to decrease with aging27, epigenetic changes associated with advanced status of IgAN may be particularly prominent within terminally differentiated subset in the late-stages of the disease. In contrast, the enrichment of naïve-like+CM CD8⁺ T cells in early-stage IgAN suggests a more immunologically active and developmentally plastic state, potentially reflecting ongoing immune activation and early pathogenic processes.

We then conducted transcription factor (TF) motif analysis on stage-enriched DARs. Both groups showed enrichment for RUNX1, a key regulator of T cell homeostasis28. In the early-stage group, motifs for ETS1, LEF1, FLI1, and RUNX2 were enriched—TFs associated with T cell survival, proliferation, and memory formation (Figure 3d, left). Downstream targets of these TFs included cytotoxic effectors (GZMB and PRF1), apoptotic regulators (MYC), and differentiation markers (IL7R and TBX21), indicating active immune regulation in early-stage patients. These transcriptomic analyses collectively suggest that the early-stage group exhibits a predominance of cytotoxic CD8+ T cells, characterized by enhanced apoptotic potential and differentiation capabilities (Supplementary Figure S1a).

In contrast, late-stage DARs were enriched for motifs of EOMES, TBX21, IRF4, and GABPA (Figure 3d, right). These TFs are associated with terminal differentiation and T cell exhaustion in chronic immune activation29,30,31,32. Downstream targets included PRF1, PDCD1, and IL2RB, suggesting effector and inhibitory signatures in late-stage CD8⁺ T cells. Together, these results highlight distinct transcriptional regulatory networks between disease stages, with the late stage characterized by terminal differentiation and activation-induced exhaustion (Supplementary Figure S1b).

Diagnostic performance of chromatin accessibility biomarkers

To evaluate the diagnostic potential of stage-specific DARs, receiver operating characteristic (ROC) curve analysis was conducted to determine optimal cut-off values based on normalized area distributions. Optimal thresholds were selected by minimizing the Euclidean distance from the ideal point (100% sensitivity, 0% specificity), ensuring balanced discrimination between early- and late-stage groups (Figure 4a).

Fig. 4
Fig. 4
Full size image

Evaluation of biomarker performance distinguishing early- and late-stage groups. a Schematic of biomarker assessment process. Area values representing chromatin openness were compared between groups. Cut-off values were determined using Euclidean distance in ROC analysis. Waterfall plots visualize area values after cut-off adjustment; filled and empty circles indicate patients above or below the cut-off, respectively. b, d Representative genome browser tracks of the top 10 biomarkers enriched in early-stage (b) and late-stage groups (d). c, e Classification of patients above (black circle) and below (open circle) the cut-off for each biomarker in the early-stage (c) and late-stage (e) groups. Sensitivity and specificity are color-coded from 50% (blue) to 100% (red).

The top 10 biomarkers from each group were selected. In early-stage group, seven biomarkers were located in gene bodies, one in a promoter, and two in intergenic regions (Figure 4b), achieving an average sensitivity of 98.3% (range: 83.3 – 100.0%) and specificity of 93.6% (range: 81.8 – 100.0%) (Figure 4c, Supplementary Figure S2). In late-stage group, five biomarkers each were located in gene bodies and intergenic regions (Figure 4d), with average sensitivity of 85.0% (range: 66.7 – 100.0%) and specificity of 89.1% (range: 72.7 – 100.0%) (Figure 4e, Supplementary Figure S3). Collectively, these results underscore the robustness and discriminative potential of stage-specific chromatin accessibility biomarkers in circulating CD8⁺ T cells for distinguishing IgAN patients according to the progressed status of the disease.

Composite biomarker scoring enhancing discriminatory power

To enhance stratification, we generated a composite weighted score by summing scores of individual biomarkers exceeding their respective cut-off values, as previously described 20 (Figures 4a and 5a). The combination of the top 10 biomarkers from each group revealed statistically significant differences between the early- and late-stage (P < 0.001 for both composite scores) groups. This composite score effectively distinguished early- and late-stage patients with AUROCs of 1.000 and 0.970, respectively. Sensitivity and specificity reached 100% and 100% in the early-stage group, and 100% and 90.9% in the late-stage group, respectively (Figures 5b and c). These results emphasize the value of integrating multiple chromatin biomarkers into a single composite score for sensitively distinguishing stage-specific disease severity based on CD8⁺ T-cell chromatin profiles.

Fig. 5
Fig. 5
Full size image

Prediction of chromatin openness using composite biomarker. a Schematic of weighted openness score calculation. For each biomarker, normalized values above or below the cut-off were scored as 1 or 0, respectively. Weighted sums were calculated across markers. b, c Openness scores for early-stage (blue) and late-stage (red) groups are shown as dot plots (left), with mean ± SEM for each group (early: n = 11, late: n = 6). Statistical significance was determined by Student’s t-test. ROC curves (middle) and waterfall plots (right) visualize predictive performance and individual openness scores.

Exploratory ATAC-qPCR assessment of ATAC-Seq derived biomarkers

Given the practical challenges of implementing ATAC-Seq in clinical settings—including cost, time, and technical complexity—we developed ATAC-qPCR as a streamlined and feasible alternative for quantifying chromatin accessibility at selected biomarker loci. This assay targets amplicons centered at the summits of ATAC-Seq peaks, which are regions enriched in accessible chromatin, using primers designed to span these open chromatin regions (Figure 6a). Chromatin accessibility was quantified using the ΔCt method, with normalization to the geometric mean of three reference loci.

Fig. 6
Fig. 6
Full size image

Practical quantification of chromatin openness by ATAC-qPCR a Schematic of the ATAC-qPCR workflow. Amplicons centered on ATAC-peak summits (gold) were designed for qPCR quantification of chromatin openness (sky blue). Forward and reverse primers span accessible regions. Normalized chromatin openness was quantified using the 2–Ct method, normalized to the geometric mean of three control loci. Schematic created with BioRender.com. b Scatter plot showing correlation (Pearson’s correlation coefficient R2) between ATAC-peak area values and normalized chromatin openness for early (blue) and late (red) biomarkers. Linear regression (green) with 95% confidence interval (dashed lines) visualizes the correlation strength.

The ATAC-qPCR results showed a significant correlation with ATAC-Seq peak intensities, as demonstrated by the relationship between ΔCt values and ATAC-Seq peak areas for both early- and late-stage biomarker groups (R2 = 0.246 – 0.389) (Figure 6b). Notably, when assessed for clinical utility in discriminating disease stages, several early-stage biomarkers (n = 5) demonstrated good predictive accuracy (AUROC = 0.772 – 0.893), while late-stage biomarkers (n = 4) also exhibited high performance (AUROC = 0.848 – 0.924) (Supplementary Figures S4 and S5)33. These findings highlight the potential of ATAC-qPCR as a clinically translatable method for targeted assessment of chromatin accessibility at specific loci, offering a practical tool for evaluating the disease status of IgAN.

Discussion

In this study, we comprehensively profiled the chromatin accessibility landscapes of circulating CD8⁺ T cells in IgAN and identified distinct epigenetic signatures associated with different disease severity using ATAC-Seq. Through comparative analysis of early- and late-stage, we demonstrated 279 DARs, which were further characterized by distinct patterns of chromatin openness, immune cell subset composition, and transcription factor motif enrichment. These stage-specific chromatin accessibility patterns revealed chromatin accessibility signatures associated with disease severity and offered their utility as discriminatory tools for disease status stratification.

Circulating CD8⁺ T cells play diverse pathogenic roles across autoimmune diseases through cytotoxic activity and aberrant differentiation,34 as seen in Graves’ disease35, aplastic anemia, and systemic lupus erythematosus36,37. This functional heterogeneity is likely governed by disease-specific transcriptional and epigenetic regulation. In IgAN, our chromatin accessibility profiling of circulating CD8⁺ T cells revealed a phenotypic shift from naïve subsets in early-stage disease to terminally differentiated EMRA cells in late-stage disease. This phenotypic evolution is consistent with previous studies demonstrating increased activated CD8⁺ T-cell populations correlating with renal function deterioration12 and enhanced tubulointerstitial CD8⁺ T cell infiltration associated with disease progression13. The epigenetic profiles of circulating CD8⁺ T cells provide mechanistic insight into their transition from active effectors in early disease to dysfunctional or exhausted states in advanced IgAN. These dynamic changes support their potential as biomarkers for capturing immune state transitions and stratifying disease severity.

The early-stage group exhibited a greater number of DARs, predominantly located in promoters and gene bodies, suggesting heightened transcriptional activity and functional regulatory potential. Deconvolution analysis revealed a transition in subpopulations, from naïve cells in early-stage to EMRA cells in late-stage group, indicative of immune quiescence or exhaustion. These cellular shifts were accompanied by distinct transcription factor motif enrichments: early-stage group showed elevated ETS1 and LEF1 activity, regulators of memory formation, apoptosis, and proliferation38, while late-stage group were enriched for EOMES and TBX21 motifs, associated with terminal differentiation and chronic activation. Correspondingly, ETS1 and LEF1 targeted genes involved in cytotoxic function and differentiation (e.g., MYC, GZMB, and PRDM1), while ETS1, RUNX1, and FLI1 co-regulated IL7R and ADGRG1, markers of naïve or cytotoxic potential. In contrast, EOMES, TBX21, IRF4, and GABPA coordinated expression of effector, exhaustion, and metabolic genes (e.g., PDCD1, IL2RB, GATA3, and TCF7), underscoring transcriptional reprogramming of CD8+ T cells along the course of progression to advanced disease.

Importantly, ETS1 binding motifs were predominantly enriched in open chromatin regions of CD8⁺ T cells of early-stage group, suggesting a potential regulatory role in T-cell activation and differentiation during the early phase of IgAN. ETS1 has previously been identified as a key transcriptional regulator of IL-7Rα expression, which is critical for T-cell homeostasis39. ETS1 has also emerged as a significant susceptibility locus in GWAS of IgAN, raising the possibility that ETS1 may contribute to disease pathogenesis through immune-mediated mechanisms beyond its known role in mesangial cells. While the precise cellular targets remain to be clarified, our findings support a potential role for ETS1 in shaping the chromatin landscape of peripheral CD8⁺ T cells. Further studies are warranted to evaluate ETS1 as a biomarker of disease activity and to explore its relevance in modulating CD8+ T cell-mediated kidney injury, as a therapeutic target.

The practical applicability of our findings was demonstrated by the successful exploration of selected DARs using ATAC-qPCR, a cost-effective and clinically feasible method. Several biomarkers showed strong discriminatory performance, and a composite scoring approach further enhanced classification accuracy. These results highlight the translational potential of integrating chromatin accessibility biomarkers into clinical risk assessment for IgAN. Moreover, our study underscores the broader utility of peripheral immune cell epigenetic profiling as a non-invasive tool for identifying patients at higher risk of progression to CKD. By distinguishing early-stage patients with progressive potential from those with stable disease, this approach may enable earlier, more targeted interventions to prevent irreversible kidney damage. Furthermore, the consistency of DAR identification using edgeR’s implementation of the generalized linear modeling framework provides additional support for the robustness and reproducibility of our chromatin accessibility findings.

Several limitations should be considered. First, peripheral blood samples were collected at varying time points after kidney biopsy, which may introduce temporal variability in chromatin accessibility profiles and limit the ability to directly correlate pathological findings with epigenetic signatures. Second, the study cohort was composed of Korean patients, potentially limiting the generalizability of the findings to other ethnic populations. Third, although our study provides comprehensive chromatin accessibility data, it lacks matched transcriptomic information that could further validate the functional consequences of the observed epigenetic alterations. Integrating multi-omic data, including RNA sequencing and single-cell ATAC-seq, in future studies will be essential to delineate the transcriptional programs downstream of key regulatory loci and to confirm subset-specific chromatin dynamics at higher resolution. Finally, our analysis was restricted to circulating CD8⁺ T cells and therefore may not fully capture the immune cell interactions and heterogeneity within the renal microenvironment. Despite these limitations, our study provides valuable insights into systemic immune alterations associated with different disease severity in IgAN.

In conclusion, this study unveils distinct chromatin accessibility landscapes in circulating CD8+ T cells that are associated with the different stages of IgAN and elucidated the potential involvement of CD8⁺ T cells in the pathogenesis of IgAN. These findings highlight the potential of chromatin accessibility biomarkers for non-invasive disease staging and personalized management. Future investigations exploring the functional implications of these epigenetic alterations and their potential as therapeutic targets may pave the way for personalized treatment strategies in IgAN.