Introduction

Deoxyribonucleic acid (DNA) methylation (DNAm) involves the addition of a methyl group to the cytosine base of cytosine-phosphate-guanine (CpG) sites in DNA and plays a crucial role in gene regulation1,2. Global DNA hypomethylation and local CpG island hypermethylation patterns are associated with ageing and replicative senescence3,4. DNAm clocks are algorithms developed to estimate DNAm age based on DNAm levels at sites distributed across the genome. Biological age refers to an individual’s age defined by the level of age-dependent biological changes, such as molecular and cellular damage accumulation5. DNAm clocks were developed using DNAm levels measured with the Illumina HumanMethylation27K (HM27K), Infinium HumanMethylation450K (HM450K) and Infinium MethylationEPICv1 (EPICv1) BeadChip arrays. Human Infinium BeadChip assays have progressed through several generations, including HM27K, which focused on analysing promoter CpG methylation6. HM450, its successor, expanded the number of CpG sites to include gene body CpG methylation, highlighting the role of gene body methylation in regulating gene expression6. EPICv1 further broadened the scope by focusing significantly on cis-regulatory elements, acknowledging the importance of tissue-specific methylation signatures in these enhancers6. EPICv1 has since been extensively utilised to profile DNA cytosine modifications due to their cost-effectiveness, precise quantification, and user-friendly data analysis, particularly beneficial for large cohort studies6.

The recently launched Illumina InfiniumMethylationEPICv2 BeadChip (EPICv2) introduces several significant advancements, including use of lower input DNA and improve coverage in enhancer, super-enhancers and CTCF binding regions, over its predecessor, EPICv12,6. The EPICv2 retains 83% of the CpG sites from EPICv1 and 81% from HM4506. The removal and addition of certain CpG sites and signal background deviation in EPICv2 may introduce bias in DNAm age estimation due to the absence of certain CpG sites and deviation in methylation levels6. In addition, methylation levels differ between biological samples, introducing differing DNAm age predictions7,8,9,10. For instance, there are differences in DNAmGrimAge2 between blood and saliva samples from mothers (aged 39-42 years) who were free of major medical conditions and disabilities11. Previous studies examining DNAm age estimation across EPICv2 and EPICv1 were limited to cell line models6 and whole blood samples from 5-year-old children12. These sample choices constrain interpretation, as cell lines may not reflect DNAm patterns of human biological samples, and paediatric samples deviate from the adult blood samples typically used to train DNAm clocks12. Moreover, these studies evaluated only a subset of available DNAm clocks, notably omitting fourth-generation clocks that offer biological interpretability, such as YingCausAge, YingAdaptAge, and YingDamAge6,12.

The first aim of this study is to analyse the number of missing CpG sites in eleven human DNAm clocks across EPICv1 and EPICv2 arrays and to assess the enrichment of these missing CpG sites within specific genomic features and CpG contexts. The second aim is to evaluate the agreement of DNAm age estimations using the EPICv2 array compared to the EPICv1 array utilising buffy coat, peripheral blood mononuclear cells (PBMC) and saliva obtained from healthy middle-aged individuals.

Results

Participant characteristics

Participant characteristics are given in Table 1. The median age of the participants was 48 [43.5; 53.8] years, with 9 of 16 participants being female and 14 of 16 participants being Chinese. Thirteen of 16 participants self-reported as non-smokers. Most participants (11/16) had self-reported alcohol consumption less than once per week. The median of Body Mass Index (BMI), waist-to-hip ratio, brachial systolic pressure and brachial diastolic pressure of the participants were 22.2 [21.1; 23.3] kg/m², 0.79 [0.76; 0.82], 115 [109;126] mmHg and 74.8 [69.4;79.3] mmHg.

Table 1 Participant demographics characteristics (N = 16)

DNAm clock CpG coverage in EPICv1 and EPICv2 arrays

The number of CpG sites included in the 11 DNAm clocks and the CpG coverage using EPICv1 and EPICv 2 is shown in Table 2. The number of CpG sites utilised in the 11 DNAm clocks ranges between 71 and 1331. EPICv1 includes all CpG sites for most DNAm clocks except Horvath1, Hannum, DNAmFitAge and YingCausAge with 5.38%, 8.45%, 1.44%, 28.2% missing CpG sites respectively. Less than 10% the CpG sites are missing in EPICv2 in most DNAm clocks including Horvath1, Hannum, PhenoAge, Horvath2, YingAdaptAge and YingDamAge. However, for GrimAge, DNAmTL, DNAmFitAge, DunedinPACE and YingCausAge, 15.6% to 24.6% CpG sites are missing in EPICv2. The CpG loci identifiers of the missing CpG sites of each DNAm clock in EPICv1 and EPICv2 are given in Table S1A and Table S1B respectively. For genomic feature, the missing CpG sites of most DNAm clocks are enriched in the gene body and the intergenic region (Table S2). For CpG context, the missing CpG sites of most DNAm clocks are enriched in CpG islands and open sea (Table S3).

Table 2 Number of CpG sites of DNAm clocks and CpG sites not retained in EPICv1 and EPICv2

DNAm age correlation in EPICv1 and EPICv2 with the same biological sample types

The Spearman correlations between DNAm clocks assessed using EPICv1 and EPICv2 arrays showed consistently high correlations across all biological samples in all PC clocks, YingCausAge and DNAmFitAge (ρ > 0.844) (Fig. 1). PC-based DNAm clocks showed higher correlation between EPICv1 and EPICv2 than non-PC-based DNAm clocks. YingDamAge and YingAdaptAge exhibited high Spearman correlations between EPICv1 and EPICv2 in buffy coat and PBMC samples (ρ > 0.815). YingDamAge and YingAdaptAge had moderate correlations between EPICv1 and EPICv2 in saliva sample (ρ =0.629 and ρ =0.553 respectively). DunedinPACE had high correlation between EPICv1 and EPICv2 in buffy coat and saliva samples (ρ > 0.774), and moderate correlations in PBMC samples (ρ =0.512).

Fig. 1: Heatmap of Spearman correlation between DNAm ages in EPICv1 and DNAm ages in EPICv2.
figure 1

Each cell corresponds to correlation coefficient (ρ), coloured in blue or red for low Spearman correlation (ρ < 0.7) and high Spearman correlation (ρ > 0.7) respectively.

Agreement and difference of DNAm clocks between EPICv1 and EPICv2 across different biological sample types

The agreement of DNAm ages between EPICv1 and EPICv2 was evaluated using Bland-Altman plots, as shown in Fig. 2. The mean difference (SD), limit of agreement and MAPE are summarised in Table S4. Across all biological samples, mean differences were positive for PCHorvath1 (range: 1.03 to 4.03 years), PCHorvath2 (range: 0.66 to 2.77 years), PCHannum (range: 0.27 to 1.20 years), PCDNAmTL (range: 0.002 to 0.009 kb), and DunedinPACE (range: 0.045 to 0.094). Mean differences were negative for PCPhenoAge (range: -1.96 to -0.82 years), PCGrimAge (range: -0.72 to -0.34 years), DNAmFitAge (range: -4.35 to -2.34 years), YingCausAge (range: -8.17 to -3.48 years), and YingDamAge (range: -6.27 to -3.60 years). For YingAdaptAge, the mean difference was negative in buffy coat (-4.55 years) and saliva (-4.67 years), but positive in PBMC (1.62 years). The smallest absolute mean difference between EPICv1 and EPICv2 was observed for PCHannum in buffy coat (mean difference ± SD: 0.27 ± 0.80) and the largest absolute mean difference between EPICv1 and EPICv2 was observed for YingCausAge in PBMC (mean difference ± SD: -8.17 ± 3.12 years). Lower MAPE between EPICv1 and EPICv2 were observed in PC-based DNAm clocks (MAPE: buffy coat: 0.118 to 4.28%; PBMC: 0.176 to 3.57%; saliva: 0.278 to 8.98%) compared to non-PC-based DNAm clocks (MAPE: buffy coat: 8.27 to 15.0%; PBMC: 5.31 to 21.2%; saliva: 6.52 to 14.3%). The 95% limits of agreement across DNAm clocks were consistently widest in the saliva samples compared to both buffy coat and PBMC samples across all DNAm clocks.

Fig. 2: Bland-Altman plots comparing DNAm ages estimates by 11 different DNAm clocks, between the EPICv1 and EPICv2 arrays stratified by biological samples (Buffy coat, PBMC, and saliva).
figure 2

Black line indicates the mean difference, with the red box marked up the 95% of the limits of agreement, and grey line indicates the zero of the y-axes, across three different biological samples on each clock.

Differences in DNAm ages between the EPICv1 and EPICv2 across 11 DNAm clocks in buffy coat, PBMC, and saliva were assessed using Wilcoxon Signed-Rank test (Fig. 3& Figure S1). Across all biological samples, significant differences in DNAm ages between EPICv1 and EPICv2 were observed for PCHorvath1, PCHorvath2, PCPhenoAge and DNAmFitAge (Fig. 3). In both PBMC and buffy coat, significant differences were observed for DunedinPACE, YingDamAge and YingCausAge. In buffy coat, significant differences were observed for PCGrimAge and YingAdaptAge. No significant difference was observed in any biological samples for PCHannum and PCDNAmTL. The smallest absolute median difference between EPICv1 and EPICv2 was observed in PCHannum (0.306 years) and the largest absolute median difference was observed in YingCausAge (8.43 years). The results of Wilcoxon Signed-Rank test are summarised in Table 3. Individual-level comparisons of DNAm age estimates between EPICv1 and EPICv2 revealed consistent increase or decrease of DNAm ages across all biological samples (Figure S1). Cell compositions estimated using DNAm data across all biological samples also showed high spearman correlation and high agreement between EPICv1 and EPICv2 in most cell compositions except for Neutrophils and Eosinophils in PBMC (Figure S1 and S2). EpiDISH did not estimate the Eosinophils in buffy coat and Eosinophils, CD8T and B cells in Saliva.

Fig. 3: Box plots showing the differences in DNAm ages between EPICv2 and EPICv1 (V2 - V1) across various DNAm clocks and stratified by biological samples (Buffy Coat, PBMC, and Saliva).
figure 3

Significance test was done by Wilcoxon signed rank test, where *, ** and *** indicates statistically significance after Bonferroni correction. *** indicates p-values < 0.0000909. ** indicates p-values < 0.000909. * indicates p-adj < 0.00455. Note: PCDNAmTL has a unit of kb and DunedinPACE has no unit as it is defined as pace of ageing of an individual ageing relative to the normative rate of one year of ageing per year of calendar time.

Table 3 DNAm age differences between EPICv1 and EPICv2 in buffy coat, PBMC, and saliva assessed using Wilcoxon Signed-Rank test

Discussion

DNAm clocks were highly correlated between EPICv1 and EPICv2, except for DunedinPACE in PBMC samples and YingDamAge and YingAdaptAge in saliva samples. However, statistically significant differences were observed in DNAm age derived from EPICv1 and EPICv2. Saliva samples displayed the highest variability of difference between EPICv1 and EPICv2 for DNAm ages as compared to buffy coat and PBMC samples. Cell compositions estimated using DNAm data across all biological samples also showed high spearman correlation and high agreement.

The high correlation and high agreement of DNAm ages between EPICv1 and EPICv2 are attributed to the preservation of most CpG sites from existing DNAm clocks in EPICv2 and the high correlation between the DNAm levels at shared CpG sites of EPICv1 and EPICv2 observed in four human cell lines (GM12878, LNCaP, K562, and HCT116)6. In most clocks, statistically significant differences in DNAm ages between EPICv1 and EPICv2 were found. The consistent direction of within-person DNAm age differences between EPICv1 and EPICv2 indicates that the observed discrepancies are due to systematic differences introduced by arrays used, rather than random inter-individual variability. There is no distinct enrichment of missing CpG sites within genomic regions and CpG contexts in EPICv2 compared to EPICv1, thus this systematic offset could be attributed to the greater number of missing CpG sites used to measure various DNAm clocks in EPICv2 compared to EPICv1. Furthermore, the missing CpG sites imputation such as GLMNET could not fully compensate this discrepancy and might introduce noise for CpG sites that are completely missing from EPICv213,14,15. Previous studies have found that each clock contains CpG sites with unreliable methylation beta values, which can affect the reproducibility of DNAm ages16. The possibility of cell composition effect should be minimally as the biological sample used for both arrays are identical and the cell compositions predicted by EpiDISH exhibited a high spearman correlation and agreement between the arrays. Therefore, the systematic differences observed between EPICv1 and EPICv2 suggest that direct comparisons of DNAm ages derived from different arrays should be interpreted with caution, especially for small clinical trials and population studies with small effect size17,18. Researchers and clinicians should account for these cross-array biases8,9 and development of better imputation or cross-array calibration algorithms would be essential to enhance the reliability of existing DNAm clocks in EPICv219.

PC-based clocks exhibit lower mean differences and smaller variability of differences between EPICv1 and EPICv2 than non-PC-based clocks across biological samples. This is consistent with previous study, which have shown that principal component analysis (PCA) enhances the reliability and reproducibility of DNAm age estimation20. This is achieved by aggregating information from many age-related CpGs to improve the signal quality and reduce the effect of inherent technical noise associated with individual CpGs20. Future construction of DNAm clocks should consider the incorporation of PCA to enhance the reliability and reproducibility of the DNAm clocks.

Saliva samples displayed the highest variability in DNAm ages differences between EPICv1 and EPICv2. This can be attributed to the heterogenous cell types in saliva, including epithelial cells and leukocytes, potentially leading to greater variations in DNAm ages20,21,22. Blood-based tissues, buffy coat and PBMC, tend to have higher quality DNA with less variability of stability, purity, and quantity compared to saliva, contributing to less variability in DNAm ages estimation23,24. However, there is only a saliva specific cell-type deconvolution algorithm for children and not adults25. Modifying the array probes can impact cell composition, as indicated by the moderate correlations observed in some cell composition predictions. Therefore, cell composition adjustment in saliva and development of cell-type deconvolution algorithms specific to saliva and EPICv2 array would be essential for saliva to be used in cohort studies or commercial epigenetic tests24.

The Illumina EPICv2 array offers cost effective DNAm profiling for large cohort studies. However, this platform provides limited coverage of the whole methylome (937,690 out of 28 million CpGs, 3.5%)6,26. To address these limitations, whole-genome bisulphite sequencing (WGBS) has emerged as the gold standard for comprehensive methylation analysis, enabling the identification of complete mechanistic processes underlying DNAm levels across 28 million CpGs27. Traditional bulk DNAm measurements, whether from arrays or WGBS, represent the average DNAm levels in populations of cells, potentially diluting the tissue-specific and cell type-specific ageing signatures27. Single-cell WGBS allows high-resolution examination of DNAm patterns, revealing whether age-related DNAm changes are uniform across cells or emerge from population-level variations27. This insight can help explain the variability in DNAm age estimations between tissues27. Therefore, single-cell WGBS would be essential to understand the mechanism of epigenetic regulation in ageing and enable the construction of a DNAm clock with CpGs causally linked to ageing processes with minimal discrepancies in DNAm age estimations between biological samples27.

This study is the first comprehensive analysis of DNAm age difference between EPICv1 and EPICv2 in three commonly used biological samples including buffy coat, PBMC and saliva. The small sample size, ethnicity consisting of mostly Chinese and age range of 40-60 years may limit the generalisability of the findings and the robustness of imputation.

This study underscores the necessity of platform-specific considerations in DNAm age estimation. The observed systematic offsets between EPICv1 and EPICv2, particularly in non-PC-based clocks and high variability of differences in saliva samples, suggest that future research should incorporate PC analysis in the construction of DNAm clocks and a method to adjust for these biases. The small effect sizes changes or difference observed in DNAm age measurements from EPICv2 should be interpreted cautiously, as they may reflect systematic platform differences rather than meaningful age-related changes, given that these clocks were not originally trained on EPICv2.

Methods

Study design

This study is a cross-sectional analysis nested within the randomised clinical trial Does Alpha-ketoglutarate supplementation lower BiologicaL agE in middle-aged adults (ABLE) study28, which evaluates the effect of Calcium Alpha-Ketoglutarate (Ca-AKG) on biomarkers of aging in individuals 40-60 year of age. A total of 16 screening participants were included in this sub-study and before randomization. All participants provided written informed consent. The study has been approved by NUS IRB (NUS-IRB-2021-946) and registered at clinicaltrial.gov (NCT05706389).

Participants characteristics

Demographic and lifestyle characteristics, including age, gender, race, education level, type of housing, smoking status and alcohol consumption status were collected from participants through participant survey after biological samples collection. Race is classified as Chinese and Caucasian. Education levels are classified as undergraduate and graduate. Housing types are classified as private housing and public housing. Smoking status is defined as non-smoker and ex-smoker. Alcohol consumption is self-reported and detailed in terms of the number of servings and the frequency of intake per week, in the past year. Alcohol consumption is categorised as never, less than once per week, and more than once per week within the past year. Height and weight were measured using a Seca 213 Stadiometer (Germany) and a Seca 813 digital floor scale with high capacity (Germany), respectively. The body mass index (BMI) was subsequently calculated as weight/height² (kg/m²). Waist and hip circumferences were measured using a Seca 201 Ergonomic circumference measuring tape, Germany, and the waist-to-hip ratio was subsequently calculated.

Biological samples collection and preparation

All blood and saliva were collected after a minimum of 8 hours of overnight fasting, in the morning, between 8.30 and 9.30 am28. A total of 8 mL of venous blood was collected, followed by collecting 4 mL of saliva through passive drooling into a saliva collection tube (Isohelix, Kent, UK) pre-filled with a guanidine-free DNA stabilisation buffer. Whole blood and saliva samples were transported at a temperature of 4 °C and processed within 1 hour after the sample collection. The saliva collection tube was transferred and stored in a -80 °C freezer.

The Anticoagulant Citrate Dextrose Solution, Solution A (ACD-A) blood tubes were gently mixed by inversion after collection, and then centrifuge at 2500 rpm for 15 min at 23 °C with brake off (acceleration: 9, deceleration: 0). After aliquoting the top layer of plasma, the buffy coat was drawn and stored at –80 °C freezer. Whole blood was diluted with phosphate-buffered saline (PBS), layered over 15 mL of Ficoll media, and centrifuge at 400 g for 30 min at 23 °C with brake off (acceleration: 9, deceleration: 0). The PBMC layer was extracted, washed twice with PBS. The cell pellet was thoroughly resuspended after each wash. The PBMC cell count was performed using the Countess II FL Automated Cell Counter and stored at –80 °C freezer. The biological samples stored at –80 °C freezer were delivered with dry ice for DNA extraction.

DNA for biological samples were extracted using the Qiagen QIAamp DNA Mini Kits and further processed in a single batch to minimize the batch effect. Quality of DNA samples were evaluated using the Agilent Tape Station (Agilent, Santa Clara, CA). Quantification of DNA from study participant were performed using the Qubit™ fluorometer (Thermo Fisher Scientific, Waltham, MA). The EPICv1 and EPICv2 were used for high-throughput measurement of DNA methylation on a genome-wide scale. For each methylation chip experiment, approximately 1μg of genomic DNA from each study participant and from each biological specimen was bisulphite treated using the Zymo EZ DNA Methylation Kit (Zymo, Irvine, CA) to convert non-methylated cytosine nucleotides to uracil for subsequent methylation profiling. The bisulfite treated DNA were denatured into single strands and hybridized to the Infinium EPIC BeadChip via allele-specific annealing to either the methylation-specific probe or the non-methylation probe. Hybridization to the chip was followed by single-base extension with labelled di-deoxynucleotides. Hybridized BeadChips were subsequently stained, washed, and scanned to determine intensities of methylated and unmethylated bead types using Illumina’s iScan system.

DNAm analysis

Identical DNAm samples of buffy coat, PBMC and saliva biological samples were processed in parallel on the EPICv1 and EPICv2 Beadchips (Illumina Inc., San Diego, CA). Each sample was prepared from a single DNA extraction and analysed on both array versions to ensure direct comparability of DNAm measurements. DNAm age was predicted using methylation data from EPICv1 and EPICv2 BeadChips across different biological samples (buffy coat, PBMC, and saliva). Noob normalisation was performed to correct for biases associated with type I and type II CpG sites29. Eleven human DNAm clocks were calculated, 6 principal component-based DNAm clocks (PC clocks) consisting of Horvath1 DNAm age(PCHorvath1)30, Horvath2 DNAm age (PCHorvath2)31, Hannum DNAm age (PCHannum)32, DNAm PhenoAge (PCPhenoAge)33, DNAm GrimAge (PCGrimAge)34, and DNAmTL (PCDNAmTL) (an estimator for telomere length in kilobases)35 and non-PC clocks including DunedinPACE (pace of ageing)36, DNAmFit37, YingCausAge38, YingAdaptAge38 and YingDamAge38. The PC clocks were calculated, in accordance with established protocols20. The units of measurement for the DNAm clocks utilised in this study are in years, with the exception of two clocks PCDNAmTL (unit in kilobase, kb)35 and DunedinPACE (no unit). DunedinPACE is defined as pace of ageing of an individual ageing relative to the normative rate of one year of ageing per year of calendar time36. Cell compositions were predicted using EpiDISH for all biological samples including saliva39. Note that, although EpiDISH was not specifically trained on saliva data, its estimates for saliva serve as approximations. Regularised generalised linear regression model (GLMNET) was used to impute missing CpG sites used in the calculation of DNAm clocks and cell compositions in EPICv240. This imputation approach leverages the substantial overlap of CpG sites between EPICv1 and EPICv2 arrays, using these shared sites as predictors to estimate methylation values for missing sites. The GLMNET algorithm employs elastic net regularization, which combines L1 and L2 penalties to handle high-dimensional data while preventing overfitting.

Assessment of missing CpG sites and enrichment

This study assessed CpG site coverage for 11 human DNAm clocks across the EPICv1 and EPICv2 arrays, quantifying missing CpG sites and evaluating their enrichment within specific genomic regions and CpG contexts using the Bioconductor packages including IlluminaHumanMethylation 450kanno.ilmn12.hg19, EPICanno.ilm10b4.hg19 and EPICv2anno.20a1.hg38 and clock coefficients from Biolearn41. The CpG loci identifiers, unique alphanumeric codes assigned by Illumina to specific CpG sites in the genome, were compiled for the missing sites. Genomic regions refer to distinct elements within the genome with specific biological functions or structural characteristics. These genomic regions include 1st Exon (first exon), 5’ UTR (5’ Untranslated Region), Body (coding region), IGR (intergenic region), TSS1500 (region within 1500 base pairs upstream of the transcription start site), TSS200 (region within 200 base pairs upstream of the transcription start site) and 3’ UTR (3’ Untranslated Region)42. CpG contexts refer to the occurrence and position of CpG within the genome. These CpG contexts consist of island, opensea, shelf and shore42.

Statistical and reproducibility

Descriptive statistics were performed and data presented as median and interquartile interval (IQR) or number and percentage (%). Analyses involved comparing DNAm age predictions between EPICv1 and EPICv2 using Spearman correlation and Bland-Altman plots to assess agreement between EPICv1 and EPICv2. A Spearman correlation coefficient (ρ) greater than 0.7 indicates good agreement, ρ value between 0.5 and 0.7 was defined as moderate agreement, and ρ value below 0.5 was defined as poor agreement between the two methods43. The mean difference of EPICv2 and EPICv1 was defined as the average of the measurements of EPICv2 -EPICv1. The standard deviation of difference was calculated as well. The mean difference in Bland-Altman plots was interpreted as DNAm ages in years for all DNAm clocks except PCDNAmTL and DunedinPACE. For PCDNAmTL, the mean difference was interpreted as telomere length in kilobases (kb). For the DunedinPACE clock, the mean difference was interpreted as the mean difference in the pace of ageing. To standardise the difference across various DNAm clocks, the mean absolute percentage error (MAPE) was calculated to compare across DNAm clocks44. Box plots were used to visualize the difference of DNAm ages between EPICv1 and EPICv2 and significance differences were tested by Wilcoxon Signed-Rank test. Bonferroni multiple testing correction was applied by number of DNAm clocks (n = 11). All statistical analyses were performed using R (version 4.4.1) or Python (3.12.0). All tests are two-sided with significance at p < 0.05, unless otherwise stated. Each comparison group consisted of 16 samples. Specifically, 16 buffy coat, 16 saliva, and 16 PBMC samples were analysed at both EPICV1 and EPICV2. All datasets used in this study are provided and links are provided for software used.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.