Extended Data Fig. 1: Colon cancer associates with epigenetically drifted cells.

a, Correlation heatmap of the categories of the gene ontology analysis of the hypermethylated genes or the genes undergoing the Aging- and Colon Cancer-Associated epigenetic drift (the ACCA drift). b, Gene ontology analysis of the statistically correlated genes revealed in Fig. 1c. Canonical pathway enrichment was performed using Ingenuity Pathway Analysis (Qiagen). Statistical significance of pathway enrichment was assessed using a right-tailed Fisher’s exact test. Reported values represent –log₁₀ of the adjusted (Benjamini–Hochberg) p-value for each pathway. c, Boxplot indicating the DNAm level of the promoter of the ACCA drift genes in healthy colon and in colon cancer samples subdivided in CIMP negative, CIMP low and CIMP positive cancers. Number of samples: healthy (9-15 y) = 19; healthy (40-60 y) = 11; healthy (>60 y) = 34; (CIMP-) = 30; (CIMPlow) = 16; (CIMP + ) = 9. Box plots represent the median (center line), the 25th and 75th percentiles (bounds of the box), and whiskers extending to the most extreme data points within 1.5× the interquartile range. d, Boxplot indicating the DNAm level of the promoter of the ACCA drift genes in healthy colon samples, CRC samples and in patient-derived xenografts (PDX) (both from primary colorectal cancers (CRCs) and CRC liver metastasis samples). Number of samples: 50 healthy, 50 tumor, 76 PDX. e, Correlation heatmap of the categories of the gene ontology analysis performed by using the correlated genes revealed in co-expression analysis in Fig. 1g. Boxplots represent interquartile range with min to max whiskers. f, Dotplot indicating the DNAm level of the promoter of the DKK and SFRP family genes in healthy human colon samples at the indicated ages. Data are presented as mean values ± SEM (samples as in (a)). p-value was calculated by Welch’s t-test, one-tail. Number of samples: (9-15 y) = 19; (40-60 y) = 11; (>60 y) = 34. g, Boxplot indicating normalized expression counts of DKK and SFRP family genes in healthy human colon samples. For each gene, expression is shown separately for each age group. Box plots represent interquartile range with 5-95 percentile whiskers. P-values were computed using two-sided Wilcoxon rank-sum tests between (40-60 y) and (>60 y) samples. Number of samples: (40-60 y) = 12; (>60 y) = 39. h, Boxplot indicating normalized expression counts of the Dkk and Sfrp family genes in mouse intestinal crypts. For each gene, expression is shown separately for age group and sex. n = 4 mice per group were analyzed. Box plots represent interquartile range with 5-95 percentile whiskers. P-values were computed using two-sided Wilcoxon rank-sum tests between young and old or young and geriatric samples, respectively. i, Boxplot indicating the DNAm level of the promoter of the DKK2 and SFRP1 genes in healthy and colon cancer samples at the indicated ages (samples as in (a)). Number of samples: healthy (9-15 y) = 19; healthy (40-60 y) = 11; healthy (>60 y) = 34; cancer (40-60 y) = 17; cancer (>60 y) = 37. Boxplots represent the interquartile range with min to max whiskers.