Extended Data Fig. 3: Functional annotations for CRE genes and gradually changed CRE genes.

a-c. GO analysis for CRE target genes in normal (a), advanced adenoma (b) and cancer (c) tissues. The significant functional pathways among biological process (BP), molecular function (MF) and cell component (CC) are visualized. d. Differentially expressed genes across three stages were classified into nine patterns (C1-C9). The x-axis indicates the patterns, and the y-axis shows the number of genes in each pattern. e. GO analysis for the differential 7,492 CREs target genes. The significant functional pathways among biological process (BP), molecular function (MF) and cell component (CC) are visualized. f. DNA motifs of TFs determined by JASPAR website, and enrichment analyses for each TF motif among gradually changed CREs. The two-sided P values were calculated by hypergeometric test using HOMER. g. Plots of -log10 P values (x-axis) and OR (y-axis) were obtained from enrichment analysis of gradually changed CREs within binding sites for each TF (N = 60) from ENCODE database. The dashed blue line indicates OR = 1 and P = 0.05/60 = 8.33 × 10-4 (Bonferroni-corrected P value threshold, binding sites for 60 TF were tested). h. ORs for enrichment of gradually changed CREs among regulatory elements compared with non-CREs. i. KEGG pathway enrichment of target genes of gradually changed CREs. The circle color represents the significance of enrichment, and the circle size denotes the number of CRE genes within each pathway. j. GO analysis for target genes of gradually changed CREs. Terms were ranked by two-sided hypergeometric test-derived P values. k. The proportion of the target genes of gradually changed CREs that were associated with immune cell infiltration estimated by EPIC in Timer. l. Representative correlations between gene expression of gradually changed CREs with the infiltrations of immune cells using Pearson correlation with two-tailed test. The circle color represents the correlation degree and circle size represents the significance. P values were calculated by two-tailed Fisher’s exact test (g, h). Each dot represents the OR and bars indicate 95% CIs (h).