Introduction

Aberrant DNA methylation1, chromosomal instability2, and microsatellite instability (MSI)3 are the major carcinogenic mechanisms underlying colorectal cancer (CRC). There are several classification methods for CRC DNA methylation status, including the CpG island methylator phenotype (CIMP)1 and genome-wide DNA methylation status (GWMS)4. CIMP is associated with approximately 20% of CRCs5,6 and has been identified as a prognostic factor for stage II–III CRC7,8,9,10. On the other hand, we have previously shown that GWMS, based on unsupervised hierarchical clustering of genome-wide DNA methylation probes, classifies metastatic CRC (mCRC) into high-methylated colorectal cancer (HMCC) and low-methylated colorectal cancer (LMCC). Classification with GWMS seems to identify a broader range of clinically distinct groups of mCRC patients as HMCC compared to CIMP1,4. Among RAS/BRAF wild-type (wt) metastatic CRCs, HMCC has a worse prognosis than LMCC4,11,12,13 and exhibits resistance to anti-epidermal growth factor receptor (EGFR) antibodies in several reports4,11,13. However, it remains unclear whether HMCC is associated with shorter OS regardless of the coexistence of major drivers of CRC, such as RAS or BRAF mutations, as well as what biological pathway drives HMCC to show its poor prognosis.

To answer these questions, we utilized data obtained in the TRICOLORE study. It was an open-label, multicenter, randomized phase III study that examined the non-inferiority of bevacizumab + irinotecan (IRI) + S-1 combination therapy to bevacizumab + oxaliplatin (OX)-based combination therapy (mFOLFOX6 or CapeOX) in terms of progression-free survival (PFS) in patients with previously untreated mCRC. The non-inferiority of the former has been statistically confirmed in both initial and extended analyses14,15. Using data obtained from this study, we previously reported that the GWMS was not a treatment response predictor for either OX- and IRI-based combination therapy16 and that HMCC was associated with worse survival than LMCC in mCRC12.

The objective of this study is to determine whether the prognosis of mCRC across RAS/BRAF genotypes is influenced by GWMS and to elucidate the mechanism at the gene expression level.

Materials and methods

Patients and materials

The TRICOLORE study was conducted in accordance with the Declaration of Helsinki and the Japanese Guidelines for the Ethics of Clinical Research. The study was approved by the institutional review board of each participating facility16 and registered in UMIN-CTR (http://www.umin.ac.jp/ctr/) (UMIN000007834). Informed consent was obtained from all the study participants. Only patients who provided informed consent for the translational research (TR) before treatment initiation were included. The formalin-fixed, paraffin-embedded (FFPE) tissues were collected from surgical or biopsy specimens of primary CRC.

Gene mutational analysis

In this study, BRAF mutation exclusively refers to BRAF V600E mutation here. Mutational analyses of KRAS (codons 12, 13, 59, and 61), NRAS (codons 12, 13, 59, and 61), BRAF (codon 600), PIK3CA (exons 9 and 20), and AKT1 (codon 17) were conducted via direct DNA sequencing as described previously17,18. Genomic DNA was extracted from FFPE tissue slides or sections using QIAamp DNA FFPE tissue kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. DNA sequences were analyzed using the automated CEQ2000XL DNA analysis system (Beckman Coulter, Fullerton, CA, USA).

Immunohistochemistry

Immunohistochemistry (IHC) for mismatch repair proteins (MLH1, MSH2, MSH6, and PMS2) and phosphatase and tensin homolog (PTEN) was performed at the Department of Pathology, Tohoku University Hospital. Details of the IHC were described in a previous report16. The following antibodies were used for IHC: an anti-human mouse MLH-1 clone G168-15 (1/100; BD Biosciences, San Jose, CA, USA), an anti-MSH2 mouse mAb clone FE11 (1/200; Sigma-Aldrich, St. Louis, MO, USA), an anti-MSH6 mouse clone 44 (1/1000; BD Biosciences), an anti-PMS2 mouse clone 16-4 (1/100; BD Biosciences), and a PTEN clone 6H2.1 (1/100; DAKO, Carpinteria, CA, USA). MMR-related proteins and PTEN positivity was classified according to the stainability of nuclei and cytoplasm, respectively.

Comprehensive gene expression analysis

Comprehensive gene expression analysis was conducted using the Whole Human Genome 4 × 44K Microarray Kit (Agilent, Santa Clara, CA, USA) as previously described16,19. We collected tumor cells by macrodissection and performed comprehensive gene-expression analysis with GeneSpringGX ver. 14.5 (Agilent). In the advanced CRC subtype, tumors were classified into A1, A2, B1, and B2 using unsupervised hierarchical clustering based on the Pearson uncentered and complete linkage clustering algorithm19. The CMS classification was performed by the gene expression profile using the single-sample predictor installed in the R package “CMS classifier”20. The similarity of the expression profile to the four subtypes (CMS1 to CMS4) was calculated in each case as “nearest CMS”21.

Genome-wide DNA methylation analysis

Genome-wide DNA methylation analysis was conducted using the Infinium MethylationEPIC BeadChip (Illumina, San Diego, CA, USA) as previously described4. The BeadChip was scanned using the iScan, and the DNA methylation β-value was calculated as intensity of methylated probe / (intensity of methylated probe + intensity of unmethylated probe). After excluding probes for the X and Y chromosomes, those with ≥ 0.25 SD of the β-value in all CRC samples were selected for further analyses. Based on hierarchical clustering of the genome-wide DNA methylation profile, the tumors were classified into HMCC and LMCC. The resulting heatmap and dendrogram has been previously described12. Tumors with three or more positive methylation regions among the five promoter regions, namely, CACNA1G, IGF2, NEUROG1, RUNX3, and SOCS1, were considered to be CIMP-positive1.

Gene set enrichment analysis

Gene set enrichment analysis (GSEA)22 was conducted using the GSEA software (a joint project of UC San Diego, San Diego, CA, and Broad Institute, Cambridge, MA; https://www.gsea-msigdb.org/gsea/index.jsp, RRID: SCR_003199). Permutation was performed 1,000 times for each phenotype. The C2.cgp gene set (c2.cgp.v2023.1.Hs.symbols.gmt) was obtained from MSigDB23, and the other gene sets were obtained from each report24,25,26. The chip platform was set to “Human_AGILENT_Array_MSigDB.v2023.1.Hs.chip,” which is contained in the GSEA software. A nominal P-value < 0.05 and a false discovery rate (FDR) Q-value < 0.25 were considered significant.

Outcome

Overall survival (OS) was defined as the time from the date of enrollment to the date of death from any cause.

Statistical analysis

The two-sided Fisher exact and Wilcoxon rank-sum (or the Kruskal–Wallis) tests were conducted to evaluate statistical significance in patient background. OS was estimated using the Kaplan–Meier method, and statistical significance was tested using the log-rank test. Hazard ratios and their 95% confidence intervals were calculated using the Cox proportional hazards model. P < 0.05 was considered to indicate statistical significance. All statistical analyses were conducted using JMP Pro, version 16.2.0 (SAS, Cary, NC, USA).

Results

Patient characteristics

A total of 487 patients were enrolled in the TRICOLORE study. Those without consent to participate in the TR (n = 119), FFPE specimens (n = 33), or data of the RAS/BRAF mutation status or GWMS (n = 109) were excluded (Fig. 1). The patients were classified into HMCC and LMCC according to the unsupervised clustering analysis of the data from their tumor samples. Ultimately, 226 patients were included in the molecular analysis.

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

CONSORT diagram. From the initial cohort (n = 487), 368 (75.6%) provided consensus to the translational research. After excluding patients without FFPE specimens (n = 33) and patients without RAS/BRAF mutation status or GWMS (n = 109), 226 patients were included in this study. FFPE, formalin-fixed, paraffin-embedded.

The patients’ tumor genotypes were classified as RAS/BRAF wt (n = 125), RAS mt (n = 87), and BRAF mt (n = 14) (Table 1). The number of patients with HMCC in each genotype was 22 (17.6%), 30 (34.5%), and 14 (100%), respectively. As regards the RAS/BRAF wt genotype, the HMCC group had significantly higher proportions of tumors with deficient MMR and PTEN-negative than the LMCC group. Furthermore, PIK3CA mt tended to be more common in the HMCC than in the LMCC group, although the difference was not significant. In the RAS mt genotype, no significant difference in patient background was observed between the HMCC and LMCC groups. All tumors with BRAF mt were classified as HMCC. Regardless of the genotypes of RAS and BRAF, CIMP-positive significantly overlapped with HMCC and most of the former was included in the latter.

Table 1 Patient characteristics and the results of molecular marker analysis according to the RAS, BRAF, and genome-wide DNA methylation status.

OS of the HMCC was shorter than that of the LMCC in the RAS/BRAF wt group

First, OS was compared in each RAS/BRAF genotype and GWMS (Fig. 2). The OS of the RAS/BRAF wt group was significantly longer than that of the RAS mt group (42.1 months vs. 28.1 months; P = 0.0004) and BRAF mt group (42.1 months vs. 16.7 months; P = 0.0001) (Fig. 2a). The OS of the RAS mt group was numerically longer than that of the BRAF mt group, but it was not statistically significant (P = 0.08). As for GWMS, the OS of the HMCC group was significantly shorter than that of the LMCC group (median, 25.1 months vs. 40.1 months; P = 0.0001) (Fig. 2b).

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

Kaplan–Meier survival curves for OS in each RAS/BRAF genotype and GWMS. (A) OS according to the RAS/BRAF genotype; (B) OS according to the GWMS. Abbreviations: OS, overall survival; wt, wild type; GWMS, genome-wide DNA methylation status; LMCC, low-methylated colorectal cancer; HMCC, high-methylated colorectal cancer; CI, confidence interval.

Subsequently, OS was compared between the GWMS groups in each RAS/BRAF genotype (Fig. 3). The OS of the HMCC was significantly shorter than that of the LMCC in the RAS/BRAF wt group (median, 25.3 months vs. 45.0 months; P = 0.006) but not in the RAS mt group (median, 25.4 months vs. 29.0 months; P = 0.51) (Fig. 3a, b). It is noted that the OS of the RAS/BRAF wt HMCC was comparable to that of the RAS mt group (median, 25.3 months vs. 28.1 months; P = 0.90) (Fig. 3c).

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Kaplan–Meier survival curves for OS between GWMS groups in each RAS/BRAF genotype. OS according to the GWMS (LMCC or HMCC) in the RAS/BRAF wt group (A), the RAS mt group (B) and the mCRC subtypes (RAS/BRAF wt LMCC, RAS/BRAF wt HMCC, RAS mt mCRC, and BRAF mt mCRC) (C). Abbreviations: OS, overall survival; mt, mutant; wt, wild type; GWM, genome-wide methylation; LMCC, low-methylated colorectal cancer; HMCC, high-methylated colorectal cancer; CI, confidence interval; mCRC, metastatic colorectal cancer.

To identify the strong prognostic factor in the RAS/BRAF wt group and the RAS mt group, univariate and multivariate analyses were performed. In the RAS/BRAF mt group, sex, CIMP, and GWMS were significantly associated with OS (Fig. 4). Among these, only GWMS remained significant in the multivariate analysis of the RAS/BRAF wt group (Fig. 4, Supplementary Table S1). There were no significant factors in the RAS mt group (Fig. 4, Supplementary Table S2). These results indicate that GWMS affects the prognosis of mCRC in the RAS/BRAF wt group.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Univariate and multivariate analysis on OS in RAS/BRAF wt or mt patients. Forest plot of the HR on OS from univariate or multivariate analysis in the indicated subgroup. *P < 0.05. Abbreviations: PS, Eastern Cooperative Oncology Group performance status; MMR, mismatch repair; CIMP, CpG island methylator phenotype; GWMS, genome-wide DNA methylation status; OX, mFOLFOX6/CapeOX plus bevacizumab; IRI, S-1 and irinotecan plus bevacizumab; LMCC, low-methylated colorectal cancer; HMCC, high-methylated colorectal cancer; OS, overall survival; HR, hazard ratio; wt, wild type; mt, mutant.

HMCC in the RAS/BRAF wt group exhibits a gene expression pattern related to MSI-high, BRAF V600E , and anti-EGFR antibody resistance

To elucidate the molecular difference between HMCC and LMCC in the RAS/BRAF wt genotype, GSEA was conducted (n = 125) (Table 2). Analysis of the C2.cgp database, which represents the expression signatures of genetic and chemical perturbations, revealed that the gene set upregulated in microsatellite instability-high (MSI-H) CRC compared with microsatellite-stable (MSS) CRC27 was significantly enriched in RAS/BRAF wt HMCC at the top score among 3,405 gene sets in the database (Supplementary Table S3). Conversely, the gene set downregulated in MSI-H CRC compared with MSS CRC was significantly enriched in RAS/BRAF wt LMCC (Supplementary Table S4). Furthermore, GSEA using other gene sets altered in MSI-H CRC26 showed that RAS/BRAF wt HMCC and RAS/BRAF wt LMCC were enriched for genes up- and down-regulated in MSI-H tumors, respectively. The analyzed population includes patients with mismatch repair-deficient (dMMR) CRC (n = 4). Even after excluding patients with dMMR, similar findings were observed (Supplementary Table S5).

Table 2 GSEA results with MSI-H gene sets comparing HMCC (n = 22) vs. LMCC (n = 103) in patients with RAS/BRAF wild-type metastatic colorectal cancer.

Both hypermethylated CRC and MSI-H CRC are correlated with BRAFV600E mutation1,12. Thus, we checked the correlation between HMCC in the RAS/BRAF wt group and CRC with BRAFV600E mutation via GSEA (Table 3). The gene set upregulated in BRAFV600E CRC compared with BRAF wt CRC25 was significantly enriched in RAS/BRAF wt HMCC.

Table 3 GSEA results with gene sets related to BRAF mutation and cetuximab resistance comparing HMCC (n = 22) vs. LMCC (n = 103) in patients with RAS/BRAF wild-type metastatic colorectal cancer.

GWMS is correlated with anti-EGFR antibody sensitivity4; therefore, GSEA was conducted in the RAS/BRAF wt group (n = 125) using gene sets related to cetuximab response24. The gene set upregulated in cetuximab-responding patient-derived xenograft (PDX) in the report was significantly enriched in RAS/BRAF wt LMCC, whereas the gene set upregulated in nonresponding PDX exhibited potential enrichment in RAS/BRAF wt HMCC (Table 3). A similar trend was observed in the total cohort, excluding patients with RAS mt and BRAF mt (n = 226; Supplementary Table S6).

Discussion

We analyzed the prognostic value of GWMS across RAS/BRAF genotypes in a prospective cohort of mCRC and found that GWMS was a prognostic factor only in RAS/BRAF wt mCRC. Notably, multivariate analyses of OS among patients with RAS/BRAF wt mCRC showed that HMCC was a significant prognostic factor, whereas CIMP-positivity was not, suggesting that GWMS captures a clinically aggressive epigenetic phenotype beyond conventional CIMP classification in this population. Moreover, GSEA revealed that the gene expression pattern of RAS/BRAF wt HMCC was associated with MSI-high status, BRAFV600E mutation, and resistance to anti-EGFR antibodies. These findings could partially explain the molecular mechanism of the poor prognosis of RAS/BRAF wt HMCC.

mCRC with high DNA methylation status is associated with poor prognosis7,8,9. The present study further showed that the poor prognosis of HMCC compared with LMCC was observed only in RAS/BRAF wt mCRC patients. Therefore, we focused on this group.

Metastatic and recurrent dMMR CRC has a poorer prognosis than pMMR CRC28. In our cohort, the gene sets up- and down-regulated in MSI-H CRC were enriched in RAS/BRAF wt HMCC and RAS/BRAF wt LMCC, respectively. These symmetrical characteristics suggest that gene expression patterns between GWMS4 and MSI27 are similar. Thus, the poor prognosis of RAS/BRAF wt HMCC could be attributed to this similarity. Since HMCC exhibited gene expression patterns similar to those of MSI-H CRC, even when patients with dMMR were excluded from the analysis, GWMS may identify a broader range of MSI-H CRC and MSI-H-like CRC than IHC of MMR-related proteins.

The similarity between RAS/BRAF wt HMCC and MSI-H CRC may also indicate the importance of GWMS in predicting the treatment response to immune checkpoint inhibitors (ICIs). In the KEYNOTE-177 trial, pembrolizumab or chemotherapy was administered to patients with metastatic MSI-H CRC29. In a multivariate analysis of PFS, pembrolizumab was preferred in the right-sided CRC subgroup but not in the left-sided CRC subgroup. As HMCC frequently occurs in the right-sided colon4, the improvement of PFS in the right-sided CRC subgroup may be attributable to HMCC. Thus, HMCC could be a positive predictor for the treatment effect of ICI in the future.

Moreover, the RAS/BRAF wt HMCC was similar to BRAFV600E CRC in the gene expression profiles. Because BRAFV600E is a well-known poor prognostic factor in CRC, this similarity in gene expression profile may partially explain the poor prognosis in this group.

The anti-EGFR antibody resistance of RAS/BRAF wt HMCC could be attributed to a gene expression pattern similar to that of BRAFV600E CRC. In this study, the gene set related to cetuximab response in CRC, the cetuximab signature24, was correlated with GWMS. In LMCC and HMCC, genes for the cetuximab responder and non-responder were enriched, respectively. The gene set of the cetuximab was created using PDX including RAS and BRAF mutants, according to their response to cetuximab. These driver mutations were frequently seen in PDXs not responding to cetuximab compared to those responding to cetuximab. Thus, non-responder gene set in cetuximab signature could potentially include genes upregulated by RAS or BRAF mutation. Moreover, mCRC is known to be anti-EGFR antibody-resistant partially due to the aberrant activation of the mitogen-activated protein kinase (MAPK) pathway30. Therefore, RAS/BRAF wt HMCC was expected to have a gene expression pattern related to MAPK pathway activation. However, GSEA failed to detect such enrichments in HMCC using the gene sets related to the MAPK pathway and RAS-mutated CRC (data not shown). Thus, the type of alterations in gene expression that causes these similarities should be determined in the future.

As for the RAS/BRAF wt LMCC, the good OS in our cohort could be due to the therapeutic effect of anti-EGFR antibodies in the later line after the first-line regimens containing either oxaliplatin or irinotecan. Although RAS/BRAF wt HMCC is reportedly anti-EGFR antibody-resistant in the later line compared with RAS/BRAF wt LMCC and has a poor prognosis4,31, similar to RAS and BRAF mt mCRC32,33,34, some patients in our cohort received anti-EGFR antibody treatment regardless of their GWMS. Thus, the different anti-EGFR antibody responses across GWMS may have affected prognosis in the RAS/BRAF wt group. However, the number of patients who were administered anti-EGFR antibodies was too small to conduct a statistical analysis. Therefore, the speculation that differences in anti-EGFR antibody responses may have affected OS in our cohort remains a hypothesis (data not shown).

This study has some limitations. First, the population was limited to Japanese patients, and the number of patients is small. Additionally, the analyzed population comprised 226 of 487 patients enrolled in the TRICOLORE trial. This could lead to selection bias influencing treatment outcomes and prognosis; patients with resected samples tended to have good performance status and thus were more likely to undergo surgery, which could result in better OS, for example. Second, the molecular mechanism was deduced via GSEA based on statistical inference; therefore, more evidence such as transcriptomic data from other cohorts, multiomics analysis, and in vitro experiments, is necessary to confirm the logic. Third, patients with RAS mutations were treated as a single group due to the small sample size, although they should be analyzed in each specific mutation group, given the recent development of RAS inhibitors for mCRC treatment.

In conclusion, we demonstrated that RAS/BRAF wt HMCC is a poor prognostic factor in mCRC. Its gene expression pattern is associated with that of MSI-H and BRAFV600E mutation, which could play a role in its cetuximab resistance.