Introduction

Colorectal cancer (CRC) is a major public health concern globally, ranking among the top causes of cancer-related morbidity and mortality worldwide1,2. In Saudi Arabia, CRC is the most commonly diagnosed cancer in men and ranks third among women, with an increasing incidence observed over recent decades3,4. Notably, a substantial proportion of CRC cases in Saudi Arabia are diagnosed at advanced stages and frequently present with regional or distant metastases5, highlighting the need for effective biomarkers to predict tumor aggressiveness, metastatic potential, and guide management.

A potential promising candidate is cluster of differentiation 44 (CD44), a transmembrane glycoprotein and principal receptor for hyaluronic acid6. CD44 plays a crucial role in cell-cell and cell-matrix interactions and is widely recognized for its involvement in various cancer related processes, including tumor initiation, epithelial-mesenchymal transition (EMT), invasion and metastasis7,8. Moreover, CD44 is considered a major marker of cancer stem cells (CSCs) in multiple solid tumors7,9,10. Its ability to maintain stemness, promote drug resistance and modulate the tumor microenvironment has made it an attractive target in cancer diagnostics and therapeutics8,11,12. In CRC specifically, CD44 has been associated with tumor progression and poor clinicopathological features, including high tumor grade, lymphovascular invasion and nodal involvement13,14. However, the prognostic significance of CD44 expression in CRC remains controversial, with some studies suggesting it correlates with poor outcomes, while others find no significant survival impact14,15,16. These discrepancies may reflect population heterogeneity, differences in detection methodology, or the dynamic role of CD44 isoforms and co-activated signaling pathways.

Consequently, there is growing interest in exploring CD44, not as an isolated marker, but rather as part of a large signaling network that drives tumor aggressiveness. One such network is the MAP/ERK signaling pathway, a key regulator of cell proliferation, differentiation and survival17. ERK1 and ERK2 when phosphorylated (p-ERK1/2), are activated downstream of receptor tyrosine kinases and RAS-RAF signaling and are known to promote tumor progression and metastasis in CRC18. Previous studies suggest that CD44 may facilitate ERK activation by acting as a co-receptor or scaffold, enhancing MAPK signaling in tumor cells19. Although CD44-ERK interaction has been characterized in cell lines20,21, its clinical significance in CRC tissues remains poorly defined.

In this study, we aimed to investigate the expression profile of CD44 in a large, well characterized Saudi cohort of > 1100 CRC patients using immunohistochemistry. We then explored the relationship between CD44 and p-ERK1/2 expression as a potential predictor of patient prognosis and metastatic risk. Our findings offer a novel tissue based biomarker signature with potential clinical relevance in CRC.

Results

Patient characteristics

The clinicopathological characteristics of the 1137 CRC patients are summarized in Table 1. The median age of the study cohort was 56.0 years (inter quartile range [IQR], 47.0–68.0 years) with a male: female ratio of 1.1. Most of the tumors were located in the left colon (81.3%; 924/1137). 79.2% (900/1137) of patients had a moderately differentiated tumor and 72.0% (819/1137) were either stage II or stage III. Lymph node metastasis was noted in 50.3% (573/1137) of cases. Synchronous distant metastasis occurred in 13.0% (148/1137), whereas metachronous distant metastasis was seen in 15.9% (181/1137) of CRC cases. 9.1% (104/1137) of tumors were MMR deficient by immunohistochemistry (Table 1).

Table 1 Clinicopathological variables for the patient cohort (n = 1137).

CD44 immuno-expression and its association with clinicopathological characteristics

CD44 protein expression was assessed immunohistochemically, with membrane staining considered for scoring (Fig. 1). CD44 overexpression was noted in 47.7% (542/1137) of CRC cases and was significantly associated with lymph node metastasis (p = 0.0042), stage III tumors (p = 0.0045), poorly differentiated (grade 3) tumors (p = 0.0111), dMMR status (p = 0.0007) and high Ki-67 proliferation index (p = 0.0051). Interestingly, a significant association was also noted between CD44 overexpression and p-ERK1/2 overexpression (p = 0.0012) (Table 2).

Fig. 1
figure 1

Tissue microarray (TMA) based immunohistochemistry analysis of CD44 and p-ERK1/2 in colorectal cancer (CRC) patients. CRC TMA spots showing overexpression of CD44 (A) and p-ERK1/2 (C). In contrast, another set of TMA spots showing reduced expression of CD44 (B) and p-ERK1/2 (D). 20 X/0.70 objective on an Olympus BX 51 microscope. (Olympus America Inc, Center Valley, PA, USA).

Table 2 Correlation of CD44 expression with clinicopathological parameters in colorectal carcinoma.

We next analyzed the survival outcomes for CD44 in CRC. However, CD44 expression was not associated with either overall survival (p = 0.7836) or disease-free survival (p = 0.2150) (Fig. 2).

Fig. 2
figure 2

Survival Analysis of CD44 protein expression. Kaplan Meier survival plots showing no statistically significant difference in (A) overall survival (p = 0.7836) and (B) disease-free survival (p = 0.2150) between CD44 overexpression and low expression groups.

p-ERK1/2 expression analysis

p-ERK1/2 overexpression was observed in 47.1% of cases (545/1157) and was significantly associated with lymph node involvement (p = 0.0001) and stage IV disease (p = 0.0001) (Table 3). High p-ERK1/2 expression was significantly associated with shorter overall survival in univariate analysis (p = 0.0156; Fig. 3A). However, it did not retain independent significance in the multivariate Cox regression model (HR = 1.32, 95% CI: 0.99–1.77, p = 0.0584; Table 4), likely due to collinearity with advanced tumor stage and nodal involvement, which are established predictors of outcome. No significant association was found with disease-free survival (Fig. 3B).

Table 3 Correlation of p-ERK1/2 expression with clinicopathological parameters in colorectal carcinoma.
Fig. 3
figure 3

Survival Analysis of p-ERK1/2 protein expression. Kaplan Meier survival plots showing (A) poor overall survival (p = 0.0156) but (B) no significant difference in disease-free survival (p = 0.2318) between p-ERK1/2 high and low expression groups.

Table 4 Univariate and multivariate analysis of clinicopathological characteristics predicting overall survival using Cox proportional hazards analysis.

CD44 and p-ERK1/2 co-expression predicts metachronous distant metastasis

To further delineate the clinical relevance of co-expression, we stratified cases into four groups based on CD44 and p-ERK1/2 status. Among the 1,120 cases with available dual-marker data, 284 (25.3%) were double-positive (CD44high and p-ERK1/2 high), 256 (22.9%) were CD44high and p-ERK1/2 low, 254 (22.7%) were CD44 low and p-ERK1/2 high, and 326 (29.1%) were double-negative. The double-positive group showed significant association with advanced disease, including increased lymph node metastasis (p = 0.0117), metachronous distant metastasis (p = 0.0404), as well as higher tumor stage (III and IV, p = 0.0022), compared to other subgroups (Table 5). However, CD44 and p-ERK1/2 co-expression was not associated with either overall survival or disease-free survival (Fig. 4).

Table 5 Correlation of CD44 and p-ERK1/2 co-expression with clinicopathological parameters in colorectal carcinoma.
Fig. 4
figure 4

Survival Analysis of CD44 and p-ERK1/2 co-expression. Kaplan Meier survival plots showing no statistically significant difference in (A) overall survival (p = 0.2435) and (B) disease-free survival (p = 0.5449) for CD44 and p-ERK1/2 co-expression.

Considering the association of CD44 and p-ERK1/2 co-expression with metachronous distant metastasis, we sought to determine if co-expression could independently predict development of metachronous distant metastasis, using logistic regression analysis. Univariate analysis revealed T stage, N stage, MMR status and CD44/p-ERK1/2 co-expression as predictors of distant metastasis. On multivariate analysis, we found CD44 and p-ERK1/2 co-expression to be an independent predictor of metachronous distant metastasis (Odds ratio = 1.73; 95% confidence interval = 1.11–2.69; p = 0.0149) (Table 6).

Table 6 Univariate and multivariate analysis of clinicopathological characteristics predicting metachronous distant metastasis using logistic regression analysis

Discussion

In this large-scale IHC study of 1,137 CRC cases from Saudi Arabia, we evaluated the clinicopathological and prognostic significance of CD44 and its potential synergy with p-ERK1/2. CD44 expression was observed in 47.7% of tumors and was significantly associated with lymph node metastasis, advanced tumor stage (Stage III), poor differentiation (grade III), MSI high status and elevated Ki-67 index, all features consistent with tumor aggressiveness. However, CD44 expression alone showed no significant impact on patient survival, which prompted further exploration of its co-expression with p-ERK1/2. Interestingly, we found that 26% of CRC cases co-expressed CD44 and p-ERK1/2, with this subset exhibiting a stronger association with advanced stage (III and IV), lymph node metastasis and most notably, metachronous distant metastasis. Co-expression emerged as an independent predictor of metachronous distant metastasis in multivariate analysis, highlighting its clinical and biological relevance. Our findings build and extend a growing body of literature implicating CD44 in tumor progression. CD44 is widely regarded as a key marker of cancer stem cells (CSCs), contributing to tumor initiation, invasion, EMT and resistance to chemotherapy22,23.

In CRC, CD44 overexpression has been linked to advanced disease stage and poor differentiation14,24. However, its role as a prognostic marker has remained unclear. Some studies have reported a survival disadvantage associated with CD44 positivity14,25,26, while others, in concordance with our own study, have found no significant association with overall or disease-free survival16,27. Although CD44 overexpression was significantly associated with adverse clinicopathological features—including lymph node metastasis, high tumor grade, and MSI-high status—it did not independently predict survival in our cohort. This apparent disconnect aligns with prior literature suggesting that CD44 primarily facilitates local invasion and early metastatic steps, whereas its influence on long-term outcomes is modulated by other oncogenic pathways and tumor-intrinsic factors19,28,29,30. In our analysis, only co-expression of CD44 with p-ERK1/2—an activated effector of the MAPK signaling cascade—identified a biologically aggressive subgroup with increased risk of distant metastasis. These findings support the view that CD44 functions as a context-dependent facilitator of tumor progression, whose prognostic relevance is enhanced in the presence of concurrent MAPK pathway activation. Additionally, population-specific genetic backgrounds and treatment patterns may have attenuated the observable survival impact of CD44 alone in our Saudi cohort, underscoring the need for broader validation in diverse populations. Notably, despite the association between CD44/p-ERK1/2 co-expression and an elevated risk of distant metastasis, this did not translate into a significant survival difference in our cohort. This discrepancy may reflect variability in post-metastatic treatment regimens, or suggest that the CD44–ERK axis predominantly contributes to the initiation phase of metastatic spread. Its influence on long-term outcomes may be mitigated by effective systemic therapies following metastasis. Also, the inconsistencies in prognostic outcomes may reflect methodological differences, population heterogeneity, or context-dependent expression of CD44, which exists in multiple isoforms and engages in dynamic signaling interactions14.

Notably our study showed a synergistic interaction between CD44 and ERK signaling. The MAP/ERK pathway plays a critical role in CRC biology, driving cell proliferation, differentiation and metastasis in response to upstream growth factor and oncogenic RAS/RAF signaling18,31. Our findings that p-ERK1/2 expression correlated with CD44 overexpression is supported by mechanistic studies in other tumor types19,32,33,34. CD44 can facilitate ERK activation by clustering with receptor kinases such as EGFR or c-MET and promoting downstream signal transduction8. Moreover, CD44 may function as signaling scaffold, enhancing MAPK pathway activation and ERK nuclear translocation19,35.

In the context of cancer stemness, ERK activation has been shown to maintain the self-renewal capacity of CD44 + CSCs and promote metastatic dissemination19,36. This layered molecular interaction may explain why CD44 alone is insufficient to predict outcomes, but its co-activation with p-ERK1/2 defines a biologically aggressive subset of CRC. Overall, our study demonstrates a strong clinical synergy between CD44 and p-ERK1/2 using large, well-characterized cohort. Importantly, while CD44 or ERK may variably impact prognosis, their co-expression defines a high-risk subset of CRC with enhanced metastatic potential, providing a more refined biomarker axis than either proteins alone. This co-expression profile may help identify patients who are more likely to develop distant metastasis, and who may benefit from more intensive follow-up or targeted interventions. While CD44/p‑ERK1/2 co‑expression correlated with an increased risk of metachronous distant metastasis, it was not associated with a shorter disease‑free survival interval. This distinction reflects the differing endpoints analyzed: metachronous metastasis represents the specific occurrence of distant spread, whereas disease‑free survival encompasses all recurrence types and is influenced by adjuvant therapy and post‑recurrence management. Hence, CD44/p‑ERK1/2 co‑expression may mark tumors with enhanced metastatic potential without necessarily predicting overall recurrence timing or survival outcome. Furthermore, the CD44–ERK axis represents a promising therapeutic target. Preclinical studies have suggested that MEK inhibitors may be more effective in CD44 expressing tumors, and co-targeting CD44 and ERK may help overcome resistance in CRC models21,37.

While our findings suggest that CD44 and p-ERK1/2 co-expression marks a biologically aggressive subset of colorectal cancer, the therapeutic implications remain speculative. Although prior studies in CRC models have shown that CD44 can activate ERK signaling and promote invasion, EMT, and chemoresistance38,39, these mechanistic insights require broader validation across CRC systems. At this stage, our data support the co-expression pattern as a prognostic biomarker rather than an immediately targetable pathway.

The strength of this study is the population it represents, focusing on an under-represented Middle Eastern population. Most prior studies on CD44 in CRC have focused on Western or Asian populations, while this study is composed entirely of the Saudi population, which may help in forming region-specific risk stratification strategies. Nevertheless, some limitations should be acknowledged, including the retrospective design of the study, lack of isoform-specific CD44 staining (e.g. CD44v6) and absence of functional assays to directly test pathway interaction. Another limitation of this study is the incomplete documentation of treatment history in approximately 20% of cases, largely due to archival constraints from earlier years (1990–2011). Given the potential impact of therapy on biomarker expression, treatment variables were intentionally excluded from multivariate models to minimize the risk of bias. This warrants validation in future cohorts with comprehensive treatment annotation. However, the clinical associations observed here are consistent with known mechanisms from in vitro studies, and our findings provide a potential foundation for future translational or therapeutic research.

In conclusion, our study provides novel evidence that co-expression of CD44 and p-ERK1/2 is a clinically meaningful biomarker for aggressive disease in CRC. While CD44 alone lacks prognostic value, its synergy with ERK activation identifies a high risk subgroup with propensity for distant metastasis. These findings have potential implications for biomarker guided prognostic as well as therapeutic stratification, and future drug development targeting the CD44-ERK axis.

Materials and methods

Sample selection and clinicopathological data

Archival samples from 1137 CRC patients diagnosed between 1990 and 2015 at King Faisal Specialist Hospital and Research Center (Riyadh, Saudi Arabia) were included in the study. Clinicopathological data were collected from patient medical records, which are summarized in Table 1. Distant metastasis was divided into synchronous (detected within 6 months of diagnosis) and metachronous (detected more than 6 months after diagnosis) metastasis. Overall survival was defined as the length of time from the date of diagnosis that patients are still alive. Of the 1137 patients, 218 (19.2%) died and the remaining 919 (80.8%) patients were censored at the time of last follow-up. Disease-free survival was defined as the length of time after patient’s initial surgery that the patient survives without any signs or symptoms of that cancer (such as local, regional and distant recurrence or disease-related deaths). Of the 1137 patients, 352 (31.0%) progressed and the remaining 785 (69%) patients were censored at the time of last follow-up. The median follow-up duration for the entire cohort was 41.2 months.

Institutional Review Board of King Faisal Specialist Hospital and Research Centre provided ethical approval for the current study. Research Advisory Council (RAC) granted waiver of informed consent for use of retrospective patient case data and archival tissue samples under project RAC# 2190 016. All the methods were carried out in accordance with the Declaration of Helsinki.

Tissue microarray construction & immunohistochemistry

Tissue microarray (TMA) format was utilized for immunohistochemical analysis of samples. For the construction of TMA, representative tumor regions from each donor tissue block were chosen and tissue cylinders with a diameter of 0.6 mm were punched and brought into recipient paraffin block with the help of a modified semiautomatic robotic precision instrument (Beecher Instruments, Wood-land, WI, USA). Two spatially distinct tumor cores (0.6 mm each) were selected per case by two independent pathologists, with one core taken from the invasive front when identifiable. This approach has demonstrated reproducibility for CD44 immunohistochemical assessment in prior studies, including consistent inter-core and whole-section concordance40.

Tissue microarray slides were processed and stained manually as described previously41. Primary antibody against CD44 and p-ERK1/2 were used, details of which are provided in Table 7. A normal colon tissue microarray was also stained to validate the antibody. Normal tissues of different organ system were also included in the TMA to serve as positive controls. Negative control was performed by omission of the primary antibody. For CD44, membranous staining was considered for scoring. Positive staining in more than 10% of tumor cells was considered as overexpression of CD44 42. Nuclear staining was considered for scoring p-ERK1/2. H score was used to analyze p-ERK1/2 staining43. Briefly, each TMA spot was assigned an intensity score from 0 to 3 (I0, I1–I3) and the proportion of tumor staining for that intensity was recorded as 5% increments from a range of 0–100 (P0, P1–P3). A final H score (range 0–300) was obtained by adding the sum of scores obtained for each intensity and proportion of area stained (H score = I1 × P1 + I2 × P2 + I3 × P3). p-ERK1/2 expression was dichotomized based on the median H score, with H score ≤ 60 classified as low expression and H score > 60 classified as overexpression.

Evaluation of mismatch repair protein staining was performed as described previously44. Briefly, MMR protein expression was evaluated using MSH2, MSH6, MLH1 and PMS2 proteins. Details of the primary antibodies used are provided in Table 7. Tumor was classified as deficient MMR (dMMR) if any of the four proteins showed complete loss of staining in tumor cells with concurrent positive staining in the nuclei of normal epithelial cells. Otherwise, they were classified as proficient MMR (pMMR).

IHC scoring was done by two pathologists, blinded to the clinicopathological characteristics. Discordant scores were reviewed together to achieve agreement.

Table 7 Antibodies used for TMA IHC Analysis.

Statistical analysis

Associations between clinicopathological variables and protein expression was analyzed using contingency table analysis and Chi square test. Kaplan-Meier method was used to generate survival curves and Mantel Cox log rank test was used to evaluate significance. Univariate and multivariate analysis was performed using Cox proportional hazards model to determine factors predicting overall survival and logistic regression model to determine factors predicting metachronous distant metastasis. Two-sided tests were used for the calculations and limit of significance was defined as p value of < 0.05 for all analyses. Data analyses was performed using JMP14.0 (SAS Institute, Inc.,Cary, NC) software package.