Introduction

Colorectal cancer (CRC) is a leading cause of morbidity and mortality globally. Notably, its incidence is on the rise in China, appearing increasingly in younger populations. In 2020, CRC ranked third in cancer occurrence and second in cancer-related fatalities worldwide, excluding non-melanoma cutaneous neoplasms1. The situation is particularly urgent in developing countries where limited awareness of early screening and insufficient health infrastructure contribute to high mortality rates. In these regions, CRC has become the second and third most lethal cancer for females and males, respectively. Lifestyle changes, environmental factors, and an aging population have elevated the risk of CRC in China, posing a significant health threat. Recent advances in genomics, including microarrays and deep sequencing, have led to the discovery of numerous molecular biomarkers over the last twenty years, enhancing our grasp of cancer mechanisms and facilitating early diagnosis and treatment prediction2,3,4,5.

The Pharma ADME consortium identifies 32 primary and 266 supplementary genes within this category6,7,8. These genes fall into distinct categories, including phase I and II xenobiotic metabolizing enzymes, transport proteins, and regulatory factors9,10,11. Research shows that polymorphisms in ADME genes contribute to the variability in drug response and cancer development among individuals12,13. Disparities in ADME gene expression across transcriptional, translational, and epigenetic stages reflect diverse responses within populations14,15,16.

Research highlights the pivotal role of ADME genes in pharmacokinetics, suggesting they may serve as potential effective biomarkers for predicting treatment outcomes, adverse drug reactions, drug resistance, and overall survival (OS). For instance, the genes CYP1B1 and ABCB1 have been identified as predictors of clinical responses to paclitaxel in breast cancer17. Additionally, a study by Sutsandiram et al. indicated a correlation between ABCB1 and increased adverse events from methotrexate in hematologic malignancies18. Moreover, findings by Zhang et al. demonstrate a connection between UGT1A1 levels and resistance to 5-fluorouracil in esophageal cancer19, while Hu et al. have pinpointed a set of key ADME genes that are prognostic of OS across various cancers20. Despite these findings, the prognostic value and biological roles of ADME genes in colon cancer remain underexplored.

In this study, we used genomic expression profiles and clinical data from 430 patients in the Cancer Genome Atlas (TCGA) to identify ADME genes with altered expression in colorectal adenocarcinoma (COAD). We developed a two-gene signature in the TCGA dataset that predicted survival outcomes using stepwise Akaike information criterion (stepAIC) and Cox regression analysis. This model was further validated using external datasets from the Gene Expression Omnibus: GSE39582 (n = 562) and GSE17536 (n = 177). Our extensive evaluations included functional enrichment, immune cell infiltration, immunotherapy response, and mutation assessment between high-risk (HR) and low-risk (LR) groups. The results suggest that the gene signature associated with ADME is linked to immune cell infiltration and can effectively forecast both the prognosis and treatment responses in CRC. Figure 1 provides a workflow diagram of our study.

Fig. 1
figure 1

Flowchart for identifying postoperative colon cancer patients at risk using ADME gene profiling.

Methods

Study design

This study retrieved mRNA expression data and associated clinical information of patients with CRC from public databases. We screened for ADME-related differentially expressed genes (DEGs) and conducted functional analysis. Subsequently, univariate Cox regression (P < 0.05) and stepAIC multivariate stepwise regression analyses were utilized to identify key genes. Using these key genes, we constructed an ADME-related risk score. The study also examined the differences in functional enrichment, immune infiltration, and tumor mutation burden (TMB) between high- and low-risk score groups (Fig. 1).

Data acquisition

Transcriptome data for COAD and corresponding normal samples were sourced from TCGA (https://portal.gdc.cancer.gov/) to form our training set16. The data downloaded via TCGAbiolinks were normalized using TPM (Transcripts Per Million). For external validation, we gathered gene expression and survival information for patients from the GSE39582 (562 samples) (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39582) and GSE17536 (177 samples) datasets (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE17536). The data obtained from GEOquery was processed using corresponding normalization methods (RMA), followed by logarithmic transformation. To maintain consistency, ENSEMBL Gene IDs were converted to Gene Symbol IDs using biomaRt. Genes expressed in less than 50% of samples were excluded. Single-cell RNA-seq data from GSE146771 was visualized using TISCH (http://tisch.comp-genomics.org/). Additionally, we compiled a list of 298 ADME-related genes from the Pharma ADME (accessed August 2023) for further analysis14.

Differential gene expression analysis and functional enrichment

Using the “limma” package, we identified genes linked to ADME that showed significant expression differences between COAD and normal cases in the TCGA dataset, using a cutoff of false discovery rate (FDR) < 0.05 and log2 |fold change (FC)| > 1. Venn diagrams were subsequently used to identify these DEGs by identifying overlaps with previously identified ADME-related genes.

To elucidate the biological roles of these identified genes, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses via R packages “cluster Profiler”, “org.Hs.eg.db”, “enrichplot”, and “ggplot2”.

Development and validation of a prognostic model

We screened differentially expressed ADME-related genes through univariate Cox regression analysis (p < 0.05). To identify the most impactful genes for our predictive model, we implemented a stepAIC method with the MASS package. For each patient, we computed a risk metric based on normalized gene expression levels (Expi) and their associated regression coefficients (Coei) using this equation:

\({\text{Risk score }}=\sum\nolimits_{{i=1}}^{N} {(Expi} \times Coei)\)

Patients in the COAD dataset were divided into HR and LR groups according to the median risk score. Kaplan-Meier survival curves compared the survival times between these groups. The effectiveness of the prognostic model was evaluated using Receiver Operating Characteristic (ROC) curve analysis through the survival ROC package. The model’s validity was further confirmed in the GSE39582 and GSE17536 datasets. Both univariate and multivariate Cox regression analyses assessed the prognostic independence of the ADME-related risk score alongside other clinical parameters in COAD patients. A nomogram incorporating significant risk factors was created to predict survival, with its accuracy evaluated through calibration curves and decision curve analysis (DCA).

Genomic alterations analysis

We obtained somatic mutation data in mutation annotation format (MAF) from the TCGA database and used the “Maftools” R package for examining and displaying mutation rates among different ADME risk groups, with TMB measured in mutations per megabase (mut/Mb). Additionally, we sourced copy number alteration (CNA) data for COAD patients from TCGA, identifying significant genomic amplifications or deletions using GISTIC2.0. The burden of CNA was calculated by tallying the total genes altered at both focal and chromosomal arm levels.

Tumor immune microenvironment and response to immunotherapy evaluation

The ESTIMATE algorithm provided immune, stromal, and ESTIMATE scores, along with tumor purity assessments for COAD patients, using the “estimate” R package. The Tumor Immune Dysfunction and Exclusion (TIDE) algorithm ( http://tide.dfci.harvard.edu/login/) assessed potential responses to immune checkpoint inhibitor (ICI) therapy, indicating better responsiveness in patients with lower TIDE scores or higher microsatellite instability (MSI) scores.

Data interpretation and numerical evaluation

All statistical analyses were performed with R software (v4.3.1). We applied the Wilcoxon test for pairwise comparisons and the Kruskal–Wallis test for multi-group comparisons, with significance levels noted as (*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001). Survival analysis utilized the Kaplan-Meier method and log-rank test. Participants were stratified into high-risk and low-risk groups based on the median risk score. Statistical significance was defined as p < 0.05.

Results

ADME-associated gene expression patterns in COAD

Differential analysis of RNA-seq data from COAD tumor samples and normal tissue samples identified 1,718 DEGs displayed on a volcano plot (Fig. 2A), with 784 genes upregulated and 1,634 genes downregulated. A heatmap of these DEGs illustrated distinct expression differences between tumor and normal tissues (Fig. 2B). An intersection of the DEGs and the 298 ADME genes yielded 19 common genes. Analysis of 49 ADME DEGs between COAD and normal groups revealed distinct transcriptomic signatures (Fig. 2C). Subsequently, a protein-protein interaction network was constructed, showing close interactions among the proteins corresponding to these genes (Fig. 2D). An investigation into the molecular alteration landscape of these ADME-related DEGs revealed missense mutation as the predominant variant type (Fig. 2E), with ABCB1, ABCC1, SULF1, DPYD, and PDE3A being the top five mutated genes. Analysis of Copy number variation (CNV) mutations demonstrated significant alterations in the top 20 mutated ADME-related DEGs (Fig. 2F). To elucidate the regulatory mechanisms governing these DEGs, we performed GO and KEGG enrichment analyses. KEGG enrichment results highlighted that DEGs were enriched the predominant pathways included Retinol Metabolism, Drug Metabolism - Other Enzymes, Chemical Carcinogenesis - DNA Adducts, Drug Metabolism - Cytochrome P450, and Metabolism of Xenobiotics by Cytochrome P450 (Fig. 2G). GO analysis enriched processes were predominantly related to responses to xenobiotic stimuli, cellular responses to xenobiotic stimuli, xenobiotic metabolic processes, vascular processes in the circulatory system, and alcohol metabolic processes (Fig. 2H).

Fig. 2
figure 2

Variant landscape of ADME-related genes in colon cancer. (A) Volcano plot of differentially expressed genes (DEGs) significantly upregulated (red) or downregulated (blue) in colon cancer versus normal tissue (FDR < 0.05, |log₂FC| > 1). Grey points indicate non-significant genes. (B) Venn diagram of ADME-related genes overlapping with colon cancer DEGs. (C) Heatmap comparing ADME-related DEG expression between colon adenocarcinoma (COAD) and normal tissues. (D) Protein–protein interaction (PPI) network of ADME-related DEGs from STRING database analysis. (E) Oncoplot of mutation frequencies in the top 20 most altered ADME-related DEGs in TCGA-COAD. (F) Bar plot of copy number variation (CNV) patterns (gain, loss, absence) in top 20 ADME-related DEGs. (G) Bar plot of KEGG pathways significantly enriched in ADME-related DEGs. (H) Dot plot of Gene Ontology (GO) enrichment for biological processes, molecular functions, and cellular components of ADME-related DEGs.

Construction of the ADME-related prognostic signature in COAD

To construct a gene-based predictive model, we initially employed univariate Cox regression analysis to identify ADME-related genes with significant prognostic potential. This analysis pinpointed four genes—GPX3, NAT1, NAT2, and SULT1B1—with notable prognostic value (p < 0.05) (Fig. 3A). Further refinement through stepAIC analysis reduced the model to two critical genes, resulting in the following prognostic formula: Risk score = (0.1644 * GPX3) + (-0.6311 * NAT1). The OS curve demonstrates that the high-risk group exhibits poorer prognosis (Fig. 3B). Analysis of risk scores and survival times indicated that OS improved as patient risk ranking decreased (Fig. 3C). The model’s robustness was validated using two independent cohorts, GSE39582 and GSE17536. In these analyses, patients in the LR groups consistently showed superior OS than those in the HR group (GSE39582: median time = 145 months vs. 105 months, P = 0.015, Fig. 3D; GSE17536: median time = 49.7 months vs. 37.0 months, P = 0.0062, Fig. 3E). Risk score distributions and survival statuses for these cohorts are displayed in Figs. 3F-G. These consistent findings across multiple datasets underscore the efficacy of our 2-gene prognostic model in forecasting COAD patient outcomes.

Fig. 3
figure 3

Construction and validation of ADME-related prognostic signature. (A) Forest plot of ADME-related genes selected by multivariate Cox regression with hazard ratios and significance levels. (B, D, E) Kaplan–Meier curves showing worse overall survival in high-risk (red) versus low-risk (blue) patients across training and validation cohorts. (C, F, G) Distribution plots showing risk score correlation with patient mortality (deceased patients cluster at higher scores) and survival duration (longer survival associates with lower scores).

Development and assessment of the prognostic nomogram

To ascertain the prognostic impact of ADME-related signatures, we carried out both univariate and multivariate Cox regression analyses. Univariate analysis identified age, gender, stage, TNM classifications (T, N, M), MSS/MSI status, and the risk score as significant predictors of OS (Fig. 4A). The HR for the risk score was 2.718 (95% CI: 1.718–4.300, p < 0.001). Multivariate analysis confirmed that age, stage, TNM-T, TNM-N, TNM-M, MSS/MSI status, and the risk score remained independently associated with OS in COAD patients, with an HR of 2.032 (95% CI: 1.198–3.448, p = 0.009) (Fig. 4B). A prognostic nomogram integrating these significant clinicopathological features (Age, Gender, Stage, T, M) with the ADME-related signature was developed to predict 1-, 3-, and 5-year survival probabilities (Fig. 4C). The nomogram demonstrated strong discriminative performance, with area under the curve (AUC) values of 0.779, 0.788, and 0.765 for 1-, 3-, and 5-year survival, respectively (Fig. 4D). Calibration curves showed close agreement between predicted and observed outcomes (Fig. 4E), while decision curve analysis (DCA) confirmed clinical utility (Fig. 4F).

Fig. 4
figure 4

Clinical prognostic model development and validation. (A) Univariate analysis of clinicopathologic variables and risk score associated with survival. (B) Multivariate analysis confirming independent survival predictors. (C) Nomogram for individualized 1-, 3-, and 5-year survival prediction. (D) ROC curve evaluating nomogram discriminative performance. (E) Calibration plots assessing prediction accuracy. (F) Decision curve analysis quantifying clinical net benefit for survival-guided decision-making.

DCA indicated that applying the nomogram incorporating both critical clinicopathological features and the ADME-related risk score in clinical decision-making provided greater net benefit across a wide range of threshold probabilities compared with treating all or no patients. This advantage was consistent across 1-, 3-, and 5-year predictions, supporting its value in guiding treatment intensity and surveillance strategies. The nomogram exhibits robust predictive performance and clinical applicability for prognostic assessment in COAD patients. Overall, our nomogram demonstrates robust predictive capability and practical clinical utility in assessing COAD patient prognosis, leveraging critical clinical indicators.

Tumor microenvironment and prognostic subgroups related to ADME

The study delved into how the ADME prognostic signature correlates with immune status within the TCGA COAD cohort. The tumor microenvironment (TME), comprises primarily stromal and immune cells, which are crucial in cancer progression and signaling. ESTIMATE algorithm includes Immune score, Stromal Score, and ESTIMATE Score. Immune score is the percentage of Immune cells, which is a scoring system based on the quantitative analysis of cytotoxic T cells and memory T cells in the core of the tumor (CT) and the invasive margin (IM) of the tumor. Stromal Score is the percentage of stromal cells, and Estimate Score is the sum of the Immune Score and Stromal Score. High tumor purity is associated with an unfavorable prognosis in patients21. Several studies have confirmed that the scores are associated with the clinicopathological characteristics and chemotherapeutic drug resistance in various types of tumors, and that ESTIMATE could be used as an indicator for patient prognosis assessment22,23,24. Using the ESTIMATE algorithm, we analyzed the immune microenvironment to understand how the ADME-related risk score correlates with immune response. Results indicate that patients with elevated risk scores typically exhibit higher immune scores (Fig. 5A). We observed no significant difference in the immune scores between high and low risk groups (P = 0.31); however, significant differences were noted in the stromal scores (P = 0.0000000048) and ESTIMATE score (P = 0.00013) within the tumor tissue. In the training set, the box plot drawn from the immune infiltration score of samples in the high- and the low-risk score groups (Fig. 5A) showed a significant difference in multiple immune infiltrations. In the high-risk score group, the levels of the following immune infiltrations were elevated: A variation in the expression of immune checkpoint genes between HR and LR groups suggests varying susceptibilities to ICIs. For instance, CD44 and CD244 are more expressed in the LR group, while Neuropilin 1 (NRP1) and TNFRSF4 show higher expression in the HR group (Fig. 5B). Additionally, the TIDE score was significantly lower in LR patients, indicating a potential for better immunotherapy efficacy (Fig. 5C). Analysis of cancer stem cells revealed significant differences in stemness enrichment scores between risk groups, based on 26 stemness genesets (Fig. 5D). Hypoxia-responsive gene expression was also higher in LR group (Fig. 5E), suggesting different adaptive responses to tumor microenvironmental stress.

Fig. 5
figure 5

Immune microenvironment characteristics in high-risk patients. (A) Higher immune/stromal infiltration (green) and lower tumor purity (orange) in high-risk patients. (B) Increased immune checkpoint gene expression in high-risk patients. (C) Higher TIDE scores indicating greater immune evasion potential. (D) Differential stemness pathway activity between risk groups (26 stemness pathways). (E) Elevated hypoxia scores in high-risk tumors.

Tumor genomic changes and their association with ADME risk scores

The study examined the association between ADME risk scores and genomic changes, including CNV alternation and mutations. CNV analysis further revealed tumor aneuploidy scores and fraction genome alterations (FGA) were significantly elevated in the HR group (Figs. 6A-B). Analysis revealed a significantly higher non-synonymous TMB in protein-coding regions among the LR group compared to the HR group (Fig. 6C). Notably, mutation frequencies for PIK3CA and TP53 differed inversely between the HR and LR groups, with PIK3CA mutations present in 22% of HR and 35% of LR cases, and TP53 mutations in 65% of HR and 45% of LR cases (Figs. 6D-E).

Fig. 6
figure 6

Genomic instability associated with ADME risk. (A) Higher fraction of genome altered (FGA) in high-risk tumors. (B) Increased aneuploidy in high-risk patients. (C) Higher tumor mutational burden (TMB) with ADME risk. (D–E) Oncoplot of mutation landscape showing frequent mutations (insertions/deletions/frameshifts) in high-risk tumors. p < 0.05; *p < 0.01; **p < 0.001; ***p < 0.0001.

This exploration underscores the complex interplay between genetic variations and ADME risk scores, highlighting their potential impact on the genomic landscape of colon cancer patients.

Functional enrichment analysis of the ADME-associated prognostic model

Through differential analysis, we identified distinct genes between high- and low-risk score groups, illustrated by a volcano plot depicting 367 DEGs (Fig. 7A). Subsequent enrichment analysis of these DEGs focused on their associated functional pathways. We identified DEGs between HR and LR groups using criteria of | logFC | > 0.5 and FDR < 0.05. These DEGs were subjected to GO pathway analysis and Gene Set Enrichment Analysis (GSEA) of KEGG pathways using the ‘clusterProfiler’ package. Significantly enriched pathways were defined by |NES| > 1, NOM p-value < 0.05, and q-value < 0.25. The GSEA revealed that pathways such as Drug Metabolism—Other Enzymes, Fatty Acid Metabolism, Oxidative Phosphorylation, Fatty Acid Degradation, and Porphyrin Metabolism were notably enriched in the LR group. Conversely, ECM-Receptor Interaction and Focal Adhesion pathways were significantly prevalent in the HR group (Figs. 7B-C). GO analysis linked these genes to processes such as ECM organization, extracellular structure organization, collagen fibril organization, and endodermal cell differentiation (Fig. 7D).

Fig. 7
figure 7

Biological drivers of ADME-linked prognostic risk. (A) Volcano plot showing upregulated (red) and downregulated (blue) DEGs between high- and low-risk patients. (B–C) KEGG enrichment revealing dysregulated metabolic and signaling pathways in high-risk tumors. (D) Dot plot mapping immune and cellular processes enriched in DEGs (dot size = gene count; color = significance).

Single-cell analysis of ADME-associated prognostic signatures

Using the colon cancer single-cell dataset GSE146771, the expression of three ADME-associated prognostic signatures was analyzed within the TME. This dataset includes 24 cell populations across 9 major cell types such as B cells, CD4 T cells, CD8 T cells, Mast cells, Mono/Macro, NK cells, Plasma, and Tproilf cells. Predominantly, Mono/Macro and CD4 T cells were the most abundant (Fig. 8A-C). Notably, GPX3 (glutathione peroxidase 3) showed higher expression in Mono/Macro and Mast cells, while NAT1 was primarily expressed in Tproilf cells, with significant expression also observed in Mast cells (Figs. 8D-F).

Fig. 8
figure 8

Single-cell validation of ADME genes in the tumor microenvironment. (A) UMAP/t-SNE plot showing 24 distinct cell clusters. (B) Annotation of nine major cell populations, including T cells and macrophages. (C) Bar plot showing cell type abundance across samples. (D–E) Feature plots mapping ADME gene expression across cell clusters. (F) Violin plots showing cell-type-specific ADME gene expression in tumor and immune compartments.

Discussion

CRC ranks among the most prevalent and lethal malignancies worldwide, characterized by high incidence and mortality rates as well as late detection and poor prognosis. The advent of precision medicine has intensified the search for novel tumor markers to aid in the diagnosis, treatment strategies, and outcome predictions of this disease.

ADME genes comprise a group of 298 genes involved in the absorption, distribution, metabolism, and excretion of drugs. Phase I drug-metabolizing enzymes include oxidases, dehydrogenases, hydrolases, and deaminases, primarily involved in functionalization reactions25. Phase II enzymes, mainly transferases like UDP-glucuronosyltransferases and sulfotransferases, facilitate conjugation26. Drug transporters such as solute carrier transporters (SLC15A2, SLC22A2) and ATP-binding cassette transporters (ABCB1, ABCG2) play crucial roles in drug uptake into cells27. ADME genes significantly influence the metabolism of both endogenous and exogenous substances, affecting cancer development and progression through mechanisms like DNA damage and modulation of growth signaling pathways. Genetic polymorphisms in these genes, including single nucleotide polymorphisms (SNPs), have been linked to carcinogenesis and drug responses13,28,29. Recent studies indicate notable differences in the expression of ADME genes between lung cancer tissues and normal tissues, with some genes correlated with patient prognosis in lung cancer. This highlights the importance of ADME genes in cancer biology and treatment efficacy20.

This study pinpointed 49 ADME genes with differential expression in CRC, suggesting their significant roles in the disease’s development and progression.

Through univariate Cox regression and stepAIC multivariate stepwise regression analyses, our research further established a prognostic model that includes two genes, GPX3 and NAT1, both implicated in tumorigenesis. The promoter hypermethylation of GPX3 is associated with reduced expression in various cancers, correlating with tumor initiation, proliferation, and migration due to enhanced oxidative stress and pro-tumorigenic redox signaling30. GPX3 is a key antioxidant enzyme that protects cells from oxidative damage by eliminating reactive oxygen species (ROS). Reduced GPX3 expression is frequently associated with tumor progression, invasion, and poor prognosis. In this model, higher GPX3 expression correlated positively with risk score, consistent with its role as a tumor suppressor. Reduced expression may compromise antioxidant defense, thereby promoting malignant phenotypes31. NAT1, a phase II metabolic enzyme, participates in detoxification of drugs, including chemotherapeutic agents, and carcinogen metabolism. Its prognostic impact is complex and tissue-specific32. In this model, higher NAT1 expression correlated significantly and negatively with risk score, suggesting that increased NAT1 activity in COAD may enhance metabolism and clearance of toxic compounds, including endogenous carcinogens and exogenous drugs, thereby improving patient survival. These findings underscore the potential of ADME genes as biomarkers in the management of CRC, highlighting the importance of ongoing research in this area.

To investigate the biological roles of ADME genes in CRC, we conducted functional analyses to compare the DEGs between HR and LR groups. GO, KEGG, and GSEA showed enrichment of drug metabolism-related pathways in the LR group, while pathways involving ECM-receptor interaction and Focal adhesion were predominant in the HR group. Focal adhesion, an anchored junction located on the epithelial basement surface, primarily functions through integrins. These integrins form crucial bridges between the ECM and the actin cytoskeleton, playing a crucial role in various tumor cell functions such as proliferation, survival, migration, invasion, and maintaining stem cell characteristics. The integrin family, key components of focal adhesion, comprises 24 transmembrane αβ heterodimers formed by selective binding of 18 α and 8 β subunits, with integrin β3 often associated with tumor malignancy. The expression patterns of different integrin heterodimers influence cancer cell behavior, triggering various signaling pathways, including PI3K/AKT and MAPK, which promote tumor survival and progression33,34,35.

Analysis of the TME in the TCGA COAD cohort based on ADME-related prognostic subgroups provided insights into the associated immune status and biological roles. The ESTIMATE algorithm’s analysis of the immune microenvironment landscape suggested that higher risk scores correlate with increased immune scores, indicating a potential link between ADME-related prognostic signatures and immune responses.

Differential expression of immune checkpoint genes between the LR and HR groups suggests varied responses to ICIs, with significant differences in the expression of genes such as CD44, CD244, NRP1, and TNFRSF4. The lower TIDE score observed in LR patients compared to HR patients suggests a potentially greater efficacy of immunotherapy in the former group. Lower TIDE scores in LR patients compared with HR patients suggest a microenvironment less conducive to immune escape and potentially more responsive to immune checkpoint inhibitors.

Research indicates that CD44 expressed in LS174T colon cancer cells has a tumor-suppressive effect, with silenced CD44 expression markedly increasing metastasis potential36. This finding is consistent with observations that CD44 is highly expressed in the LR group. NRP1, a single-pass transmembrane glycoprotein, functions as a coreceptor that interacts with various extracellular ligands such as semaphorins (SEMA), placental growth factor-2 (PLGF-2), hepatocyte growth factor (HGF), platelet-derived growth factor (PDGF), and transforming growth factor β (TGF-β). These interactions activate oncogenic signaling pathways that regulate HIF-1α protein stability and enhance glycolysis, promoting hepatocellular carcinoma (HCC) proliferation37,38,39,40. Our finding that NRP1 is highly expressed in the HR group is consistent with previous reports. The differential expression of immune checkpoints revealed that Tumor necrosis factor receptor 4 (TNFRSF4), a member of the TNF receptor family, is highly expressed in the HR group, suggesting its role in promoting CRC development and progression. TNFRSF4 plays a crucial role in shaping the immune microenvironment within tumors, enhancing T cell proliferation, survival, and migration through costimulatory signals. Its elevated expression is associated with increased immune infiltration and gene mutation frequency41, further corroborating the study findings.

Significant variations in the enrichment scores of stemness genesets between risk groups suggest a correlation between cancer stem cells and ADME-related prognostic signatures. The observed higher hypoxia scores in LR patients may indicate differences in the TME and cellular responses to hypoxia in these subgroups.

These findings elucidate the complex interactions among ADME-related prognostic signatures, immune responses, cancer stem cells, and hypoxia in the COAD TME. Understanding these dynamics could provide crucial insights into patient outcomes and inform the development of personalized treatment strategies based on immune status and microenvironment characteristics. Continued research could advance immunotherapy and precision medicine approaches for COAD patients.

However, this study also has certain limitations. TIDE is a computational algorithm for inferring tumor immune evasion potential. Although this association suggests that low-risk patients may benefit more from immunotherapy, this remains speculative without clinical validation in ICI-treated cohorts. Prospective studies evaluating ICI treatment responses are needed to validate this hypothesis and confirm the model’s predictive value for immunotherapy outcomes.

In the era of high-throughput biology, bioinformatics has become a crucial discipline for interpreting large-scale datasets. It is often regarded as a practical field focused on developing databases and software tools to support other research areas, rather than a fundamental scientific discipline dedicated to uncovering biological principles42,43. Since the model was developed using retrospective data from public databases, large-scale, multi-ethnic prospective validation studies are essential to confirm its clinical applicability. Key ADME-related genes were identified through bioinformatics analyses, but their specific roles in COAD progression remain uncertain. These findings are hypothesis-generating and derived from in silico analyses; therefore, experimental validation in independent cohorts and mechanistic studies are required to confirm the clinical utility and biological mechanisms.

The association between the identified signature and immunotherapy response is based on computational algorithms and requires clinical validation. Future research should experimentally elucidate the functional mechanisms of these genes in COAD.

This study presents an in-depth examination of ADME-related genes in COAD, elucidating their influence on prognosis and the TME. Exploring differential expression, molecular alterations, prognostic signatures, and immune responses linked to ADME genes offer significant insights into the underlying processes of COAD development and progression. The construction of a prognostic model and nomogram derived from these genes enhances our understanding of patient outcomes and supports clinical decision-making.