Introduction

Head and neck squamous cell carcinoma (HNSCC) is a type of cancer that affects the upper part of the digestive system, impacting regions such as the mouth, lips, nasal passages, larynx, and pharynx1. Currently, HNSCC is identified as the sixth most prevalent form of cancer worldwide, with an annual report of more than 650,000 new cases and 90,000 deaths2. Estimates suggest that the number of head and neck cancer cases might rise to 856,000 by the year 2035, underscoring the urgent need to address and advance treatment methodologies in the coming years3. Immunotherapy has emerged as an advanced standard treatment for patients with HNSCC, showing significant efficacy. However, the challenge remains in accurately predicting the therapeutic outcomes of immune checkpoint inhibitors (ICIs) for patients at the same TNM staging due to the lack of effective biomarkers4,5.

Over the past decade, research has shown that treatment with ICIs delivers enduring clinical advantages to individuals with late-stage malignancies, such as HNSCC6,7. Specifically, inhibiting the PD-1 signaling receptor restores the cytotoxic activity of tumor-specific CD8 + T cells in the tumor microenvironment (TME), thus reducing the tumors’ ability to evade immune detection and managing disease advancement.As a crucial form of immunotherapy, ICIs have attracted extensive attention from researchers over the years. These inhibitors work by blocking immune checkpoints such as PD-L18, thereby lifting the suppression on T cells and enabling them to more effectively identify and attack cancer cells9. Immune checkpoints serve as safety mechanisms within the immune system, preventing it from mistakenly attacking normal cells. However, cancer cells could utilize these checkpoints to escape immune detection and destruction. Consequently, this therapeutic approach has become a primary treatment for HNSCC in recent years10,11,12.

ICBs offer a revolutionary approach to treating HNSCC. These compounds, especially those focusing on the PD-1/PD-L1 signaling route, have shown effectiveness across multiple cancer forms, such as kidney cancer, lung cancer, and HNSCC13. They have significantly improved one-year survival rates for patients with recurrent metastatic HNSCC to a range of 36% to 57%, and extended median survival time to between 7.7 and 13.0 months14,15. This therapeutic effect is driven by a mechanism known as ICIs, where ICIs reinvigorate T-cell-mediated immune responses against tumor cells by obstructing inhibitory signaling pathwa16. For example, researchers conducted two important trials, KEYNOTE-040 and CheckMate 141, which respectively studied the effects of pembrolizumab and nivolumab on recurrent or metastatic HNSCC therapies that had progressed after previous treatments. Both trials showed an improvement in overall survival (OS) and led to the approval of these therapies for recurrent or metastatic HNSCC17. Nevertheless, clinical trials indicate that only a minority of HNSCC patients (18%) benefit from ICI treatment18,19.

Consequently, there is a pressing need to develop models based on large-scale multi-omics data and advanced methods that can identify and predict biomarkers for patients likely to respond effectively to immunotherapy. In this context, we carried out the following research activities: Firstly, through the integration of machine learning with multi-omics approaches, two molecular subtypes were independently identified and, by analyzing their markers, related immune-regulated genes (IRGs) were discovered. Next, leveraging these IRGs, a combination of 303 machine learning algorithms was utilized to develop a consensus machine learning-driven prediction immunotherapy signature (CMPIS). Subsequently, the prognostic model was validated and compared across multiple datasets, including clinical characteristics, external datasets, and previously published models. Ultimately, it was found that the low-CMPIS patients were more suitable for immunotherapy and chemotherapy. Conversely, patients with high CMPIS were better suited for radiation therapy and EGFR-targeted treatment. These findings indicated that patient stratification based on CMPIS can enhance the accuracy of treatment plan selection.

Materials and methods

Downloading and organizing of publicly available data

We first acquired multi-omics data on HNSCC from the TCGA database, including mRNA, lncRNA, miRNA, DNA methylation, somatic mutations, and clinical data (Supplementary Table S1). Meanwhile, we acquired mRNA and clinical data on four HNSCC cohorts from the GEO database, including GSE27020, GSE41613, GSE42743, and GSE65858 (Supplementary Table S2)20,21,22,23. Finally, mRNA and clinical data were obtained from the GEO database and the R package “IMvigor210CoreBiologies” for four immunotherapy cohorts (GSE78220, GSE91061, GSE135222, and IMvigor210) (Supplementary Tables S3-S6)24,25,26,27,28.

For data from microarrays, we performed a log2 transformation and normalized the expression values to achieve inter-array consistency29. When analyzing RNA-seq data obtained through high-throughput sequencing, researchers commonly rely on TPM values. These values offer a standardized measure of gene expression levels that closely resemble the data output of microarray analysis, thereby facilitating easier comparisons across different samples30. The GSE27020, GSE41613, GSE42743, and GSE65858 datasets were merged into the META dataset using the “ComBat” function, and the validity and stability of the dataset merge were further verified using PCA analysis31. The gene signatures for this study were acquired from the package “IOBR” and previously published literature19,32,33.

Identification of cancer subtypes by multi-omics consensus clustering analysis

We first obtained the expression matrix of immune checkpoint genes (ICGs) from mRNA data. For lncRNA, miRNA, and methylation data, the top 1500 genes with the highest degree of variation were filtered by the “getElitee” function and the prognostic genes were screened by univariate Cox analysis with clinical data (p < 0.05). For the mutation data, the top 5% of most frequently mutated genes were filtered by the “getElitee” function. Next, the above multi-omics data were included in the R package “MOVICS” to continue further analysis34. The optimal number of clusters was determined from the sum of the Clustering Prediction Index (CPI) and gaps-statistics35. Finally, we used 10 clustering algorithms included in the “getMOIC” function to analyze the above multi-omics data and combined the results of the various algorithms with consensus clustering to enhance the robustness of the clusters34.

Validation, molecular characterization, and immune landscape of cancer subtypes

Initially, to ensure the robustness of subtype classifications, we conducted a rigorous validation process by leveraging subtype-specific biomarkers within the META cohort. This validation step allowed us to confirm the reliability and stability of the clustering results obtained. Subsequently, the assessment of the concordance between the consensus clustering approach and the NTP and PAM classifiers was proceeded. By comparing these different classification methods, the aim was to evaluate the consistency and reproducibility of subtype assignments across various analytical frameworks. The PCA, TSNE, and UMAP analysis were also utilized to find the gene distribution of cancer subtypes (CSs)36.

Then, we investigated the association between CSs and marker genes for clinical characteristics and subtypes. The GSVA algorithm was employed to assess the changes in the relevant signaling pathways in different CSs37. The transcriptional regulatory networks, also known as regulons, were meticulously built utilizing the powerful package “RTN”. This comprehensive analysis revealed a network comprising 23 induced or repressed target-associated transcription factors (TFs) intricately linked to the regulatory landscape. Furthermore, our investigation identified 71 candidate regulators intricately associated with cancerous chromatin remodeling processes, shedding light on the intricate regulatory mechanisms underlying cancer progression and gene expression dynamics38.

Finally, based on the R package “IOBR” and previously published literature, we gathered a variety of published signatures associated with tumor microenvironment (TME), T cells, B cells, CAFs, CAFs at the single-cell level, immune suppression, immune exclusion, immune function, immunotherapy, and ICGs, and utilized a standardized approach to compute the enrichment score for individual samples, allowing for a thorough examination of the immunological variances between patients with CSs32,39.

Construction of a consensus machine learning-driven prediction immunotherapy signature

To improve comparison across datasets, we standardized the data in all cohorts with Z scores. The TCGA-HNSCC dataset was regarded as the training set, while the GSE27020, GSE41613, GSE42743, and GSE65858 cohorts were regarded as the test sets. We first take the intersection of the genes in these datasets to obtain the shared genes for subsequent analyses. Second, we conducted univariate Cox analysis in the TCGA-HNSCC, GSE27020, GSE41613, GSE42743, and GSE65858 cohorts. Genes with p < 0.001 for the TCGA-HNSCC cohort, genes with p < 0.01 for the other cohorts, and the same HR orientation were recognized as prognostic immune-regulated genes (PIRGs). We used 10 machine learning algorithms whose parameters were primarily referenced from previous high-quality literature40.These learning algorithms were then ranked and combined to form a combination of 303 models. Then, based on the model of the training set and related parameters, it is used to construct the model of the test set. In the end, we chose the algorithm with the highest C-index value among all the cohorts as the final method for a CMPIS.

Comparison of CMPIS with clinical features and published signatures

First, we obtained the risk score of each sample by StepCox [forward] + Ridge algorithm. Then, we used the function “surv_cutpoint” based on the R package “survminer” to determine the optimal cut point for continuous variables. Finally, this optimal cutpoint was used as a threshold to differentiate patients into low CMPIS and high CMPIS groups. C-index values were employed to compare the stability and accuracy of CMPIS with clinical features and published signatures. Univariate and multivariate Cox analyses were performed to confirm whether CMPIS was an independent prognostic factor. The time-dependent ROC curve was also employed to compare the stability and accuracy of CMPIS with clinical features. To obtain a comprehensive list of previously published signatures, we searched the PubMed database with the keywords “model”, “signature”, and "head and neck squamous cell carcinoma", or "laryngeal squamous cell carcinoma", or "oral squamous cell carcinoma", or "tongue squamous cell carcinoma" as keywords. All studies from 2018 to now were searched, and after screening, a total of 94 previously published models were obtained41.

Construction of a nomogram and implementation of enrichment analysis

To improve the clinical utility value of the CMPIS, a nomogram incorporating both clinical features and the CMPIS was developed to predict the survival rates of HNSCC. To elucidate the biological functions and signaling pathway differences among different CMPIS groups, we obtained differentially expressed genes (DEGs) from different CMPIS groups (|log2FC >  = 1| and FDR < 0.05) and performed GO and KEGG enrichment analysis29,42,43.

Immunotherapy, radiotherapy, chemotherapy, and EGFR-targeted therapy responses

For the immunotherapy response, we initially assessed the restricted mean survival (RMS) and long-term survival (LTS) of patients to immunotherapy in the IMvigor210 cohort. Similar evaluations were completed in the GSE78220, GSE135222, and GSE91061 cohorts. We also combined the Tracking Tumor Immunophenotype (TIP) method and the Tumor Immune Dysfunction and Exclusion (TIDE) method to evaluate immunotherapy response44,45. In conjunction with immunotherapy, radiotherapy, chemotherapy, and targeted therapy represent quintessential modalities frequently employed within the contemporary clinical sphere. The ssGSEA algorithm was used to calculate two radiotherapy signatures (hypoxia and cell cycle) and EGFR-ligand signatures to predict response to radiotherapy and EGFR-targeted therapy. Finally, the package “oncoPredict” was utilized to predict the sensitivity of patients in different CMPIS patients to conventional chemotherapeutic drugs (Cisplatin, Docetaxel, 5-Fluorouracil, and Gemcitabine) and EGFR-targeted therapeutic agents (Osimertinib, Lapatinib, and Sapitinib)46.

Statistical analysis

For comparisons of two groups, the normally distributed variables were tested by unpaired Student’s t tests, and the non-normally distributed variables were tested by the Wilcoxon rank-sum test. A two-sided Fisher’s exact test was performed for the contingency tables. The log-rank test was used to compare differences in Kaplan–Meier survival analysis between different groups. All statistical analyses and figures were performed in R v.4.3.3.

Results

Identification of cancer subtypes by immune multi-omics consensus clustering analysis

The research workflow illustrated the methodology employed in this study (Fig. 1). Prior to and after the processing of batch effects, PCA was utilized to validate the validity and stability of the dataset integration (Figures S1A and S1B). Through the application of ten multi-omics ensemble clustering algorithms, we autonomously identified two distinct subtypes, with the determination of the subtype count being meticulously guided by the Composite Predictability Index (CPI), gap statistical analysis, and silhouette score (Fig. 2A and B). Subsequently, the clustering outcomes were amalgamated utilizing a consensus ensemble approach, highlighting discernible molecular expression patterns across transcriptomes (including ICGs, lncRNA, and miRNA), epigenetic methylation, and somatic mutations (Fig. 2C–E). Notably, CS1 demonstrated the most favorable OS (p < 0.001; Fig. 2F).

Fig. 1
figure 1

The research workflow.

Fig. 2
figure 2

Identification of cancer subtypes by immune multi-omics consensus clustering analysis. (A) The CPI and gap statistical analysis of the immune multi-omics clusters. (B) The sample similarity of each subgroup was assessed by calculating the silhoutte score. (C) Visualization of immune multi-omics consensus clustering analysis. (D) Consensus heatmap for CSs. (E) Consensus clustering matrix based on the 10 multi-omics algorithms. (F) Kaplan–Meier analysis of CSs in the TCGA cohort.

Validation, molecular characterization, and immune landscape of cancer subtypes

Based on the marker genes for each CSs, we identified 1500 genes that were specifically up-regulated for each CSs as classifiers and immune-regulated genes (IRGs) (Supplementary Table S7). These genes were subsequently validated in the META cohort to reinforce the stability of the subtypes (Fig. 3A). In line with these findings, it was observed that CS1 in the META cohort exhibited the most favorable prognosis (p < 0.001; Fig. 3B). Furthermore, the concordance of CSs with NTP and PAM algorithms was evaluated (p < 0.001; Fig. 3C–E). The robustness of the classification was substantiated by the outcomes derived from PCA, tSNE, and UMAP approaches, which unveiled notable distinctions among the various subtypes (Fig. 3F–H).

Fig. 3
figure 3

Validation of cancer subtypes. (A) Molecular subtype validation in the META cohort based on IRGs by the NTP algorithm. (B) Kaplan–Meier analysis of CSs in the META cohort. (C) The consistency between CSs and NTP in the TCGA cohort. (D) The consistency between CSs and PAM in the TCGA cohort. (E) The consistency between NTP and PAM in the META cohort. (F–H) The PCA, tSNE, and UMAP methods unveiled notable distinctions among the CSs.

To investigate the association between CSs and clinical features, we performed a heat map, which showed that CS1 was associated with higher mortality, older age, and a higher clinicopathologic stage (Fig. 4A). To explore the differences in signaling pathways between CSs, we used the GSVA algorithm to analyze the KEGG pathways that differed in each CS (Supplementary Table S8). CS1 was notably enriched in pathways related to the extracellular matrix, coagulation system, and TGF-beta signaling pathway. CS2 was significantly enriched in pathways related to the cell cycle, immune-related signaling, and metabolism-related signaling. To investigate transcriptome differences, we analyzed underlying regulators related to TFs and chromatin remodeling in HNSCC (Fig. 4B). RXRA, EGFR, HIF1A, PGR, and RARA regulators were significantly activated in CS1, whereas ESR2, FOXA1, PPARG, RARB, RXRB, FGFR3, ERBB2, and ERBB3 were specifically enriched in CS2 (Fig. 4C). The diverse range of regulatory activities linked to oncogenic chromatin remodeling underscores the potential for distinctive patterns of differential regulation among the CSs. This observation suggests that epigenetically-driven transcriptional networks could serve as crucial distinguishing factors for these molecular subtypes.

Fig. 4
figure 4

Molecular characterization of cancer subtypes. (A) A heat map showed that CS1 was associated with higher mortality, older age, and a higher clinicopathologic stage. (B) CS1 was notably enriched in pathways related to the extracellular matrix, coagulation system, and TGF-beta signaling pathway. CS2 was significantly enriched in pathways related to the cell cycle, immune-related signaling, and metabolism-related signaling. (C) RXRA, EGFR, HIF1A, PGR, and RARA regulators were significantly activated in CS1, whereas ESR2, FOXA1, PPARG, RARB, RXRB, FGFR3, ERBB2, and ERBB3 were specifically enriched in CS2.

Using the R package “IOBR”, we found that T cells and B cells were significantly enriched in CS2, suggesting an immune-activated state, while CAFs were mainly enriched in CS1, suggesting an immune-suppressed state (Fig. 5A–E). Immune suppression signatures and immune exclusion signatures were significantly enriched in CS1, while immune function signatures, immunotherapy signatures, and ICGs were significantly enriched in CS2 (Fig. 6A–E). These results suggested that CS1 was more likely to be a “cold tumor” or “immunosuppressive tumor”, while CS2 was more probably classified as a “hot tumor” or "immune-activated tumor".

Fig. 5
figure 5

Immune cell infiltration of cancer subtypes. (A-E) T cells and B cells were significantly enriched in CS2, suggesting an immune-activated state, while CAFs were mainly enriched in CS1, suggesting an immune-suppressed state.

Fig. 6
figure 6

Immunotherapy of cancer subtypes. (A-E) Immune suppression signatures and immune exclusion signatures were significantly enriched in CS1, while immune function signatures, immunotherapy signatures, and ICGs were significantly enriched in CS2.

Development of a consensus machine learning-driven prediction immunotherapy signature

Firstly, univariate Cox analysis was used to screen PIRGs whose expression was markedly associated with OS from the IRGs in the TCGA and META sets. Subsequently, models were constructed based on each of the 303 algorithms and the average C-index value of each model across all cohorts was computed to evaluate the predictive power (Fig. 7A and Supplementary Table S9). The highest average C-index (0.666) was achieved by the StepCox [forward] + Ridge algorithm, and a CMPIS was constructed using this algorithm, which was constructed with 16 hub PIRGs (CD6, CD5, EPHX3, MAP4K1, CALML5, KLRB1, PDGFA, WNT7A, CHGB, CAMK2N1, DKK1, PTX3, STC2, HOXA1, EFNB2, and GSTO1). Figure 7B showed the coefficients of individual PIRGs in the CMPIS, and Fig. 7C showed the results of individual PIRGs in the univariate Cox analysis in the TCGA and META sets. Finally, we computed CMPIS scores for each sample in all cohorts and found that the high CMPIS patients had a worse clinical prognosis in the TCGA, GSE27020, GSE41613, GSE42743, GSE65858, and META cohorts (Fig. 7D).

Fig. 7
figure 7

Construction of a consensus machine learning-driven prediction immunotherapy signature. (A) Models were constructed based on each of the 303 algorithms, and the average C-index value was calculated for each model across all cohorts. The highest average C-index (0.666) was achieved by the StepCox [forward] + Ridge algorithm. (B) The coefficients of individual PIRGs in the CMPIS. (C) The PIRGs in the univariate Cox analysis in the TCGA and META cohorts. (D) Survival analysis of different CMPIS groups in the TCGA, GSE27020, GSE41613, GSE42743, GSE65858, and META cohorts.

Comparison of CMPIS with clinical characteristics and published signatures

Traditional clinical features have historically served as predictors of patient prognosis. In our study, the predictive efficacy of these conventional clinical features has been evaluated and compared with the CMPIS. The C-index confirmed the superior stability and accuracy of the CMPIS compared to clinical features (Fig. 8A). Considering the clinical prospect of CMPIS, we validated CMPIS as an independent prognostic factor by univariate and multivariate Cox analysis (Fig. 8B and C). Moreover, the Area Under ROC Curve (AUC) value for the CMPIS at 1, 3, 5, and 10 years was greater than 0.65, surpassing that of the clinical features and indicating high reliability (Fig. 8D). In order to conduct a thorough comparative analysis between the CMPIS and other relevant signatures, we systematically reviewed literature spanning from 2018 to the present (Supplementary Table S10). This meticulous search process led to the incorporation of a total of 94 published signatures for our comparative evaluation. These signatures were associated with different biological functions and processes, including immune, immune cell, necroptosis, hypoxia, autophagy, pyroptosis, metastasis, metabolism, ferroptosis, stemness, cell component, cuproptosis, methylation, and alternative splicing. The results indicated that the CMPIS showed better C-index performance than almost 94 models in the TCGA, GSE27020, GSE41613, GSE42743, GSE65858, and META cohorts (Fig. 9).

Fig. 8
figure 8

Comparison of the CMPIS with clinical features. (A) The C-index of CMPIS was higher than the clinical features. (B and C) Univariate and multivariate Cox analyses for the CMPIS and clinical features. (D) The AUC values for CMPIS at 1, 3, 5, and 10 years were greater than 0.65, exceeding the clinical features.

Fig. 9
figure 9

Comparison of CMPIS with published signatures. The CMPIS showed better C-index performance than almost 94 models in the TCGA, GSE27020, GSE41613, GSE42743, GSE65858, and META cohorts.

Construction of a nomogram and implementation of enrichment analysis

To enhance the clinical utility of the CMPIS, we devised a nomogram that combines the CMPIS with pertinent clinical characteristics (Fig. 10A). Furthermore, the calibration curve analysis illustrated that the nomogram’s predictive accuracy aligns closely with real-world outcomes (Fig. 10B). This integration not only facilitates more precise prognostication but also supports informed decision-making in patient management. This validation underscored the reliability and practicality of the nomogram as a predictive tool in the clinical setting. In total, we obtained 759 DEGs by differential expression analysis, with 436 marker genes in the low CMPIS group and 323 marker genes in the high CMPIS group (Supplementary Table S11). The marker genes in the low CMPIS group were significantly associated with immune cell activation, immune cell differentiation, immune-related signaling pathways, and immune checkpoint pathways (Fig. 10C and D, Supplementary Tables S12 and S13), whereas the marker genes in the high CMPIS group were significantly associated with a variety of tumor-related pathways, cancer invasion and metastasis, and cytokine pathways (Fig. 10E and F, Supplementary Tables S14 and S15).Significant differences in biological functions and molecular pathways were found between different CMPIS groups, which might contribute to their different prognoses and treatment responses.

Fig. 10
figure 10

Construction of a nomogram and implementation of enrichment analysis. (A) A nomogram combining the CMPIS with pertinent clinical characteristics to enhance the clinical utility of the CMPIS. (B) The calibration curve analysis illustrated that the nomogram’s predictive accuracy aligns closely with real-world outcomes. (C and D) The marker genes in the low CMPIS group were significantly associated with immune cell activation, immune cell differentiation, immune-related signaling pathways, and immune checkpoint pathways. (E and D) The marker genes in the high CMPIS group were significantly associated with a variety of tumor-related pathways, cancer invasion and metastasis, and cytokine pathways.

Different CMPIS patients are suited for different treatment approaches

To comprehensively evaluate the impact of the CMPIS in the context of HNSCC immunotherapy, a systematic analysis was undertaken. The study accounted for the delayed clinical effects often associated with immunotherapy by assessing the RMS at 12 and 24 months, along with examining LTS differences 6 months post-treatment in the IMvigor210 cohort (Fig. 11A and B). Notably, the group with lower CMPIS scores exhibited superior prognostic outcomes, suggesting a more pronounced benefit from immunotherapy. Furthermore, the distribution of CMPIS scores across patients with varying response levels revealed a significantly lower CMPIS score in the responder group compared to the nonresponder group (Fig. 11C). Subsequently, our results were validated across multiple immunotherapy cohorts with prognostic data. Consistently, lower CMPIS scores correlated with improved prognostic outcomes post-immunotherapy in the GSE78220 cohort (Fig. 11D) and GSE135222 cohort (Fig. 11E), as well as better immunotherapy responses in the GSE91061 cohort (Fig. 11F). Moreover, the study delved into the potential immunological mechanisms by calculating the TIP score, revealing significant differences in priming and activation, trafficking of immune cells to tumors, and infiltration of immune cells into tumors among patients with low CMPIS scores (Fig. 11G). Additionally, the TIDE algorithm was employed to assess patient responsiveness to immunotherapy, demonstrating higher responsiveness in the low CMPIS group (Fig. 11H). Notably, the low CMPIS group exhibited higher Micro-Satellite Instability (MSI) scores and Dysfunction scores, while the high CMPIS group displayed elevated Exclusion scores (Figs. 10K–11I).

Fig. 11
figure 11

Evaluation of immunotherapy response. (A) The RMS time difference by 12 and 24 months after treatment. (B) The LTS difference after 6 months of treatment. (C) A significantly lower CMPIS score in the responder group compared to the nonresponder group. (D-F) Lower CMPIS scores correlated with improved prognostic outcomes post-immunotherapy in the GSE78220 cohort and GSE135222 cohort, as well as better immunotherapy responses in the GSE91061 cohort. (G) Significant differences in priming and activation, trafficking of immune cells to tumors, and infiltration of immune cells into tumors among patients with low CMPIS scores. (H) Higher responsiveness in the low CMPIS group. (I-K) The low CMPIS group exhibited higher MSI scores and Dysfunction scores, while the high CMPIS group displayed elevated Exclusion scores.

In conjunction with immunotherapy, radiotherapy, chemotherapy, and targeted therapy represent quintessential modalities frequently employed within the contemporary clinical sphere. Two radiotherapy-related biomarkers (hypoxia and cell cycle) were significantly enriched in the high CMPIS patients, suggesting that the high CMPIS patients were more suitable for radiation therapy (Fig. 12A). Three commonly used chemotherapeutic drugs (Cisplatin, 5-Fluorouracil, and Gemcitabine) were more sensitive in the low CMPIS patients, suggesting that the low CMPIS patients were better suited for chemotherapy (Fig. 12B). EGFR ligand scores were higher in the high CMPIS patients, and the three EGFR-targeted therapeutic agents were more sensitive in the high CMPIS patients, suggesting that the high CMPIS patients were more suitable for EGFR-targeted therapy (Fig. 12C).

Fig. 12
figure 12

Evaluation of radiotherapy, chemotherapy, and EGFR-targeted therapy responses. (A) Two radiotherapy-related biomarkers (hypoxia and cell cycle) were significantly enriched in the high CMPIS patients. (B) Three commonly used chemotherapeutic drugs (Cisplatin, 5-Fluorouracil, and Gemcitabine) were more sensitive in the low CMPIS patients. (C) EGFR ligand scores were higher in the high CMPIS patients, and the three EGFR-targeted therapeutic agents were more sensitive in the high CMPIS patients.

Our study indicated that the low CMPIS patients exhibited greater suitability for immunotherapy and chemotherapy. Conversely, those in the high CMPIS group demonstrated a higher suitability for radiotherapy and EGFR-targeted therapy. These findings suggested that stratifying patients based on CMPIS can enhance the precision of treatment selection, potentially leading to improved clinical outcomes.

Disscussion

The treatment of HNSCC encompasses a variety of disciplines, including surgery, radiation therapy, and cytotoxic chemotherapy47,48. However, the recent emergence of immunotherapy has significantly changed both treatment procedures and outcomes. Known for its enduring efficacy, minimal side effects, and broad applicability, immunotherapy has shown outstanding performance in the treatment of HNSCC. Research has demonstrated that immunotherapy significantly improves the prognosis of recurrent metastatic HNSCC49. New immunotherapy-based treatments were also being applied to locally advanced cases. Furthermore, numerous ongoing trials are investigating the use of new immunotherapeutic agents, including durvalumab, atezolizumab, avelumab, tremelimumab, monalizumab, and others50. These findings underscored the efficacy of immunotherapy in the treatment of HNSCC. Looking ahead, it is highly probable that this therapeutic approach will become the predominant method of treatment.

Immunotherapy has been associated with more durable outcomes and fewer, less severe, or at least more manageable adverse effects compared to traditional cancer treatments. This innovative approach leveraged the body’s inherent immune response to identify and attack cancerous cells, thus enhancing the specificity and efficacy of treatment while decreasing collateral damage to healthy tissues50. Additionally, the combination of immunotherapy with other treatment approaches, including vaccines and chemotherapy, remained a subject of particular interest50. Cancer vaccines have the potential to boost the immune system’s capacity for identifying and combating cancerous cells, while chemotherapy can reduce tumor burden and alter the tumor milieu in ways that make it more susceptible to immune attack. The integration of these modalities aims to create a synergistic effect, thereby enhancing the overall therapeutic outcome51.

In the realm of HNSCC treatment, numerous clinical trials are exploring the integration7 of immunotherapy with other therapeutic modalities and pharmacotherapy to enhance treatment efficacy and safety52. For example, studies have demonstrated that integrating radiation therapy with immunotherapy offers a promising method for patients with locally advanced, recurrent, or metastatic HNSCC49. These researches offer mechanistic insights into the synergies and therapeutic benefits of such combinations. Several phase II/III clinical trials suggest that augmenting radiation therapy with immunotherapy is a safe option53. Researchers have also examined various immunotherapy regimens and alternatives for patients who are not eligible for immunotherapy. Discussions have focused on using alternative systemic treatments (such as EGFR inhibitors or different forms of immunotherapy) to reduce treatment intensity, including the possibility of excluding simultaneous chemotherapy for patients with a low risk51. In conclusion, a biological framework has been established to investigate the potential interactions between radiation therapy and systemic treatments, particularly emphasizing immunotherapy. A retrospective analysis of G/GEJ adenocarcinoma patients receiving a combination of PD-1 inhibitors and chemotherapy between October 2017 and May 2022 revealed significant findings. Specifically, patients with a PD-L1 CPS of ≥ 5 significantly benefited from this treatment, showing enhanced response rates and prolonged progression-free survival54. Therefore, we foresee that the coming decade will present numerous opportunities for the development of enhanced prognostic methods for personalized and concurrent treatments. This progress would facilitate the personalized choice of the best order and mix of cytotoxic chemotherapy, specific drugs, and immune therapy7, customized to enhance advantages and reduce adverse effects51.

In this study, we successfully identified two distinct molecular subtypes using ten integrated multi-omics clustering algorithms. Through the application of ten machine learning methods to analyze multi-omics data, significant enrichment of immune suppression and immune rejection features was found in the CS1 subtype, while in the CS2 subtype, significant enrichment of immune function, immune therapy features, and ICGs was shown.This suggested that the CS1 subtype was more likely to be a “cold tumor” or "immune-suppressive tumor", while the CS2 subtype was more likely to be a “hot tumor” or "immune-activated tumor". Subsequently, by analyzing the markers of these subtypes, we identified their IRGs, which we referred to as key features. Based on these IRGs, we employed a combination of over 303 machine learning algorithms and utilized the StepCox [forward] + Ridge algorithm to construct a prognostic model named CMPIS, consisting of 16 key genes (CD6, CD5, EPHX3, MAP4K1, CALML5, KLRB1, PDGFA, WNT7A, CHGB, CAMK2N1, DKK1, PTX3, STC2, HOXA1, EFNB2, and GSTO1). Next, the CMPIS model was validated and compared across various datasets, including clinical characteristics, external datasets, and previously published models. Significant biological and molecular pathway differences among different CMPIS groups were observed, which might explain their distinct prognostic and treatment response patterns.A systematic analysis of the impact of CMPIS on immunotherapy in HNSCC revealed that the low CMPIS patients demonstrated better prognosis, indicating a greater benefit from immunotherapy. In conclusion, we validated and compared our results using clinical characteristics, external datasets, and published models, ultimately concluding that patients with lower CMPIS scores were more suitable for immunotherapy and chemotherapy, while those with higher CMPIS scores were better suited for radiation therapy and EGFR-targeted treatment. These findings underscored the importance of patient stratification based on CMPIS in enhancing the accuracy of treatment selection and potentially improving clinical treatment outcomes.

Given the current challenges in the field of immunotherapy, including the lack of effective immune markers, poor clinical efficacy, and the inability to accurately determine patient sensitivity to treatment, there is an urgent need to develop models based on large-scale multi-omics data and advanced algorithms to identify and predict biomarkers that may effectively respond to immunotherapy. Numerous researchers were dedicated to this goal, focusing on predicting and identifying key biomarkers. For instance, S.Trebeschi et al. utilized artificial intelligence technology to characterize lesions based on preprocessed contrast-enhanced CT data, successfully developing and validating a non-invasive machine learning biomarker that can distinguish between responders and non-responders to immunotherapy. Their study concluded that the radiological characteristics of lesions could serve as non-invasive biomarkers for assessing response to immunotherapy and might improve patient stratification in neoadjuvant and palliative treatments55. Additionally, other researchers have emphasized the importance of predictive biomarkers in guiding patients’ choices of monotherapy or combination therapy and in developing alternative treatment strategies56. Michaël Duruisseaux et al. discussed currently available and promising biomarkers, highlighting the need for predictive biomarkers of the efficacy of immune checkpoint inhibitors57. Furthermore, studies have recognized CCDC71L as an independent prognostic marker for HNSCC, with high expression related to ICGs, oncogene mutations, and markers of genomic heterogeneity, indicating its significant role in the progression of HNSCC58. These research findings are highly consistent with our study conclusions, further validating the reliability of our research and underscoring the critical role of immunotherapy biomarkers in the treatment of HNSCC. By comparing our research results with the aforementioned existing literature, we can verify and confirm the reliability of our findings, demonstrating the value and significance of our research.

While our findings hold promise, we acknowledge several limitations in our study. Firstly, the difficulty in obtaining fresh clinical samples impeded our ability to conduct functional validations with fresh tumor samples. Secondly, as our research was retrospective in nature, further validation through multicenter prospective cohort studies is essential to confirm our findings. Additionally, our dataset was relatively small and predominantly comprised non-metastatic patients, potentially introducing bias in our experimental outcomes. Our proposed model may be more suitable for predicting the prognosis and guiding treatment for non-metastatic patients, which could limit its generalizability. Furthermore, a notable limitation was the lack of basic experiments, including cell and animal studies, to explore the functional expression of genes.

Conclusion

In summary, we identified two CSs (CS1 and CS2) of HNSCC using multi-omics data and predicted patient prognosis and treatment response by constructing the CMPIS model with IRGs and 303 machine learning algorithms, finding that low CMPIS patients were more sensitive to immunotherapy and chemotherapy, while high CMPIS patients were more sensitive to radiotherapy and EGFR-targeted therapies, which underscored the importance of immunotherapy biomarkers in providing more targeted, precise, and personalized immunotherapy plans for HNSCC patients, significantly contributing to the optimization of clinical treatment outcomes.