Abstract
Head and neck squamous cell carcinoma (HNSCC) is a highly aggressive malignancy with complex molecular underpinnings. Hodgkin lymphoma (HL), another distinct cancer type, shares several biological characteristics with HNSCC, particularly regarding immune system involvement. However, the molecular crosstalk between HNSCC and HL remains largely unexplored. This study aims to elucidate shared molecular mechanisms, identify potential diagnostic biomarkers, and uncover therapeutic targets through an integrative approach combining bioinformatics and machine learning techniques. Publicly available RNA sequencing datasets were utilized to identify differentially expressed genes (DEGs) in HNSCC, while weighted gene co-expression network analysis (WGCNA) was applied to uncover HL-associated gene modules. The intersection of HNSCC DEGs and HL-related modules was evaluated using protein–protein interaction (PPI) network analysis. Candidate hub genes were selected via machine learning algorithms, including LASSO regression, random forest, and support vector machine-recursive feature elimination (SVM-RFE). Prognostic and diagnostic values were assessed using survival analysis and ROC curves. Furthermore, scRNA-seq data were analyzed to assess gene expression in the tumor microenvironment, and drug sensitivity was evaluated to identify potential therapeutic agents. A total of 150 shared genes were identified at the intersection of HNSCC DEGs and HL-associated gene modules. PPI network analysis highlighted 16 candidate hub genes, among which IL6, CXCL13, and PLAU were prioritized through machine learning methods. Survival analysis revealed that high expression of CXCL13 and PLAU, and low expression of IL6, were significantly associated with poor prognosis in HNSCC patients. ROC curve analysis validated their diagnostic performance. Single-cell RNA-seq data confirmed the expression of these biomarkers in macrophages, epithelial cells, and fibroblasts within the tumor microenvironment. Drug sensitivity analysis identified Andrographolide, Rituximab, and Amiloride as potential therapeutic agents. This study identified IL6, CXCL13, and PLAU as critical biomarkers involved in immune regulation and tumor progression in both HNSCC and HL. These findings provide valuable insights into the shared molecular mechanisms and suggest novel therapeutic strategies for patients affected by these diseases.
Similar content being viewed by others
Introduction
Head and neck squamous cell carcinoma (HNSCC) is a highly aggressive malignant tumor that originates in the hypopharynx, larynx, oral cavity and oropharyngeal mucosa and accounts for 90% of all head and neck cancers1. The relative 5-year survival rate of metastatic HNSCC was less than 38%. For recurrent or metastatic stage IV disease, the 5-year survival rate is only 4%2. Contemporary etiological research underscores HNSCC are mostly caused by lifestyle, such as cigarette smoking, or by viruses, such as human papillomavirus (HPV) and Epstein–Barr virus (EBV)3,4, which also the risk factor in HL. HL is a unique hematopoietic neoplasm characterized by cancerous Reed–Sternberg cells in an inflammatory background. It is divided into nodular lymphocyte predominant Hodgkin lymphoma (NLPHL) and classic Hodgkin lymphoma (HL). The lesion was composed of tumor cells (HRS cells, Hodgkin Reed–Sternberg cells) and non-neoplastic reactive cells5. Its etiology has not been elucidated, and the etiology may be caused by multiple factors, such as down-regulation of transcription factor network and loss of B-cell phenotype, abnormal activation of NF-κB, EBV infection and tumor microenvironment6.
Prior research suggests a potential relationship between HNSCC and HL7,8,9. First, from the point of view of risk factors, smoking and alcohol consumption are common risk factors for HNSCC and HL. At the same time, EBV is strongly associated with HL and may also increase the risk of HNSCC through abnormal cell proliferation and carcinogenesis, especially in nasopharyngeal carcinoma. As reported, ebv-encoded nuclear antigens (e.g. EBNA1, EBNA2, EBNA3 family) and membrane proteins (e.g. LMP1, LMP2A) play important roles in cell transformation and tumorigenesis. For example, LMP1 promotes cell proliferation and survival by activating the NF-κB signaling pathway, while EBNA2 promotes cell transformation by activating cellular and viral genes. Previous study also indicates that EBV induces NF-κB-mediated survival signals in T and NK cells, and therefore, may contribute to the lymphomagenesis of these cells10. In HNSCC, prior results showed that the majority of EBV genome infection detected in tumor involved lymph node (65%), adjacent tissue (60%), undifferentiated tumor types (40%), well-differentiated ones (35%) and pharynx (35%)11. These results suggest that EBV may also induce lymph node metastasis. At the same time, raised antibody levels to EBV antigens were detected in HL patients12. For example, the ‘EBNA1’ protein (Epstein–Barr nuclear antigen) was detected in HRS cells in 198513. Later, EBV DNA was detected in 20–25% of HL biopsies by Southern blotting and the results of situ hybridization for EBV DNA also confirmed the EBV genome in HRS cells14,15,16. Moreover, the detection of a single TR length in cHL proved EBV persists in HRS cells throughout the course of disease and multiple sites of disease17. In HRS cells, multiple cell signaling pathways are aberrantly activated, including NF-κB and JAK/STAT18,19,20,21, which also aberrantly activated in HNSCC. As a result, both HNSCC and HL are characterized by inflammatory response22,23 and immune microenvironment24,25, particularly T and B cells. Thus, EBV may give evidence for the relation between HNSCC and HL.
Next, from the point of view of curing, patients with HL who receive radiation or chemotherapy may have an increased risk of secondary HNSCC, and the survival rate of patients with second primary HNSCC is worse than that of patients with primary HNSCC, especially if the radiotherapy site is in the head and neck26. Actually, not only HNSCC but also other cancers have increased risk after treatment for HL, such as lung cancer, breast cancer and gastrointestinal cancer27. As reported, among those patients treated for HL, 27.3% were treated with radiotherapy alone, 12.1% with chemotherapy alone, and 60.5% with a combination of radiotherapy and chemotherapy. What’s more, 908 patients were diagnosed with one second tumor, 130 patients were diagnosed with three types of cancer and 17 patients were diagnosed with four types of cancer during follow-up. And patients treated for HL have a higher risk of being diagnosed with a second tumor than the general population.
Then, from the point of view of immunosuppression, patients with HL are immunosuppressed due to disease or treatment, which weakens the immune surveillance function, makes abnormal cells more likely to survive and proliferate, and increases the risk of cancer28. In addition, it can lead to an increased risk of viral infection. For example, it can increase the risk of EBV and HPV infection and they can also promote the development of HNSCC. Also, the immunosuppressed state may lead to chronic inflammation and promote carcinogenesis. Overall, it weakens immune surveillance and increases the risk of viral infections, treatment-related injuries, and chronic inflammation, thereby increasing the probability of head and neck squamous cell carcinoma.
Finally, previous studies have also shown that while well-differentiated squamous cell carcinoma is observed in laryngeal histopathology, mixed cell subtypes of cHL can be observed in anatomical specimens of the neck, revealing a potential relationship between HNSCC and HL8.
Given the shared risk factors—particularly viral infections and inflammatory processes—there is growing interest in elucidating the molecular links between HNSCC and HL. Despite this, the precise signaling pathways and cellular interactions connecting these two malignancies remain poorly understood. This study seeks to bridge this knowledge gap by integrating bioinformatics and machine learning to investigate the molecular crosstalk between HNSCC and HL. Through the identification of shared biomarkers and immune-related pathways, our goal is to provide novel insights into the underlying tumor biology and suggest new therapeutic strategies.
Materials and methods
Data collection and data processing
Transcriptomic RNA and survival data for HNSCC were sourced from TCGA29, encompassing 546 samples (502 tumor and 44 normal samples). Similarly, the publicly accessible transcriptomics dataset GSE12453, which GPL ID is GPL570, consisting of 17 tumor and 25 normal samples for HL in primary human lymphoma cells, was retrieved from the GEO database30,31,32. To further support and corroborate the findings, additional GEO datasets were integrated into the analysis, including GSE41613 with GPL570, GSE65858 with GPL10558, GSE30784 with GPL570, GSE37991 with GPL6883, GSE9844 with GPL570, and GSE83519 with GPL413333,34,35,36,37,38,39,40. These datasets provided a comprehensive foundation for comparative analysis and cross-validation of the study’s conclusions. The following databases were also employed to augment the bioinformatics analysis: STRING V12.0 (https://cn.string-db.org)41, GEPIA2 (http://gepia2.cancer-pku.cn)42, Human Protein Atlas (https://www.proteinatlas.org)43, NetworkAnalyst V3.0 (https://www.networkanalyst.ca)44, UALCAN (http://ualcan.path.uab.edu)45, CancerSEA (http://biocc.hrbmu.edu.cn/CancerSEA)46, GSCA (https://guolab.wchscu.cn/GSCA)47, Drugbank V6.0 (https://go.drugbank.com/drugs)48, DGldb V5.0 (https://www.dgidb.org)49, CIBERSORT (https://cibersortx.stanford.edu)50, and TIMER V1.0 (http://timer2.compbio.cn/timer1/)51.
Identification of DEGs in HNSCC and key-module related genes in HL
The study design, as depicted in Fig. 1, outlines the comprehensive data processing workflow. Initially, the HNSCC dataset from the TCGA website were pre-processed and normalized using RStudio (v 2023.03.0 + 386) and R (version 4.3.1). The built-in log2 transformation function was applied to ensure the data were scaled appropriately. In cases where multiple probes corresponded to the same gene symbol, the average expression value was computed to maintain consistency. Differential expression analysis between HNSCC and normal tissue samples was conducted using the “DESeq2” v1.42.1 R package, with significant DEGs identified based on the thresholds |log2FC| > 1 and p-adj < 0.0552.
Flowchart depicting study design. DEGs were identified in HNSCC (TCGA) using DESeq2, while key-module genes were selected via WGCNA in HL (GSE12453). The intersection of DEGs and key-module genes underwent enrichment analysis (GO, DO, KEGG). Six machine learning algorithms (LASSO, RF, SVM + RF) identified candidate genes, with final hub genes selected based on survival analysis and ROC. Validation was conducted using HPA, GTEx, and nomogram. Further analyses included pan-cancer, mutation, GSEA, immune infiltration, single-cell, TF regulation, and drug-target association (DrugBank).
In the HL dataset, the top 5000 genes were screened for WGCNA analysis using the median absolute deviation, then missing values were filtered using the goodSamplesGenes function to remove unqualified genes and samples before constructing a scale-free co-expression network. The soft thresholding power (β) was used to make the constructed network conform to the scale-free network, and transformed the gene co-expression network into a topological overlap matrix (TOM). The clustering tree was divided into modules using a dynamic shearing method (the minimum number of genes is 50), with 0.25 set as the merging threshold (cut height) at which modules were merged (modules with correlations higher than 0.75 were merged). Finally, an intersection between WGCNA-derived module genes associated with HL development was performed.
Functional enrichment analysis and PPI analysis for overlapping genes
Functional enrichment analysis was conducted on the overlapping genes between the DEGs and key modules using the R packages “clusterProfiler”53 v4.10.1 and “DOSE” v3.28.2. This analysis identified significantly enriched categories from Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and Disease Ontology (DO), with a significance threshold of p < 0.0554,55. The top 10 most enriched terms were visualized. To explore the interactions between the shared genes, the STRING database was used to construct PPI networks. In the first selection phase, genes without connections to others were excluded. The remaining genes were then visualized using Cytoscape software (v3.10.1) for a second round of selection. Six kinds of algorithms (Degree, EcCentricity, MNC, EPC, Closeness and Radiality) from the CytoHubba plugin were applied, and their intersecting results (top30 genes of each algorithm) were identified as the prospective genes.
Machine learning algorithms for recognition of hub genes
To identify potential hub genes, three distinct machine learning (ML) algorithms were employed on the above prospective genes. First, lasso logistic regression (LASSO) analysis was conducted using the “glmnet” v4.1.8 R package, allowing for the selection of genes by minimizing overfitting. Second, the random forest (RF) algorithm was applied via the “randomForest” v4.7.1.1 R package, retaining features with an importance score greater than 1. Finally, support vector machine recursive feature elimination (SVM-RFE) was implemented using the “svmRFE” function, with the “e1071” v1.7.14 R package handling the elimination of non-essential features. The intersection of the outputs filtered through all three algorithms was subsequently selected as the final set of candidate genes for further analysis.
Survival analysis, ROC evaluation and nomogram construction for identification of biomarkers
Using clinical data from the HNSCC dataset, survival analysis was performed using the “survminer” v0.4.9 and “survival” v3.5.8 R packages, and the results were visualized through Kaplan–Meier (K–M) survival curves. These curves allowed for an intuitive representation of patient survival probabilities over time, stratified by key variables such as gene expression. For the HL dataset, where clinical data were more limited, a ROC curve was generated based on the gene expression profiles. The area under the ROC curve (AUC) served as a measure of the gene’s diagnostic performance, with an AUC value exceeding 0.8 indicating a high capacity to distinguish between tumor and normal tissue. Genes that met the thresholds in both survival and ROC analyses were identified as biomarkers.
A co-expression network was constructed for these biomarkers using GeneMANIA, providing insights into the functional relationships and interactions between these genes. To evaluate the clinical utility of these hub genes, the “rms” v6.8.0 R package was employed to build a nomogram model. This model integrates the biomarkers to predict patient outcomes by calculating a total score that correlates with the probability of 1-, 2-, and 3-year survival. Additionally, protein expression data for three significant biomarkers in HNSCC were extracted from the Human Protein Atlas (HPA) database, providing further validation of their biological relevance. To complement this, the Genotype-Tissue Expression (GTEx) project was utilized through the GEPIA2 database, offering another context for the expression patterns of these biomarkers.
Immune cell infiltration analysis
The relative abundance of infiltrating immune cells in HNSCC patients was quantified using the CIBERSORT algorithm and visualized with the “ggplot2” v3.5.1 R package. The proportions of various immune cell types across the samples were illustrated using bar plots, providing a clear overview of immune cell distribution. Correlation analysis between the expression levels of biomarkers and immune cell infiltration in HNSCC was conducted using the TIMER database. To further explore the relationship between biomarker gene expression and immune cell infiltration across different cancer types, lollipop charts were generated with the consistent R package, effectively visualizing these associations and offering insights into immune response dynamics in the tumor microenvironment.
Single-cell RNA sequencing data analysis of cellular and molecular alterations in HNSCC lymphoid metastasis
The single-cell RNA sequencing (scRNA-seq) dataset GSE227156 was processed and analyzed using the R package “Seurat” v5.0.3 to explore differences between hypopharyngeal carcinoma and lymphoid metastases from five patients with hypopharyngeal squamous cell carcinoma (HSCC). Initially, gene expression data were normalized and scaled with the NormalizeData and ScaleData functions to ensure consistency across all cells. To identify key features, the top 2000 most variable genes were selected for Principal Component Analysis (PCA) using the vst method through the FindVariableFeatures function. Subsequently, cell distributions in two-dimensional space were visualized using the Uniform Manifold Approximation and Projection (UMAP) algorithm, providing a clear distinction between cell types. To account for batch effects between different tissue samples, batch correction was performed using the IntegrateLayers function with the Canonical Correlation Analysis (CCA) algorithm. Cell-type annotation was conducted using the R package “singleR” v2.4.1 to classify cell populations accurately. This rigorous analysis pipeline allowed for the identification of key cellular and molecular changes between normal lymphoid tissues and HSCC metastases, revealing critical insights into tumor microenvironment modulation.
Prediction of transcription factors and prospective drug screening
Potential transcription factors (TFs) of the biomarkers were predicted with the NetworkAnalyst database44. Among them, TFs which contain transcription factor binding sites (TFBS) of each hub gene were preserved, using the R package “JASPAR2020” v0.99.10, and visualized with the Cytoscape. The correlation between TFs and hub genes was visualized with the R package “ggcorrplot” v0.1.4.1.
Based on the research results, the GSCA database, DrugBank database and DGldb database were used for prospective hub gene-related drug predictions. Using the GSCA database, the correlation between gene expression and the sensitivity of GDSC and CTRP drugs in pan-cancer was predicted. DrugBank database and DGldb database was used to predicts relevant drugs on hub genes in HNSCC tissues56.
Results
Identification of key module-related genes and DEGs
Using WGCNA analysis, interconnected gene modules highly associated with HL were identified. No outliers were detected during the clustering process. The soft-thresholding power (β = 20) was selected based on scale independence and average connectivity, with the scale-free fit index set to 0.85 (Fig. 2a). After merging modules, six gene co-expression modules related to HL were identified, each represented by a different color (Fig. 2b, c). These colors indicate the relationship between the modules and HL, with the blue module showing the strongest positive correlation (1248 genes; correlation coefficient [CC] = 0.87; P = 4e−14) and the greenyellow module the strongest negative correlation (194 genes; CC = − 0.65; P = 3e−06). The modules were further analyzed for clinical relevance by calculating module membership and gene significance values. A notable correlation was found between the blue (ModuleTraitCor = 0.59) and greenyellow (ModuleTraitCor = 0.34) modules with HL (Fig. 2d, e). As a result, the 1248 genes within the blue module, which showed the strongest association with HL, were identified as critical candidates for further experimental investigation.
Identification of key gene modules related to HL and DEGs in HNSCC. (a) Left: Scale-free fit index; Right: Mean connectivity plot used to assess the network construction in WGCNA analysis. (b) Cluster dendrogram illustrating the co-expression gene modules identified by WGCNA. (c) Heatmap displaying the correlation between gene modules and HL, with the top-left corner showing the correlation coefficient and the bottom-right corner indicating the p-value. (d,e) Scatter plots representing the correlation between module membership and gene significance for specific modules. (f) PCA of HNSCC samples, as visualized in (h). (g) Volcano plot showing DEGs in HNSCC, with red indicating up-regulated and green indicating down-regulated genes. (h) Heatmap of the top 30 up- and down-regulated DEGs in HNSCC patients, where red grids represent up-regulated and blue grids represent down-regulated genes. (i) Venn diagram illustrating the intersection between WGCNA-identified gene modules related to HL and DEGs detected in HNSCC using DESeq2.
The identification of DEGs in patients with HNSCC began with a PCA (Fig. 2f), which resulted in the identification of 4055 DEGs, with 2085 downregulated and 1970 upregulated. A volcano plot was generated to illustrate the global distribution of DEGs (Fig. 2g). A heatmap was used to visualize the top 30 upregulated and downregulated genes (Fig. 2h). Additionally, 150 significant DEGs were identified from the intersection of the 1248 module genes associated with HL, as detected by WGCNA, and the 4055 DEGs identified through DESeq2 analysis in HNSCC (Fig. 2i).
Enrichment analysis and selection of prospective genes via PPI network
To explore the underlying biological mechanisms, GO, KEGG, and DO enrichment analyses were conducted on the shared genes. GO analysis highlighted significant enrichment in biological processes such as cell killing, regulation of T cell activation, cell–cell adhesion regulation, leukocyte adhesion, and the positive regulation of cell–cell adhesion. The cellular components these genes are associated with include the collagen-containing extracellular matrix, tertiary granules, specific granules, the tertiary granule lumen, and the endoplasmic reticulum lumen. In terms of molecular functions, these genes are involved in cytokine receptor binding, chemokine receptor binding, chemokine activity, peptidase regulator activity, and cytokine activity (Supplemental Fig. 1a–c). KEGG pathway enrichment analysis further indicated that these genes play critical roles in cytokine-cytokine receptor interaction, fluid shear stress and atherosclerosis, rheumatoid arthritis, the IL-17 signaling pathway, and inflammatory bowel disease (Supplemental Fig. 1d). Disease Ontology (DO) analysis linked these genes to several diseases, including rheumatic disease, systemic scleroderma, hypersensitivity reactions, and type IV hypersensitivity (Supplemental Fig. 1e). Genes without connections were excluded, and the remaining 125 were used to construct a PPI network containing 1032 edges. Using six topological algorithms, the top 30 genes from each method were compared, and their intersection yielded 16 candidate hub genes (Fig. 3a, b).
Identification and selection of hub genes using multiple machine learning approaches and PPI network construction. (a) Intersection of shared genes through six ML algorithms. (b) Visualization of prospective genes based on the PPI network. (c) Top 10 most significant genes identified through RF model. (d) LASSO regression model for gene selection and reduction. (e) SVM-RFE analysis showing the lowest error rate. (f) Venn diagram showing the overlap of selected genes from the three ML models.
Machine learning algorithms for recognition of hub genes
In the RF analysis, 10 genes with significance scores greater than one were identified (Fig. 3c). LASSO regression pinpointed 12 genes based on the lowest binomial deviance among the top 16 hub DEGs (Fig. 3d). Additionally, the SVM-RFE method revealed 13 genes with the lowest error rate and highest predictive accuracy (Fig. 3e). The DEGs identified through each approach—RF (n = 10), LASSO (n = 12), and SVM-RFE (n = 13)—were then compared, resulting in eight overlapping genes, which were visualized in a Venn diagram (Fig. 3f). These genes included Interleukin-6 (IL-6), Matrix Metallopeptidase 9 (MMP9), C-X-C Motif Chemokine Ligand 13 (CXCL13), Collagen Type I Alpha 1 Chain (COL1A1), Interleukin 1 Receptor Antagonist (IL1RN), CD80 Molecule (CD80), Vascular Cell Adhesion Molecule 1 (VCAM1), and Plasminogen Activator, Urokinase (PLAU).
Assessment and validation of diagnostic value of hub genes and nomogram construction
Survival analysis was performed on the hub genes in HNSCC, dividing tumor samples into high and low expression groups based on their expression levels. Among these genes, IL-6 (log-rank p = 0.029), CXCL13 (log-rank p = 0.0032), and PLAU (log-rank p = 0.00049) showed significant associations with patient outcomes, suggesting their potential as prognostic indicators in HNSCC (Fig. 4a–c, left panels). These findings were further supported by analysis of the GSE41613 and GSE65858 datasets, where all three genes demonstrated notable differences in 5-year survival probabilities (Supplemental Fig. 2, Table 1).
Integrated analysis of biomarkers and development of a predictive model for HNSCC. This figure presents a comprehensive analysis of three key hub genes: (a) IL6, (b) CXCL13, and (c) PLAU. The left panels display K–M survival curves from the HNSCC dataset, providing insights into the relationship between gene expression levels and patient survival outcomes. In the middle panels, representative IHC images from HPA compare protein expression between normal tissues (left) and HNSCC tissues (right). The right panels highlight the differential expression of these genes by comparing HNSCC and normal tissues using data from the GTEx project. Additionally, a nomogram is constructed based on the three hub genes in panel (d), where each gene is assigned a score to predict the risk of developing HNSCC, especially in populations with HL. The model’s diagnostic performance is further validated using a ROC curve in panel (e), illustrating the model’s effectiveness in distinguishing between HNSCC cases and controls.
The diagnostic value of these hub genes was assessed through ROC curve analysis. PLAU demonstrated excellent predictive power with an AUC of 0.92, while CXCL13 achieved an AUC of 0.85, and IL-6 displayed a relatively lower diagnostic capability with an AUC of 0.61. When the three genes were combined, their diagnostic performance significantly improved, yielding an AUC of 0.96, underscoring their combined strength in accurately differentiating between HNSCC cases and controls (Fig. 4d). Consistent findings were observed in Supplemental Fig. 3, and further analysis across additional datasets, including GSE41613, GSE30784, GSE37991, GSE9844, and GSE83519, confirmed these results (Supplemental Fig. 2, Table 1).
IHC analysis revealed elevated protein expression levels of CXCL13 and PLAU in HNSCC tissues, while IL-6 expression was lower in comparison (Fig. 4a–c, middle panels). To validate these findings, gene expression levels were further analyzed using GTEx data (|FC| > 1.5, p < 0.05), which reinforced the differential expression patterns observed (Fig. 4a–c, right panels). A pan-cancer analysis of multiple cancer types in the TCGA dataset revealed that all three genes followed similar expression trends across various cancers (Supplemental Fig. 3).
A nomogram was subsequently constructed, incorporating IL-6, CXCL13, and PLAU as key predictive variables (Fig. 4e). Each gene’s expression level was assigned a corresponding score, and the total score was used to estimate the risk of HNSCC in patients, particularly in those with HL. Calibration curves for 1-year, 2-year, and 3-year disease-free survival (DFS) demonstrated that the nomogram-predicted probabilities closely aligned with the actual survival outcomes, indicating high accuracy in the model’s predictive capability (Supplemental Fig. 4).
Correlation of transcript levels of biomarkers with TNM classification and tumor stage in HNSCC
The expression levels of the biomarkers were analyzed in relation to TNM status in HNSCC. For IL6 (Fig. 5a), transcript levels were significantly higher in HNSCC samples compared to normal tissues, with a marked increase observed as the nodal metastasis status advanced. Specifically, IL6 expression was significantly elevated in N1 (p < 0.05), N2 (p < 0.01), and N3 (p < 0.01) stages, compared to normal tissues. CXCL13 (Fig. 5b) exhibited the most pronounced upregulation across all metastasis stages, with significant differences observed between normal tissues and N0, N1, N2, and N3 (p < 0.001), as well as between N0 and N3 (p < 0.01). Similarly, PLAU (Fig. 5c) showed progressively higher expression from N0 to N3, with significant differences between normal tissues and all metastasis stages (p < 0.001), as well as between N0 and N3 stages (p < 0.01). Survival analyses further underscored the prognostic value of these biomarkers. Patients with higher expression of IL6 (Fig. 5d) had significantly lower survival probabilities compared to those with lower expression (p < 0.0001). Similarly, elevated levels of CXCL13 (Fig. 5e) and PLAU (Fig. 5f) were associated with worse survival outcomes, with both showing significant p values (0.0058 and 0.0019, respectively).
Expression of biomarkers in relation to nodal metastasis and survival in HNSCC. (a) IL6 expression is significantly higher in HNSCC samples compared to normal tissues, with increasing levels as the nodal metastasis status progresses from N0 to N3. (b) CXCL13 shows pronounced upregulation across all metastasis stages, with significant differences between normal tissues and stages N0–N3, as well as between N0 and N3. (c) PLAU exhibits progressively higher expression levels from N0 to N3, with significant differences observed between normal tissues and metastasis stages. (d–f) K–M survival analysis reveals that patients with higher expression of IL6, CXCL13, and PLAU have significantly lower survival probabilities, underscoring the prognostic value of these genes in HNSCC.
The transcript levels of IL6, CXCL13, and PLAU were analyzed across different tumor stages in HNSCC patients. IL6 (Supplemental Fig. 5a) showed a significant increase in expression between normal tissues and Stage 2 (p < 0.05), with a trend toward higher expression as the tumor stage advanced, although the differences between later stages were not statistically significant. In contrast, CXCL13 (Supplemental Fig. 5b) exhibited a highly significant upregulation across all cancer stages compared to normal tissues (p < 0.001 for all stages), with progressively higher expression as the tumor stage increased, indicating its strong association with tumor progression. Similarly, PLAU (Supplemental Fig. 5c) was markedly upregulated in all cancer stages compared to normal tissues, with statistically significant differences observed between normal tissues and Stage 1 through Stage 4 (p < 0.001 for all comparisons), further highlighting its role in tumor development and progression. Survival analyses indicated that high expression levels of these genes were associated with poorer patient outcomes. IL6 (Supplemental Fig. 5d) showed a significant impact on survival, with high expression correlating with lower survival probability in Stage 4 patients (p = 0.014). PLAU (Supplemental Fig. 5e,f) had an even more pronounced effect on survival, with high expression significantly associated with reduced survival in both Stage 4 (p = 0.0039) and Stage 3 (p = 0.011) patients.
Correlation between immune cell infiltration and biomarker expression
The analysis of immune cell infiltration in HNSCC revealed distinct proportions of 22 different immune cell types. In particular, resting dendritic cells, macrophages, and activated mast cells were found to make up significant portions of the immune microenvironment in HNSCC samples (Fig. 6a). Further investigation showed that the expression levels of IL-6, CXCL13, and PLAU were significantly correlated with the infiltration of CD8 + T cells, macrophages, neutrophils, and dendritic cells in HNSCC (Fig. 6b). Moreover, these genes were positively associated with increased macrophage and dendritic cell infiltration across various cancers (Figs. 6c), indicating their potential roles in modulating immune cell dynamics in both HNSCC and other malignancies (Fig. 6d). These findings suggest that IL-6, CXCL13, and PLAU may play crucial roles in shaping the tumor immune microenvironment and influencing cancer progression (Fig. 6e).
Immune cell infiltration and its correlation with hub gene expression. (a) Proportions of 22 different immune cell types in HNSCC samples, with significant contributions from resting dendritic cells, macrophages, and activated mast cells. (b) Correlation between the expression levels of IL6 (top), CXCL13 (middle), and PLAU (bottom) and the infiltration of various immune cells in HNSCC, highlighting significant associations with CD8+ T cells, macrophages, neutrophils, and dendritic cells. Correlation between the expression of IL6 (c), CXCL13 (d), and PLAU (e) and macrophage and dendritic cell infiltration in various cancers, with significant correlations indicated by darker colors.
Prediction of TFs and therapeutic drugs
A total of 21 potential TFs were identified, and those containing transcription factor binding sites (TFBS) were further filtered. The key TFs identified included POU Class 2 Homeobox 2 (POU2F2), Signal Transducer and Activator of Transcription 3 (STAT3), CAMP Responsive Element Binding Protein 1 (CREB1), Forkhead Box C1 (FOXC1), GATA Binding Protein 2 (GATA2), Forkhead Box L1 (FOXL1), Nuclear Factor I C (NFIC), GATA Binding Protein 3 (GATA3), and TEA Domain Family Member 1 (TEAD1) (Fig. 7a). A PPI network was constructed for these TFs and the identified biomarkers. However, the results showed that POU2F2 and TEAD1 did not interact with other genes, which contrasted with some other findings (Fig. 7b). Despite this, correlation analysis between the TFs and biomarker expression confirmed consistent results, supporting their interaction network (Fig. 7c).
Network analysis of hub genes with MFs and their correlation with drug sensitivity. (a) Interaction network of TFs and hub genes IL6, CXCL13, and PLAU, showing the regulatory relationships among them. (b) Refined network showing key TFs such as STAT3, CREB1, and GATA2 with strong interactions regulating the hub genes. (c) Heatmap displaying the correlation between hub gene expression and MFs, where darker colors indicate stronger correlations. (d) Correlation between hub gene expression and GDSC drug sensitivity. CXCL13 shows a negative correlation with sensitivity to several drugs, while PLAU and IL6 have mixed effects. (e) Correlation between hub gene expression and drug sensitivity in the CTRP dataset, revealing that IL6 and PLAU are positively correlated with resistance to several drugs.
Furthermore, the correlation between the expression of the three biomarkers and drug sensitivity was analyzed using data from the GDSC dataset (Fig. 7d). CXCL13 demonstrated a strong negative correlation with several drugs, including TPCA-1, YK 4-279, Epolactaene B, KIN001-270, PD-0325901, Bleomycin, and MLN4924. This indicates that higher expression of CXCL13 is associated with increased sensitivity to these drugs, suggesting that CXCL13 may be a potential biomarker for predicting better therapeutic response in patients treated with these agents. In contrast, IL6 did not show significant correlations with the majority of the drugs analyzed, indicating that IL6 expression levels may have a limited impact on drug sensitivity or resistance in this context. PLAU displayed a varied response, with a positive correlation observed with TPCA-1, indicating that higher PLAU expression is associated with increased resistance to this drug. Conversely, PLAU showed negative correlations with Bleomycin and MLN4924, suggesting that elevated PLAU expression may enhance sensitivity to these agents.
Moreover, the correlation was also evaluated using data from the CTRP dataset (Fig. 7e). IL6 exhibited a strong positive correlation with multiple drugs, including AA-COCF3, ABT-199, alvocidib, apricitibib, arvincib, BRD-K91615587, dexamethasone, fluorouracil, dinaciclib, and others. This positive correlation suggests that higher IL6 expression may be associated with increased resistance to these drugs, indicating a potential role for IL6 in modulating drug resistance mechanisms. Similarly, PLAU also demonstrated positive correlations with several drugs, including apricitibib, BRD-K91615587, cytochalasin B, and dexamethasone, suggesting that elevated PLAU expression may be linked to greater resistance to these agents. In contrast, CXCL13 did not show significant correlations with most of the drugs analyzed, indicating that its expression level may have little impact on the sensitivity or resistance to these particular therapies. These results suggest that IL6 and PLAU may contribute to drug resistance in HNSCC, while CXCL13 may play a lesser role in drug sensitivity, warranting further investigation into their potential as predictive biomarkers for drug response in cancer therapy.
Bioinformatics analysis on scRNA-seq data in lymphoid metastases of HNSCC
In the scRNA-seq analysis of lymphoid metastases in HNSCC, distinct differences in cellular composition and gene expression were observed between normal lymphoid tissues (left) and tumor samples (right). The UMAP clustering in Fig. 8a illustrates the distribution of various cell types, such as macrophages (MPs), epithelial cells, lymphocytes, fibroblasts, and tissue stem cells, with a marked increase in macrophages and epithelial cells in HNSCC compared to normal tissues. This shift is further quantified in the bar plot of Fig. 8b, where normal lymphoid tissue is primarily composed of lymphocytes, whereas HNSCC samples show an increase in epithelial cells and macrophages, reflecting the changes induced by the tumor microenvironment.
Bioinformatics analysis on scRNA-seq data in lymphoid tissues and HNSCC microenvironments. (a) UMAP clustering of cell types from lymphoid tissue and HNSCC samples. (b) A bar plot displaying the relative proportions of different cell types in lymphoid tissues and HNSCC. HNSCC samples are dominated by epithelial cells and MPs, whereas lymphoid tissues are primarily composed of lymphocytes. (c) Dot plot representing the expression levels of hub genes across various cell types, comparing lymphoid tissues (LYMPH) and HNSCC samples. Epithelial cells, fibroblasts, and MPs in HNSCC show elevated levels of these genes. (d–f) Spatial gene expression patterns for hub genes across cell types, with significant upregulation in macrophages and epithelial cells in HNSCC compared to lymphoid tissues. (g) Interaction networks between different cell types in lymphoid tissues (left) and HNSCC (right). Tumor tissues exhibit stronger, more complex interactions, particularly between epithelial cells, fibroblasts, and MPs, highlighting the enhanced cellular communication that supports tumor growth and metastasis.
Gene expression data from Fig. 8c reveal elevated levels of CXCL13 and PLAU in fibroblasts and epithelial cells in HNSCC, while IL6 expression shows a reduction in HNSCC, particularly in MPs, as confirmed by Fig. 8d. These expression patterns highlight the distinct roles of these genes in modulating the tumor progression and immune response. Figure 8e, f further verify the spatial expression of CXCL13 and PLAU, showing increased expression in macrophages, epithelial cells, and fibroblasts within the HNSCC tumor environment.
The interaction strength between different cell types, as shown in Fig. 8g, indicates weaker, lymphocyte-centered interactions in normal tissues, whereas HNSCC samples exhibit stronger and more complex interactions among macrophages, epithelial cells, and fibroblasts. This enhanced cellular communication, as also visualized in Supplemental Fig. 6, supports tumor growth, immune evasion, and metastasis. In HNSCC (right), chondrocytes, epithelial cells, and fibroblasts show notably stronger interactions compared to normal lymphoid tissues (left), reinforcing the critical role of altered cellular dynamics and gene expression in shaping the tumor microenvironment and promoting disease progression in HNSCC.
Discussion
Numerous studies have contributed to our understanding of HNSCC, and an increasing number of treatment options have been developed. As reported, the association between HNSCC and HL is mainly due to common risk factors (smoking, drinking and viral infection), treatment side effects, immunosuppression, and genetic predisposition57,58. However, the molecular mechanisms driving HNSCC progression in patients with HL remain poorly understood. With advancements in bioinformatics and ML, it is now possible to rapidly identify key therapeutic targets from large-scale sequencing datasets. In this context, our study offers detailed insights into the roles of inflammatory and immune-related biomarkers—IL-6, CXCL13, and PLAU—in HNSCC, particularly in their connection with HL. The identification of these biomarkers highlights their potential as pivotal contributors to both tumor progression and immune regulation.
First, IL-6 is a Protein Coding gene that encodes a cytokine that functions in inflammation and the maturation of B cells, which plays an important role in several regulatory mechanisms of cancer59,60,61. What’s more, previous studies have reported that IL-6 targeted therapies might provide significant benefits for cancer treatment. In addition, DrugBank database uncovered that drug targeting IL-6 included Andrographolide, validated in clinical trials62. Multiple fundamental investigations have since validated the involvement of IL-6 in regulating HNSCC development. For instance, Uz et al. demonstrated that overexpression and elevated serum or saliva IL-6 concentrations in patients with HNSCC are related to poor survival and oncological outcomes63. Mechanistically, IL-6 transcriptionally activates xCT expression through the JAK2/STAT3 pathway. Furthermore, IL-6 reversed ferroptosis and growth suppression that was induced by xCT knockdown or ferroptosis inducer erastin, demonstrating the critical role of IL-6-induced ferroptosis resistance during HNSCC carcinogenesis64.
Second, CXCL13 is identified as a B-cell chemokine, and the interaction between it and its receptor, C-X-C Motif Chemokine Receptor 5, builds a signaling network that regulates not only normal organisms but also the development of many diseases65,66,67. Recent studies have found that the signaling network is implicated in the pathogenesis of several diseases, which is consistent with our enrichment analysis results, such as rheumatoid arthritis, multiple sclerosis, systemic lupus erythematosus, myasthenia gravis, and inflammatory bowel disease68. With the DGldb database, Rituximab was found targeting CXCL13 and also validated in clinical trials69.
Third, PLAU encodes a secreted serine protease that converts plasminogen to plasmin and promotes tumor progression in many tumors70,71,72. Recent studies have found that the silencing of PLAU can inhibit the proliferation, migration, and EMT process of CAL27 cell lines, and the down-regulation of PLAU decreased the expression of TNFRSF12A, which also inhibits cell proliferation and migration. Furthermore, vivo experiment results indicated that PLAU inhibition could suppress tumor growth73. In the results from DrugBank database, Amiloride was uncovered. But there is a lack of research on Amiloride and HNSCC treatment, which suggest that this drug may serve as a potential treatment for HNSCC patients.
After identifying the biomarkers, nomogram was constructed for further clinical use in this patient population and calibration curves indicating that Nomogram-Predicted Probability of 2-Year DFS is comparatively speaking the most accurate and the predicted probability is in good agreement with the actual probability. In addition, in the next pan cancer analysis, IL-6 expression in RCC, OV, LUAD and GBM were significantly positively associated with inflammation and quiescence, CXCL13 expression in NSCLC, GBM and BRCA were significantly negatively associated with apoptosis, DNA damage, DNA repair, EMT, invasion and metastasis and PLAU expression in RCC, OV, LUAD and GBM were significantly positively associated with inflammation and quiescence.
In the next place, series of downstream analyses targeting these three genes were performed. First, common pathways were detected in GSEA, such as epithelial mesenchymal transition and inflammatory response, which may be activated by NF-κB signaling pathway (Supplemental Fig. 3). Previous study has confirmed that NF-κB pathway plays a key role in regulating many physiological processes such as immune response, inflammatory response, cell growth and death. Considering the NF-κB signaling pathway can be aberrantly activated by EBV, this may give evidence for the relation between EBV and biomarkers. Then, immune cell infiltration analysis revealed that they were positively correlated with MPs and dendritic cell infiltration not only in HNSCC, but also in various malignancies, which indicated that controlling macrophages and dendritic cell behavior is essential to interfering with HNSCC patients with HL. Prior research also revealed that macrophages can directly present polypeptide antigens through Major histocompatibility complex Class II (MHCII) and activate helper T cells after ingestion of pathogens through phagocytosis. In contrast to dendritic cells (DCS) which present antigens in lymph nodes and activate naive T cells, macrophages present antigens in tissues and cannot induce activation of naive T cells. In the process, they used a system of pattern recognition receptors (PRRs), such as Toll-like receptors (TLRs), which can specifically bind to pathogen components such as bacterial lipopolysaccharides (LPS), RNA, DNA, or extracellular proteins, leading to the activation of signaling cascades and the production of inflammatory mediators. As a result, macrophages release soluble factors such as cytokines, enzymes, or metabolites that affect other immune cell types74. What’s more, results from scRNA-seq analysis revealed the unevenly distribution in different cell types in hypopharynx tissues and lymphatic tissues during lymph node metastases. The expression of IL6 in hypopharynx MPs cells was higher than in lymphatic MPs cells and expression of PLAU in hypopharynx tissue stem cells, fibroblast cells and ECs was higher than in lymphatic MPs cells. There seems no significantly distribution of CXCL13 in different tissues during our research, while there has been demonstrated that exhausted CD8 T cells, which with high CXCL13 expression, strongly interact with tumor cells during lymph node metastases in HNSCC. Then, the TF network results showed that POU2F2 and TEAD1 did not interact with other genes, which contrasted with some other findings. Despite this, correlation analysis between the TFs and biomarker expression confirmed consistent results, supporting their interaction network.
Finally, Andrographolide, Rituximab and Amiloride were chosen from drug analysis and may be possible to be used in combination to treat HNSCC patients with HL69. In the meantime, results from DrugBank database based on IL6 revealed that there is a potential relationship between HNSCC and rheumatoid arthritis in the molecular mechanism, which is in agreement with the results of DO enrichment analysis63. In summary, the results from IL6 and CXCL13 suggest that Andrographolide and Rituximab are possible to be used in combination to treat HNSCC patients with HL. And the results from PLAU suggest that Amiloride may serve as a potential treatment for HNSCC patients.
Regarding the mechanisms of targeting IL6, CXCL13, and PLAU by Andrographolide, Rituximab, and Amiloride, prior research has shown that Andrographolide can inhibit the transcription of downstream genes of NF-κB signaling pathway by blocking the binding of NF-κB signaling pathway to DNA75. The expression of iNOS, TNF-α, IL6 and other genes is regulated by NF-κB signaling pathway. Therefore, the inhibition of andrographolide on the expression of inflammatory factors may be related to the decrease of the activity of NF-κB signaling molecules. Meanwhile, CXCL13 is a chemokine that is mainly involved in the regulation of the immune system, especially in B-cell chemotaxis and localization in lymphoid tissue. Rituximab, as an antibody targeting B cells, is mainly used to treat B-cell lymphoma by labeling and binding to CD20 antigen on B cells, triggering the dissolution of B cells76. This therapeutic strategy directly affects the number and function of B cells, which may affect CXCL13 expression and function. In addition, Amiloride and its analogues can induce non-infectious cell death by affecting intracellular calcium ion concentration and activating NADPH oxidase77. In some cases, this mechanism of action may be related to specific physiological responses of neutrophils, especially in terms of cellular calcium overload and reactive oxygen species (ROS) production78. In inflammatory response and immune response, the expression of CXCL13 may be regulated by various inflammatory factors and microbial products, thus affecting the mobilization, recruitment and activation of neutrophils and participating in the regulation of the intensity of immune response and inflammatory response. This may be the key factor between CXCL13 and Amiloride.
Despite providing new insights into the shared molecular pathways between HNSCC and HL, this study has certain limitations. First, all tumor samples from the TCGA dataset were used without additional quality control. Although the TCGA data have undergone standardized processing, implementing more rigorous quality control and selecting an optimized subset of samples could further enhance the reliability of our findings. Additionally, while the machine learning algorithms applied are widely used for feature selection, they are not specifically designed for biomedical data. Future studies could benefit from developing models tailored to biological characteristics to improve predictive accuracy. Another major limitation is that drug screening in this study was entirely based on in silico predictions, lacking experimental validation. Although our bioinformatics analysis suggests that Andrographolide, Rituximab, and Amiloride may target IL6, CXCL13, and PLAU, further experimental confirmation is required to verify their actual effects in HNSCC and HL. Therefore, in the short future, we plan to evaluate the expression levels of the three genes in HNSCC and HL samples using qPCR and Western blot, while also utilizing flow cytometry or immunohistochemistry to validate the computational predictions related to immune cell infiltration.
Conclusion
In conclusion, this study successfully identified three key diagnostic biomarkers—IL6, CXCL13, and PLAU—associated with the progression of HNSCC in patients with HL. These biomarkers were significantly enriched in pathways related to the epithelial-mesenchymal transition and inflammatory responses, both of which are critical in the progression and metastasis of HNSCC. Their correlation with immune cell infiltration, particularly with MPss and dendritic cells, suggests their potential role in modulating the tumor immune microenvironment, which is pivotal in tumor progression. Moreover, the findings from drug sensitivity analysis indicate that Andrographolide, Rituximab, and Amiloride may offer promising therapeutic options for HNSCC patients, particularly those with HL. However, the absence of clinical validation and the reliance on bioinformatics models highlight the need for future experimental and clinical studies to substantiate these results. Additionally, the construction of a nomogram incorporating these biomarkers provides a practical tool for predicting disease prognosis, which could enhance personalized treatment strategies in clinical settings. Overall, this research provides novel insights into the molecular mechanisms linking HNSCC and HL, offering potential avenues for therapeutic intervention.
Data availability
The datasets analyzed in this study are publicly available. The GSE datasets were obtained from the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov), while the head and neck squamous cell carcinoma dataset was retrieved from The Cancer Genome Atlas database (https://portal.gdc.cancer.gov).
Abbreviations
- HNSCC:
-
Head and neck squamous cell carcinoma
- HL:
-
Hodgkin lymphoma
- EBV:
-
Epstein–Barr virus
- HPV:
-
Human papillomavirus
- TCGA:
-
The Cancer Genome Atlas
- GEO:
-
Gene Expression Omnibus
- GTEx:
-
Genotype-Tissue Expression
- HPA:
-
Human Protein Atlas
- DEGs:
-
Differentially Expressed Genes
- WGCNA:
-
Weighted Gene Co-expression Network Analysis
- PPI:
-
Protein–Protein Interaction
- GO:
-
Gene Ontology
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- DO:
-
Disease Ontology
- GSEA:
-
Gene Set Enrichment Analysis
- LASSO:
-
Least Absolute Shrinkage and Selection Operator
- RF:
-
Random Forest
- SVM-RFE:
-
Support Vector Machine Recursive Feature Elimination
- ROC:
-
Receiver Operating Characteristic
- AUC:
-
Area Under the Curve
- K–M:
-
Kaplan–Meier Survival Analysis
- IL6:
-
Interleukin 6
- CXCL13:
-
C-X-C Motif Chemokine Ligand 13
- PLAU:
-
Plasminogen Activator, Urokinase
- MMP9:
-
Matrix Metallopeptidase 9
- COL1A1:
-
Collagen Type I Alpha 1 Chain
- IL1RN:
-
Interleukin 1 Receptor Antagonist
- CD80:
-
CD80 Molecule
- VCAM1:
-
Vascular Cell Adhesion Molecule 1
- scRNA-seq:
-
Single-cell RNA Sequencing
- MPs:
-
Macrophages
- TFs:
-
Transcription Factors
- TFBS:
-
Transcription Factor Binding Sites
- NF-κB:
-
Nuclear Factor Kappa B
- STAT3:
-
Signal Transducer and Activator of Transcription 3
- CREB1:
-
CAMP Responsive Element Binding Protein 1
- STRING:
-
Search Tool for the Retrieval of Interacting Genes/Proteins
- GEPIA2:
-
Gene Expression Profiling Interactive Analysis 2
- UALCAN:
-
University of Alabama Cancer Database
- CancerSEA:
-
Cancer Single-cell State Atlas
- GSCA:
-
Gene Set Cancer Analysis
- DrugBank:
-
Drug-Target Database
- DGldb:
-
Drug-Gene Interaction Database
- CIBERSORT:
-
Cell-type Identification by Estimating Relative Subsets of RNA Transcripts
- TIMER:
-
Tumor Immune Estimation Resource
References
Sung, H. et al. Global Cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249. https://doi.org/10.3322/caac.21660 (2021).
Guo, Y. et al. First-line treatment with chemotherapy plus cetuximab in Chinese patients with recurrent and/or metastatic squamous cell carcinoma of the head and neck: Efficacy and safety results of the randomised, phase III CHANGE-2 trial. Eur. J. Cancer. 156, 35–45. https://doi.org/10.1016/j.ejca.2021.06.039 (2021).
Galizia, D. et al. How risk factors affect head and neck squamous cell carcinoma (HNSCC) tumor immune microenvironment (TIME): Their influence on immune escape mechanisms and immunotherapy strategy. Biomedicines https://doi.org/10.3390/biomedicines10102498 (2022).
Zheng, J., Li, X., Gong, X., Hu, Y. & Tang, M. Elucidating the molecular and immune interplay between head and neck squamous cell carcinoma and diffuse large B-cell lymphoma through bioinformatics and machine learning. Transl. Cancer Res. 13, 5725–5750. https://doi.org/10.21037/tcr-24-1064 (2024).
Mathas, S., Hartmann, S. & Kuppers, R. Hodgkin lymphoma: Pathology and biology. Semin. Hematol. 53, 139–147. https://doi.org/10.1053/j.seminhematol.2016.05.007 (2016).
Hu, J., Zhang, X., Tao, H. & Jia, Y. The prognostic value of Epstein–Barr virus infection in Hodgkin lymphoma: A systematic review and meta-analysis. Front. Oncol. 12, 1034398. https://doi.org/10.3389/fonc.2022.1034398 (2022).
Kim, S. H. et al. Differential diagnosis of sinonasal lymphoma and squamous cell carcinoma on CT, MRI, and PET/CT. Otolaryngol. Head Neck Surg. 159, 494–500. https://doi.org/10.1177/0194599818770621 (2018).
Joo, Y. H., Jung, C. K., Sun, D. I. & Kim, M. S. Synchronous laryngeal squamous cell carcinoma and hodgkin lymphoma of the head and neck region. Auris Nasus Larynx. 36, 501–504. https://doi.org/10.1016/j.anl.2008.10.012 (2009).
Moore, M. G. & Bhattacharyya, N. Survival outcomes for head and neck squamous cell carcinoma arising after lymphoma. Otolaryngol. Head Neck Surg. 138, 388–393. https://doi.org/10.1016/j.otohns.2007.11.026 (2008).
Takada, H. et al. EBV induces persistent NF-kappaB activation and contributes to survival of EBV-positive neoplastic T- or NK-cells. PLoS One. 12, e0174136. https://doi.org/10.1371/journal.pone.0174136 (2017).
Karbalaie Niya, M. H. et al. Epstein–Barr virus molecular epidemiology and variants identification in head and neck squamous cell carcinoma. Eur. J. Cancer Prev. 29, 523–530. https://doi.org/10.1097/CEJ.0000000000000554 (2020).
Levine, P. H. et al. Elevated antibody titers to Epstein–Barr virus in Hodgkin’s disease. Cancer 27, 416–421 (1971).
Poppema, S., van Imhoff, G., Torensma, R. & Smit, J. Lymphadenopathy morphologically consistent with Hodgkin’s disease associated with Epstein–Barr virus infection. Am. J. Clin. Pathol. 84, 385–390. https://doi.org/10.1093/ajcp/84.3.385 (1985).
Weiss, L. M., Strickler, J. G., Warnke, R. A., Purtilo, D. T. & Sklar, J. Epstein–Barr viral DNA in tissues of Hodgkin’s disease. Am. J. Pathol. 129, 86–91 (1987).
Anagnostopoulos, I., Herbst, H., Niedobitek, G. & Stein, H. Demonstration of monoclonal EBV genomes in Hodgkin’s disease and Ki-1-positive anaplastic large cell lymphoma by combined Southern blot and in situ hybridization. Blood 74, 810–816 (1989).
Weiss, L. M., Movahed, L. A., Warnke, R. A. & Sklar, J. Detection of Epstein–Barr viral genomes in Reed–Sternberg cells of Hodgkin’s disease. N. Engl. J. Med. 320, 502–506. https://doi.org/10.1056/nejm198902233200806 (1989).
Coates, P. J., Slavin, G. & D’Ardenne, A. J. Persistence of Epstein–Barr virus in Reed–Sternberg cells throughout the course of Hodgkin’s disease. J. Pathol. 164, 291–297. https://doi.org/10.1002/path.1711640404 (1991).
Bargou, R. C. et al. Constitutive nuclear factor-kappaB-RelA activation is required for proliferation and survival of Hodgkin’s disease tumor cells. J. Clin. Invest. 100, 2961–2969. https://doi.org/10.1172/jci119849 (1997).
Dutton, A., Reynolds, G. M., Dawson, C. W., Young, L. S. & Murray, P. G. Constitutive activation of phosphatidyl-inositide 3 kinase contributes to the survival of Hodgkin’s lymphoma cells through a mechanism involving Akt kinase and mTOR. J. Pathol. 205, 498–506. https://doi.org/10.1002/path.1725 (2005).
Heath, E. et al. Epstein–Barr virus infection of Naïve B cells in vitro frequently selects clones with mutated Immunoglobulin genotypes: Implications for virus biology. PLoS Pathog. 8, e1002697. https://doi.org/10.1371/journal.ppat.1002697 (2012).
Holtick, U. et al. STAT3 is essential for hodgkin lymphoma cell proliferation and is a target of tyrphostin AG17 which confers sensitization for apoptosis. Leukemia 19, 936–944. https://doi.org/10.1038/sj.leu.2403750 (2005).
Castellano, L. R. C. et al. Implications and emerging therapeutic avenues of inflammatory response in HPV + head and neck squamous cell carcinoma. Cancers (Basel). https://doi.org/10.3390/cancers14215406 (2022).
Shanbhag, S. & Ambinder, R. F. Hodgkin lymphoma: A review and update on recent progress. CA Cancer J. Clin. 68, 116–132. https://doi.org/10.3322/caac.21438 (2018).
Ruffin, A. T. et al. Improving head and neck cancer therapies by Immunomodulation of the tumour microenvironment. Nat. Rev. Cancer. 23, 173–188. https://doi.org/10.1038/s41568-022-00531-9 (2023).
Vardhana, S. & Younes, A. The immune microenvironment in hodgkin lymphoma: T cells, B cells, and immune checkpoints. Haematologica 101, 794–802. https://doi.org/10.3324/haematol.2015.132761 (2016).
Chowdhry, A. K. et al. Second primary head and neck cancer after hodgkin lymphoma: A population-based study of 44,879 survivors of hodgkin lymphoma. Cancer 121, 1436–1445. https://doi.org/10.1002/cncr.29231 (2015).
Schaapveld, M. et al. Second cancer risk up to 40 years after treatment for Hodgkin’s lymphoma. N. Engl. J. Med. 373, 2499–2511. https://doi.org/10.1056/NEJMoa1505949 (2015).
Aldinucci, D., Borghese, C. & Casagrande, N. Formation of the immunosuppressive microenvironment of classic hodgkin lymphoma and therapeutic approaches to counter it. Int. J. Mol. Sci. https://doi.org/10.3390/ijms20102416 (2019).
Tomczak, K., Czerwinska, P. & Wiznerowicz, M. The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. Contemp. Oncol. (Pozn). 19, A68–A77. https://doi.org/10.5114/wo.2014.47136 (2015).
Barrett, T. et al. NCBI GEO: Archive for functional genomics data sets—Update. Nucleic Acids Res. 41, D991–D995. https://doi.org/10.1093/nar/gks1193 (2013).
Brune, V. et al. Origin and pathogenesis of nodular lymphocyte-predominant hodgkin lymphoma as revealed by global gene expression analysis. J. Exp. Med. 205, 2251–2268. https://doi.org/10.1084/jem.20080809 (2008).
Weniger, M. A. et al. Human CD30+ B cells represent a unique subset related to hodgkin lymphoma cells. J. Clin. Investig. 128, 2996–3007. https://doi.org/10.1172/JCI95993 (2018).
Zhao, Y. et al. Comprehensive analysis of tumor immune microenvironment characteristics for the prognostic prediction and immunotherapy of oral squamous cell carcinoma. Front. Genet. 13, 788580. https://doi.org/10.3389/fgene.2022.788580 (2022).
Chen, C. et al. Gene expression profiling identifies genes predictive of oral squamous cell carcinoma. Cancer Epidemiol. Biomark. Prev. 17, 2152–2162. https://doi.org/10.1158/1055-9965.EPI-07-2893 (2008).
Lohavanichbutr, P. et al. A 13-gene signature prognostic of HPV-negative OSCC: Discovery and external validation. Clin. Cancer Res. 19, 1197–1203. https://doi.org/10.1158/1078-0432.CCR-12-2647 (2013).
Giefing, M. et al. Hodgkin–Reed–Sternberg cells in classical hodgkin lymphoma show alterations of genes encoding the NADPH oxidase complex and impaired reactive oxygen species synthesis capacity. PloS One. 8, e84928. https://doi.org/10.1371/journal.pone.0084928 (2013).
Xi, Y. et al. GOLM1 and FAM49B: Potential biomarkers in HNSCC based on bioinformatics and immunohistochemical analysis. Int. J. Mol. Sci. https://doi.org/10.3390/ijms232315433 (2022).
Ye, H. et al. Transcriptomic dissection of tongue squamous cell carcinoma. BMC Genom. 9, 69. https://doi.org/10.1186/1471-2164-9-69 (2008).
Lee, C. H. et al. IL-1beta promotes malignant transformation and tumor aggressiveness in oral cancer. J. Cell. Physiol. 230, 875–884. https://doi.org/10.1002/jcp.24816 (2015).
Sheu, J. J. et al. LRIG1 modulates aggressiveness of head and neck cancers by regulating EGFR-MAPK-SPHK1 signaling and extracellular matrix remodeling. Oncogene 33, 1375–1384. https://doi.org/10.1038/onc.2013.98 (2014).
Szklarczyk, D. et al. The STRING database in 2023: Protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646. https://doi.org/10.1093/nar/gkac1000 (2023).
Tang, Z., Kang, B., Li, C., Chen, T. & Zhang, Z. GEPIA2: An enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 47, W556–W560. https://doi.org/10.1093/nar/gkz430 (2019).
Thul, P. J. & Lindskog, C. The human protein atlas: A spatial map of the human proteome. Protein Sci. 27, 233–244. https://doi.org/10.1002/pro.3307 (2018).
Zhou, G. et al. NetworkAnalyst 3.0: A visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res. 47, W234–W241. https://doi.org/10.1093/nar/gkz240 (2019).
Chandrashekar, D. S. et al. UALCAN- A portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia 19, 649–658. https://doi.org/10.1016/j.neo.2017.05.002650 (2017).
Yuan, H. et al. CancerSEA: A cancer single-cell state atlas. Nucleic Acids Res. 47, D900–D908. https://doi.org/10.1093/nar/gky939 (2019).
Liu, C. J. et al. GSCA: An integrated platform for gene set cancer analysis at genomic, pharmacogenomic and immunogenomic levels. Brief. Bioinform. https://doi.org/10.1093/bib/bbac558 (2023).
Knox, C. et al. DrugBank 6.0: The drugbank knowledgebase for 2024. Nucleic Acids Res. 52, D1265–D1275. https://doi.org/10.1093/nar/gkad976 (2024).
Cannon, M. et al. DGIdb 5.0: Rebuilding the drug-gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res. 52, D1227–D1235. https://doi.org/10.1093/nar/gkad1040 (2024).
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods. 12, 453–457. https://doi.org/10.1038/nmeth.3337 (2015).
Li, T. et al. A web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res. 77, e108–e110. https://doi.org/10.1158/0008-5472.CAN-17-0307 (2017).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550. https://doi.org/10.1186/s13059-014-0550-8 (2014).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. ClusterProfiler: An R package for comparing biological themes among gene clusters. OMICS 16, 284–287. https://doi.org/10.1089/omi.2011.0118 (2012).
Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 28, 1947–1951. https://doi.org/10.1002/pro.3715 (2019).
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: Biological systems database as a model of the real world. Nucleic Acids Res. 53, D672–D677. https://doi.org/10.1093/nar/gkae909 (2025).
Wagner, A. H. et al. DGIdb 2.0: Mining clinically relevant drug-gene interactions. Nucleic Acids Res. 44, D1036–1044. https://doi.org/10.1093/nar/gkv1165 (2016).
Swerdlow, A. J. Epidemiology of Hodgkin’s disease and non-Hodgkin’s lymphoma. Eur. J. Nucl. Med. Mol. Imaging. 30, S3–S12. https://doi.org/10.1007/s00259-003-1154-9 (2003).
van Monsjou, H. S., Wreesmann, V. B., van den Brekel, M. W. & Balm, A. J. Head and neck squamous cell carcinoma in young patients. Oral Oncol. 49, 1097–1102. https://doi.org/10.1016/j.oraloncology.2013.09.001 (2013).
Siersbæk, R. et al. IL6/STAT3 signaling hijacks estrogen receptor α enhancers to drive breast cancer metastasis. Cancer Cell 38, 412–423e419. https://doi.org/10.1016/j.ccell.2020.06.007 (2020).
Hu, Z. et al. IL6-STAT3-C/EBPβ-IL6 positive feedback loop in tumor-associated macrophages promotes the EMT and metastasis of lung adenocarcinoma. J. Exp. Clin. Cancer Res. 43, 63. https://doi.org/10.1186/s13046-024-02989-x (2024).
Leng, S. et al. Radon exposure, IL-6 promoter variants, and lung squamous cell carcinoma in former uranium miners. Environ. Health Perspect. 124, 445–451. https://doi.org/10.1289/ehp.1409437 (2016).
Li, H. et al. Andrographolide-loaded solid lipid nanoparticles enhance anti-cancer activity against head and neck cancer and precancerous cells. Oral Dis. 28, 142–149. https://doi.org/10.1111/odi.13751 (2022).
Uz, U. & Eskiizmir, G. Association between interleukin-6 and head and neck squamous cell carcinoma: A systematic review. Clin. Exp. Otorhinolaryngol. 14, 50–60. https://doi.org/10.21053/ceo.2019.00906 (2021).
Li, M., Jin, S., Zhang, Z., Ma, H. & Yang, X. Interleukin-6 facilitates tumor progression by inducing ferroptosis resistance in head and neck squamous cell carcinoma. Cancer Lett. 527, 28–40. https://doi.org/10.1016/j.canlet.2021.12.011 (2022).
El-Haibi, C. P., Singh, R., Sharma, P. K., Singh, S. & Lillard, J. W. Jr. CXCL13 mediates prostate cancer cell proliferation through JNK signalling and invasion through ERK activation. Cell. Prolif. 44, 311–319. https://doi.org/10.1111/j.1365-2184.2011.00757.x (2011).
Vachani, A. et al. A 10-gene classifier for distinguishing head and neck squamous cell carcinoma and lung squamous cell carcinoma. Clin. Cancer Res. 13, 2905–2915. https://doi.org/10.1158/1078-0432.CCR-06-1670 (2007).
Tian, C. et al. Identification of CXCL13/CXCR5 Axis’s crucial and complex effect in human lung adenocarcinoma. Int. Immunopharmacol. 94, 107416. https://doi.org/10.1016/j.intimp.2021.107416 (2021).
Pan, Z., Zhu, T., Liu, Y. & Zhang, N. Role of the CXCL13/CXCR5 axis in autoimmune diseases. Front. Immunol. 13, 850998. https://doi.org/10.3389/fimmu.2022.850998 (2022).
Hsieh, C. Y. et al. Rituximab in combination with gemcitabine plus cisplatin in patients with recurrent and metastatic head and neck squamous cell carcinoma: A phase I trial. BMC Cancer 22, 169. https://doi.org/10.1186/s12885-022-09258-0 (2022).
Hu, M. D., Jia, L. H. & Wang, M. L. PLAU contributes to the development of cholangiocarcinoma via activating NF-κB signaling pathway. Cell. Biol. Int. 47, 1381–1391. https://doi.org/10.1002/cbin.12025 (2023).
Ai, C. et al. FOXM1 functions collaboratively with PLAU to promote gastric cancer progression. J. Cancer. 11, 788–794. https://doi.org/10.7150/jca.37323 (2020).
Gao, Y., Ma, X., Lu, H., Xu, P. & Xu, C. PLAU is associated with cell migration and invasion and is regulated by transcription factor YY1 in cervical cancer. Oncol. Rep. https://doi.org/10.3892/or.2022.8462 (2023).
Chen, G. et al. PLAU promotes cell proliferation and epithelial-mesenchymal transition in head and neck squamous cell carcinoma. Front. Genet. 12, 651882. https://doi.org/10.3389/fgene.2021.651882 (2021).
Christofides, A. et al. The complex role of tumor-infiltrating macrophages. Nat. Immunol. 23, 1148–1156. https://doi.org/10.1038/s41590-022-01267-2 (2022).
Hidalgo, M. A. et al. Andrographolide interferes with binding of nuclear factor-kappaB to DNA in HL-60-derived neutrophilic cells. Br. J. Pharmacol. 144, 680–686. https://doi.org/10.1038/sj.bjp.0706105 (2005).
Rosengren, S., Wei, N., Kalunian, K. C., Kavanaugh, A. & Boyle, D. L. CXCL13: A novel biomarker of B-cell return following rituximab treatment and synovitis in patients with rheumatoid arthritis. Rheumatology (Oxford) 50, 603–610. https://doi.org/10.1093/rheumatology/keq337 (2011).
Sparks, R. L., Pool, T. B., Smith, N. K. R. & Cameron, I. L. Effects of amiloride on tumor growth and intracellular element content of tumor cells in vivo. Cancer Res. 43, 73–77 (1983).
Ellson, C. D. et al. An integrated pharmacological, structural, and genetic analysis of extracellular versus intracellular ROS production in neutrophils. J. Mol. Biol. 434, 167533. https://doi.org/10.1016/j.jmb.2022.167533 (2022).
Wichmann, G. et al. The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer. Int. J. Cancer. 137, 2846–2857 (2023).
Funding
This work was supported by the Natural Science Foundation of Hubei Province of China (2021CFB553) and the National Natural Science Foundation of China (No. 82101207).
Author information
Authors and Affiliations
Contributions
M.Q. contributed to conceptualization, data curation, formal analysis, methodology, and writing the original draft. X.L. was responsible for data curation, software, visualization, and reviewing and editing the manuscript. X.G. contributed to investigation, validation, and reviewing and editing the manuscript. Y.H. provided supervision, project administration, writing review and editing, and secured funding. M.T. contributed to conceptualization, supervision, reviewing and editing the manuscript, and provided resources.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Consent for publication
All authors confirm that they have thoroughly reviewed and approved the final manuscript for publication. As the study does not involve any personal or identifiable data, obtaining individual consent for publication is not applicable. The authors further declare that this work is entirely original and has not been previously published or submitted to any other journal.
Ethical approval and consent to participate
All datasets were downloaded from the public GEO database and TCGA database without restriction. Ethical approval and consent to participate are not applicable.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.






Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Qin, M., Li, X., Gong, X. et al. Integrative bioinformatics and machine learning identify key crosstalk genes and immune interactions in head and neck cancer and Hodgkin lymphoma. Sci Rep 15, 15745 (2025). https://doi.org/10.1038/s41598-025-99017-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-99017-5