Molecular correlates for HPV-negative head and neck cancer engraftment prognosticate patient outcomes

Waas, Matthew; Karamboulas, Christina; Wu, Benson Z.; Khan, Shahbaz; Poon, Stephanie; Meens, Jalna; Govindarajan, Meinusha; Khoo, Amanda; Mejia-Guerrero, Salvador; Ha, Annie; Liu, Lydia Y.; Nixon, Kevin C. J.; Walton, Joseph; Bratman, Scott V.; Huang, Shao Hui; Goldstein, David; Gaiti, Federico; Ailles, Laurie; Kislinger, Thomas

doi:10.1038/s41467-024-55203-z

Download PDF

Article
Open access
Published: 30 December 2024

Molecular correlates for HPV-negative head and neck cancer engraftment prognosticate patient outcomes

Nature Communications volume 15, Article number: 10869 (2024) Cite this article

5375 Accesses
3 Citations
6 Altmetric
Metrics details

Subjects

Abstract

There is a pressing need to improve risk stratification and treatment selection for HPV-negative head and neck squamous cell carcinoma (HNSCC) due to the adverse side effects of treatment. One of the most important prognostic features is lymph nodes involvement. Previously, we demonstrated that tumor formation in patient-derived xenografts (i.e. engraftment) was associated with poor clinical outcomes in patients with HPV-negative HNSCC. However, assessing engraftment is challenging in clinical settings. Here, we perform transcriptomic and proteomic profiling of 88 HNSCC patients and find the relationship between engraftment and clinical outcomes is recapitulated by molecular phenotype. We identify LAMC2 and TGM3 as candidate prognostic biomarkers and validated their utility in an independent cohort containing 404 HPV-negative HNSCC patients. Strikingly, these markers significantly improve prediction of outcomes beyond nodal status alone and can significantly stratify patients without any nodal involvement. Overall, our study demonstrates how the molecular characteristics of engraftment can inform patient prognostication.

Ligand-receptor interactions combined with histopathology for improved prognostic modeling in HPV-negative head and neck squamous cell carcinoma

Article Open access 28 February 2025

Proteome and phosphoproteome signatures of recurrence for HPV⁺ head and neck squamous cell carcinoma

Article Open access 30 July 2022

Tumor microenvironment: an evil nexus promoting aggressive head and neck squamous cell carcinoma and avenue for targeted therapy

Article Open access 12 January 2021

Introduction

Outcomes for patients with HPV-negative head and neck squamous cell carcinoma (HNSCC) remain relatively poor with 5-year overall survival (OS) rates of ~ 50%^1,2. Treatment for HPV-negative HNSCC patients depends on clinical presentation, with the principal options including surgical resection and radiotherapy with or without chemotherapy. Nodal involvement is currently one of the most important prognostic features for treatment selection. For patients with small tumors and involvement of ≤ 1 node, treatment with a single modality has been effective – however, there is still a subset of patients who experience tumor recurrence and eventually succumb to the disease³. In patients undergoing surgery for more advanced tumors, postoperative radiotherapy, with or without concurrent chemotherapy, can improve outcomes. While combination therapies can improve the survival of patients, HNSCC survivors often suffer from major complications⁴ and long-term reduction in quality of life due to the pervasive impact treatment can have on daily life - leading to a high suicide rate (63.4 cases per 100,000 individuals) among survivors⁵. Given these considerable consequences of over- or under-treatment, HNSCC care could benefit greatly from additional strategies to assist risk-tailored clinical decision-making.

Patient-derived xenograft (PDX) models are well suited to tumor biology and biomarker studies as they recapitulate the originating tumor in its molecular profile, histopathological features, and therapeutic sensitivity^6,7,8,9. Poor patient outcomes are associated with successful engraftment of surgically-resected HNSCC tumors into mice, akin to other cancers^10,11. Determining engraftment capacity requires extensive expertise, facilities, and time, which could limit its utility for clinical decision-making. A molecular biomarker for engraftment could eliminate the need for xenografts. Here, we integrate mass spectrometry (MS)-based proteomics with transcriptomics to investigate whether the molecular correlates of engraftment recapitulate the relationship to clinical outcomes in 88 HPV-negative HNSCC tumors.

From our molecular profiling, we identified LAMC2 and TGM3 as candidate prognostic biomarkers. We validated that LAMC2 and TGM3 protein expression correlates with clinical outcomes in an independent cohort of 404 HNSCC patients using immunohistochemistry. Critically, our findings show that LAMC2 and TGM3 can significantly stratify clinical outcomes beyond nodal status alone. These findings further demonstrate engraftment to be of clinical significance for HNSCC patients and that the molecular correlates of engraftment represent a resource from which biomarkers of clinical outcomes can be identified.

Results

Clinical outcomes of HPV-negative HNSCC patients are associated with engraftment

We subcutaneously implanted 273 surgically resected HPV-negative HNSCC specimens into NOD/SCID/IL2Rg^-/- (NSG) mice which resulted in 180 successful engraftments (Fig. 1A). Compared with our prior study¹², this expanded cohort contains 30 additional patient samples and updated records for 44 patients; disease-specific survival (DSS) for the entire cohort was 71% at 3 years, and median follow up was 27.4 months. Consistent with our prior findings, we observed engraftment is associated with significantly worse clinical outcomes (Fig. 1B, Supplementary Data Table 1). Of the recorded clinical covariates, engraftment was significantly correlated with N category, surgical margin, and nodal extracapsular extension (ECE) (chi-square p-values of 0.022, 0.037, and 0.038, respectively) (Table 1). As expected, higher N category was significantly associated with worse clinical outcomes (Supplementary Fig. 1A). To interrogate the relationship between engraftment and N category on clinical outcomes, we first binned N category into two groups representative of their clinical outcomes and management - lower risk (N^low: N0, N1) and higher risk (N^high: N2, N3) (Fig. 1C and Supplementary Fig. 1A). Patients who had high-risk tumors that engrafted (E-N^high) had worse clinical outcomes than those with high-risk tumors that did not engraft (N-N^high), suggesting that engraftment and nodal status independently predict clinical outcomes (Fig. 1D and Supplementary Data Table 1). N^high patients that engraft (E-N^high) have significantly worse clinical outcomes compared to every other subset (i.e., N-N^high, E-N^low, and N-N^low) for each clinical outcome assessed suggesting that engraftment provides additional risk stratification beyond assessing N category alone (Fig. 1D and Supplementary Fig. 1B). This observation is striking given the strong association between N category and HNSCC survival and reinforces the utility of considering engraftment for clinical risk stratification.

**Fig. 1: Experimental workflow and clinical characteristics of cohort.**

Table 1 Comparison of clinical characteristics of cohort based on whether patient samples successfully formed xenografts (Engrafters) or not (Non-engrafters)

Full size table

Engraftment is associated with diminished immune signaling and partial EMT

To explore the relationship between engraftment and clinical outcomes, we profiled a subset of our surgically resected, primary HPV-negative HNSCC tumor samples (n = 88) using mass spectrometry-based proteomics and RNA-Sequencing (Fig. 1A and Supplementary Data Table 1). There were no significant differences in the clinical covariates for the subset of the cohort for which we acquired molecular data compared to the complete cohort (Table 2 and Fig. 1E). Proteomic analysis detected a total of 9721 proteins. Median transcript and protein abundances were moderately positively correlated (Spearman’s correlation of 0.31, p-value < 2.2 × 10⁻¹⁶) (Fig. 2A). The RNA and protein fold changes between engrafters and non-engrafters were better correlated than the abundance measurements (Spearman’s correlation of 0.42, p-value < 2.2 × 10^-16) (Fig. 2B). Samples did not cluster notably by any clinical covariates (Supplementary Fig. 2A, B).

Table 2 Comparison of clinical characteristics to determine whether subset of patients with molecular profile data is representative of the entire cohort

Full size table

**Fig. 2: Engraftment is associated with expression of partial EMT signature.**

Pathway analysis of the proteomic data revealed that engraftment is positively associated with various developmental and morphogenic programs and terms related to extracellular matrix reorganization and adhesion (Fig. 2C and Supplementary Data Table 1). Cell type deconvolution of bulk transcriptomic data revealed increased immune cell content in non-engrafting tumors and increased fibroblast content in engrafting tumors, although the differences weren’t statistically significant (Supplementary Data Table 2 and Supplementary Fig. 2C). Most pathways enriched in non-engrafters were immune cell processes; correspondingly, most (75%) proteins associated with immune system processes were higher in non-engrafters (Fig. 2C and Supplementary Fig. 2D). Markers of natural killer cells¹³ were the most strongly associated with non-engrafters (Supplementary Fig. 2E) consistent with reports that infiltration of NK cells is associated with favorable prognosis^14,15. Engraftment-related fold changes were very weakly correlated with malignant transformation-associated fold changes (i.e., tumor/normal adjacent tissue) from Huang et al. (Spearman’s ρ of 0.06 and 0.18 for proteomic and transcriptomic data, respectively) (Supplementary Fig. 3A, B)¹⁶. Overall, these analyses suggest that the capacity of a tumor to form a xenograft involves diverse pathways that are distinct from malignant transformation and are partially associated with the modulation of multiple cell type-specific programs.

Next, we investigated the relationship between engraftment and established HPV-negative HNSCC subtypes to determine whether the features of the engraftment phenotype were distinct from previously reported molecular profiles^16,17. While our cohort contains samples of each HPV-negative bulk molecular HNSCC subtype (Supplementary Fig. 3C), we detected no correlation between engraftment and subtype (Supplementary Fig. 3D). Alternatively, Puram et al. recently identified six distinct expression profiles for malignant HNSCC cells by applying single-cell RNA Sequencing (scRNA-Seq) to dissect intra-tumoral heterogeneity¹⁸. Of these six expression profiles, only partial epithelial-mesenchymal transition (p-EMT) had significantly different scores between engrafters and non-engrafters (two-sided Wilcoxon rank-sum unadjusted p-value of 2.2 x 10⁻⁵) (Fig. 2D). Among the most differentially expressed p-EMT proteins were secreted structural and enzymatic components of the ECM, consistent with the pathway analysis in Fig. 2C, E. To corroborate that the p-EMT signature is indeed reflective of an epithelial cancer cell state, we performed transcriptomic analysis on paired HNSCC PDX-patient tumors (n = 41). Even with stromal depletion characteristic of PDX models, cancer cells robustly expressed p-EMT signatures supporting that expression is driven by differences in epithelial cells and not an artifact¹⁹ of stromal contamination (Supplementary Fig. 3E–H). Altogether, these results indicate that engraftment is independent of the HNSCC subtype, but instead is associated with an intratumoral pattern p-EMT, characterized by co-expression of epithelial and mesenchymal genes.

Engraftment is associated with a fibroblast and epithelial cell signaling network

As fibroblasts are the most abundant stromal cell type in our HNSCC samples (Supplementary Fig. 2C and Supplementary Data Table 2) – and have an established role in the proliferation and remodeling of HNSCC^20,21,22 - we sought to investigate engraftment through a framework of fibroblast-to-epithelial cell signaling networks. To assign the expression of the proteins in our bulk data to a cell type, we integrated scRNA-Seq of primary tumors¹⁸ with proteomic analysis of seven oral squamous epithelial cell lines and eight patient-derived fibroblast cultures²³ (Fig. 3A, Supplementary Fig. 4A, and Supplementary Data Table 3). Cell-type assignments were mostly (86%) concordant for genes detected in both datasets (Supplementary Fig. 4B). Compared to epithelial-assigned proteins, fibroblast-assigned proteins had significantly higher engraftment fold changes (Fig. 3B). While cell-cell communication with fibroblasts could be important for priming epithelial cells for engraftment, they may not be as consequential for nodal involvement (Supplementary Fig. 4C). These findings suggest that although human stroma is eventually replaced by mouse stroma during xenograft growth^24,25 – consistent with our analysis of PDX tumors (Supplementary Fig. 3E–G) - the extent of interactions between fibroblasts and epithelial cells at the time of implantation may support the capacity of the epithelial cells to engraft.

**Fig. 3: Signaling between fibroblasts and epithelial cells contributes to engraftment.**

We investigated putative interactions²⁶ for the subset of proteins that had both higher expression in engrafters and could be assigned a cell-type specific expression (Fig. 3C). By integrating these different analyses, we uncovered a network of potential engraftment-associated epithelial and fibroblast interactions (Fig. 3D). The majority (62%) of the interactions in this network are ligand-ligand interactions of ECM associated proteins (e.g., laminins, collagens) – consistent with the finding that ECM reorganization is enriched in engrafters (Fig. 3D and Supplementary Fig. 4D). Most (53%) of the interactions are between, not within, cell types (Fig. 3D and Supplementary Fig. 4E). Taken together, these observations suggest that epithelial cells and fibroblasts both contribute to the engraftment-enriched ECM.

Interestingly, six of the seven (85.7%) epithelial-assigned proteins in this network are p-EMT proteins (Supplementary Fig. 4F), revealing a possible relationship between fibroblasts and the epithelial p-EMT state – consistent with the reported localization of p-EMT expression at the leading edges of human tumors (i.e. proximal to stroma)^18,27. Indeed, a reanalysis of the spatial transcriptomic data from Arora et al.²⁷ reveals that, with few exceptions, pEMT genes are enriched in the leading edge compared to tumor core epithelial cells (Supplementary Fig. 4G). In fact, p-EMT increases stepwise from the TC to the transitory region, to the LE (Supplementary Fig. 4H). We profiled how gene expression changes in HNSCC cell lines during co-culture with patient-derived CAFs and found that expression of p-EMT genes increases (Supplementary Fig. 4I, J). This provides experimental evidence that expression of p-EMT at the leading edge could be due to enhanced crosstalk between CAFs and epithelial cells. These findings suggest that engraftment capacity may be driven by interactions with the tumor microenvironment and provide a putative network of engraftment-associated interactions between cancer cells and CAFs.

Engraftment associated proteins LAMC2 and TGM3 stratify clinical outcomes

We next investigated whether there were molecular features that were associated with both engraftment and clinical outcome to identify biomarkers that could provide prognostic value. Each clinical outcome hazard ratio had positive correlations with engraftment fold changes (Spearman’s ρ ranging from 0.26–0.37 and 0.35–0.49 for protein and RNA, respectively) (Supplementary Fig. 5A). We filtered our dataset based on engraftment fold changes and hazard ratios to identify candidate biomarkers associated with clinical outcomes (Fig. 4A). We then ranked candidates according to median protein and mRNA expression, DSS hazard ratio from TCGA, and tumor/normal fold changes¹⁶ evaluating markers for both better and worse clinical outcomes separately (Fig. 4A and Supplementary Fig. 5B). The top-ranking candidate biomarkers were LAMC2 and TGM3 – where increased expression was associated with worse and better clinical outcomes, respectively.

**Fig. 4: Identification and validation of LAMC2 and TGM3 as biomarkers.**

We investigated whether expression of LAMC2 and TGM3 was associated with survival in an independent cohort of HNSCC patients (n = 404; 3-year DSS of 76.7% with median follow-up 58.3 months) using immunohistochemistry (IHC) (Fig. 4A). Each tumor core was scored for the frequency of positive staining among cancer cells, the intensity of staining, and a third score was generated by multiplying these two together. For TGM3, we noted two distinct staining patterns: one in which the staining was very similar to the staining of the suprabasal layers of the normal adjacent squamous epithelium (intense in both the nucleus and cytoplasm), and one in which staining was only cytoplasmic, less intense, and in some cases speckled. We, therefore, added an additional scoring category for TGM3 in which we scored the frequency of tumor cells with a staining pattern similar to that seen in normal epithelia, which we refer to as a “normal-like” pattern. Of the scored IHC features, LAMC2 intensity, and TGM3 normal-like staining had significant associations with clinical outcomes (Supplementary Fig. 6A, example images in Fig. 4B). Staining patterns in matched PDX tumors resembled those directly from patients (Supplementary Fig. 6B).

Higher intensity of LAMC2 staining was significantly associated with worse clinical outcomes for each measure assessed in accordance with our expectations from candidate selection criteria (Supplementary Fig. 6A, Kaplan-Meier survival curve for DSS shown in Fig. 4C). Strikingly, the significant associations with worse outcomes were preserved for each clinical variable in multivariate survival analyses including controlling for N category (Supplementary Data Table 4). LAMC2 intensity was significantly associated with DSS and recurrence-free interval (RFI) for N^low patients and with overall survival (OS) and recurrence-free survival (RFS) for N^high patients (Supplementary Fig. 6C, Kaplan-Meier survival curve for DSS shown in Fig. 4D). Expression of TGM3 in a normal-like pattern (Fig. 4B) was significantly associated with improved clinical outcomes (Supplementary Fig. 6A, Kaplan-Meier survival curve for DSS shown in Fig. 4E). Multivariate survival analyses revealed a significant interaction between N category risk group and TGM3 normal-like expression (Supplementary Data Table 4). Pairwise survival analyses revealed that high normal-like TGM3 expression was significantly correlated with improved clinical outcomes in N^low but not N^high risk groups (Supplementary Fig. 6D, Kaplan-Meier survival curve for DSS shown in Fig. 4F). These findings are consistent with analyzing the effect of each marker within each N category (i.e., N0, N1, N2), instead of as nodal status groups (i.e., N^low, N^high) (Supplementary Data Table 4).

Combining LAMC2 and TGM3 stratifies patients without nodal involvement

To investigate whether there was a benefit to considering the combination of LAMC2 intensity, TGM3 staining, and N category risk groups, we combined all three variables in a multivariate Cox model. Based on similarities in clinical outcomes, we subset the cohort into four Survival Groups comprising different combinations of these three variables (SG1-4, where SG1 has the best outcomes and SG4 has the worst outcomes) (Fig. 5A and Supplementary Fig. 7A, Kaplan-Meier survival curve for DSS shown in Supplementary Fig. 7B). There are significant differences between Survival Groups for each clinical outcome (Fig. 5B and Supplementary Fig. 7C). For markers to be of clinical benefit for prognosis they must provide value beyond N category assignment, as that is part of standard-of-care. We observed different clinical outcomes within N category 0 (N0) patients, suggesting that combining LAMC2 intensity and TGM3 staining met this criterion (i.e., assigned Survival Groups were significantly associated with different clinical outcomes within N0 patients) (Fig. 5C and Supplementary Fig. 7D).

**Fig. 5: Combining LAMC2 and TGM3 stratifies clinical outcomes beyond N category.**

An important decision for the clinical management of HNSC patients without nodal involvement (N0) is elective neck dissection. A commonly cited threshold is that for patients with > 20% probability of occult metastases, dissection is warranted²⁸. To investigate whether LAMC2 and TGM3 could be of clinical utility in this context, we assessed the relationship between assigned Survival Groups and the rates of local failure (i.e. disease persistence or reappearance at the primary tumor site) and regional/distant failure (i.e., metastases in lymph nodes / an organ outside the head or neck) (Supplementary Fig. 7E). Compared to N category alone, the incorporation of the IHC data on the LAMC2 and TGM3 (i.e., Survival Groups) marks a patient subset with 42% of the failure rate of N0 category patients overall (Supplementary Fig. 7E, F). If patients assigned SG3 (which spans patients of all N categories (Fig. 5A)) are assessed, there is no statistical difference in the failure rate of N0 and N2 category patients (Supplementary Fig. 7G). Strikingly, the increased failure rate of N0 patients assigned SG3 is associated with an overall survival probability that is statistically indistinguishable from N2 category patients assigned SG3 (Supplementary Fig. 7H). The differences between Survival Groups were most pronounced for N0 category patients – where total failure rates increased from 9.7% to 36.8% from SG1–SG3 (Supplementary Fig. 7I–L). In summary, the use of LAMC2 and TGM3 as markers allows us to pinpoint N0 patients that are performing statistically similarly to N2 patients in terms of failure rate and overall survival (Supplementary Fig. 7G, H).

There are important clinical covariates for HNSCC, such as extracapsular extension, surgical margins, perineural invasion, and lymphovascular invasion that can stratify outcomes and failure rate (Supplementary Fig. 8A–H). However, these effects appear to be driven by the relationship with N category, as the statistical significance mostly disappears for comparisons within N category groupings (Supplementary Fig. 8A–D, I–L). This highlights how exceptional the relationship between the Survival Group and outcomes within N0 category patients is, as this stratification splits N0 into three distinct prognostic groups. Recognizing that within N0 there remain some clinical covariates that can impact prognosis (e.g., T category, disease subsite, adjuvant treatment), many of which are associated with SG (Supplementary Fig. 8M and Supplementary Data Table 4), we accounted for these using a multivariate Cox model. SG2 and SG3 had worse clinical outcomes than SG1 for each outcome assessed; hazard ratios ranged from 1.6 – 6.5 and 2.4 – 10.3 for SG2 and SG3, respectively (Fig. 5D and Supplementary Data Table 4). Differences between SG1 and SG2-3 were similarly pronounced for N0 patients treated only with surgery; hazard ratios ranged from 2.3 to 11.4 for SG3 (Supplementary Data Table 4). In summary, we validated the relationship of LAMC2 and TGM3 to survival in an independent cohort and demonstrated that the markers could benefit prognosis by stratifying outcomes within patients with the same N category.

LAMC2 and TGM3 pathways correlate with clinical outcomes and engraftment

To explore why LAMC2 and TGM3 expression are associated with clinical outcomes, we explored the pathways that involve these proteins. Most LAMC2 pathways are associated with worse clinical outcomes, especially “epithelial-mesenchymal transition” and “ECM receptor interactions” (Supplementary Fig. 9A). Pathways involving TGM3, particularly “keratinization” and “keratinocyte differentiation”, are associated with favorable outcomes (Supplementary Fig. 9A). Enrichment analysis performed by splitting samples using either LAMC2 or TGM3 intensity (Supplementary Fig. 9B) reveals many of the same pathways associated with engraftment (Fig. 2C). LAMC2 and TGM3 are expressed in epithelial cells of different spatial compartments in the tumor, (Supplementary Fig. 9C), consistent with participation in distinct cellular programs. These analyses suggest that LAMC2 and TGM3 are proteins whose expression in epithelial cells is correlated – in opposing directions - with engraftment, epithelial-mesenchymal transition, and clinical outcomes.

Finally, we integrated two scRNA-Seq datasets^29,30 to explore the molecular characteristics and stromal interactions of malignant epithelial cells with distinct LAMC2 and TGM3 expression profiles. While LAMC2 was robustly expressed, detection of TGM3 was too scarce (56/5614 malignant epithelial cells) for meaningful analysis. This was concordant with our analysis of co-cultures, which saw an increase of LAMC2 with co-culture but did not detect TGM3 (Supplementary Fig. 9D). Of the proteins associated with high LAMC2 expression, LAMA3 and LAMB3 were the most significant hits (Supplementary Fig. 10A) – consistent with the importance of the laminin 332 complex in EMT of squamous cell carcinomas and other cancers^31,32,33,34. Among the other differentially expressed proteins were molecules involved in our putative engraftment associated fibroblast-epithelial signaling network including ITGB4, ITGA6, and COL17A1 (Fig. 3C and Supplementary Fig. 10A). Enrichment analysis further validated that differences we observe in EMT signaling could be explained by differences in expression profiles of malignant epithelial cells (Supplementary Fig. 10B). LAMC2-high cells interact via extracellular matrix remodeling with all cell types associated with the HNSCC tumor microenvironment (Supplementary Fig. 10C). These findings are in agreement with our previous findings that reorganization of the extracellular matrix is associated with engraftment (Fig. 2A), includes p-EMT signature genes (Fig. 2E), and is a primary mechanism by which epithelial cells and CAFs communicate (Fig. 3D and Supplementary Fig. 4D). Overall, these findings elucidate how cellular crosstalk between CAFs and epithelial cells, driven through mutual extracellular remodeling, contributes to the expression of the p-EMT program, the capacity to engraft, and worse clinical outcomes for patients.

Discussion

HPV-negative HNSCC treatment plans - guided by factors such as anatomical site and staging - are formulated to optimize the curative potential while preserving form and function. As the treatment modalities can have profoundly impactful sequelae (e.g., speech impairment and dysphagia), there is an acute interest in improving risk-tailored clinical decision-making. We previously reported that successful engraftment of surgically resected HPV-negative HNSCC tumor tissue into immune-compromised mice was significantly associated with poor patient outcomes¹². While engraftment could, in theory, be of benefit to clinical decision-making, there remain practical challenges for implementation³⁵. Molecular biomarkers have been used for myriad diseases for applications ranging from risk stratification to predicting response to therapeutic agents. In particular, protein biomarkers that can be detected by IHC are attractive because of the familiarity of this type of information to a pathologist and the importance of considering spatial information. Here, we interrogated the transcriptomic and proteomic characteristics of a cohort of 88 HNSCC tumors to investigate the molecular characteristics of engraftment and to determine whether the relationship between engraftment and clinical outcomes could be recapitulated with biomarkers.

From our molecular profiling data, we identified LAMC2 and TGM3 as top-ranking candidate markers for stratifying clinical outcomes. We validated the relationship between LAMC2 and TGM3 and clinical outcomes by IHC in an independent cohort containing samples from 404 HNSCC patients in a tissue microarray. Analogous to engraftment itself, LAMC2 and TGM3 proved to further stratify patient prognostication beyond the N category - the only currently accepted reliable predictor of patient outcome. We tested whether LAMC2 and TGM3 could be of value in patients without nodal involvement (N category of 0; N0). By combining LAMC2 and TGM3, we identified a subset of N0 patients (29 patients) with poor clinical outcomes – 58.6% and 73.6% 5-year overall and disease-specific survival, respectively. Correspondingly, we identify a substantial number of N0 patients (51 patients) with exceptional clinical outcomes – 87.2% and 100% 5-year overall and disease-specific survival, respectively. The hazard ratios associated with the combination of LAMC2 and TGM3 within N0 patients treated with surgery alone (2.3–11.4) are larger than the hazard ratios associated with the difference between N0 and N2 patients (2.7–5.2). Furthermore, we demonstrate that by considering LAMC2 and TGM3, we can identify subsets of patients with significantly different failure rates, an important consideration for elective neck dissection. These outcome differences are of sufficient magnitude to indicate escalation or de-escalation of treatment plans for patients within these subsets and warrant further investigation and confirmation as a source of information to support clinical decision making. While this is not the first report of the utility of LAMC2³⁶ or TGM3³⁷ as biomarkers for outcomes of HNSCC patients, this is, to our knowledge, we have shown that LAMC2 or TGM3 can significantly improve patient stratification beyond N category and the benefit of combining the markers. In addition, this represents the largest independent cohort reporting differences for LAMC2 and TGM3 (404 patients compared to 175 or 87 patients, respectively). Finally, we report that LAMC2 and TGM3 allow for patient stratification beyond that afforded by clinical covariates that prognosticate outcomes such as N category, extracapsular extension, and surgical margins. In summary, these analyses definitively highlight LAMC2 and TGM3 as having potential utility to assist risk-tailored clinical decision-making for HNSCC treatment.

Beyond the identification of biomarkers, the relationship between engraftment and poor clinical outcomes suggests there may be therapeutic potential in targeting engraftment-associated pathways and signaling. Our study identified a potential relationship between engraftment and communication between cancer cells and fibroblasts. These findings are in line with previous reports of the significance of fibroblasts in HNSCC^21,23,38. There is an increasing recognition of differences between signaling at the leading edge of tumors compared to the core^18,27. Arora et al. hypothesized that directing the signaling of leading-edge cells towards a tumor core-like state could be the basis for effective anticancer therapeutics²⁷. Our analyses suggest that fibroblast-tumor crosstalk may contribute to the aggressiveness of the leading edge and modulation of this signaling axis represents an alluring avenue of therapeutic intervention in HNSCC.

In summary, we have established that the relationship between HNSCC engraftment and clinical outcomes is reflected by differences in the molecular profile of patient tumors. From the engraftment molecular profile, we identified and validated two protein biomarkers, LAMC2 and TGM3, associated with clinical outcomes and capable of adding to the standard-of-care information used for risk-tailored clinical decision-making. Expanding the cohort size will be imperative for translating the relationship between the markers and outcomes into the clinic, particularly as indicia for adjuvant assignment or neck dissection. Although previous studies have demonstrated similarities between different disease sites¹⁶, further investigation of new and independent HNSCC cohorts is warranted, as most of the patient tumor disease sites in this study were lip and oral cavity cancers. As there are other cancers for which a relationship between engraftment and clinical outcomes has been established, analogous and comparative analyses may reveal generalizable findings about engraftment and tumor aggressiveness.

Methods

Biological materials

Patient tumor samples: Fresh HNSCC tumor samples were collected from patients undergoing surgery at Toronto General Hospital, University Health Network. Informed consent was obtained by all patients in this study and participants did not receive any form of compensation. All experiments were approved by the Research Ethics Board at University Health Network. Sex and/or gender were not considered for the study design, but self-reported gender was investigated as a covariate. Patient clinical data was extracted from the Anthology of Outcomes³⁹. A fragment of each patient sample was snap-frozen.

Mouse experiments: All animal experiments were performed with the approval of the University Health Network Animal Care Committee and adhered to the Canadian Council on Animal Care guidelines (protocol #1542). NOD.Cg-Prkdc^scidIl2rg^tm1Wjl/SzJ (NSG) mice were bred in-house at the University Health Network Animal Resources Center. The housing is on a 12 h light cycle, and the temperature is set to 21-22 °C with 45–60% relative humidity.

Cell culture: FaDu (pharynx SCC, HTB-43), Detroit-562 (pharynx SCC, CCL-138), SCC-4 (tongue SCC, CRL-1624), SCC-25 (tongue SCC, CRL-1628), and Cal27 (tongue SCC, CRL-2095) cells were from ATCC. Cal33 (tongue SCC, ACC-447) were from DSMZ. Normal Oral Epithelium (NOE) cells were from Celprogen (San Pedro, CA; 36063-01). SCC-8 (RRID:CVCL_7781) and SCC-42a (RRID:CVCL_7847), human laryngeal squamous cell cancer cell lines, were kind gifts from R. Grénman, Turku University Hospital, Turku, Finland). The cells were authenticated at the Center for Applied Genomics (Hospital for Sick Children, Toronto, Canada) using the AmpF/STR Identifier PCR Amplification Kit (Applied Biosystems) and routinely tested for mycoplasma contamination using the Mycoalert detection kit (Lonza Group Ltd). Cell lines were grown in IMDM supplemented with 10% FBS and PSG (penicillin (20 U/ml)/streptomycin (20 U/ml)/glutamine (60 µg/ml)). NOE cells were cultured in NOE-specific media according to the supplier’s indications (pre-coated flasks with Human Oral Epithelial Primary Cell Culture Complete Extra-cellular Matrix Cat# E36063-01 and Media with Serum Cat# M36063-01S). All cell lines were maintained in a 5% CO2 environment at 37 °C.

Co-culture: Large numbers of cancer cells only and cancer + CAF co-culture spheroids were generated using AggreWell™ Microwell 24-well plates (StemCell Technologies) according to the manufacturer’s guidelines. Briefly, co-cultures were seeded using 600,000 CAFs and 300,000 GFP-labeled Cal33 or Cal27 cells (Creative Bioarray) per well, while monocultures were seeded with 900,000 cancer cells alone. Cells were seeded in IMDM with 10% fetal bovine serum and 1% Penicillin/Streptomycin (Wisent) per well. The plate was spun down at 100 x g for 3–5 minutes and placed in a humidified 37 °C (5% CO2) incubator for 72 h to allow spheroids to fully form and CAFs and cancer cells to interact. Spheroids were collected into 15 mL Falcon tubes using 37 mm reversible strainers (StemCell Technologies), washed twice with PBS, and enzymatically dissociated with 600 mL Accutase™ supplemented with 20 mL collagenase/hyaluronidase (StemCell Technologies). Tubes were warmed to 37 °C for a maximum of 1 h, flicking the side of the tube occasionally to agitate the mixture. 2 mL of complete media was added to stop the reaction. The dissociated spheroid samples were washed with PBS filtered with a 70 mM mesh strainer to remove small aggregates. GFP+ cancer cells were isolated using an Aria Fusion fluorescence-activated cell sorter (Beckton Dickinson).

Tissue sample preparation

Snap-frozen patient OCT-embedded tissues were first sectioned (8 microns) onto a slide, and H&E staining was performed to assess the proportion of tumor cells present. Samples that contained < 25% tumor cells were excluded. The cohort of patient tumor samples consisted of 51 Engrafters and 37 Non-Engrafters. Serial shavings, 20–40, at 10 µm were cut and collected, alternating tubes for RNA and protein, from each snap-frozen OCT sample for downstream extractions.

Patient-derived xenografts

Patient-derived xenograft tumors were generated as previously described¹². Briefly, under sterile conditions, tumor samples were cut into small pieces (~ 1 mm³), and individual pieces were implanted subcutaneously into the flank of NSG mice. Mice were between six to twelve weeks old when implanted with patient tumors. Two tumor fragments were implanted per mouse (1 to each flank) of up to 5 mice. Mice were then monitored weekly for tumor growth and the time of initial palpation was recorded. The maximal tumor burden permitted by the ethics committee was no more than 15 mm in diameter – this was not exceeded. Mice were euthanized when tumors reached 15 mm in diameter or after 6 months. If no tumor formed by 6 months, the patient tumor was defined as Non-engrafter (N).

Transcriptomics sample processing

RNA was extracted from tumor, PDX tissue, or cells using the RNeasy Minikit (Qiagen). After passing QC, the sample library was prepared using the Illumina TruSeq stranded total RNA sample preparation kit at the Princess Margaret Genomics Center. Sequencing used a 100-cycle paired read protocol and multiplexing to obtain ~ 75 million reads/sample on a Novaseq S4 flow cell using XP mode.

Proteomics sample preparation

OCT removal: All steps were performed at 4 °C using pre-chilled solutions unless otherwise noted. Shavings of OCT-embedded surgically resected tumor samples were depleted of OCT using a protocol adapted from published literature^40,41. Briefly, samples were washed two times with 1 mL of 70% ethanol, two times with 1 mL of deionized water, and one time with 1 mL of 50 mM Ammonium Bicarbonate. Samples were air-dried for 5 minutes at room temperature.

SP3 digest and peptide cleanup: 100 µL of SP3 lysis buffer⁴² (100 mM Ammonium Bicarbonate with 1% SDS, 1% TritonX100, 1% NP-40, 1% Tween, 1% Sodium Deoxycholate, 1% glycerol, 50 mM NaCl, and 1x Protease Inhibitor Cocktail) was added to each tube of tissue pieces. Samples were sonicated (VialTweeter; Hielscher Ultrasonics, Teltow, Germany) by three ten-second pulses, set on ice for one minute, and then sonicated by three ten-second pulses. Samples were heated at 95 °C for 5 min using a heat block and then stored at − 80 °C until all samples were ready for proteomic digestion. All samples were retrieved from the freezer and thawed on ice. Samples were sonicated by three ten-second pulses, set on ice for one minute, and then again sonicated by three ten-second pulses. Samples were heated at 95 °C for 5 min using a heat block and chilled on ice. Protein concentration was estimated using the Pierce™ BCA Protein Assay Kit according to the manufacturer’s protocol. Samples were brought to 5 mM dithiothreitol (DTT) and reduced for 30 min at 60 °C in a water bath. Samples were brought to 14 mM iodoacetamide for alkylation in the dark for 30 min at room temperature. 50 µg of protein from each sample was transferred to a low attachment round-bottom plate. Volumes were normalized using deionized water, and 10 µL of prewashed magnetic particles (100 µg/µL of 1:1 SeraMag Hydrophilic:SeraMag Hydrophobic) were added to each well and mixed in with trituration. Samples were brought to 70% ethanol. The plate was transferred to a rocker and incubated for 18 min. at room temperature. Magnetic particles were collected with 96 well plate magnetic rack for 2 min, and supernatants were removed. Magnetic particles were washed 2x with 200 μL of 80% ethanol and 1x with 200 μL of acetonitrile. Cleaned particles were resuspended by adding 100 µL of digest buffer (100 mM ammonium bicarbonate with 2 mM CaCl₂ and 20% (v/v) Invitrosol) and incubating for 10 min at room temperature in ultrasonicator water bath to assist disaggregation. 2 µg of Trypsin/LysC was added to each well and samples were digested for 16 hr at 37 °C in a humidity chamber. After the digestion, the plate was centrifuged at 1000 × g at 24 °C for 2 min. Magnetic particles were collected with 96 well plate magnetic rack for 2 min, and supernatants were transferred to low-binding tubes. 40 µL of digested peptides were cleaned using SP2⁴³ with 6 µL of prewashed magnetic particles (100 µg/µL of 1:1 SeraMag Hydrophilic:SeraMag Hydrophobic) and eluted in 60 µL of MS grade water. Peptide concentrations were estimated using Pierce™ Quantitative Fluorometric Peptide Assay.

Cell line digestion: Each cell line was processed in triplicates. Cells were grown in 10 cm dishes to 80% confluency and washed three times with cold phosphate-buffered saline PBS (pH 7.4) before cells were pelleted. The cell pellets were lysed in 500 μL of 50% (v/v) 2,2,2-Trifluoroethanol with 100 mM ammonium bicarbonate (pH 8) with repeated freeze-thaw cycles followed by five cycles of pulse sonication (10 s each). The disulphide bonds were reduced using 5 mM dithiothreitol for 30 min at 60 °C, the reduced disulphide bridges were alkylated with 25 mM iodoacetamide for 30 min at room temperature in the dark. The samples were diluted 1:5 with 100 mM ammonium bicarbonate (pH 8.0) and added with 2 mM CaCl₂. The proteins were digested overnight with 2 µg of trypsin/Lys-C enzyme mix (Promega, Cat# V5072) at 37 °C. The reaction was quenched with the addition of 1% formic acid and the peptides were desalted by C₁₈-based solid phase extraction, then lyophilized in a SpeedVac vacuum concentrator. The peptides were solubilized in mass spectrometer grade 0.1% formic acid in water.

Mass spectrometry data acquisition

LC-MS/MS analysis was performed on a Q Exactive HF (ThermoFisher) coupled to EASY-nLC™ 1000 System (ThermoFisher). Peptides were loaded on pre-column (Acclaim™ PepMap™ 100 C18, ThermoFisher) at 740 Bar max pressure separated using a 50 cm EASY-Spray column (ES903, ThermoFisher) ramping mobile phase B (0.1% FA in HPLC grade acetonitrile) from 0% to 6% in 5 min, 5% to 24% in 200 min, 24% to 48% in 40 min interfaced online using an EASY-Spray™ source (ThermoFisher). The Q Exactive HF MS was operated in data-dependent acquisition mode using a loop count of 25 at a full MS resolution of 120,000 with a full scan range of 350–1800 m/z, full MS AGC at 3 × 10⁶, and maximum inject time at 240 ms. Ions for MS/MS were selected using a 1.4 Th isolation window with 0.2 Th offset, AGC at 2 x 10⁵, 55 ms maximum injection time, minimum AGC target of 100, positive charge states of 2–5, 60 s dynamic exclusion, and then fragmented using HCD with 27 NCE. MS/MS scans were collected at a resolution of 30,000 in profile mode.

Mass spectrometry raw data analysis

Raw files were analyzed using FragPipe (v.20.0) using MSFragger^44,45 (v.3.8) to search against a human proteome (Uniprot, 43,392 sequences, accessed 2023-02-08) – canonical plus isoforms. Default settings for LFQ workflow^46,47 were applied using IonQuant⁴⁸ (v.1.9.8) and Philosopher⁴⁹ (v.5.0.0) with the following modifications: Precursor and fragment mass tolerance were specified at − 50 to 50 ppm and 7 ppm, respectively; parameter optimization was disabled; Pyro-Glu or loss of ammonia at peptide N-terminal was included as a variable modification; MaxLFQ min ions was set to 1; MBR RT tolerance was set to 1 min, and MBR top runs was set to 10.

Tissue microarray – Immunostaining and image acquisition

A clinically annotated TMA consisting of over 600 oral cavity cancer tumor tissues (most patients are from 1994-2012) was used to assess the expression of LAMC2 and TGM3. TMAs were sectioned at 4 microns at the UHN-LMP-Pathology Research Program. After deparaffinization and antigen retrieval with Tris-EDTA, pH 9.0, buffer, the TMA slides were stained with LAMC2 monoclonal antibody at 1:100 dilution, (MA5-24646, ThermoFisher) or TGM3 polyclonal antibody at 1:100, (HPA004728, Sigma Aldrich). The stained slides were scanned at 20X magnification using Aperio ScanScope AT2. The digital images were viewed using Aperio ImageScope and each tumor core.

TMA – Scoring

Each tumor core (duplicate per patient) was scored manually by three scientists. For both LAMC2 and TGM3, the following parameters were scored: 1- the percentage of tumor cells that are positively stained by the biomarker with 5 bins (0 (no staining), 1 (1–25%), 2 (25–50%), 3 (50–75%), 4 (75–100%); 2 – the intensity of biomarker expression with 4 bins (0 (no staining), 1 (low), 2 (medium), 3 (high). An overall score (frequency x intensity) was determined by multiplying the values for each parameter. For TGM3, an additional score for the percentage of cells with a “nomal-like” staining pattern (as described in Results) was determined with 4 bins:(1 (0–25%), 2 (25–50%), 3 (50–75%), 4 (75–100%).

Mass spectrometry statistical analysis

All analysis was performed using R programming language (v.4.2.2) with Tidyverse package (tidyverse_1.3.2) unless otherwise specified. All correlation estimates and p-values were calculated using the “cor.test” function. For all experiments, the “MaxLFQ Intensity” columns were extracted from the “combined_protein.tsv” output file from FragPipe (Supplementary Data Table 3 and 5).

Patient tissue cohort: First, the log₂-transformed protein intensities were adjusted to the median cellularity (estimated from H&E staining) using a linear model for proteins quantified in ≥ 3 samples (8384 proteins). Next, proteins were filtered for presence in > 75% of patient samples, resulting in 4382 proteins, and subsequently imputed with a random forest algorithm using the MissForest package (missForest_1.5). Differential expression was estimated using the difference in group means (Engrafters v Non-engrafters), and significance was calculated using a two-sided Wilcoxon rank-sum test. GSEA was applied using a pre-ranked list of Gene Symbols sorted based on estimated fold changes against the Gene Ontology Biological Processes, KEGG Pathways, and Cancer Hallmark gene sets with minimum and maximum sizes of 25 and 200, respectively. Outputs from GSEA were used as inputs to Cytoscape (v.3.9.1). GSEA gene sets were visualized in EnrichmentMap (v.3.3.4) and AutoAnnotate (v.1.3.5).

Cell line analysis: First proteins were filtered for presence in > 50% of lines for at least one cell type and subsequently imputed with lower tail imputation (downshift of 1.8 s.d. and width of 0.3 s.d.)⁵⁰. Differential expression and significance were estimated using a linear model.

RNA Sequencing

Tissue and co-culture: Transcript-level abundance was quantified using Kallisto⁵¹ (v.0.46.1) using Gencode.v33 transcript. Transcript abundances were then imported into R using tximport⁵², filtered for protein-coding genes with annotations from org.Hs.eg.db (v.3.16.0). Samples with total counts fewer than mean minus 2.5 s.d. were excluded and then genes with low transcript counts were excluded using filterByExpr function from DESeq2 (v.1.38.1). Using edgeR⁵³ (v. 3.40.0), normalization factors were calculated using calcNormFactors function and counts were converted to counts-per-million (CPM). The log₂-transformed CPM were adjusted to the median cellularity (estimated from H&E staining) using a linear model. Differential expression was estimated using the difference in group means (Engrafters v Non-engrafters), and significance was calculated using a two-sided Wilcoxon rank-sum test.

Paired PDX: Reads from PDX were classified to either mouse or human genomes using Xenome⁵⁴. Reads classified as human were aligned to GENCODE v38 human genome using STAR⁵⁵ (v. 2.7.9a) and counted to exonic features using featureCounts⁵⁶ (Subread v. 2.0.1). Reads from primary tissue were aligned to the human genome (gencode.v30) using STAR⁵⁵ (v. 2.6.1c) and filtered to remove reads that aligned to more than one locus. Reads were counted to exonic features using HTSeq-count⁵⁷ (v. 0.11.0). Samples with total counts fewer than mean minus 3 s.d. were excluded, and then genes with low transcript counts were excluded using filterByExpr function from DESeq2 (v.1.38.1). Using edgeR⁵³ (v. 3.40.0), normalization factors were calculated using calcNormFactors function and counts were converted to counts-per-million (CPM). The ESTIMATE algorithm⁵⁸ was used for the determination of tumor cellular composition. A linear model using log₂(CPM + 1) was used for gene fold change estimation.

scRNA Seq for cell type assignment of engraftment associated proteins

Processed expression data from Puram et al.¹⁸ was downloaded from Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE103322. Data were imported into Seurat⁵⁹ (v.4.3.0). Cells annotated as “lymph” or without a specific cell type annotation were excluded. Data were scaled and normalized using ScaleData and NormalizeData functions, respectively. Differential expression testing was performed using the FindMarkers function.

scRNA-Seq - Data processing, normalization, and annotation

When available, we re-analyzed processed and annotated data provided by the authors of the respective studies. For the Kürten et al.²⁹ dataset, processed gene barcodes from CellRanger were provided. We processed this dataset following the methods described in their manuscript. Briefly, using Seurat (v4.3.0), cells with fewer than 200 genes expressed, more than 5000 genes expressed, or more than 10% mitochondrial genes were filtered out. Data were normalized by scaling expression to 10,000 counts per cell followed by log transformation. The top 2000 highly variable genes were selected and normalized expression values for these genes were scaled, regressing out mitochondrial gene percentage and number of molecules detected per cell. PCA was performed on the scaled data, and a shared nearest neighbors (SNN) graph was computed using the top 10 PCs. Clusters were computed on the SNN graph with resolution = 0.3. Clusters were annotated using reported cell type markers. Malignant cells were identified by performing CNV inference on epithelial cells using inferCNV (v1.17.0) with cells from the peripheral blood leukocyte samples as reference cells.

scRNA-Seq - Pseudobulk differential gene expression analysis

To identify differentially expressed genes (DEGs), we first created pseudobulks by aggregating LAMC2-positive/-negative cells from each patient. Pseudobulk analysis mitigates type I error compared to traditional approaches at the single level which is important for deriving robust DEGs across patients⁶⁰. DEGs between LAMC2-positive and LAMC2-negative cells were identified using DESeq2⁶¹ (v1.32.0), considering patient identity as a fixed effect with design ‘~ Patient + CellType’. Statistical significance was computed using a Wald test. Log fold changes were shrunk using the ashr method in the lfcShrink function to control for noise in lowly detected genes. DEGs were considered significant if they met the criteria FDR < 0.05 and |log2(FoldChange)| > 1. The fgsea⁵³ package (v1.18.0) was used to perform gene set enrichment analysis with the fgseaMultiLevel function. A ranked list was computed from the results of the differential gene expression test described above. Genes were ranked by log2(FoldChange). The C2, CP, and C5 GO pathways collections were downloaded from MsigDB^62,63 and only pathways containing between 10 and 2000 genes were used.

Cell-cell communication inference from scRNA-seq data

To achieve robust prediction of ligand-receptor pairs, we used a pipeline that integrates several published tools for scRNA-seq: LIANA⁶⁴, CellPhoneDB v5⁶⁵, Cell2Cell⁶⁶, and CellChat v2⁶⁷. In addition, the interaction databases from LIANA, CellPhoneDB, and CellChat were combined into a single database capturing 6938 interactions. Interactions were inferred on a sample basis, followed by consensus identification where an interaction had to be detected (p < 0.05) in at least three tools, of which one had to be LIANA. We employed two filtering steps to obtain a final list of interactions: (1) interactions detected in at least two patients were retained, and (2) on a sample level, a consensus rank (p-value) across tools was computed using Robust Rank Aggregation⁶⁸. Sample-level results were aggregated by combining p-values using Fisher’s method. The combined p-values were adjusted via the BH procedure. The final list of interactions was obtained by filtering for interactions that fulfill criteria for (1) and have an FDR < 0.05 from (2).

Scoring pEMT signature in spatial transcriptomic data

Processed and annotated Seurat objects for 10x Visium data from the Arora 2023²⁷ study were used to analyze the expression of the pEMT signature reported by Puram et al.¹⁸ For each sample, the relative expression of the pEMT signature was scored using the AddModuleScore function in Seurat (v4.3.0) in spots annotated as SCC by the authors. The relative expression of the pEMT signature was compared across regions (edge, transitory, core), with statistical significance computed using a likelihood ratio test of linear mixed models (LMM) with/without sample ID as a random effect to account for the nested structure in the data. LMMs were implemented using the lme4 R package (v1.1-34), and p-values were calculated using analysis of variance (ANOVA) with a likelihood ratio test between the LMMs with/without sample ID using the Stats R package (v4.2.1). The cumulative log₂(FoldChange) expression of p-EMT genes between edge and core spots was visualized by computing the average expression of the p-EMT genes per sample, log-transforming the values, and subtracting the core value from the edge value to obtain the log₂(FoldChange) for each sample.

Marker prioritization

To meet “Lower in E” or “Higher in E” criteria, the mean log₂(E/N) fold change (considering RNA and protein) was < − 0.5 or > 0.5, respectively. Hazard ratios were calculated using Cox proportional hazard models on median dichotomized RNA and protein data for four clinical outcomes – disease-specific survival, overall survival, recurrence-free interval, and recurrence-free survival. The “Better clinical outcomes” and “Worse clinical outcomes” criteria were met if > 4 of the eight possible hazard ratios were < 0.67 or > 1.5, respectively. Five ranking criteria were assessed for each candidate that met both filtering criteria – DSS hazard ratio of TCGA cohort (accessed using Xena⁶⁹), median protein intensity (current study), median RNA intensity (current study), Huang et al.¹⁶ HNSCC log₂(Tumor/normal adjacent tissue) RNA and protein. The z-score was calculated for each ranking criteria and then summed for candidate selection. Markers for “Worse survival” and “Better survival” prioritized different directionality for each ranking criteria apart from median intensities of RNA and protein.

Data visualization

Unless otherwise specified, plots were generated using R programming language (v.4.2.2) with Tidyverse package (tidyverse_1.3.2) with ggthemes_4.2.4, ggpubr_0.5.0, ggplot2_3.4.0, and ggbeeswarm_0.6.0 packages.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All mass spectrometry raw files and processed result files acquired in this study are publicly available in the MassIVE database: MSV000094507. Processed proteomics data are available in this paper’s Supplementary Tables 3 and 5. Patient RNA Sequencing data has been deposited at the European Genome-phenome Archive (EGA): EGAS50000000714. The co-culture RNA sequencing data are deposited at GEO: GSE279481. The processed discovery cohort RNA-Seq data are available in Supplementary Table 5. Processed scRNA-seq data were retrieved from the Gene Expression Omnibus database: accession IDs GSE164690 (Kürten et al.)²⁹ and GSE181919 (Choi et al.³⁰). Processed ST data from Arora et al²⁷ were retrieved from https://doi.org/10.6084/m9.figshare.20304456.v1. Source data is provided as a source data file. Source data are provided in this paper.

Code availability

Analysis scripts and the cell-cell communication pipeline can be found at: https://github.com/Bensonwu02/hnscc_engraftment, https://github.com/GaitiLab/scrnaseq-cellcomm-pipeline.

References

Chaturvedi, A. K. et al. Human papillomavirus and rising oropharyngeal cancer incidence in the United States. J. Clin. Oncol. 29, 4294–4301 (2011).
Article PubMed PubMed Central MATH Google Scholar
Zhou, P. et al. Survival outcomes and treatment decision by human papillomavirus status among patients with stage IVC head and neck squamous cell carcinoma. Front. Oncol. 11, 668066 (2021).
Article PubMed PubMed Central Google Scholar
Rogers, S. N. et al. Survival following primary surgery for oral cancer. Oral. Oncol. 45, 201–211 (2009).
Article PubMed MATH Google Scholar
Larson, D. L., Lindberg, R. D., Lane, E. & Goepfert, H. Major complications of radiotherapy in cancer of the oral cavity and oropharynx. Am. J. Surg. 146, 531–536 (1983).
Article CAS PubMed Google Scholar
Osazuwa‐Peters, N. et al. Suicide risk among cancer survivors: Head and neck versus other cancers. Cancer 124, 4072–4079 (2018).
Article PubMed MATH Google Scholar
Keysar, S. B. et al. A patient tumor transplant model of squamous cell cancer identifies PI3K inhibitors as candidate therapeutics in defined molecular bins. Mol. Oncol. 7, 776–790 (2013).
Article CAS PubMed PubMed Central Google Scholar
Gao, H. et al. High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response. Nat. Med. 21, 1318–1325 (2015).
Article CAS PubMed MATH Google Scholar
Li, H. et al. Proteomic characterization of head and neck cancer patient-derived xenografts. Mol. Cancer Res. 14, 278–286 (2016).
Article CAS PubMed MATH Google Scholar
Townsend, E. C. et al. The public repository of xenografts enables discovery and randomized phase II-like trials in mice. Cancer Cell 29, 574–586 (2016).
Article CAS PubMed PubMed Central MATH Google Scholar
McAuliffe, P. F. et al. Ability to generate patient-derived breast cancer xenografts is enhanced in chemoresistant disease and predicts poor patient outcomes. PLoS ONE 10, e0136851 (2015).
Article PubMed PubMed Central Google Scholar
John, T. et al. The ability to form primary tumor xenografts is predictive of increased risk of disease recurrence in early-stage non–small cell lung cancer. Clin. Cancer Res. 17, 134–141 (2011).
Article CAS PubMed MATH Google Scholar
Karamboulas, C. et al. Patient-derived xenografts for prognostication and personalized treatment for head and neck squamous cell carcinoma. Cell Rep. 25, 1318–1331 (2018).
Article CAS PubMed MATH Google Scholar
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).
Article CAS PubMed PubMed Central MATH Google Scholar
Chi, H. et al. Natural killer cell-related prognosis signature characterizes immune landscape and predicts prognosis of HNSCC. Front. Immunol. 13, 1018685 (2022).
Article CAS PubMed PubMed Central Google Scholar
Charap, A. J. et al. Landscape of natural killer cell activity in head and neck squamous cell carcinoma. J. Immunother. Cancer 8, e001523 (2020).
Article PubMed PubMed Central MATH Google Scholar
Huang, C. et al. Proteogenomic insights into the biology and treatment of HPV-negative head and neck squamous cell carcinoma. Cancer Cell 39, 361–379 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Walter, V. et al. Molecular subtypes in head and neck cancer exhibit distinct patterns of chromosomal gain and loss of canonical cancer genes. PLoS ONE 8, e56823 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Puram, S. V. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624 (2017).
Article CAS PubMed PubMed Central MATH Google Scholar
Kreis, J., Aybey, B., Geist, F., Brors, B. & Staub, E. Stromal signals dominate gene expression signature scores that aim to describe cancer cell-intrinsic stemness or mesenchymality characteristics. Cancer Res. Commun. 4, 516–529 (2024).
Article CAS PubMed PubMed Central Google Scholar
Li, X., González-Maroto, C. & Tavassoli, M. Crosstalk between CAFs and tumour cells in head and neck cancer. Cell Death Discov. 10, 1–16 (2024).
MATH Google Scholar
Raudenska, M., Balvan, J., Hanelova, K., Bugajova, M. & Masarik, M. Cancer-associated fibroblasts: Mediators of head and neck tumor microenvironment remodeling. Biochim. Biophys. Acta Rev. Cancer 1878, 188940 (2023).
Article CAS PubMed Google Scholar
Prieto-Fernández, L. et al. Dissecting the functions of cancer-associated fibroblasts to therapeutically target head and neck cancer microenvironment. Biomed. Pharmacother. 161, 114502 (2023).
Article PubMed MATH Google Scholar
Principe, S. et al. Proteomic analysis of cancer-associated fibroblasts reveals a paracrine role for MFAP5 in human oral tongue squamous cell carcinoma. J. Proteome Res. 17, 2045–2059 (2018).
Article CAS PubMed MATH Google Scholar
Qian, W. et al. Tumor purity in preclinical mouse tumor models. Cancer Res. Commun. 2, 353–365 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Hidalgo, M. et al. Patient-derived xenograft models: an emerging platform for translational cancer research. Cancer Discov. 4, 998–1013 (2014).
Article CAS PubMed PubMed Central MATH Google Scholar
Isserlin, R., Voisin, V., Ailles, L. & Bader, G. D. Cell-Cell Interaction Database. Zenodo https://doi.org/10.5281/ZENODO.7589953 (2020).
Arora, R. et al. Spatial transcriptomics reveals distinct and conserved tumor core and edge architectures that predict survival and targeted therapy response. Nat. Commun. 14, 5029 (2023).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
de Bree, R. et al. Elective neck dissection in oral squamous cell carcinoma: Past, present and future. Oral. Oncol. 90, 87–93 (2019).
Article PubMed MATH Google Scholar
Kürten, C. H. L. et al. Investigating immune and non-immune cell interactions in head and neck tumors by single-cell RNA sequencing. Nat. Commun. 12, 7338 (2021).
Article ADS PubMed PubMed Central MATH Google Scholar
Choi, J.-H. et al. Single-cell transcriptome profiling of the stepwise progression of head and neck cancer. Nat. Commun. 14, 1055 (2023).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Banerjee, S. et al. Multiple roles for basement membrane proteins in cancer progression and EMT. Eur. J. Cell Biol. 101, 151220 (2022).
Article CAS PubMed Google Scholar
Rousselle, P. & Scoazec, J. Y. Laminin 332 in cancer: When the extracellular matrix turns signals from cell anchorage to cell movement. Semin. Cancer Biol. 62, 149–165 (2020).
Article CAS PubMed MATH Google Scholar
Scanlon, C. S., Van Tubergen, E. A., Inglehart, R. C. & D’Silva, N. J. Biomarkers of epithelial-mesenchymal transition in squamous cell carcinoma. J. Dent. Res. 92, 114–121 (2013).
Article CAS PubMed Google Scholar
Marinkovich, M. P. Laminin 332 in squamous-cell carcinoma. Nat. Rev. Cancer 7, 370–380 (2007).
Article CAS PubMed Google Scholar
Karamboulas, C. & Ailles, L. Patient-derived xenografts: a promising resource for preclinical cancer research. Mol. Cell. Oncol. 6, 1558684 (2019).
Article PubMed PubMed Central Google Scholar
Guo, Z. et al. PLAU and LAMC2 can predict a poor prognosis in patients with HNSCC. J. Cancer 14, 1660–1672 (2023).
Article CAS PubMed PubMed Central MATH Google Scholar
Wu, X. et al. TGM3, a candidate tumor suppressor gene, contributes to human head and neck cancer. Mol. Cancer 12, 151 (2013).
Article PubMed PubMed Central MATH Google Scholar
Hu, C., Zhang, Y., Wu, C. & Huang, Q. Heterogeneity of cancer-associated fibroblasts in head and neck squamous cell carcinoma: opportunities and challenges. Cell Death Discov. 9, 1–10 (2023).
Article MATH Google Scholar
Wong, K. et al. Point-of-care outcome assessment in the cancer clinic: Audit of data quality. Radiother. Oncol. 95, 339–343 (2010).
Article PubMed MATH Google Scholar
Zhang, W., Sakashita, S., Taylor, P., Tsao, M. S. & Moran, M. F. Comprehensive proteome analysis of fresh frozen and optimal cutting temperature (OCT) embedded primary non-small cell lung carcinoma by LC–MS/MS. Methods 81, 50–55 (2015).
Article CAS PubMed Google Scholar
Holfeld, A., Valdés, A., Malmström, P.-U., Segersten, U. & Lind, S. B. Parallel proteomic workflow for mass spectrometric analysis of tissue samples preserved by different methods. Anal. Chem. 90, 5841–5849 (2018).
Article CAS PubMed Google Scholar
Hughes, C. S. et al. Single-pot, solid-phase-enhanced sample preparation for proteomics experiments. Nat. Protoc. 14, 68–85 (2019).
Article CAS PubMed MATH Google Scholar
Waas, M., Pereckas, M., Jones Lipinski, R. A., Ashwood, C. & Gundry, R. L. SP2: Rapid and automatable contaminant removal from peptide samples for proteomic analyses. J. Proteome Res. 18, 1644–1656 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat. Methods 14, 513–520 (2017).
Article CAS PubMed PubMed Central Google Scholar
Teo, G. C., Polasky, D. A., Yu, F. & Nesvizhskii, A. I. Fast deisotoping algorithm and its implementation in the MSFragger search engine. J. Proteome Res. 20, 498–505 (2021).
Article CAS PubMed Google Scholar
Nesvizhskii, A. I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).
Article CAS PubMed Google Scholar
Käll, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4, 923–925 (2007).
Article PubMed Google Scholar
Yu, F., Haynes, S. E. & Nesvizhskii, A. I. IonQuant enables accurate and sensitive label-free quantification with FDR-controlled match-between-runs. Mol. Cell. Proteom. 20, 100077 (2021).
Article CAS Google Scholar
da Veiga Leprevost, F. et al. Philosopher: a versatile toolkit for shotgun proteomics data analysis. Nat. Methods 17, 869–870 (2020).
Article PubMed PubMed Central MATH Google Scholar
Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13, 731–740 (2016).
Article CAS PubMed MATH Google Scholar
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Article CAS PubMed Google Scholar
Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 4, 1521 (2015).
Article PubMed MATH Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed MATH Google Scholar
Conway, T. et al. Xenome—a tool for classifying reads from xenograft samples. Bioinformatics 28, i172–i178 (2012).
Article CAS PubMed PubMed Central MATH Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Article CAS PubMed MATH Google Scholar
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Article CAS PubMed MATH Google Scholar
Yoshihara., K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
Article CAS PubMed PubMed Central Google Scholar
Squair, J. W. et al. Confronting false discoveries in single-cell differential expression. Nat. Commun. 12, 5692 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central MATH Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Article CAS PubMed PubMed Central Google Scholar
Dimitrov, D. et al. Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data. Nat. Commun. 13, 3224 (2022).
Article ADS PubMed PubMed Central MATH Google Scholar
Garcia-Alonso, L. et al. Single-cell roadmap of human gonadal development. Nature 607, 540–547 (2022).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Armingol, E. et al. Inferring a spatial code of cell-cell interactions across a whole animal body. PLoS Comput. Biol. 18, e1010715 (2022).
Article CAS PubMed PubMed Central Google Scholar
Jin, S., Plikus, M. V. & Nie, Q. CellChat for systematic analysis of cell-cell communication from single-cell and spatially resolved transcriptomic. Nat. Protoc. https://doi.org/10.1038/s41596-024-01045-4 (2023).
Kolde, R., Laur, S., Adler, P. & Vilo, J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28, 573–580 (2012).
Article CAS PubMed PubMed Central MATH Google Scholar
Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 38, 675–678 (2020).
Article CAS PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

Work in the Kislinger and Ailles labs was supported by operating grants from the Canadian Institutes of Health Research (T.K.: PJT173487 and PJT162384; L.A. PJT168946), Ontario Institute for Cancer Research (L.A.: IA-016), the Canadian Research Chair program. M.W. was supported by a CIHR Postdoctoral Fellowship (M.W.: 473129). We would like to acknowledge the Princess Margaret Cancer Center Head & Neck Translational Research Program, supported by philanthropic funds from the Wharton Family, Joe’s Team, Gordon Tozer, the Reed Fund, the Riley Family, and the Elia Family.

Author information

Authors and Affiliations

Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Matthew Waas, Christina Karamboulas, Benson Z. Wu, Shahbaz Khan, Stephanie Poon, Jalna Meens, Meinusha Govindarajan, Amanda Khoo, Salvador Mejia-Guerrero, Annie Ha, Lydia Y. Liu, Kevin C. J. Nixon, Joseph Walton, Federico Gaiti, Laurie Ailles & Thomas Kislinger
Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
Benson Z. Wu, Stephanie Poon, Meinusha Govindarajan, Amanda Khoo, Annie Ha, Lydia Y. Liu, Scott V. Bratman, Federico Gaiti, Laurie Ailles & Thomas Kislinger
Radiation Medicine Program, Princess Margaret Cancer Centre, and Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada
Scott V. Bratman & Shao Hui Huang
Department of Otolaryngology-Head and Neck Surgery, Princess Margaret Cancer Centre, and University of Toronto, Toronto, ON, Canada
David Goldstein

Authors

Matthew Waas
View author publications
Search author on:PubMed Google Scholar
Christina Karamboulas
View author publications
Search author on:PubMed Google Scholar
Benson Z. Wu
View author publications
Search author on:PubMed Google Scholar
Shahbaz Khan
View author publications
Search author on:PubMed Google Scholar
Stephanie Poon
View author publications
Search author on:PubMed Google Scholar
Jalna Meens
View author publications
Search author on:PubMed Google Scholar
Meinusha Govindarajan
View author publications
Search author on:PubMed Google Scholar
Amanda Khoo
View author publications
Search author on:PubMed Google Scholar
Salvador Mejia-Guerrero
View author publications
Search author on:PubMed Google Scholar
Annie Ha
View author publications
Search author on:PubMed Google Scholar
Lydia Y. Liu
View author publications
Search author on:PubMed Google Scholar
Kevin C. J. Nixon
View author publications
Search author on:PubMed Google Scholar
Joseph Walton
View author publications
Search author on:PubMed Google Scholar
Scott V. Bratman
View author publications
Search author on:PubMed Google Scholar
Shao Hui Huang
View author publications
Search author on:PubMed Google Scholar
David Goldstein
View author publications
Search author on:PubMed Google Scholar
Federico Gaiti
View author publications
Search author on:PubMed Google Scholar
Laurie Ailles
View author publications
Search author on:PubMed Google Scholar
Thomas Kislinger
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, M.W., C.K., L.A., and T.K.; Methodology, M.W. and C.K.; Formal Analysis, M.W., B.Z.W., M.G., A.K., L.Y.L., and A.H.; Investigation, M.W., C.K., B.Z.W., and S.P.; Resources, M.W., C.K., and S.M.G; Data Curation, M.W., C.K., S.K., K.C.J.N., J.M., J.W., and S.H.H.; Writing – Original Draft, M.W.; Writing – Review & Editing, all authors; Visualization, M.W., C.K., and B.Z.W.; Supervision - S.V.B, S.H.H., D.G., F.G., L.A., and T.K.; Project Administration, S.V.B., S.H.H., D.G., F.G., L.A., and T.K.; Funding Acquisition, L.A. and T.K.

Corresponding authors

Correspondence to Laurie Ailles or Thomas Kislinger.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Adel El-Naggar, Jochen Hess, Shawn Li, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Reporting Summary

Transparent Peer Review file

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Waas, M., Karamboulas, C., Wu, B.Z. et al. Molecular correlates for HPV-negative head and neck cancer engraftment prognosticate patient outcomes. Nat Commun 15, 10869 (2024). https://doi.org/10.1038/s41467-024-55203-z

Download citation

Received: 02 May 2024
Accepted: 29 November 2024
Published: 30 December 2024
DOI: https://doi.org/10.1038/s41467-024-55203-z