Introduction

Close to 30% of patients with renal cell carcinoma (RCC) have metastatic disease at time of diagnosis and up to 50% may develop metastases during follow-up or after primary resection1. Historically, patients who developed metastatic RCC had a dismal prognosis with a 5-year survival rate near 10%2. Fortunately, with the advent of targeted therapies such as tyrosine kinase inhibitors (TKIs) and immune checkpoint inhibitors, the outlook of these patients has improved considerably3. However, there remain several unanswered questions for patients with metastatic RCC. First, response rates to immunotherapy at the individual patient level vary, and there are currently no clinically validated genomic biomarkers to predict response4. Second, primary tumors often display a lower responsiveness to immunotherapy than their metastatic sites, with one report demonstrating a 33% partial response rate in the kidney versus a 91% overall response rate in metastatic tumors5,6. Third, the role of the once standard-of-care cytoreductive nephrectomy to reduce tumor burden and improve the effectiveness of systemic therapies remains controversial in the immunotherapy era7. Many of these questions remain unanswered as we have yet to fully characterize the complex tumor immune microenvironment of metastatic RCC. Extensive work from the TRACERx Renal Cohort has identified heterogeneity in driver mutations and copy number variations within not only primary tumors but also matched primary and metastatic tumors8,9. However, fewer studies have assessed the heterogeneity between matched primary and metastatic ccRCC in regard to their immune cell populations. Here, we explored the molecular differences in the tumor immune microenvironment of clear cell renal cell carcinoma by comparing whole transcriptome RNA sequencing of primary renal tumors to their matched metastatic lesions of varying sites.

Methods

Patients, samples, and follow-up

Archival samples were collected and experimental protocols approved for genome-based studies by the University of Michigan Institutional Review Board (IRB). All experiments were performed in accordance with relevant guidelines and regulations. Informed consent was waived per the IRB as only de-identified archival samples were used. Patients included those with localized ccRCC (pT1-T3) who underwent both radical nephrectomy and either biopsy or resection of one or more metastatic lesion. Tissue was obtained prior to patients receiving any systemic treatment. Hematoxylin and Eosin (H & E) slides as well as formalin-fixed paraffin-embedded (FFPE) specimens were reviewed by an anatomic pathologist with genitourinary oncology expertise (R.M.) to confirm stage, Fuhrman grade, and identify areas for molecular profiling. All samples were derived from FFPE diagnostic blocks. At our institution, bone specimens undergo EDTA-based decalcification rather than strong acid–based decalcification; EDTA-based decalcification is compatible with RNA preservation. Our analyses included comparing pooled primary tumors to pooled metastases samples and comparing matched primary and lung only metastases to matched primary and bone only metastases.

Capture whole transcriptome sequencing

RNA was isolated from FFPE specimens as previously described10,11 using the Qiagen Allprep FFPE DNA/RNA kit (Qiagen, Valencia, CA). Total RNA was quantified using a Nanodrop spectrophotometer and RNA quality was assessed by using Bio-Analyzer. We calculated DV200 and RNA integrity number (RIN) to measure degradedness of our samples (metrics are listed in the supplementary data table). Using 1ug—6 μg of total RNA, exome-capture RNA sequencing was performed as previously described12. Following library preparation, sequencing was performed using the Illumina HiSeq 2500. Briefly, 1 to 6 µg of total RNA was used as input in a reverse transcription reaction, followed by second-strand DNA synthesis. Libraries were generated using Sciclone G3 NGS workstation (Perkin Elmer) with the Kappa HT library preparation. Agilent SureSelectXT Human All Exon V5 + lncRNA probes were used for exon capture following the manufacturer’s protocol. Capture transcriptome libraries were analyzed by Agilent 2100 Bioanalyzer for library size and concentration and sequenced by the Illumina HiSeq 2500 (2 × 126-nucleotides read length), with a sequencing coverage of 4060 million paired reads. Reads that passed the chastity filter of Illumina BaseCall software were used for subsequent analysis.

Following sequencing and base calling, reads were aligned using STAR (2.4.0g1) to the GRCh38.p1 reference genome, using the “basic” version of Gencode 22 to construct the splice junction database. The total number of reads mapping to each locus was counted using feature Counts. TMM function (default settings) on reads mapped to diploid chromosomes were applied to correct for different sequencing depth and effective library size normalization factors. Trimming was not necessary due to the lack of detectable contamination and the soft-clipping capability of STAR. Expression of protein-coding genes was quantified by counting the reads overlapping exons of annotated protein coding genes in strand-specific mode.

Differential gene expression analyses

We utilized calcnormfactors function from R package edgeR13 to perform library size normalization on counts per million (CPM) data. Samples with library size less than 5 million reads and genes with normalized CPM less than 2 were excluded from the downstream analyses to eliminate samples with poor quality and genes with low expression. Paired differential gene expression (DGE) was performed using R package limma14 using the formula: ~ Patient.ID + Sample.type. metastatic tissue site information was added to the design matrix as a covariate to adjust for tissue heterogeneity. Genes with Benjamini–Hochberg adjusted p-value < 5% and absolute log-twofold change > 0.585 (1.5 in linear space) were considered as statistically significant.

Gene set enrichment analysis

Gene Set Enrichment Analysis (GSEA)15 was performed using R package gsva16and fgsea17.. We utilized MSigDB18 to identify Hallmark gene sets. Benjamini–Hochberg false discovery rates (FDR) < 5% were considered as statistically significant.

Immune cell deconvolution

Immune cell deconvolution was performed by feeding non-log transformed counts per million (CPM) normalized data as input for CIBERSORT19, an in-silico flow cytometry tool which estimates the proportions of immune cell types based on gene expression data. CIBERSORT has been validated against traditional laboratory-based assays with a high degree of accuracy19. Statistical significance between groups was assessed using the Wilcoxon rank-sum test.

ESTIMATE

The ESTIMATE20 scores, which estimates tumor purity and the presence of stromal and immune cells in tumor tissue, were calculated on non-log transformed CPM normalized data for each tumor sample. Statistical significance between groups was assessed using the Wilcoxon rank-sum test.

Molecular cluster assignments

To assess the concordance of previously identified molecular clusters between primary and metastatic RCC tumors, RNA clusters were assigned to our cohort per methods established on the IMmotion 151 (IMM151) dataset and further validated on the Javelin Renal 101 dataset21,22.

We identified shared genes within our dataset and the 10% most variable genes within IMM151 with ultimately 2731 shared genes identified and included. We then trained a random forest machine learning algorithm (R package, randomForest) on the IMM151 data limited to those shared genes. This achieved an out-of-bag error rate of 17.86%. We then used this model on our data to predict IMM151 molecular clusters for each sample. Given the absence of small nucleolar RNA in our dataset, no samples were assigned to cluster 7.

Survival analysis

We investigated the prognostic impact of CD8 + and T regulatory cell infiltration on ccRCC survival outcomes using a cohort composed of three large publicly available ccRCC gene expression data sets [Clinical Proteomics Tumor Analysis Consortium26 ccRCC (CPTAC; n = 191)]: Cohort A, The Cancer Genome Atlas (TCGA)27: Cohort B and [Seishi Ogawa Japanese ccRCC25 (n = 87)]: Cohort C. Cohorts -A, B are RNA-seq studies while Cohort C is a micro-array based study, therefore, we derived CIBERSORT immune deconvolution scores for all three cohorts separately. Given the heterogenous nature of clinical/gene expression studies, we decided to perform survival associations for all three cohorts separately. We then stratified patients into high vs low based on the CD8 + and Tregs upper quartile values (Q3) within each cohort. First, we evaluated the prognostic impact of Tregs and CD8 + cells independently by fitting separate Cox proportional hazards models of the form: Surv(time, event) ~ age + sex + grade + stage + cell_type, for each cell type. However, in both cases, the inclusion of either Tregs or CD8 + cells alone did not yield statistically significant associations with overall survival or progression-free survival. Then, we examined the combined effect of CD8 + and Treg abundances (stratified into low-low, low–high, high-low, and high-high groups). We utilized R package survival23 to create Kaplan–Meier (KM) curves and multivariable Cox proportional hazard testing to evaluate the independent prognostic impact of the CD8 + and T-regs association on Progression Free Survival (PFS), Disease Specific Survival (DSS) and Overall Survival (OS) by comparing hazard ratios [95% confidence intervals (CI)] and p < 0.05 were considered statistically significant. All downstream data analyses were performed using R statistical software (v 4.1.0, R Core Team 2021).

Results

Primary ccRCC exhibits distinct molecular profile compared to matched asynchronous metastases

A total of 42 tumor samples from 19 patients (19 primary tumors with 23 matched metastases) were analyzed (Fig. 1A). Metastasis sites included lung (n = 6), bone (n = 6), adrenal (n = 4), liver (n = 2), lymph node (n = 2), and soft tissue (n = 3). All patients underwent radical nephrectomy and 48% had pT3 disease, 52% Fuhrman grade 3, 30% necrosis, 17% angiolymphatic invasion and additional clinical/pathological information available are summarized in Table 1.

Fig. 1
figure 1

Whole transcriptome analyses of primary and asynchronous metastases in patients with clear cell renal cell carcinoma (ccRCC). A. Capture whole transcriptome sequencing was performed on primary nephrectomy specimens from 19 patients and matched asynchronous metastasis specimens from multiple sites comprising Adrenal, Bone, Liver, Lung, Lymph and Soft tissue. B. Principal Component Analysis (PCA) plot demonstrating separation of primary and metastatic tumors based on gene expression variation along the PC1 axis. C. Volcano plot demonstrating top significantly (FDR p-value < 5%) expressed genes between primary and metastatic ccRCC from differential gene expression (DGE) analysis D. MSigDB Cancer Hallmark pathways.

Table 1 Clinicopathologic characteristics of the ccRCC patient cohort (n = 19).

We performed capture whole transcriptome sequencing of all primary and metastatic tumors. Principal component analysis (PCA) was performed on 14,971 genes (Fig. 1B) and demonstrated primary tumors clustering together and separate from metastasis of all tissue types, indicating a more similar transcriptomic profile between primary tumors than their matched metastases. DGE analysis of 14,971 genes identified 3,440 upregulated and 3,188 downregulated genes (False Discovery Rate (FDR) < 5%; absolute log2 fold-change > 0.585) between primary and metastatic ccRCC tumors and are represented in (Fig. 1C).

We observed substantial heterogeneity in molecular cluster assignments between primary and matched metastases. Among 24 metastases, only 7 were congruent with their matched primary (three in Angiogenic: Cluster 2, two in complement/Omega-oxidation: Cluster 3, one in Angiogenic/Stromal: Cluster 1, and one in Proliferative: Cluster 5). This did not seem to be driven by the time to metastasis which ranged from 1 to 7 years among the seven patients with congruent primary/metastases and ranged from 1—8 years among the remaining patients with primary/metastases that were discordant. Results of the molecular cluster analysis are demonstrated in supplementary Fig. 2.

ccRCC metastases harbor genomic features of proliferative disease compared to matched primary tumors

Differential gene expression analyses revealed overexpression of genes linked with aggressive disease in metastases including DNAJA1 and HRNR. Gene Set Enrichment Analysis (GSEA) showed that metastases were enriched in hallmark gene sets associated with proliferative disease biology compared with primary tumors (Fig. 1D). For example, G2M and E2F targets, protein secretion and mitotic spindle were enriched in metastatic tumors compared to primary tumors. The WNT beta catenin, reactive oxygen species, P53, hypoxia and TNFA signaling gene sets were enriched in primary tumors compared to metastases.

Primary tumors contain a significantly higher proportion of T regulatory cells compared to metastases

We performed immune cell deconvolution using CIBERSORT. The ESTIMATE20 scores, which estimates tumor purity and the presence of stromal and immune cells in tumor tissue, was calculated for each tumor type. There were no differences between primary and metastasis sites in stromal, immune, and tumor purity ESTIMATE scores.

The immune cell composition differences between primary tumors and metastases from the same tumors are displayed in Fig. 2. Primary tumors displayed a significantly higher proportion of immunosuppressive T regulatory cells (Treg) than metastases (p < 0.0001) and this was consistent across all metastases sites (Supplementary Fig. 1). Primary tumors also had greater resting dendritic cells, monocytes, resting natural killer (NK), and CD8 + T cells. Metastasis samples displayed higher proportion of M2 macrophages (p = 0.003), with the highest proportion in bone and soft tissue, (Supplementary Fig. 1) plasma cells, and active dendritic cells.

Fig. 2
figure 2

CIBERSORT deconvolution of the tumor immune microenvironment (TIME) differences between primary vs. metastatic ccRCC (all sites). Wilcoxon rank-sum test was used to calculate the p-values displayed.

ccRCC lung metastases

We compared lung only (n = 6) metastases to their matched primary tumors using DGE analysis, GSEA and CIBERSORT as described above. Renal primary tumors displayed a significantly greater composition of T regulatory cells than their matched lung metastases (p = 0.0072), (Fig. 3). One of the most significant enriched genes in lung metastases was the gene encoding the protein hornerin (HRNR), which has been identified as an angiogenesis-promoting protein involved in cancer24 (Fig. 4A). Hallmark gene sets enriched in lung metastases included the G2M checkpoint and E2F targets (Fig. 4B). The paired primary tumors from patients with lung metastases were enriched in hypoxia gene sets.

Fig. 3
figure 3

CIBERSORT deconvolution of the tumor immune microenvironment (TIME) differences between primary vs. bone metastases and primary vs. lung metastases. Wilcoxon rank-sum test was used to calculate the p-values displayed.

Fig. 4
figure 4

Comparison of the transcriptomic and tumor immune microenvironment (TIME) differences between paired primary ccRCC tumors and bone or lung metastases. A. Volcano plot reveals differentially expressed genes between paired primary tumors and lung metastases. Genes to the left are enriched in primary tumors and genes to the right are enriched in lung metastases. B. Hallmark pathway analysis demonstrating pathways enriched in either site. C. Volcano plot of differentially expressed genes between paired primary tumors and bone metastases. Genes to the left are enriched in primary tumors and genes to the right are enriched in bone metastases. D. Hallmark pathway analysis demonstrating gene sets enriched in either site.

ccRCC bone metastases

We then compared bone (n = 6) metastases to their matched primary tumors using DGE analysis, GSEA and CIBERSORT. Bone metastases displayed greater M2 macrophages than their paired primary tumors (p = 0.024, Fig. 3).

A notable gene enriched in bone metastases is GPX8 (Fig. 4C). The primary renal tumors from which bone metastases occurred displayed high concentration of the gene HHLA2 (Fig. 4C). Epithelial-mesenchymal transition was the most significantly enriched gene set in bone metastases (Fig. 4D). There were no significant differences between primaries of either site (lung or bone) and the primary tumors remained more similar to other renal primaries than their matched metastases, despite metastasizing to different sites (Fig. 3).

A high Treg and low CD8 + T cell ratio is associated with reduced progression free survival

We investigated the prognostic impact of CD8 + and T regulatory cell infiltration on ccRCC survival outcomes using a cohort composed of three large publicly available ccRCC gene expression data sets as described above. In cohort A, Kaplan–Meier survival analysis demonstrated worse PFS in patients with low CD8 +/high Treg scores and after adjusting for clinicopathologic variables, high Treg score was significantly associated with worse PFS (Fig. 5A-B). However, lower CD8 + or Treg infiltration was not associated with OS (Fig. 5C-D). In cohort B, the ratio of low CD8 +/high Treg infiltration was associated with reduced disease specific survival (Figure S3C) but individually, neither CD8 + nor Treg infiltration were independently associated with survival outcomes (Figure S3B,D). In cohort C, we also observed worse OS among low CD8 +/high Treg patients (Figure S4C) and high Treg score was independently associated with worse OS after adjusting for clinicopathologic variables (Figure S4D).

Fig. 5
figure 5

Testing the impact of immune cell infiltration in CPTAC ccRCC (n = 191). We utilized RNA-seq data with oncologic outcomes from Clinical Proteomics Tumor Analysis Consortium ccRCC cohort (CPTAC; n = 191). Upper quartile values (Q3) were used as cut-offs for CIBERSORT scores (CD8 high vs low; Tregs high vs low). A, B Progression Free Survival (PFS). Kaplan–Meier survival curve analysis demonstrated worse PFS in patients with low CD8 and high Treg scores. Multivariable Cox proportional hazard analyses adjusting for clinicopathologic variables showed that high Tregs were independently associated with worse PFS. C, D Overall Survival (OS). Kaplan–Meier survival curve analysis did not demonstrate a significant association with CD8 or Tregs with OS. In multivariable Cox proportional hazard analyses adjusting for clinicopathologic variables, only tumor stage was associated with OS.

Discussion

Characterization of the tumor immune microenvironment is an important and evolving area of research in RCC. Here, we performed whole transcriptome RNA sequencing of a cohort of patients with matched primary and metastatic ccRCC and demonstrate heterogeneity in the immune microenvironment not only between primary and metastatic tumors but also across different sites of metastases. In this cohort, ccRCC metastasis harbor transcriptomic features of aggressive disease compared to matched primary tumors. However, primary tumors displayed a potentially more immunosuppressive microenvironment compared to metastases including high Treg and low CD8 + T cell infiltration which were independently associated with reduced survival in publicly available genomic datasets. Additionally, we found that significant differences in gene expression, gene set enrichment, molecular clusters, and immune cell composition between metastatic sites from primary ccRCC.

Differential gene expression analyses showed greater expression of genes linked to aggressive disease in our metastasis samples. For example, DNAJA1, a member of heat shock protein 40, has been shown to prevent proteasomal degradation of mutant TP53 protein28. Tumor suppressor 53 is the most commonly mutated gene in human cancer, and can promote tumor progression through a variety of mechanisms including facilitation of pro-oncogenic tumor microenvironment by altering the secretion of pro-inflammatory cytokines29. Gene HRNR is involved in vascular invasion through angiogenesis and poor tumor differentiation30. It is possible that these genes could influence development of RCC metastases and serve as potential therapeutic targets.

Evasion of antitumor immunity, or immune escape, is purported to be a key driver of primary tumor growth. Thus, our finding of a more immunosuppressive primary tumor microenvironment is consistent with a loss of immunosurveillance that can lead to tumor proliferation. The most striking difference in the tumor microenvironment supporting this finding is the higher concentration of T regulatory cells in primary ccRCC compared to metastases across all tissue types. T regulatory cells play a vital role in the immune system with regard to tolerance to self-antigens, preventing autoimmunity31. This mechanism, however, can similarly lead to tolerance of tumor antigens, suppressing innate antitumor immunity and potentially dampening immunotherapeutic response32. Emerging research also suggests the role of tumor infiltrating Tregs in responsiveness to immune checkpoint inhibitor (ICI) therapy. As Tregs often express CTLA-4 and PD-1 receptors, depletion of immunosuppressive Tregs may account for part of the therapeutic mechanism of these agents33. Our finding of increased intratumoral Tregs in primary vs metastatic tumors could provide a mechanism for reduced immunotherapy responsiveness in primary ccRCC compared to metastatic sites.

When further evaluating the most common metastatic sites in renal cell carcinoma, lung and bone, we found further deviation from their matched primary tumors. Lung metastases displayed gene set enrichment in the G2M checkpoint and targets of E2F transcription factors. The G2M checkpoint is important for DNA repair and found to be dysregulated in a variety of cancers34. High levels of E2F transcription factors have been associated with tumor aggressiveness, and similarly, in breast cancer, higher levels of E2F associated genes were seen in metastases than in primary tumors35. Renal primary tumors displayed a significantly greater composition of T regulatory cells than their matched lung metastases and bone metastases had greater immunosuppressive M2 macrophages than their matched primary renal tumors. These differences in the immune microenvironment may account for some of the clinical findings in patients with metastases to either site. It has been previously shown that regardless of their primary tumor of origin, lung metastases display a consistently higher immunogenic score36. This is in stark contrast to bone, which tends to be a more immunocompromised microenvironment due to the presence of immature and inhibitory immune cell types37. Clinically, patients with ccRCC metastatic to the bone tend to have worsened cancer outcomes38. In addition, the response of bone metastases to immune checkpoint therapy is also mixed, with one report showing site-specific overall response rates to nivolumab in metastatic RCC of 36% in lung compared to only 5% in bone39. The primary renal tumors from which bone metastases occurred also displayed high concentration of the gene HHLA2. This gene is not generally found in increased concentration in normal kidney tissue40. In cancers, it is a newly discovered immune checkpoint that has both immunostimulant and immunosuppressive functions41. Further insights into the TIME of RCC metastatic to bone may help direct appropriate therapy and improve outcomes for these patients.

Prior studies have demonstrated worse survival in ccRCC patient tumors with a lower CD8 + T cell/Treg ratio42. Here, we also found worse PFS and DSS in patients with low CD8 +/high Treg scores and this association was independent of clinicopathologic variables in our cohorts A and C, respectively. However, tumor stage remained much more prognostic for survival outcomes in TCGA (Cohort B). Further research is necessary to determine how additional insights into the tumor immune microenvironment of RCC can influence prognostic modeling, independent of common clinicopathologic variables.

A limitation to our study is inherent in the nature of bulk tumor sequencing, which analyzes a mixture of tumor cells, normal cells, and stromal cells present in a tissue sample. Thus, it can be difficult to distinguish gene expression in normal tissues from changes specific to the tumor. To mitigate this, we added the metastatic tissue site as a co-variate in our analyses. Also, while we hypothesize that differences in the TIME may influence responsiveness to therapy, this study did not contain tumors treated with immunotherapy and this will require further investigation. Despite these limitations, our study has several potential clinical implications. First, separate biopsies of primary and metastatic tumors may be needed to capture the overall disease genomic landscape given tumor heterogeneity. Second, an enriched immunosuppressive T regulatory cell microenvironment in primary ccRCC could provide a biologic rationale for the reduced responsiveness to immunotherapy compared to metastases and a role for cytoreductive nephrectomy. Finally, the heterogenous tumor immune microenvironment across different metastases sites suggests that a multimodal or combinatorial treatment approach may be warranted in metastatic ccRCC.

Conclusions

Our study provides insights into the heterogeneity between primary ccRCC and their matched metastases, including the observation of an immunosuppressive T regulatory cell-enriched TIME in primary tumors. These findings highlight the need to develop diagnostic tools and treatment paradigms that overcome tumor heterogeneity in metastatic renal cell carcinoma.