Introduction

Bladder cancer is among the most common malignancies globally, with non-muscle invasive bladder cancer (NMIBC) accounting for the majority of new diagnoses. Intravesical Bacillus Calmette-Guérin (BCG) remains the gold standard for reducing recurrence and progression in intermediate- to high-risk NMIBC patients. Despite its clinical utility, up to 50% of patients experience disease recurrence, highlighting the need for improved treatment strategies and an improved understanding of resistance mechanisms1,2,3.

Prior bulk genomic and transcriptomic studies have provided insights into mutations4, copy-number variations5, and transcriptional signatures6 that may contribute to BCG resistance. However, these methods lack the resolution to elucidate cell type-specific transcriptomic changes and cell-cell interactions leading to recurrence after BCG. In this study, we employed single-cell RNA-sequencing (scRNA-seq) and whole exome sequencing on pre- and post-BCG tumor specimens to investigate the tumor microenvironment and immunologic shifts that drive recurrence.

Our results revealed that although genomic alterations and cellular composition were similar between BCG naïve and recurrent tumors, recurrent samples showed enhanced CD6/ALCAM signaling between T cells and urothelial cells. CD6-high T cells exhibited a gene expression signature of impaired activation, suggesting a mechanism of immune evasion. The significance of this pathway was validated in a large independent cohort, which further implicates CD6/ALCAM as a potential therapeutic target in NMIBC.

Results

Cell type composition of BCG naïve and recurrent tumors

To understand the genomic and molecular drivers of BCG resistance, we implemented a protocol for rapid dissociation and scRNA-seq from cystoscopically resected NMIBC specimens along with parallel whole exome sequencing. Our final cohort consisted of 27 samples from 23 patients, with 14 samples collected prior to BCG treatment (“naïve” samples; n = 13 patients) and 13 samples collected following BCG treatment (“recurrent” samples; n = 12 patients; Table 1, Supplementary Fig. S1). The median follow-up for the naïve group without recurrence was 37.7 months (range 27.6–43.3 months). For the recurrent group, the median time to event was 5.8 months (range 1.8–13.2 months) from last BCG instillation. Within the BCG naïve group, 3 of the 13 patients eventually recurred. One patient was found to have muscle invasive bladder cancer on restaging transurethral resection and was not treated with BCG. Nonetheless, this patient was included in the BCG naïve group to enhance cell type mapping, as scRNA-seq was limited to the intraluminal portion of the tumor. In the other 2 cases, we were able to obtain matched naïve and recurrent specimens, enabling limited longitudinal analysis. Initial somatic profiling of our study cohort (n = 23) revealed strong similarities between BCG naïve and recurrent tumors, including known mutations in FGFR3, PIK3CA, KMT2D, KDM6A, KMT2C, and ARID1A (Fig. 1A)7. CDKN2A loss was frequently identified, affecting 70% of all tumors, and mutational signature analysis showed known signatures of NMIBC including APOBEC (SBS13 and SBS2) (Fig. 1A). Looking further into the genomic alterations, we searched for copy number changes that might be driving recurrence and noted no significant changes between naïve and recurrent samples (all q-values > 0.05, GISTIC); however, commonly reported NMIBC alterations like gain of chromosome 1q and loss of chromosome 9p and 9q were evident (Fig. 1B).

Fig. 1: Genomic and single-cell transcriptional profiling of NMIBC.
figure 1

A Oncoplot summary of top NMIBC related genomic alterations separated by two primary groups of interest, BCG naïve (orange and green) and BCG-treated samples acquired after recurrence (recurrent, red). Clinical follow-up further identified two groups within the BCG naïve cohort, those patients who were later treated with BCG but did not recur at the time of follow up (naïve_wo, green), and those who recurred post-BCG (naïve_w, orange). B Summary of copy number alterations across chromosome arms per sample. Common gain (1q) and loss (9p and 9q) regions for bladder cancer are labeled with Red/Blue row names. Fraction copy number alterations, loss of heterozygosity, tumor purity, and tumor ploidy are also presented above copy number alteration heatmap with legend to the right for each unit. C UMAP containing 229,558 cells from 10 benign bladder samples partitioned into three major compartments for reference mapping of NMIBC samples (stromal, immune, and urothelial). Suspected doublets were classified as other. The proportions of each cell class in an individual sample are summarized in a histogram (right). D UMAP containing 250,229 cells from 27 bladder cancer samples partitioned into stromal, immune, urothelial, and proliferating urothelial compartments using the benign bladder as a reference. E Fraction of major cell compartments across NMIBC samples annotated with their BCG treatment status. FH Re-clustering and detailed naming of the stromal (F), immune (G), and urothelial (H) compartments within NMIBC samples along with representative feature plots (legend colors = average expression).

Table 1 Clinical cohort characteristics

Given the absence of dramatic differences in somatic events between BCG naïve and recurrent samples, we hypothesized that disease recurrence may be transcriptionally regulated, and that single-cell RNA-sequencing (scRNA-seq) may help elucidate the cellular state changes underlying BCG resistance. As NMIBC cells likely harbor aberrant transcriptional profiles depending on their degree of differentiation, we performed scRNA-seq on 10 freshly isolated bladder biopsies away from tumor with normal appearing urothelium and used these samples as a reference atlas to assist with cell type annotation (Fig. 1C, Supplementary Data S1). Within these benign bladder samples, most contained predominantly immune or urothelial cells, with smaller contributions from stromal cell types (Fig. 1C). Detailed naming of each compartment was performed (Supplementary Fig. S2), and then we mapped our cancer cell dataset to corresponding categories of Urothelial, Immune, and Stromal, along with a minor component of Proliferating Urothelial (Fig. 1D). Of these, the urothelial compartment was typically predominant, as expected by tumor sampling procedure which involves biopsy of their intraluminal aspect (Fig. 1E). Next, we isolated and re-clustered each compartment to better resolve individual cell types (Fig. 1F-H). Stromal cells showed expected fibroblast and endothelial cell subtypes, including myofibroblasts and WNT high periurothelial fibroblasts; however, given their limited representation in most samples, these cell populations were generally underpowered for more detailed analysis (Fig. 1E). For the immune compartment, a combination of reference mapping, AI-based naming, and expert curation was used to ensure rigorous classification of a total of 20 immune cell types across 27 samples (Supplementary Fig. S3). Lastly, naming of the urothelial compartment was performed by reference mapping to normal urothelial populations and assessment of basal cell and umbrella cell markers such as KRT15, KRT5, UPK2, and UPK1A (Fig. 1H). As expected with the variable differentiation state of NMIBC, we observed multiple clusters of intermediate cell types likely in transitory states between basal and umbrella cell identities (Fig.1H).

Interferon signaling is broadly activated across cell types after BCG treatment

After harmonizing single cell populations across patients, we sought to identify changes in cell-type abundance that might exist between naïve (n = 14) and recurrent samples (n = 13). When represented as a percentage of their respective compartments, we saw no statistically significant differences in cell type abundance between the naïve and recurrent specimens, with or without FDR correction (Fig. 2A). Without strong evidence of compositional changes after BCG treatment, we reasoned that BCG resistance may be driven by more nuanced changes in the cellular state of individual sub-populations. We performed cell-type specific differential expression to identify genes enriched in recurrent samples relative to naïve samples (Supplementary Data S2). Gene set enrichment analysis of these genes using the Hallmark pathways showed surprisingly broad activation of inflammation-related pathways across compartments, such as tumor necrosis factor alpha (TNFA) signaling, interferon gamma (IFNG) signaling, and interferon alpha (IFNA) signaling (Fig. 2B, Supplementary Data S3). These results suggest that the increased inflammation described in bulk RNA-seq studies of BCG resistance may reflect contributions from a broad collection of cell types.

Fig. 2: Differential cell abundance and differential gene expression between BCG naïve and BCG recurrent NMIBC.
figure 2

A Differential abundance analysis was performed using the proportion of cells per sample comparing naïve and recurrent samples (Wilcoxon test, no p-values met significance threshold of p < 0.05). B Heatmap of enriched hallmark pathways from GSEA analysis of differentially expressed genes per cell type between naïve and recurrent samples. The color scale represents the normalized enrichment score (NES) which met p-value threshold of p < 0.05. Non-significant NES scores were not plotted (white). Positive NES values (red) indicate pathway enrichment in recurrent samples. C Log2 fold change of leading-edge genes for the interferon gamma (IFNG) pathway (red = enriched in recurrent, blue = enriched in naïve).

Next, we isolated the leading-edge genes from the IFNG pathway and saw a common activation of antigen presentation genes such as HLA-A, HLA-B, and B2M (Fig. 2C). Many of these genes overlapped with leading-edge genes of the IFNA pathway (Supplementary Fig. S4A), while those in the TNFA pathway appeared to be more closely related to cytokine response and proliferation such as FOS, JUNB, JUN, and CD69 (Supplementary Fig. S4B). To further resolve how specific the IFNG activation pathway is to recurrence, we utilized the 2 patients who had matched samples from the BCG naïve and recurrent settings. For patient 1, who had one naïve sample and two recurrent samples, we observed a parallel increase in interferon pathway gene expression – based on the log2 ratio of IFNG signature genes relative to the naïve baseline – observed at the cohort-wide level (Supplementary Fig. S4C). Similarly, in patient 2, who had two naïve samples and one recurrent sample, we saw relatively lower IFNG pathway expression prior to treatment (Supplementary Fig. S4D). Taken together, observations from both unmatched and matched samples showed that BCG recurrence is enriched for immune activation signatures of the IFNG pathway across all cellular compartments.

CD6/ALCAM T cell-urothelial cell interactions are enriched in BCG recurrent samples and associated with T cell inhibition

While the above pathway enrichment analysis identified increased inflammatory signatures during recurrence, it was unclear how NMIBC evades this enhanced immune activation. To surface more nuanced mechanisms of immune resistance related to specific cellular interactions in the microenvironment, we performed cell-cell communication analyses using CellChat8. For an initial overview of interaction changes, we grouped cell types into major categories, Urothelial, Stromal, T cells, B cells, NK cells, Myeloid cells, and Dendritic cells. Within these major groups, we summarized the overall cell-cell interactions by comparing communication patterns in naïve and recurrent samples. We found that recurrent samples had increased Urothelial-to-Urothelial (n = 451), Urothelial-to-T cells (n = 175), and Stromal-to-Urothelial (n = 177) interactions (Fig. 3A). We next focused specifically on outgoing and incoming interactions between T cells and urothelial cells to precisely understand potential alterations in immune recognition and clearance. Across all comparisons in the naïve samples, we saw enrichment for typical immune trafficking and surveillance interactions such as those between MHC class II genes and CD4 as well as a variety of CD44 ligands and CD44, a marker of basal bladder cancer cells that correlates with recurrence, invasion, and poor prognosis9 (Supplementary Fig. S5A). There was also evidence of some immune suppressive interactions between T cells and urothelial cells like TIGIT and NECTIN2. Within the recurrent tumors, we saw high enrichment of ADGRE5 on T cells interacting with CD55 on urothelial cells10, a pathway that has been shown to be associated with tumor progression and stemness across various cancer types. Increased communication between urothelial MDK and T-cell NCL was also identified in the recurrent tumors, and this intercellular signaling pathway has been associated with immunosuppressed populations that are PD1 high in other cancers such as triple negative breast cancer11. Despite seeing some mild enrichment of immune exhaustion related pathways, we saw no major expression changes in well described immune checkpoints (e.g., PD1, PDL1, CTLA4, TIGIT) to explain recurrence after BCG treatment (Supplementary Fig. 5B).

Fig. 3: Comparative analysis of cell communication networks in BCG naïve and recurrent samples.
figure 3

A The differences in the numbers of cell-to-cell interactions predicted by CellChat for naïve and recurrent samples are plotted. Red arrows indicate increased interactions in recurrent samples. Blue arrows indicate decreased interactions in recurrent samples. The thickness of the connecting lines relates to the number of differential interactions, with the largest changes in either direction labeled with exact numbers. B Summary plot of enriched interactions between T cells and urothelial cells (aggregate of T cells-to-Urothelial and Urothelial-to-T cells interactions from Supplementary Fig. S4A) in recurrent samples compared to naïve. C Heatmaps of CD6-ALCAM communication probability score between cell types in naïve (left) and recurrent (right) samples. D Fraction of CD6hi cells in each T cell population in naïve (green) and recurrence (red) samples. The asterisk indicates statistical significance (Mann Whitney U test, Benjamini-Hochberg-adjusted p = 0.018). E Time to disease recurrence post-BCG treatment for Jong et al. NMIBC cohort (n = 282), stratified by median CD6/ALCAM dual gene signature score. P-value was derived from log-rank test.

To better gauge alterations in shared interactions between T cells and urothelial cells, we summed the total interactions that were enriched during recurrence across T cells-to-Urothelial and Urothelial-to-T cells (Fig. 3B). The most upregulated interaction was between CD6 on T cells and ALCAM (CD166) on urothelial cells. In BCG naïve samples, we observed a limited number of strong CD6 and ALCAM interactions, primarily linking CD4 T-cell populations to antigen presenting cells including macrophages, CCL17+ dendritic cells, and Type 2 conventional dendritic cells (left panel in Fig. 3C). In contrast, recurrent tumors exhibited a pronounced shift in the CD6/ALCAM interaction landscape, characterized by diminished T cell-antigen presenting cell interactions and a notable expansion of interactions between multiple T cell subsets (CD4 central memory, CD8 effectors, and CD8 memory T cells) and various urothelial cell populations (right panel in Fig. 3C). NK cells (CD56 dim) also showed increased CD6 interactions broadly across groups. We next sought to determine if the abundance of CD6 expressing T cells changed following recurrence. Classifying cells as CD6-high (CD6hi) vs. CD6-low (CD6lo) based on median expression, we compared naïve and recurrent samples based on the abundance of CD6hi cells within each T cell subset (Fig. 3D). CD4 effector cells showed a significant increase in the proportion of CD6hi cells following recurrence (Mann Whitney U test, Benjamini-Hochberg-adjusted p = 0.018), with proliferating lymphocytes also showing a trend in this direction (Fig. 3D). In addition to changes in abundance, we assessed whether CD6hi vs. CD6lo cells may have differences in gene expression. Notably, CD6hi cells demonstrated a broad loss of activation markers including IL2RB, IRF1, IL2RG, TIGIT, TOX, JAK1, TOX4, and TNFAIP3 (Supplementary Fig. S5C).

High CD6/ALCAM signaling predicts recurrence after BCG

To further understand how the identification of CD6/ALCAM from our single-cell interactions studies might help inform response to BCG, we identified a bulk RNA-seq study6 with n = 282 non-muscle invasive bladder tumors that included time to BCG recurrence. With the hypothesis that CD6/ALCAM high tumors are immunosuppressed via T cell inactivation and are more likely to recur after BCG treatment, we scored each sample for its expression of CD6, ALCAM, or CD6/ALCAM gene set for comparison (Supplementary Data S4). Using individual gene expression of CD6 (Supplementary Fig. S5D) or ALCAM (Supplementary Fig. S5E) with a median cutoff, we observed significantly shorter times to high grade recurrence post-BCG treatment (log rank test, p = 0.014 and p = 0.04 respectively). When we stratified the cohort using the combination gene set of CD6/ALCAM, an even stronger effect emerged showing CD6/ALCAM high samples with significantly shorter time to BCG recurrence than CD6/ALCAM low samples (Fig. 3E, log rank test, p = 0.00059).

Discussion

In this study, we aimed to elucidate the molecular drivers of recurrence after Bacillus Calmette-Guérin (BCG) treatment in non-muscle invasive bladder cancer (NMIBC) through a comprehensive scRNA-seq analysis of freshly isolated tumor specimens collected before and after BCG treatment. In an era of widespread BCG shortage, identifying mechanisms responsible for disease recurrence has become even more critical, so that we can develop therapeutic strategies to augment BCG efficacy or judiciously offer clinical trial enrollment. While there have been multiple attempts to identify mechanisms of recurrence using bulk RNA-seq6,12 and liquid biopsies13,14, we used single-cell analysis of BCG naïve and recurrent NMIBC samples to elucidate potential differences in cell proportion, cell state, and intercellular communication. We identified broad activation of inflammation-related pathways in BCG recurrent samples, particularly interferon gamma (IFNG) signaling, across multiple cellular compartments. The interferon signaling pathway and antigen presentation machinery were both upregulated, suggesting a heightened immune response associated with exposure to BCG. Longitudinal analysis of matched samples further supported the consistent immune activation and interferon type II response in BCG-treated cancers. While it seems counterintuitive to see increased inflammation in recurrent tumors, a prior bulk RNA-seq study also identified increased inflammatory signatures, including IFNG, IL-2, and MHC class I genes, to correlate with BCG resistant samples6. The authors further defined this subset as BRS3 subtype and hypothesized that immune exhaustion may explain their lack of response; but they did not find striking signals of standard immune checkpoints like PD-16. Due to the limitations of bulk RNA-seq, the authors could not resolve cell populations to understand how these inflamed tumors remained resistant to BCG. Here we expand on the mechanisms of immune evasion during this broad inflammatory state.

Cell-to-cell communication analysis revealed increased interactions between T cells and urothelial cells in recurrent samples mediated by CD6/ALCAM. We further identified that T cells with high CD6 expression had a reduced activation capacity. CD6 is a transmembrane glycoprotein primarily expressed on the surface of T cells, where it plays a crucial role in modulating T cell activation and function during immune responses and tolerance15,16,17. CD6 interacts with multiple ligands, including ALCAM (CD166), CD44, and CDCP1 (CD318), through its extracellular domain and signals through its cytoplasmic tail by modulating Ras18. Importantly, studies have shown that CD6 signaling reduces immune activation and mutations of its intracellular signaling domain enhanced activation18. The CD6/ALCAM interaction is also involved in T cell adhesion to antigen-presenting cells (APCs) and the formation of immunological synapses, thereby regulating T cell activation and immune responses. The dysregulation of CD6-mediated signaling has been associated with autoimmune diseases, including rheumatoid arthritis, multiple sclerosis, and systemic lupus erythematosus19,20,21,22. CD6 has been proposed as a potential immunotherapeutic target for breast cancer, lung cancer, and prostate cancer23, and CD6+/CD8+/PD-1+ T cells have been shown to be deficient in granzyme, perforin, and interferon gamma production compared to CD6-/CD8+/PD-1+ T-cells24.

Validation of the CD6/ALCAM signature was performed using an independent bulk RNA-seq dataset. Elevated expression of both CD6 and ALCAM was detectable prior to BCG treatment and was significantly associated with worse recurrence-free survival. These findings suggest that T cell dysfunction may be pre-established in the tumor microenvironment and are consistent with prior studies showing an association between an inflamed state prior to BCG and treatment resistance6,25,26,27. While this contrasts with observations in other cancers where inflamed tumors predict better responses, BCG uniquely relies on live bacterial instillation to trigger an innate immune cascade followed by adaptive immunity28. Together with prior evidence implicating CD6 in immune regulation, our results support further investigation of the CD6/ALCAM axis as a potential therapeutic target to improve NMIBC outcomes29.

Another study showed that intravesical BCG exerts a potent systemic effect by reprogramming hematopoietic stem and progenitor cells in both mice and humans30. Notably, their findings, which include upregulation of interferon response genes and enhanced antigen presentation, align with key immune pathways observed in our study. However, while these systemic changes may potentiate anti-tumor immunity, our data suggest that local immunoregulatory mechanisms, such as CD6/ALCAM-mediated T cell suppression, may counteract these beneficial effects and contribute to treatment failure. Further mechanistic studies are warranted to elucidate the complex interplay between BCG-induced central innate immune memory and local immune regulation in determining BCG response.

In addition to T cell–urothelial interactions, stromal elements have also been implicated in bladder tumor immunity. Wang et al. reported that high stromal and immune infiltration scores were associated with worse overall survival in muscle-invasive bladder cancers and in metastatic bladder cancers treated with checkpoint inhibition31. Notably, tumors with high scores also exhibited upregulation of interferon gene expression. Although we observed a prominent IFNG signature in BCG recurrent NMIBC, our ability to assess the contribution of the stromal compartment to this signal was limited by the low representation of fibroblasts and related cell types in our dataset.

While our study provides novel insights into the role of CD6/ALCAM signaling in BCG resistance, several limitations should be acknowledged. First, our single-cell RNA-sequencing analysis was conducted on a modest cohort (n = 23 patients), which may limit the generalizability of our findings. However, the consistency of CD6/ALCAM upregulation across recurrent tumors and its correlation with worse recurrence-free survival in an independent cohort of 282 patients supports the robustness of this observation. Second, our study only included longitudinal samples from two patients, which limits our ability to determine whether CD6/ALCAM signaling is both pre-existing and inducible by BCG treatment. Third, we only had two patients with both pre- and post-treatment tumors available for matched analysis. Future prospective studies that collect tumors at multiple time points before and after treatment would enable tracking of immune priming, resistance emergence, and treatment-induced remodeling of the tumor microenvironment. This approach could also help clarify whether immune suppression is already present prior to BCG exposure or develops in response to treatment. In addition, integrating blood and urine-based analyses could help identify non-invasive biomarkers predictive of BCG response and allow for real-time immune monitoring throughout the course of therapy. Finally, although we identified associations between CD6hi T cells and BCG treatment failure, our study was not designed to provide conclusions about causality. Future mechanistic studies using in vitro and in vivo models will be essential to establish whether CD6/ALCAM signaling directly contributes to BCG resistance or represents a consequence of other immunosuppressive mechanisms.

In conclusion, our findings shed light on the immune cell interactions potentially driving BCG resistance and highlighted a need to further investigate the role of CD6/ALCAM signaling within NMIBC. While we are in silico estimating cellular interactions via CellChat8, the bulk RNA-seq analysis strongly indicates a role for this signaling cascade’s involvement in mediating recurrence after BCG treatment. Furthermore, other interactions that were identified have been shown to be present in NMIBC and/or have been sought after targets such as MDK/NCL32,33 and TIGIT/NECTIN234,35. Future studies are needed to elucidate how CD6/ALCAM signaling mediates immune suppression in NMIBC and to determine whether disruption of this interaction can effectively reduce recurrence rates and enhance long-term response to BCG.

Methods

Patients, sample collection, treatment, and follow-up

Patients with a new bladder tumor diagnosed by axial imaging and/or cystoscopy were consented for specimen collection and molecular analysis under an Institutional Review Board-approved protocol (14-1222/CASE2815) adhering to U.S. Common Rule guidelines and in accordance with the Declaration of Helsinki at Cleveland Clinic. Tumors were identified with a cystoscope, and a cold cup forceps was used to biopsy the intra-luminal portion. Control specimens were biopsied in a similar fashion from areas with normal appearing urothelium at least 5 cm away from any tumor site or suspicious lesion. Tissue pieces were either snap frozen immediately in liquid nitrogen or dissociated for 10x Genomics 3’ single-cell RNA-seq library preparation as previously described in ref. 36. All patients with T1 HG or multifocal Ta HG disease underwent repeat transurethral resection prior to BCG treatment. All patients received six instillations of BCG (TICE), and surveillance was performed every 3 months with white light cystoscopy and urine cytology.

Whole exome sequencing

Genomic DNA was extracted from frozen tumor tissues and quantified using NanoDrop. The sequencing library was made using SureSelectV5 library kit following manufacturer’s protocol and sequenced using paired end 150 bp reads to an average coverage of 187x (range 151x-263x). The sequencing reads were aligned to GRCh38 using STAR. A combination of five different mutation callers (Mutect2, SomaticSniper 1.0.4, VarScan V2.4.3, Strelka v2.9.10 and Platypus 0.8.1) were used to identify SNVs. Small insertions and deletions (indels) were determined using Mutect2, Varscan v2.4.3, Strelka v2.9.10, and Platypus 0.8.1. We used the SnpEff variant annotation and effect prediction tool to determine effect of called variants. Our variant reporting criteria are as follows:

  • Tcov > 10 & Taf ≥ 0.04 & Ncov > 7 & Naf ≤ 0.01 & Tac > 4 were set to pass.

  • Common SNPs were eliminated by comparison to snp142.vcf.

  • Rare variants found in dbSNP were kept if Naf = 0.

  • Variants with Tcov < 20 or Tac < 4 were marked as low confidence.

  • Only Variants called by more than 1 caller were reported.

Common variants in gnomAD v 2.1.1 (March 2019 release; gnomAD also incorporates Exome Aggregation Consortium (ExAC)) were excluded. Additional optimization and filtering were applied for insertions/deletions (INDELS). INDELS in blacklisted regions (https://www.encodeproject.org/annotations/ENCSR636HFF/) and low mappability regions were excluded.

Mutation load was determined as the total number of non-synonymous mutations passing filters. Previously described signatures of mutational processes were determined in each sample using non-negative least-squares regression as provided by the R package deconstructSigs v1.8.0 using the COSMIC signatures as the mutational signature matrix. Mutational profiles were analyzed from MAF files using R package maftools version 2.12.0. Gistic2.0 was used for analyzing copy number change from facets segment values.

Single-cell RNA-seq mapping and cell naming

FASTQ files were mapped to the GRCh38 reference human genome using Cellranger (v5.0.0). Cells containing less than 600 genes and/or more than 30% mitochondrial and ribosomal genes were removed. Sample-specific Seurat objects were created using Seurat (v4.3.0), then normalized using Seurat’s SCTransform. Samples were then split into benign bladder (10 samples) and cancer (27 samples) sets. Each set was integrated independently, based on variable features using Seurat’s IntegrateData function. Cells from the benign bladder set were then divided into Immune, Stromal, and Urothelial compartments, re-clustered, and named based on marker genes. We also identified and named clusters containing doublets (cell clusters that contain cells with an abnormal number of genes or markers characteristic for multiple different cell types).

To help name cancer cells, we used Seurat’s label transfer method37 to map cancer set to benign bladder set. Provisional names were then used to identify Immune, Stromal, and Urothelial compartments. Proliferating Urothelial cells were assigned to separate compartment based on the presence of cell proliferation markers. Low quality cells in the Urothelial compartment were further filtered out by removing all cells that contained less than 2000 nFeatures per cell. Cells in each compartment were then named in an iterative process that involved manually examining cell markers for each cluster, naming clusters that consisted of a single cell type, and re-clustering the remaining cells using Harmony (v0.1.0).

Differential gene expression, differential abundance, GSEA, and CellChat analyses

Differential gene expression was assessed with Seurat’s FindMarkers function using a negative binomial generalized linear model (test_use = ‘negbinom’) with patient as a covariate to account for several samples that came from the same individual. Parameter logfc_threshold was set to 0.05. Other parameters were set to default. Differential expression (DE) analysis was performed only for comparisons in which there were more than 100 cells present for analysis in recurrence and naïve groups.

The output of DE analysis was the input for gene set enrichment analysis, performed using the GSEA function from clusterProfiler (v4.4.4) and HALLMARK gene sets. All genes, regardless of p-value, were sorted from highest to lowest log2 fold change and used for the analysis. Longitudinal analysis of two individuals with tumors taken at multiple time points was used to investigate the change in IFNG related gene sets using ssGSEA scores. Differential abundance analysis was performed by comparing fractions of different cells between different groups using Wilcoxon rank sum test as implemented in R. Cell types with more than 100 cells per group were considered for this analysis. Heat maps were generated using ComplexHeatmap and circlize packages. To group cells into CD6 (ALCAM) high and CD6 (ALCAM) low groups, we first split cells into those that express the gene (count > 0) and those that do not and filter out all cell types for which less than 100 cells expressed the gene. Next, we found the mean log-normalized gene expression of gene-expressing cells and used that value to separate low from high groups. Cell-to-cell interaction was performed using CellChat 1.6.18. Only cell types present in at least 5 samples and with >100 total cells were included in the comparisons. We summed the significant ligand-receptor pairs for naïve and recurrent samples between Urothelial cells, Urothelial cells and T cells, and Stromal cells and Urothelial cells. Then, we subtracted the naïve sum from the recurrent sum to identify alterations in cellular communications between groups.

Bulk RNA-seq analysis

Validation of CD6/ALCAM signature association with BCG response was performed using the bulk RNA-seq data from Jong et al.6, where gene expression data for primary samples (n = 282) was extracted from Supplemental Data S1 and combined with metadata from Supplemental Data S2. For single gene analysis of CD6 and ALCAM, the median expression value was used to stratify high/low groups. Recurrence-free survival analysis was performed using time to recurrence (“Time_to_HG_recur_or_FUend”) and the censor (“HG_recur_BCG_failure”) with log-rank method in R package survminer. Dual signature of CD6/ALCAM was scored using the R package gsva with the ssGSEA method. The median ssGSEA score was then used to categorize samples into high/low groups for recurrence-free survival analysis.