Fig. 1: Overall characteristics of the PURE01 pre-treatment cohort.

a Heatmap showing five unsupervised consensus clusters, and 1005 differentially expressed genes satisfying p(adj) <10−4 and |log2(FC)| > 2. Covariate tracks show response (CR complete response, PR partial response, and NR non-response), PD-L1 (+/−) status from a Dako 22C3 combined positive score (CPS) assay (+ corresponds to ≥10%), and gender. P-values are from two-sided Fisher exact tests, and are uncorrected for multiple hypothesis testing. b Fraction of PURE01 samples in consensus subtypes that had a complete response (CR), partial response (PR), and non-response (NR). The stacked bar at the right shows the overall responses for the cohort. c Kaplan–Meier plot of recurrence for the five PURE-01 MIBC expression subtypes, censored at 24 months, with a log-rank p-value. d PD-L1 +/− status, shown as fractions of samples in each subtype. e Predicted Lund, TCGA, consensusMIBC, and MD Anderson subtypes for PURE-01 n = 82 expression subtypes. P-values for the covariate tracks are from Fisher exact tests and were Bonferroni-corrected for multiple hypothesis testing (x4). f, g Dot representation of GSEA AUCs for selected f MSigDB Hallmark gene sets, and g Mariathasan et al. 2018 gene sets for the five subtypes. Enriched (vs. repressed) gene sets are shown as red (vs. blue) discs, with disc areas proportional to the areas-under-the-curve (AUCs) of the CERNO test results. h For the PURE01 subtypes, CMap v1.0 connectivity score-rank distributions, with a binary heatmap showing chemical perturbagens with the most negative scores. The dotted box highlights perturbagens that have large negative connectivity scores.