Fig. 3

Prediction of Cluster B using binomial logistic regression. a Using binomial logistic regression penalized by the lasso method, we trained on 4546 samples to predict Cluster B. ROC curve assesses how the lasso output (the weighted gene sets in Supplementary Data 1) discriminates a sample to be Cluster B or not. b For the 4546 samples in the training set, the heatmap represents whether a sample is part of Cluster B (light blue) or Clusters A and C (purple), using the clustering or the lasso methods. c–e The prediction of the clusters (lasso) was tested on five cohorts, which were not included in the training phase: c STAM (n = 856), d MAINZ (n = 200), and e UPSA (n = 289) are presented here, the two other cohorts CAL and PNC are presented in Supplementary Figures. The association between predicted clusters and survival was tested using Kaplan–Meier survival curves for predicted Cluster B (light blue) and predicted Clusters A and C (purple). p values are from log-rank tests. Kaplan–Meier display relapse-free survival for STAM, distant metastasis-free survival for MAINZ, and overall survival for UPSA.