Fig. 1

Immune clusters are associated with total immune infiltration. a, d Gene expression was measured in 95 FFPE MicMa (a) and 1904 fresh frozen METABRIC samples (d). Unsupervised clustering using correlation distance and ward. D linkage of the correlation matrix assesses the relation between patients according to the expression of the genes on the PanCancer Immune Profiling array. All 760 genes on the array were used for clustering the MicMa cohort, while 509 genes, which corresponds to genes (out of the 760) found in all datasets, were used to cluster the METABRIC. Annotations of the samples on the top of the heatmap indicate histopathological features: PAM50 subtype, ER status as well as the three clusters identified by the cutree method. b, e In the MicMa (b) and the METABRIC (e), lymphoid scores quantify lymphoid infiltration which was calculated from a set of genes’ markers of lymphocyte as defined by the algorithm Nanodissect. Lymphoid scores are represented in boxplots according to immune clusters with Kruskal–Wallis test p values. c, f H&E-stained tumor tissue samples (c, MicMa, n = 50 and f, METABRIC, n = 1904) were categorized by an experienced pathologist according to the level of tumor-infiltrating immune cells. Boxplots represent the average lymphocyte score from Nanodissect according to pathologists’ classifications. Kruskal–Wallis test p values is denoted. The line within each box represents the median. Upper and lower edges of each box represent 75th and 25th percentile, respectively. The whiskers represent the lowest datum still within [1.5 × (75th − 25th percentile)] of the lower quartile, and the highest datum still within [1.5 × (75th − 25th percentile)] of the upper quartile.