Fig. 1: The analysis pipeline for automated image analysis and subsequent clustering of H&E slides.

a Overview of the cohorts used and schematic illustration of the deep neural network-based pipeline for the processing of FFPE H&E slides. b Schematic examples of tumors with high KL-divergences (restricted patterns) and low KL-divergences (mixed patterns). c Example of FFPE H&E image and output of cell detection and classification. Scalebar indicates 5 mm. d Cell density distribution plots of H&E image under c. e Explanatory illustration of the three fractional levels of cell types, namely the cell type percentage of TILs, fibroblasts and tumor cells, and three measures to describe the mixing or restriction of the three different cell type distributions. The resulting six variables were used to cluster patients based on their H&E slides. f Hierarchical clustering of the patients in the ICGC cohort and METABRIC cohort separately. g Downstream analyses of molecular characteristics and survival of ER+HER2− patients only.