Fig. 1

History of the molecular classification of breast cancer. Names are shown at the chronological level at which they were introduced. The Her2+ breast tumors were already well-known in the 1990s for highly favorable response to the drug trastuzumab (herceptin), which was approved by the FDA for metastatic Her2+ breast cancer in 1998. The hierarchical clustering of Perou et al.4 used genes whose expression differentiates between samples from different tumors better than between samples from the same tumor, finding four main classes: ERBB2+ (or Her2+), Basal, Luminal, and Normal breast-like. Sorlie et al.26 explicitly incorporated clinically relevant outcome data such as overall survival, uncovering three Luminal subtypes, Luminal A, B, and C. Luminal A has higher overall survival than Luminal B, and Luminal B has higher overall survival than Luminal C. Later investigators found only two Luminal subtypes to be sufficiently robust. Parker et al.27 introduced the 50 gene set that became known as the PAM50 (Prediction Analysis of Microarray) and introduced a straightforward centroid-based classifier for breast tumor RNA expression patterns along the PAM50 with five classes: Basal, Her2, Luminal A, Luminal B, and Normal. The authors used this classification as a key component in the model that became the Prosigna predictor of Risk Of Relapse (ROR). Prat and Perou9 introduced the Claudin-low subtype carved largely out of the Basal group. The authors find that the Claudin-low subtype has poor prognosis compared to Luminal A, but no worse than the other subtypes. The Topological Data Analysis of Nicolau et al.17 confirmed the distinction between more luminal, more basal, and more normal-like subtypes along branches of a graph structure modeling the distribution of breast tumor samples. They found a subgroup of patients exhibiting a very high survival rate, largely characterized by expression of MYB. Our proposed classification uses the method of Nicolau et al.17 and incorporates gene sets and priors (e.g., the basal-to-luminal stratification) known to be relevant to breast cancer biology. (Below right) Our proposed system with seven classes defined by four elementary phenotypes (see also Figs 2, 7)