Extended Data Fig. 5: Characterization of the variation in SNV allelic fraction and cellular prevalence across 27 MCF7 strains and their single-cell-derived clones.
From: Genetic and transcriptional evolution alters cancer cell line drug response

a, Top, unsupervised hierarchical clustering of 27 MCF7 strains, based on the allelic fractions of all their SNVs. Groups of strains expected to cluster together based on their evolutionary history are highlighted, as in Fig. 1. Bottom, a corresponding heat map, showing the allelic fractions of all mutations across the 27 MCF7 strains. Mutations that were identified in only a subset of the strains are shown. The presence of a mutation is shown in colour according to its allelic fraction. b, The allelic fractions of an activating PIK3CA mutation (top) and an inactivating TP53 mutation (bottom) across strains. c, Top, unsupervised hierarchical clustering of 27 MCF7 strains based on their SNV cellular prevalence. Groups of strains expected to cluster together based on their evolutionary history are highlighted, as in Fig. 1. Bottom, a corresponding heat map, showing the cellular prevalence of all mutations across the 27 MCF7 strains. Mutations that were identified in only a subset of the strains are shown. The presence of a mutation is shown in colour according to its cellular prevalence. d, The distribution of the maximal differences in cellular prevalence (CP) of non-silent mutations, across 27 MCF7 strains. The peak at maximum ΔCP = 1 represents SNVs that are clonal in at least one strain but are nearly or completely absent in at least one other strain; the peak at maximum ΔCP = 0 represents SNVs that are detected at similar prevalence across all 27 strains; and the peak at maximum ΔCP ≈ 0.1 represents a group of SNVs present at CP ≈ 0.1 only in strain M. e, Description of the MCF7 single-cell-derived clones included in this study, including their parental cell line, genetic manipulations and relationship to one another. f, A heat map showing the allelic fractions of non-silent mutations in three wild-type single cell-derived MCF7 (scWT3–scWT5) clones and the parental population. The presence of a mutation is shown in colour according to its allelic fraction. g, A heat map showing the allelic fractions of non-silent mutations in five genetically manipulated single-cell-derived MCF7 clones. For two of the clones, samples were passaged for a prolonged time and sequenced at multiple time points. The presence of a mutation is shown in colour according to its allelic fraction. h, Comparison of the karyotypic variation between parental and single-cell-derived cell populations. Histograms show the distribution of chromosome numbers from the parental (light grey) and single-cell-derived (dark grey) populations. P values indicate the significance of the differences between the variations (rather than the means) of the populations using a one-tailed Levene’s test (n = 50 metaphases per group). i, Two representative karyotypes of each sample. Note that all single-cell-derived clones are karyotipically heterogeneous. Marker chromosomes are not shown. Arrows point to partially aberrant chromosomes. Images are representative of 50 metaphases counted per sample. j, Two representative karyotypes from two cell populations of the same single-cell-derived clone, separated by six months of culture propagation. Marker chromosomes are not shown. Arrows point to partially aberrant chromosomes. Images are representative of 50 metaphases counted per sample. k, Comparison of the karyotypic variation between two cell populations of the same single-cell-derived clone, separated by six months of culture propagation. Histograms show the distribution of chromosome numbers from the early (light grey) and late (dark grey) populations. Per sample, 50 metaphases were counted. The P value indicates the significance of the difference between the means of the populations using a two-tailed Wilcoxon rank-sum test.