Abstract
Single-cell sequencing has characterized cell state heterogeneity across diverse healthy and malignant tissues. However, the plasticity or heritability of these cell states remains largely unknown. To address this, we introduce PATH (phylogenetic analysis of trait heritability), a framework to quantify cell state heritability versus plasticity and infer cell state transition and proliferation dynamics from single-cell lineage tracing data. Applying PATH to a mouse model of pancreatic cancer, we observed heritability at the ends of the epithelial-to-mesenchymal transition spectrum, with higher plasticity at more intermediate states. In primary glioblastoma, we identified bidirectional transitions between stem- and mesenchymal-like cells, which use the astrocyte-like state as an intermediary. Finally, we reconstructed a phylogeny from single-cell whole-genome sequencing in B cell acute lymphoblastic leukemia and delineated the heritability of B cell differentiation states linked with genetic drivers. Altogether, PATH replaces qualitative conceptions of plasticity with quantitative measures, offering a framework to study somatic evolution.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The published pancreatic cancer mouse data were downloaded from the NCBI Gene Expression Omnibus (GEO) accession GSE173958. The published GBM data are available from GEO under accession number GSE151506. The gliomasphere scRNAseq data are available from GEO under accession number GSE273357. For the B-ALL data, all SNV and CNV variant calls, as well as all trees, can be found at https://doi.org/10.5281/zenodo.13143937. To protect pediatric patient privacy, unprocessed single-cell whole-genome sequences are available to qualified researchers upon request to D.A.L.
Code availability
The code used to measure phylogenetic correlations and to infer cell state transitions is available as part of our PATH R software package at https://github.com/landau-lab/PATH and archived at https://doi.org/10.5281/zenodo.13144052 (ref. 93). Scripts for simulations, benchmarking and data analysis are available at https://github.com/landau-lab/PATHpaper and are archived at https://doi.org/10.5281/zenodo.13143937 (ref. 94).
References
Wu, F. et al. Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer. Nat. Commun. 12, 2540 (2021).
Neftel, C. et al. An integrative model of cellular states, plasticity and genetics for glioblastoma. Cell 178, 835–849 (2019).
Chaligne, R. et al. Epigenetic encoding, heritability and plasticity of glioma transcriptional cell states. Nat. Genet. 53, 1469–1479 (2021).
Barkley, D. et al. Cancer cell states recur across tumor types and form specific interactions with the tumor microenvironment. Nat. Genet. 54, 1192–1201 (2022).
Gavish, A. et al. Hallmarks of transcriptional intratumour heterogeneity across a thousand tumours. Nature 618, 598–606 (2023).
Jan, M. et al. Clonal evolution of preleukemic hematopoietic stem cells precedes human acute myeloid leukemia. Sci. Transl. Med. 4, 149ra118 (2012).
Miles, L. A. et al. Single-cell mutation analysis of clonal evolution in myeloid malignancies. Nature 587, 477–482 (2020).
Zeng, A. G. X. et al. A cellular hierarchy framework for understanding heterogeneity and predicting drug response in acute myeloid leukemia. Nat. Med. 28, 1212–1223 (2022).
Simeonov, K. P. et al. Single-cell lineage tracing of metastatic cancer reveals selection of hybrid EMT states. Cancer Cell 39, 1150–1162 (2021).
Shaffer, S. M. et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 546, 431–435 (2017).
Goyal, Y. et al. Diverse clonal fates emerge upon drug treatment of homogeneous cancer cells. Nature 620, 651–659 (2023).
Baron, C. S. & van Oudenaarden, A. Unravelling cellular relationships during development and regeneration using genetic lineage tracing. Nat. Rev. Mol. Cell Biol. 20, 753–765 (2019).
Sankaran, V. G., Weissman, J. S. & Zon, L. I. Cellular barcoding to decipher clonal dynamics in disease. Science 378, eabm5874 (2022).
Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).
Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR-Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).
Pei, W. et al. Resolving fates and single-cell transcriptomes of hematopoietic stem cell clones by PolyloxExpress barcoding. Cell Stem Cell 27, 383–395 (2020).
Rodriguez-Fraticelli, A. E. et al. Single-cell lineage tracing unveils a role for TCF15 in haematopoiesis. Nature 583, 585–589 (2020).
Wang, F. et al. MEDALT: single-cell copy number lineage tracing enabling gene discovery. Genome Biol. 22, 70 (2021).
Salehi, S. et al. Cancer phylogenetic tree inference at scale from 1000s of single cell genomes. Peer Community J. 3, e63 (2023).
Lodato, M. A. et al. Somatic mutation in single human neurons tracks developmental and transcriptional history. Science 350, 94–98 (2015).
Ludwig, L. S. et al. Lineage tracing in humans enabled by mitochondrial mutations and single-cell genomics. Cell 176, 1325–1339 (2019).
Gaiti, F. et al. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature 569, 576–580 (2019).
DeTomaso, D. & Yosef, N. Hotspot identifies informative gene modules across modalities of single-cell genomics. Cell Syst. 12, 446–456 (2021).
Minkina, A., Cao, J. & Shendure, J. Tethering distinct molecular profiles of single cells by their lineage histories to investigate sources of cell state heterogeneity. Preprint at bioRxiv https://doi.org/10.1101/2022.05.12.491602 (2022).
Jones, M. G., Rosen, Y. & Yosef, N. Interactive, integrated analysis of single-cell transcriptomic and phylogenetic data with PhyloVision. Cell Rep. Methods 2, 100200 (2022).
Yang, D. et al. Lineage tracing reveals the phylodynamics, plasticity and paths of tumor evolution. Cell 185, 1905–1923 (2022).
Fang, W. et al. Quantitative fate mapping: A general framework for analyzing progenitor state dynamics via retrospective lineage barcoding. Cell 185, 4604–4620 (2022).
Wang, S.-W., Herriges, M. J., Hurley, K., Kotton, D. N. & Klein, A. M. CoSpar identifies early cell fate biases from single-cell transcriptomic and lineage information. Nat. Biotechnol. 40, 1066–1074 (2022).
Gillespie, J. H. Population Genetics (Johns Hopkins University Press, 2004).
Blomberg, S. P. & Garland, T. Jr. Tempo and mode in evolution: phylogenetic inertia, adaptation and comparative methods. J. Evol. Biol. 15, 899–910 (2002).
Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).
Househam, J. et al. Phenotypic plasticity and genetic control in colorectal cancer evolution. Nature 611, 744–753 (2022).
Blomberg, S. P., Garland, T. Jr. & Ives, A. R. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57, 717–745 (2003).
Gittleman, J. L. & Kot, M. Adaptation: statistics and a null model for estimating phylogenetic effects. Syst. Zool. 39, 227 (1990).
Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019).
Feng, J. et al. Estimation of cell lineage trees by maximum-likelihood phylogenetics. Ann. Appl. Stat. 15, 343–362 (2021).
Louca, S. & Doebeli, M. Efficient comparative phylogenetics on large trees. Bioinformatics 34, 1053–1055 (2018).
Hermisson, J., Redner, O., Wagner, H. & Baake, E. Mutation-selection balance: ancestry, load and maximum principle. Theor. Popul. Biol. 62, 9–46 (2002).
Baake, E. & Georgii, H.-O. Mutation, selection and ancestry in branching models: a variational approach. J. Math. Biol. 54, 257–303 (2007).
Maddison, W. P., Midford, P. E. & Otto, S. P. Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 56, 701–710 (2007).
Louca, S. & Pennell, M. W. A general and efficient algorithm for the likelihood of diversification and discrete-trait evolutionary models. Syst. Biol. 69, 545–556 (2020).
Celentano, M., DeWitt, W. S., Prillo, S. & Song, Y. S. Exact and efficient phylodynamic simulation from arbitrarily large populations. Preprint at https://arxiv.org/abs/2402.17153 (2024).
Shibue, T. & Weinberg, R. A. EMT, CSCs and drug resistance: the mechanistic link and clinical implications. Nat. Rev. Clin. Oncol. 14, 611–629 (2017).
Dongre, A. & Weinberg, R. A. New insights into the mechanisms of epithelial–mesenchymal transition and implications for cancer. Nat. Rev. Mol. Cell Biol. 20, 69–84 (2019).
Lüönd, F. et al. Distinct contributions of partial and full EMT to breast cancer malignancy. Dev. Cell 56, 3203–3221 (2021).
Thiery, J. P. Epithelial–mesenchymal transitions in tumour progression. Nat. Rev. Cancer 2, 442–454 (2002).
Lambert, A. W., Pattabiraman, D. R. & Weinberg, R. A. Emerging biological principles of metastasis. Cell 168, 670–691 (2017).
van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729 (2018).
Pastushenko, I. & Blanpain, C. EMT transition states during tumor progression and metastasis. Trends Cell Biol. 29, 212–226 (2019).
McFaline-Figueroa, J. L. et al. A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transition. Nat. Genet. 51, 1389–1398 (2019).
Puram, S. V. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624 (2017).
Yang, J. et al. Guidelines and definitions for research on epithelial–mesenchymal transition. Nat. Rev. Mol. Cell Biol. 21, 341–352 (2020).
Pastushenko, I. et al. Identification of the tumour transition states occurring during EMT. Nature 556, 463–468 (2018).
Brown, M. S. et al. Phenotypic heterogeneity driven by plasticity of the intermediate EMT state governs disease progression and metastasis in breast cancer. Sci. Adv. 8, eabj8002 (2022).
Fustaino, V. et al. Characterization of epithelial–mesenchymal transition intermediate/hybrid phenotypes associated to resistance to EGFR inhibitors in non-small cell lung cancer cell lines. Oncotarget 8, 103340–103363 (2017).
Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).
Gundem, G. et al. The evolutionary history of lethal metastatic prostate cancer. Nature 520, 353–357 (2015).
El-Kebir, M., Satas, G. & Raphael, B. J. Inferring parsimonious migration histories for metastatic cancers. Nat. Genet. 50, 718–726 (2018).
Turajlic, S., Sottoriva, A., Graham, T. & Swanton, C. Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20, 404–416 (2019).
Hu, Z. et al. Quantitative evidence for early metastatic seeding in colorectal cancer. Nat. Genet. 51, 1113–1122 (2019).
Nicholson, J. G. & Fine, H. A. Diffuse glioma heterogeneity and its therapeutic implications. Cancer Discov. 11, 575–590 (2021).
Hara, T. et al. Interactions between cancer cells and immune cells drive transitions to mesenchymal-like states in glioblastoma. Cancer Cell 39, 779–792 (2021).
Verhaak, R. G. W. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR and NF1. Cancer Cell 17, 98–110 (2010).
Boyer, L. A. et al. Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature 441, 349–353 (2006).
Natsume, A. et al. Chromatin regulator PRC2 is a key regulator of epigenetic plasticity in glioblastoma. Cancer Res. 73, 4559–4570 (2013).
Suvà, M.-L. et al. EZH2 is essential for glioblastoma cancer stem cell maintenance. Cancer Res. 69, 9211–9218 (2009).
Chanoch-Myers, R., Wider, A., Suva, M. L. & Tirosh, I. Elucidating the diversity of malignant mesenchymal states in glioblastoma by integrative analysis. Genome Med. 14, 106 (2022).
Wakimoto, H. et al. Maintenance of primary tumor phenotype and genotype in glioblastoma stem cells. Neuro. Oncol. 14, 132–144 (2012).
Laks, D. R. et al. Large-scale assessment of the gliomasphere model system. Neuro. Oncol. 18, 1367–1378 (2016).
Lee-Six, H. et al. Population dynamics of normal human blood inferred from somatic mutations. Nature 561, 473–478 (2018).
Gonzalez-Pena, V. et al. Accurate genomic variant detection in single cells with primary template-directed amplification. Proc. Natl Acad. Sci. USA 118, e2024176118 (2021).
Brady, S. W. et al. The genomic landscape of pediatric acute lymphoblastic leukemia. Nat. Genet. 54, 1376–1389 (2022).
Roberts, K. G. & Mullighan, C. G. The biology of B-progenitor acute lymphoblastic leukemia. Cold Spring Harb. Perspect. Med. 10, a034835 (2020).
Iacobucci, I., Witkowski, M. T. & Mullighan, C. G. Single-cell analysis of acute lymphoblastic and lineage-ambiguous leukemia: approaches and molecular insights. Blood 141, 356–368 (2023).
Welner, R. S., Pelayo, R. & Kincade, P. W. Evolving views on the genealogy of B cells. Nat. Rev. Immunol. 8, 95–106 (2008).
Yusufova, N. et al. Histone H1 loss drives lymphoma by disrupting 3D chromatin architecture. Nature 589, 299–305 (2021).
Chapuy, B. et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 24, 679–690 (2018).
Bell, C. C. et al. Targeting enhancer switching overcomes non-genetic drug resistance in acute myeloid leukaemia. Nat. Commun. 10, 2723 (2019).
Shaffer, S. M. et al. Memory sequencing reveals heritable single-cell gene expression programs associated with distinct cellular behaviors. Cell 182, 947–959 (2020).
Oren, Y. et al. Cycling cancer persister cells arise from lineages with distinct programs. Nature 596, 576–582 (2021).
Fennell, K. A. et al. Non-genetic determinants of malignant clonal fitness at single-cell resolution. Nature 601, 125–131 (2022).
Halley-Stott, R. P. & Gurdon, J. B. Epigenetic memory in the context of nuclear reprogramming and cancer. Brief. Funct. Genomics 12, 164–173 (2013).
Bintu, L. et al. Dynamics of epigenetic regulation at the single-cell level. Science 351, 720–724 (2016).
Marjanovic, N. D. et al. Emergence of a high-plasticity cell state during lung cancer evolution. Cancer Cell 38, 229–246.e13 (2020).
McGranahan, N. & Swanton, C. Clonal heterogeneity and tumor evolution: past, present and the future. Cell 168, 613–628 (2017).
Marine, J.-C., Dawson, S.-J. & Dawson, M. A. Non-genetic mechanisms of therapeutic resistance in cancer. Nat. Rev. Cancer 20, 743–756 (2020).
Chapman, M. S. et al. Lineage tracing of human development through somatic mutations. Nature 595, 85–90 (2021).
Salehi, S. et al. Clonal fitness inferred from time-series modelling of single-cell cancer genomes. Nature 595, 585–590 (2021).
Moran, P. A. P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Wartenberg, D. Multivariate spatial correlation: a method for exploratory geographical analysis. Geogr. Anal. 17, 263–283 (1985).
Chen, Y. A new methodology of spatial cross-correlation analysis. PLoS ONE 10, e0126158 (2015).
Czaplewski, R. L. & Reich, R. M. Expected Value and Variance of Moran’s Bivariate Spatial Autocorrelation Statistic for a Permutation Test (US Department of Agriculture, 1993).
Schiffman, J. S. Landau-Lab/PATH: V1.0. Preprint at Zenodo https://doi.org/10.5281/zenodo.13144052 (2024).
Schiffman, J. S., Prieto, T. & D’Avino, A. R. Landau-Lab/PATHpaper: V1.0. Preprint at Zenodo https://doi.org/10.5281/zenodo.13143937 (2024).
Hormoz, S. et al. Inferring cell-state transition dynamics from lineage trees and endpoint single-cell measurements. Cell Syst 3, 419–433 (2016).
Münkemüller, T. et al. How to measure and test phylogenetic signal. Methods Ecol. Evol. 3, 743–756 (2012).
Hansen, T. F. & Martins, E. P. Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspecific data. Evolution 50, 1404–1417 (1996).
Yang, Z. & Kumar, S. Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites. Mol. Biol. Evol. 13, 650–659 (1996).
Higham, N. J. & Lin, L. On pth roots of stochastic matrices. Linear Algebra Appl. 435, 448–463 (2011).
R Core Team. R: A Language and Environment for Statistical Computing (2023); https://www.R-project.org/
Acknowledgements
We thank members of the Landau laboratory and Norbert Fehér for thoughtful discussions throughout the development of this work. We thank N. Yosef for critical comments on the manuscript. We thank A. McKenna (Dartmouth College) and B. Raj (University of Pennsylvania Perelman School of Medicine) for sharing data and code related to the scGESTALT phylogenies. We thank A. Meissner’s (Max Planck Institute for Molecular Genetics) group and the authors of Chan et al. 2019, including M. Chan (Princeton University), for sharing their cell-type assignment and code. A.R.D. is supported by a Medical Scientist Training Program grant from the National Institute of General Medical Sciences of the National Institutes of Health under award number T32GM007739 to the Weill Cornell/Rockefeller/Sloan Kettering Tri-Institutional MD-PhD Program. C.G. is supported by a Burroughs Wellcome Fund Career Award for Medical Scientists, National Institutes of Health Director’s New Innovator Award (DP2-CA239145) and Chan Zuckerberg Investigator Award. D.A.L. is supported by the Burroughs Wellcome Fund Career Award for Medical Scientists, the Valle Scholar Award, the William Rhodes and Louise Tilzer-Rhodes Center for Glioblastoma at New York-Presbyterian Hospital (NYPH 203205-01), the Sontag Foundation (Distinguished Scientist Award, SFI 203261-01), Leukemia Lymphoma Scholar Award and the Mark Foundation Emerging Leader Award. This work was supported by the National Heart Lung and Blood Institute (R01HL157387-01A1), National Cancer Institute (R33CA267219), a Tri-Institutional Stem Cell Initiative award, the National Institutes of Health Common Fund Somatic Mosaicism Across Human Tissues (UG3NS132139-01) and the National Human Genome Research Institute, Center of Excellence in Genomic Science (RM1HG011014). D.A.L. and M.L.S. are jointly supported by NCI R01CA258763 and a grant from the STARR Cancer Consortium. This work was made possible by the MacMillan Family Foundation and the MacMillan Center for the Study of the Non-Coding Cancer Genome at the New York Genome Center.
Author information
Authors and Affiliations
Contributions
J.S.S. and D.A.L. conceived the project and designed the study. J.S.S. developed PATH and PATHpro and performed simulations. J.S.S., A.R.D., T.P. and S.R. performed analyses. Y.F. and T.H. generated the gliomasphere data. Y.P. and C.G. generated the single-cell PTA data. J.S.S., A.R.D., T.P., C.P., M.L.S., C.G. and D.A.L. helped interpret the results. M.L.S., Y.F., C.G. and T.H. provided critical comments on the paper. J.S.S., A.R.D., T.P., C.P. and D.A.L. wrote the paper. All authors reviewed and approved the paper.
Corresponding authors
Ethics declarations
Competing interests
M.L.S. is equity holder, scientific co-founder and advisory board member of Immunitas Therapeutics. C.G. is a co-founder, equity holder and board member of BioSkryb Genomics. D.A.L. has served as a consultant for Abbvie, AstraZeneca and Illumina and is on the Scientific Advisory Board of Mission Bio, Pangea, Alethiomics, Montage Bio and Veracyte; D.A.L. has received prior research funding from BMS, 10x Genomics, Ultima Genomics and Illumina unrelated to the current manuscript. A.R.D. is a consultant for Montage Bio. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Verena Körber, Arjun Raj and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Cell state transition dynamics and phylogenetic correlations.
Cell state transition dynamics can be linked with phylogenetic correlations using mathematical modeling (Methods).
Extended Data Fig. 2 Cell state transition dynamics predict phylogenetic correlations.
a, Simulated idealized phylogeny containing 26 = 64 cells (Supplementary Note) in which cells can transition between three possible cell states. Cell state transitions are represented as a discrete-time Markov chain (Supplementary Note). b, Simulated cell state transition dynamics and measured phylogenetic autocorrelations (Methods) for the first cell state for 1,000 independent simulations on idealized phylogenies, containing 210 cells, in which state transition probabilities were randomly generated for each trial. Phylogenetic correlations were computed using a weighting function that included only sibling cells (one-node only, as described in Methods). LOESS regression line (blue) with 95% confidence interval (light gray) is shown. c, (Left) Simulated versus PATH-inferred cell state self-transition (that is, stability) probabilities (Methods), by transforming the phylogenetic autocorrelations measured in b. (Right) Simulated versus PATH-inferred cell state transition probabilities from state 1 to 2, on idealized phylogenies (Supplementary Note). Dashed red lines both have slope 1 and pass through the origin. Linear regression lines (blue) with 95% confidence intervals (light gray) are shown for both plots. Spearman’s correlation coefficients shown in panels b and c had two-sided P values < 2.2e-16, the limit of precision in R.
Extended Data Fig. 3 PATH inferences and simulations of somatic evolution.
a, Same as Fig. 2b but for systems where heritability is not detectable for all cell states (at least one state phylogenetic autocorrelation z score ≤ 2) b, Comparing the state transition dynamic inference accuracy of PATH using phylogenetic correlations measured at different node depths. Transition inference error is measured as the Euclidean distance between simulated and inferred transition probabilities, and the number of possible states in a simulated system is shown on the x-axis. 1,000 phylogenies were simulated for each system and sampled at a rate of 10−2. c, Mean path length distance, phylogenetic correlation difference, and Robinson-Foulds distance between simulated true and reconstructed phylogenies (Supplementary Note). Phylogenies were simulated 1,000 times for each barcode length (x-axis). d, Expected pendant edge lengths for a sampled somatic evolutionary process, as a function of birth, death, and sampling rates (Supplementary Note). e, Transition inference error (Euclidean distance between inferred and true transition probabilities) using PATH or MLE for 3, 4, or 5 cell states in a phylogeny composed of either 100, 500, or 1,000 cells, representing a sample of 10−6 or 10−3 of the total population. Each parameter combination was simulated 1,000 times. f, Run times corresponding to simulations sampled at a rate of 10−6 depicted in e. Box plots represent median, bottom and upper quartiles; whiskers correspond to 1.5 times interquartile range.
Extended Data Fig. 4 PATHpro inferences from simulations of somatic evolution with cell state-specific proliferation rates.
a, Comparison of PATH versus PATHpro transition inference errors (Euclidean distance between inferred and simulated transition probabilities) on simulated data when proliferation rates depend on cell state (Methods). Phylogenies were simulated in forward time, starting with one cell and ending after reaching 104 cells, and then subsampled to 103 cells. The state of the first cell was randomly chosen, and \({\gamma }_{1}\), the proliferation rate of state 1, is shown by the x-axis, and \({\gamma }_{2}\), the proliferation rate of state 2, was fixed at 1. Cell state transitions between states were symmetric with \({P}_{12}={P}_{21}=0.1\). 100 phylogenies were simulated for each proliferation rate. b, Diagrams of cell state transition probabilities used for benchmarking PATHpro (Supplementary Note). c, Comparison of transition probability inferences between PATHpro and SSE-MLE for systems shown in b, when the true proliferation rates are known. Inference error is the Euclidean distance between inferred and true transition probabilities. d, Comparison of transition probability and proliferation rate inferences for networks shown in b when proliferation rates are not known. Inference error is measured as the Euclidean distance between true/simulated and inferred transition probabilities or proliferation rates and values reported are rounded to the nearest thousandth. e, Comparison of PATHpro and SSE-MLE run times for the inferences shown in d, with the y-axis is scaled by log10(seconds). For all box plots, boxes represent the interquartile range (IQR); the center line represents the median; minima and maxima shown represent 1.5⋅IQR. For violin plots, white lines correspond to the median, and dashed lines represent the mean. Box and violin plots represent 100 simulated phylogeny replicates each.
Extended Data Fig. 5 Quantifying the heritability versus plasticity of EMT transcriptional states.
a, Tumor cell harvest site phylogenetic correlations. b, EMT bin phylogenetic correlations (z scores). Colors represent putative states. Full table of EMT bin phylogenetic correlations can be found in Supplementary Table 1. c, Single-cell phylogeny from Mouse 1, Clone 1 from Simeonov et al.9, containing 700 of 7,968 randomly chosen cells for visualization. Each leaf represents a single cell. Cells are colored by PATH-defined states (T1, T2, T3, M). d, EMT bin phylogenetic correlation (z score) heat maps using different bin sizes (0.5, 2, 3). e, Box and whisker plot of EMT bin phylogenetic correlations (z scores) across phylogenies that contain cells from only one harvest site. Dots correspond to EMT bins. (T1, n = 7 bins; T2, n = 11 bins; T3, n = 6 bins; M, n = 1 bin). Bins are grouped and colored by transition state membership. f, PATHpro-inferred transition probabilities between states (T1, T2, T3, M) for branch lengths imputed using a death rate of 0, 0.01, or 0.02. Transition probabilities inferred for 100 iterations of subsampling 5,248 out of 10,495 cells for each imputation (Supplementary Note). For all box plots, boxes represent the interquartile range (IQR); the center line represents the median; minima and maxima shown represent 1.5⋅IQR.
Extended Data Fig. 6 PATH-inferred cell state transitions and gene set enrichment in human glioblastoma.
a, Heat map of the phylogeny-replicate (n = 9) mean phylogenetic correlations (Methods) for the top 100 most heritable genes (determined by phylogeny-replicate mean gene phylogenetic autocorrelation z scores) in MGH115. Over-representation analysis (ORA) performed on the genes in each of the two clusters, defined by hierarchical clustering using Ward’s method, separately. Phylogenetic correlations were computed using an inverse-node-distance weighting (Methods). Only select gene sets are depicted for Cluster 2; remaining significantly enriched gene sets are in Supplementary Table 4. b, PATHpro-inferred transition probabilities for MGH115. Transitions and proliferations, for \(t=\tau /2\), depicted correspond to the median ± MAD from ML phylogeny replicates, with low transition probabilities (Pij < 0.05) omitted. c, PATH-inferred transition probabilities \({\widehat{\bf{P}}\left(\tau \right)}\) (Methods) from neurodevelopmental-like (Stem-/AC-like) cell states to the MES-like cell state in human patient-derived GBM samples MGH115 and MGH122. Points correspond to PATH inferences for each phylogeny ML replicate (n = 9) per sample. Significance determined by two-sided t-test for Stem-like vs AC-like in MGH115 and MGH122, respectively. Colors correspond to cell state. Box plots represent median, bottom and upper quartiles; whiskers correspond to 1.5 times the interquartile range.
Extended Data Fig. 7 Quantifying cell state heterogeneity in B-ALL using single-cell whole-genome sequencing.
Genome-wide copy-number deletion annotations projected onto the B-ALL single-cell phylogeny from Fig. 5a. 25-kb copy-number events detected in at least 5 cells were concatenated and displayed next to the phylogeny. Single-cell CNV events were colored based on chromosomes. chr4.q31.21, chr6.q16.3-chr6.q22.33, chr11.p11.2 and chr16.p13.12 represent the largest deletions.
Supplementary information
Supplementary Information
Supplementary Note
Supplementary Table
Supplementary Tables 1–7
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Schiffman, J.S., D’Avino, A.R., Prieto, T. et al. Defining heritability, plasticity, and transition dynamics of cellular phenotypes in somatic evolution. Nat Genet 56, 2174–2184 (2024). https://doi.org/10.1038/s41588-024-01920-6
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41588-024-01920-6
This article is cited by
-
DARLIN mouse for in vivo lineage tracing at high efficiency and clonal diversity
Nature Protocols (2025)
-
T16 modulated extracellular matrix remodeling in fibroblasts via paracrine activation of TGF-β1 through M2 macrophage polarization
Molecular Biology Reports (2025)