Abstract
A distinct model of neutral evolution of rare cancer mutations is described and contrasted with models relying on the infinite sites approximation (that a specific mutation arises in only one cell at any instant). An explosion of genetic diversity is predicted at clinical cell numbers and may explain the progressive refractoriness of cancers during a clinical course. The widely used infinite sites assumption may not be applicable for clinical cancers.
Similar content being viewed by others
This perspective describes a novel approach to analysis of neutral mutations within an exponentially growing cancer prior to therapy. The majority of mutations occurring at variant allele fractions (VAF) of 0.1 or less evolve neutrally before therapy1,2,3,4,5,6. In contrast, mutations occurring at higher VAFs are more likely to be drivers that are positively selected. We have hypothesized that moderate to late term relapse in cancer is often due to outgrowth of rare subclones containing resistance mutations7. Prior to therapy, these subclones are evolving neutrally.
The model does not rely on the infinite sites approximation (that a specific mutation newly arises in only one cell at any instant) in contrast to related models. The results become increasingly important in cancers large enough to be clinically detectable, especially concerning rare resistant subclones in these cancers that may be selected by therapy. Notably, the approach herein is similar in many respects to that of Williams et al., which relies on the infinite sites assumption2. Our model was independently developed in 2015 without using the infinite sites assumption, and subsequently published when confirmed by highly accurate ultradeep sequencing1. We note that other formulations of the infinite sites assumption are possible, but the definition given herein is most pertinent to the current results.
The next section describes the model for neutral evolution. “The 'Similarities and differences with related work’ section compares and contrasts the methodology with that of Williams et al.2. “Comparison of predictions of models” section compares the predictions of the two models. The “Potential significance of rare subclones” section briefly discusses the hypothesized importance of rare subclones in resistance. We conclude by discussing potential implications for mathematical oncology and precision cancer medicine. The infinite sites approximation, underlying many contributions to mathematical oncology, does not apply for cancers large enough to be clinically diagnosable, and the model described is unique in not relying on it.
The text is written for an interdisciplinary audience. Additional mathematical details are provided in the Supplementary Notes.
Model for neutral evolution
As it is not possible to sequence every genome present in a cancer, rare mutations (i.e., mutations present in one cell or a small number of cells) are infrequently sampled. Due to the branching nature of evolution, the somatic mutations of the founder cell constitute the trunk of the evolutionary tree and the earliest mutational events near the trunk are scored in the majority of cells, and can be detected at low sequencing depth. As we increase sequencing depth, additional mutations that occurred later in carcinogenesis and are present in a smaller fraction of the cells are also detected. As the number of nucleotides sequenced at a given genomic position is nearly always less than the number of cells in the cancer, we are unlikely to detect evidence of recent mutational events late in the cancer evolution. Thus, the full mutation burden in the cancer cannot be directly determined, and the estimate of mutation rates is based on an incomplete dataset. We developed a method to estimate the mutation rate and the full mutation burden in the cancer by comparing measurements at several different sequencing depths1.
Specifically, we sequenced 13 kb of sequences not thought to be selected early in cancer evolution, at multiple different sequencing depths up to 20,000 using duplex sequencing, a highly accurate sequencing technique that sequences both DNA strands and calls a mutation only if complementary changes are present on both strands, providing unparalleled accuracy and sensitivity for detection of rare mutations8,9,10. This high accuracy allows us to sequence very deeply and reliably detect rare mutations that would be hidden in noise if conventional sequencing techniques were used, for either bulk or single cell techniques.
At each depth D, where D is the number of DNA duplexes (not strands) sequenced, we determined the fraction of the base loci at which no mutation was observable by duplex sequencing, the “apparently unmutated fraction”. The word “apparently” refers to the fact that some “unmutated” sites actually have mutations, but at a VAF below the limit of detection. As D increases, lower VAFs are detectable. As discussed below, the duplex sequencing experiments determine the increasing number of unique mutations that are observable in the 13 kb sequence with increasing sequencing depth, and from this curve we can determine both the overall mutation rate and the fact that we are observing neutral evolution at this depth. Although our duplex sequencing to depth of 20,000 directly interrogates VAFs of approximately 1/20,000 and higher (corresponding to a cell number of approximately 20,000), the mutation rate thus obtained, combined with the theoretical treatment, allows us to infer the behavior of the system for much rarer mutations that occur at clinically relevant cell numbers (Fig. 2). To directly observe this portion of the curve would require duplex sequencing to a depth of 1 million, technically but not economically feasible.
In order to model the fraction of apparently unmutated single base loci, we integrate over the entire temporal history of the cancer, determine the fraction of apparently unmutated single base loci for daughter cells born at different times, and obtain the average fraction of apparently unmutated single base loci, weighted by the number of daughter cells born at different times. We assume the total cell populations are growing exponentially, according to a net growth rate that is the difference of proliferation and death rates. Moreover, we assume that a particular time t, all of the cells have the same net growth rate. In reality, cells are not dividing exactly synchronously, but the cell generation typically takes place over several days. Because the integral is over a clinical period of interest is months to years, we approximate this several day period as “instantaneous” and these instants are non-overlapping. The fraction of apparently unmutated single base loci is independent of time, making the average simple to calculate. At early timepoints, when there are few cells, it is less likely that a mutation will arise at a given locus because there are fewer cells dividing at that time. But, if a mutation arises at this early time, it will be present in a larger fraction of the cells in the final cancer, because it is closer to the trunk of the evolutionary tree, and therefore will be more likely to be detectable. These two factors (the lower likelihood of a mutation at earlier times but greater likelihood of detecting a mutation which does occur at an earlier time) exactly counterbalance each other to give a constant number of expected detectable mutations arising at any time. The derivation is given in Supplementary Note 1.
In order to connect the number of expected detectable mutations at a given base locus to the probability of observing no mutations at that base, we assume a Poisson distribution of detectable mutations with the expectation value as the mean. While the Luria-Delbrück distribution11 has been employed in this setting12, the Luria-Delbrück paper states that this more complex distribution is necessary only because the number of observations in their experiments is smaller than the reciprocal of the mutation rate, meaning in most experiments no mutation is detected. This coarse digitization results in “jackpots” (experiments with a much higher apparent mutation rate) occurring sporadically if a mutation occurs, and a high ratio of variance in VAF to mean VAF. In our case, there are 2.6 × 108 observations (number of bases sequenced (13,000) X depth (20,000)), which greatly exceeds the reciprocal of the mutation rate that was determined experimentally, finely digitizing rare events and justifying the use of the Poisson distribution. Indeed, the ratio of variance in VAF to mean VAF across the sequence in our data was close to 1, as expected for the Poisson distribution, but much lower than expected for the Luria-Delbrück distribution11.
By using the expected number of detectable mutations and the Poisson statistical distribution, we can determine the expected proportion of sites at which no mutation will be detectable at sequencing depth D. We note that the model is continuous, not discrete, and thus the proportion of such sites need not be a ratio of integers.
As shown in Supplementary Note 1, the proportion of apparently unmutated loci as a function of sequencing depth D is given by:
where \({F}_{{apparent}-{unmutated}}\) is the proportion of bases where no read shows a mutation at depth D, and \({k}_{{mut}-{eff}}\) is an “effective” mutation rate constant; namely, the expected number of new mutations at a given base locus when the number of cells in the cancer is increased by one. Whenever the number of cells in the cancer has been increased by one, we call it an “effective cell division”, and the “effective mutation rate” is the mutation rate per effective cell division. It may take more than one actual cell division to increase the number of cells by one if the cellular birth and death rates are closely matched. In such a case, the effective mutation rate would be higher than the mutation rate per cell division1. While the number of cell divisions in the history of a cancer is not observable, the model is parameterized based on total cell number in the cancer, which is estimable in principle. Thus, the model inputs and outputs reflect quantities that can be estimated.
From Eq. 1, we see that the natural logarithm of \({F}_{{apparent}-{unmutated}}\), when plotted vs D, gives a straight line with a slope of -\({k}_{{mut}-{eff}}\). In colorectal cancer patients at diagnosis, the effective mutation rate constant per base per new cell added to the cancer was 6-7 × 10-7 as shown by duplex sequencing1 (Fig. 1), higher than previously estimated. The plots fit straight lines extremely well, which is indicative of neutral evolution and constitutes experimental validation of Eq. 1 using a sequencing technique with unprecedented sensitivity and accuracy. By contrast, positive selection would be expected to result in curvature in this plot, because positively selected mutations ultimately increase in VAF during cancer evolution, leading to increased absolute slope of the line at high VAFs1.
ln (Fapparent unmutated) versus sequencing depth for 5 colorectal cancer tumors at diagnosis biopsied by colonoscopy (a–e), Black symbols represent experimental data. Red symbols represent subsampled data, in which lower depth sequencing was simulated from sub-samples of the higher depth data. The line represents the linear regression line through points shown. The shaded area is the 2-sided 95% confidence interval of the experimental data as determined by the student’s t test with 2 degrees of freedom. The linear correlation coefficients for these 5 linear regressions range from 0.953 to 0.999, indicating excellent model fit to the data. The resultant values of the effective mutation rate constant kmut-eff range from 2.8 × 10−7 to 1.5 × 10−6, mean +/− S.D. of 7.1 + /−4.4 × 10−7. Reproduced from supplemental material in reference 1 with permission.
We can use Eq. 1 to estimate the fraction of apparently unmutated sites that would be present if we had sequenced every cell in the cancer, for a cancer with N cells, by assuming D = N:
Given the effective mutation rate described above, and the fact that a 1 cm3 lesion, approximately the detection limit on CT scan, contains approximately 108–109 cells, it is apparent from Eq. 2 that the fraction of bases at which no mutation is detectable is near zero in a diagnosable cancer, and thus a rare subclone corresponding to a mutation at every neutrally evolving base in the genome already exists at diagnosis.
This neutral evolution model is based on a continuum approximation with non-integral probabilities and expectation values and fractions that are not required to be ratios of integers. It is not designed to accurately reflect the earliest events in cancer evolution where the number of cells is small, and small chance fluctuations are important. To query these early events, a discrete treatment using an exact binomial distribution as opposed to a Poisson approximation would be required, and has been described in refs. 13,14. These early events likely involve drivers and high VAF passenger mutations that are outside the scope of this analysis. The continuum approximation is expected to be highly accurate for mutations at low VAF, which arise later in cancer evolution, in smaller branches of the evolutionary tree, where the number of events is very large.
Similarities and differences with related work2
There have been many approaches to modelling cancer evolution. The field has generally modelled carcinogenesis with selection operating only on dominant and recessive oncogenes, and neutral evolution proceeding only after the formation of the founder cell15,16. Sottoriva et al.17 investigated in depth the similar hypothesis that colorectal cancers grow as a single expansion with most selected mutations occurring early, a model supported by multiparameter profiling of individual clones from diverse locations within these cancers. Williams et al.2 analyzed the TCGA database and found remarkable subclonal diversity and evidence for neutral evolution. Martincorena et al.3 subsequently found strong support for neutral tumor evolution in a comprehensive analysis of the ratios of synonymous to nonsynonymous mutations.
Herein, we compare our approach with that of Williams et al.2. Both approaches use quantitative analysis of rare mutations to support the idea that most bases evolve neutrally, i.e. without positive or negative selection, and parameterize the model in terms of observables such as “effective cell divisions”, rather than utilizing parameters that require knowledge of the cancer’s early history to be estimated. Both models analyse the pattern of increased detection of diversity with more sensitive methods to infer the mutation rate and to determine whether evolution is neutral or selected. Both models conclude that neutral evolution predominates and that significant subclonal diversity exists and may be a source of pre-existing drug resistance. Both are continuum models with many of the same underlying assumptions.
There are, however, several important differences between our approach and that of Williams et al.2. Williams et al.2 examined the TCGA database, which contains sequences obtained mostly at low depth utilizing a variety of techniques with varying accuracy, and described the statistical distribution of mutation frequencies in whole exome sequencing within cancer types. In contrast, we isolated DNA from surgical specimens at diagnosis, and sequenced the fresh tissue over a small segment of the genome using duplex sequencing to a depth of 20,000 at very high accuracy (one technical error in 107 bases sequenced, or less).
Critically, Williams et al.2 utilized the infinite sites approximation18, a very widely used assumption stating that any new mutation occurring at time t is occurring in only one copy, rather than being born new in multiple copies in different cells. This assumption drives the following quote “for a new mutation occurring at any time t, its allelic frequency (relative fraction) f must be the inverse of the number of alleles in the population,” in turn determining the rest of the analysis. The assertion is valid for sequencing depths less than the reciprocal of the mutation rate, i.e. the subject of the Williams et al. analysis and our experimental work. However, we estimate the expected number of new mutations occurring at the same base simultaneously within different cells as the product of an effective mutation rate per base per cell division and the number of cells dividing simultaneously. This “expectation value” is an average and does not have to be an integer like 0 or 1 (Eq. 2). Significantly, given the fact that a minimally diagnosable cancer has a cell number of 108–109, exceeding the reciprocal of the effective mutation rate we have determined by at least one-three orders of magnitude1,19, many cells with identical mutations in the same base would be expected with each cell doubling. The probability that there will be any given integral number of DNA molecules with the variant of interest arising during this doubling is then given by the Poisson distribution defined by the expectation value. In section “Comparison of predictions of models,” we will compare and contrast the predictions of the mutant allele frequency as a function of the number of alleles in the population for our model, which does not rely on the infinite sites approximation, and contrast it with the Williams et al. model2 as well as the model of Bozic et al.13,14 a neutral evolution model that relies on the infinite sites approximation but does not use the continuum approximation.
Comparison of predictions of models
Figure 2 summarizes the predictions of our model (gold line) as compared to two models that rely on the infinite sites assumption: the Williams et al. model (red line)2 that, like ours, is a continuum model; and the Bozic et al.13,14 model (blue line), that utilizes an exact binomial treatment to more precisely deal with chance events involving small numbers of cells early in cancer evolution. The x axis shows the total number of cells in the cancer on a logarithmic scale. The y axis shows the predicted VAF (also on a logarithmic scale) for cells that are formed at the time t in the cancer’s history when the total number of cells is n(t).
Average mutant allele frequency (MAF), a synonym for VAF, for a given mutation vs N(t), the number of cells at the time it is formed, for the stochastic model with the infinite sites assumption13,14 (blue), the deterministic model with the infinite sites assumption2 (red), and without the infinite sites assumption1 (gold). Sequencing to a depth D queries, on average, mutational events occurring when N(t) =D. Thus, our duplex sequencing experiments query mutational events up N(t) ≈ 20,000. The portions of the gold curve beyond 20,000 are inferred from the mathematical model using the mutation rate determined from the duplex sequencing experiments, but are not directly observed. For the other models, the lower sequencing depth implies direct observation of less of the curves. For the range of the graph covered by direct observation, all three models give a straight line with downward slope, although the slope (indicative of the mutation rate) differs in the current work, that found a higher mutation rate. In the figure, all three models are nonetheless graphed using the experimental mutation rate determined in reference 1 by duplex sequencing, and thus coincide in that range. Stochastic models consider random fluctuations from the expected average evolutionary trajectory and are more accurate at early times when the tumor is small compared to deterministic models that are based on the expected average evolutionary trajectory. The infinite sites assumption states that a mutation in a particular base will occur uniquely in one cell at any instant. Models without the infinite sites assumption are more accurate than those with it for larger cancer burdens in which the number of cells approaches or exceeds the reciprocal of the effective mutation rate, since the expected number of instances of mutations at a particular base is equal to the product of the effective mutation rate and the number of cells dividing. Parameters are cell birth rate = 0.25/day, cell death rate = 0.18/day, effective mutation rate per base per new cell added to the tumor = 6.1 × 10−7. Reproduced from reference 1 with permission.
For the models utilizing the infinite sites approximation, the VAF must always be inversely proportional to the number of cells in the cancer, resulting in a straight line on a log-log plot, with a negative slope, as shown. However, our model predicts that the VAF will reach a limit (an asymptote) as the cancer gets significantly larger than the reciprocal of the effective mutation rate, i.e. in the neighborhood of ≫ 1.5 million cells. The limiting VAF is equal to the effective mutation rate. In short, the VAF cannot go to zero because it is limited by the accuracy of DNA replication. In fact, in cancers with far greater than 1.5 million cells, the predictions of the models utilizing the infinite sites assumption implicitly require progressive evolution of more accurate DNA replication machinery as the cancer burden increases.
Our model predicts a transition between the region with a straight downward slope of VAF, and the region where the VAF curve levels off at the effective mutation rate. The shape of the entire curve is given by Eq. 3:
The derivation is given in Supplementary Note 2. When n(t) is much greater than the reciprocal of the effective mutation rate, the exponential term is near zero, and the VAF approaches the effective mutation rate.
Moreover, we can show that if n(t) (or equivalently the sequencing depth D, which can be substituted in the expression as it corresponds to the number of cells sequenced) is much less than the reciprocal of the effective mutation rate, that the expected VAF closely approximates 1/n(t) (or 1/D) as asserted by the models relying on the infinite sites approximation. This is because when the number of cells is smaller than the effective mutation rate per cell division, the chance of getting a mutation at a given locus in even one cell is low, since kmut-eff n(t) ≪ 1. The chances that more than one cell will have the same mutation in such a small number of cell divisions is extremely low, making the infinite sites approximation highly accurate. The derivation is given in Supplementary Note 3.
As a further point of clarification, Williams et al.2 developed their mathematical approach so that the number of observed unique subclones is plotted on the y axis to determine the effective mutation rate, and the slope of their line giving the effective mutation rate is positive. In contrast, in our model it is more precise to evaluate the logarithm of the fraction of apparently unmutated bases, and the slope of our line giving the effective mutation rate is negative. We show at cell numbers below the reciprocal of the effective mutation rate that these approaches are equivalent (Supplementary Note 4), whereas at high cell number our approach is more precise and the plot stays linear.
Thus, all three models give similar predictions early in cancer evolution, except for the correction applied in the Bozic et al. model13,14 for stochastic extinction of small subclones. That is, in the Williams et al.2 model and our model1 that use the continuum approximation, subclonal dynamics are described by their average behavior, and will always increase in number if their net growth rate (proliferation rate minus death rate) is positive. In contrast, the Bozic et al. model13,14 uses an exact binomial expression that describes all possible outcomes and their probabilities, and from this we see that a very small number of cells may become extinct by the play of chance. This leads to the divergence of the Bozic et al.13,14 model from the continuum models at very low cell numbers, and this exact stochastic treatment is more accurate in this range, corresponding to the earliest events in carcinogenesis. However, at larger cell numbers that are of interest in the present work, it is very unlikely that a minor subclone undergoing exponential growth will become extinct (unless due to competition with other subclones for limited resources, see 4 below), and mathematically the continuum approximation is extremely close to the exact binomial. Thus, in the middle of Fig. 2, all three models coincide. Our model begins to diverge from the other 2 models in the neighbourhood of 1.5 million total cells. The difference in mutational diversity between our model and the lesser diversity predicted by the other models is a function of the area between the two curves, and it increases as the cancer burden increases. This novel prediction affects all clinical scenarios, with increasing significance as the cancer grows and spreads. Another group has explored a different discrete model with the infinite sites assumption and accounting for extinction, and derived complex predictions that depend on whether the cancer is evaluated at a fixed time or fixed size20. If an exact binomial model were created that did not rely on the infinite sites assumption, such a model would be expected to behave like ours at high cell number, and would be accurate in all three regions of Fig. 2.
Potential significance of rare subclones
The significance of rare subclones in cancer precision medicine remains a subject of debate7. A sequencing study of primary tumors and a small number of their associated metastases showed that clonal driver mutations are relatively similar21. However, genetic differences arising from the metastasis of a rare subclone would have been present only in a rare metastasis, and would have required sequencing of many metastases to detect.
Many authors have pointed to the role of competition between subclones in large lesions that have exceeded their carrying capacity, and pointed out that rare subclones may be driven to extinction by either stochastic effects or competition for nutrients. However, extinction of subclones by stochastic effects may be less relevant when the number of cells dividing is far greater than the effective mutation rate per base. Each cell generation would be expected lead to the rebirth of many new cells with the same mutation. This may alter concepts of extinction and fixation when applied to rare subclones in clinical cancers.
Competition within subclones may be important in large lesions. However, clinically most patients do not perish from large lesions, but from widespread metastases infiltrating organs22. The total cancer burden is thus spread over many sites, and in many cases micrometastases are so small that nutrients can diffuse in, and angiogenesis is not even necessary23. In these scenarios subclones may grow independently and exponentially. The total cancer burden over many of these metastases may be substantial. In other scenarios, subclones may cooperate rather than compete24,25.
Discussion
The infinite sites approximation is very widely used in mathematical oncology. We have shown that it may not be applicable when cancers exceed 1.5 million cells. Since sequencing to the appropriate depth is rarely done, and phylogenies are rarely determined so far out on the evolutionary tree, discrepancies between models using the infinite sites assumption and the ground truth, if present, may not be detectable at this time. Moreover, computational agent-based models, in which each virtual cell is represented as a “agent”, rarely contain more than one million cells. Few laboratory experiments in which subclones are tracked have been done with a sufficient number of cells to capture rare events. However, bioreactors that can grow barcoded subclones in large numbers for long periods of time, encompassing sufficient cell divisions to observe rare mutational events, are a promising emerging technology26. While it is beyond the scope of this article, we recommend that many of the computational and mathematical methods in the field be carefully evaluated to determine if they are applicable to the true cell numbers present in clinical cancers.
All of the neutral evolution models compared in this perspective are parsimonious in that they are tumor-centric, neglecting the tumor microenvironment, and grounded in genomics, neglecting non-genetic mechanisms of resistance. Parsimonious conceptual models and more detailed descriptive models each have important roles in mathematical oncology22. Notwithstanding, even the parsimonious model discussed herein has uncovered exponentially increasing genetic diversity that may in part explain the increasing difficulty in treating cancers during a clinical course, where at each relapse the patient and physician are confronted with a cancer that is increasingly refractory, and consequent diminishing returns from therapy.
Currently, oncology is driven by achieving higher response rates and more durable responses. Nonetheless, we believe that as these goals are achieved, addressing moderate to late term relapse resulting from the outgrowth of rare subclones will be required for further improvements and ultimately, cures. Simulations of therapy based on the continuum approximation allow large numbers of cells to be considered and suggest the potential importance of rare subclones in optimizing therapy in earlier lines of treatment, particularly if these subclones have mutations in proteins responsible for genome integrity, thus leading to more rapid evolution27,28.
Rare subclones are projected to be an immense reservoir of diversity from which resistance may arise. The resistance mutations may have little impact on fitness during carcinogenesis, but once therapy is initiated the mutations will promote survival and eventual outgrowth of these subclones. Consideration of these rare subclones proactively in earlier lines of therapy may not lead to improved immediate shrinkage of large lesions, but may prevent or delay moderate to late term relapse and improve survival, especially if a personalized approach balancing immediate cytoreduction and resistance prevention is undertaken, based on a understanding of evolutionary dynamics27,28. In clinical oncology, it is commonly observed that benefits from targeted therapy are progressively reduced with each relapse. We believe this phenomenon is, to a significant degree, due to progressive selection of multiply resistant subclones, often with enhanced genetic instability. In some cases, resistance mutations associated with relapse that were thought to be new have been shown by deeper sequencing to be pre-existing at diagnosis29.
We note that a more exact version of our model would also account for the fact that, as resistant subclones grow, a small fraction of their cells will revert the mutation to wild type, regaining sensitivity. However, this fraction is small and does not meaningfully affect our estimate of the number of resistant cells, and it is the resistant cells that are likely to represent a clinically significant problem. Thus, we have opted for a simpler version of the model.
We propose a continuum model of evolution of neutral (unselected) sites in cancer evolution. This model makes very different predictions from related models when the number of cells is comparable to a clinical case. Yet, for the small cell numbers directly addressable by current sequencing techniques and agent-based computational models, all of the models agree in their predictions. The differences between this model and related models are subtle, but critical, and the implications for clinical care may be important. We suggest the other models (which are widely used) do not fully apply in the clinical setting due to the behavior of the system when the number of cells equals that in a real cancer.
Data availability
All data generated or analysed during this study are included in the published article and supplementary notes.
References
Loeb, L. A. et al. Extensive subclonal mutation diversity in human colorectal cancer and its significance. Proc Natl. Acad. Sci. USA 116, 26863–26972 (2019).
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Identification of neutral tumor evolution across cancer types. Nature Genetics 48, 238–244 (2016).
Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).
Ling, S. et al. Extremely high genetic diversity in a single tumor points to prevalence of non-Darwinian cell evolution. Proc. Natl. Acad. Sci. USA 112, E6496–E6505 (2015).
Zhou, Z. et al. Mutation-profile-based methods for understanding selection forces in cancer somatic mutations: a comparative analysis. Oncotarget 29, 58835–58846 (2017).
Bozic, I. et al. Accumulation of driver and passenger mutations during tumor progression. Proc. Natl. Acad. Sci. USA 107, 18545–18550 (2010).
Beckman, R. A. & Loeb, L. A. Rare mutations in cancer drug resistance and implications for therapy. Clin. Pharm. Ther. 108, 437–439 (2020).
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl. Acad. Sci. 109, 14508–14513 (2012).
Kennedy, S. R. et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat. Protocols 9, 2586–2606 (2014).
Schmitt, M. W. et al. Sequencing small genomic targets with high efficiency and extreme accuracy. Nat. Methods 12, 423–425 (2015).
Luria, S. E. & Delbrück, M. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28, 491–511 (1943).
Cheek, D. & Antal, T. Genetic composition of an exponentially growing cell population. arXiv 1905, 12355v3 (2020).
Bozic, I., Gerold, J. M. & Nowak, M. A. Quantifying clonal and subclonal passenger mutations in cancer evolution. PLoS Comp. Bio. 12, e1004731, https://doi.org/10.1371/journal.pcbi.1004731 (2016).
Bozic, I. et al. Evolutionary dynamics of cancer in response to targeted combination therapy. Elife 2, e00747 (2013).
Beckman, R. A. & Loeb, L. A. Efficiency of carcinogenesis with and without a mutator mutation. Proc. Natl. Acad. Sci. USA 103, 14140–14145 (2006).
Beckman, R. A. Mutator mutations enhance tumorigenic efficiency across fitness landscapes. PLoS One 4, e5860 (2009).
Sottoriva, A. et al. A big bang model of human colorectal tumor growth. Nat. Genet. 47, 209–216 (2015).
Kimura, M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893–903 (1969).
Del Monte, U. Does the cell number 10(9) still really fit one gram of tumor tissue? Cell Cycle 8, 505–506 (2009).
Gunnarsson, E. B., Leder, K. & Foo, J. Exact site frequency spectra of neutrally evolving tumor: a transition between power laws reveals a signature of cell viability. Theor. Popul. Biol. 142, 67–90 (2021).
Reiter, J. G. et al. An analysis of genetic heterogeneity in untreated cancers. Nat. Rev. Cancer 19, 639–650 (2019).
Beckman, R. A., Kareva, I. & Adler, F. R. How should cancer models be constructed? Cancer Control 27, 1–12 (2020).
Deshpande, N., Ren, Y., Foygel, K., Rosenberg, J. & Willmann, J. K. Tumor angiogenic marker expression levels during tumour growth: longitudinal assessment with molecularly targeted microbubbles and US imaging. Radiology 258, 804–811 (2011).
Marusyk, A. et al. Non-cell-autonomous driving of tumour growth supports sub-clonal heterogeneity. Nature 514, 54–58 (2014).
Vietsch, E. E. et al. De-convoluting therapeutic resistance in a pancreatic cancer model: Pharmacogenomic evaluation of intratumoral clonal heterogeneity. Cancer Res. 75, A73 (2015).
Makohon-Moore, A. P. Transcriptional and metabolic dynamics of cancer cells under nutrient deprivation. Cancer Res 83, NG08 (2023).
Beckman, R. A., Schemmann, G. S. & Yeang, C. H. Impact of genetic dynamics and single-cell heterogeneity on development of nonstandard personalized medicine strategies for cancer. Proc. Natl. Acad. Sci. USA 109, 14586–14591 (2012).
Beckman, R. A. & Loeb, L. A. Evolutionary dynamics and significance of multiple subclonal mutations in cancer. DNA Repair 56, 7–15 (2017).
Schmitt, M. W., Loeb, L. A. & Salk, J. J. The influence of subclonal resistance mutations on targeted cancer therapy. Nat. Rev. Clin. Oncol. 13, 335–347 (2016).
Acknowledgements
This study was funded by Department of Defense Breakthrough Award #W81XWH-20-1-0760. The funder played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript. The author gratefully acknowledges helpful discussions with Drs. Alvin Makohon-Moore (Hackensack Meridian Health Center for Discovery and Innovation), Rebecca Riggins (Lombardi Comprehensive Cancer Center, Georgetown University Medical Center), and Ryan Sullivan (Massachusetts General Hospital, Harvard Medical School).
Author information
Authors and Affiliations
Contributions
R.A.B. contributed to the conception and design of this study and the analysis and interpretation of data. He drafted the manuscript, approved the final version, and assumes accountability of all aspects of the work.
Corresponding author
Ethics declarations
Competing interests
RAB is the Chief Scientific Officer of Onco-Mind, LLC, which holds patents on dynamic precision medicine, a novel approach to precision cancer medicine. He is uncompensated in this role and does not anticipate any patent royalties at this time. RAB has also recently consulted for AstraZeneca and Boehringer-Ingelheim in the area of oncology clinical trial design.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Beckman, R.A. Neutral evolution of rare cancer mutations in the computer and the clinic. npj Syst Biol Appl 10, 110 (2024). https://doi.org/10.1038/s41540-024-00436-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41540-024-00436-3
This article is cited by
-
Personalized cancer treatment strategies incorporating irreversible and reversible drug resistance mechanisms
npj Systems Biology and Applications (2025)