Introduction

The ability of cells and tissues to alter their response to chemical and radiological agents is a major impediment to the success of therapy. Such altered response to treatment occurs in infections by bacteria, viruses, fungi, and other pathogens, as well as in cancer1,2,3,4. Despite advances in recent decades, this reduction in the effectiveness of treatments, broadly termed drug resistance, remains poorly understood, and in some circumstances is thought to be inevitable5.

One of the most clinically important examples of drug resistance is that involving the treatment of cancer via chemotherapy and targeted therapy. Drug resistance is a primary cause of treatment failure, with a variety of molecular and microenvironmental causes already identified6. For example, the upregulation of drug efflux transporters, enhanced DNA repair mechanisms, modification of drug targets, stem cells, irregular tumor vasculature, environmental pH, immune cell infiltration and activation, and hypoxia have all been identified as mechanisms which may inhibit treatment efficacy4,6,7,8,9,10,11,12,13,14. A vast amount of experimental and mathematical research continues to shed light on our understanding drug resistance. An excellent “roadmap” article (which also includes references to a number of relevant mathematical models) summarizes state-of-the-art approaches to understanding the role of both non-genetic plasticity and genetic mutations in cancer evolution and treatment response15, while also highlighting current and future challenges.

Aside from understanding the mechanisms by which resistance to therapies may manifest, a fundamental question is when resistance arises. With respect to the initiation of therapy, resistance can be classified as either pre-existing or acquired6. The term pre-existing (also known as intrinsic) drug resistance is reserved for the case when the organism contains a sub-population (or a tumor contains a sub-clone) which resists treatment prior to the application of the external agent. Examples of the presence of extant resistance inhibiting treatments are abundant in bacteria and cancer, including genes originating from phyla Bacteroidetes and Firmicutes in the human gut biome16, BCR-ABL kinase domain mutations in chronic myeloid leukemia17,18, and MEK1 mutations in melanoma19. Conversely, acquired resistance describes the phenomenon in which resistance first arises during the course of therapy from an initially drug-sensitive population. The question of whether resistance is pre-existing or acquired is a classical one in the context of bacterial resistance to a phage20.

The study of acquired resistance is complicated by the question of how resistance emerges. Resistance can be spontaneously (also called randomly) acquired during treatment as a result of random genetic mutations or stochastic non-genetic phenotype switching. These cells can then be selected for in a classic Darwinian fashion21. Resistance can also be induced (or caused) by the presence of the drug21,22,23,24,25,26,27. These cells are often referred to as “drug tolerant persisters” in the literature28. That is, the drug itself may promote, in a “Lamarckian sense”, the formation of resistant cancer cells so that treatment has contradictory effects: it eliminates cells while simultaneously upregulating the resistant phenotype, often from the same initially-sensitive (wild-type) cells.

The fundamentally distinct scenarios in which resistance may arise are illustrated in Fig. 1. Note that in each of the three cases (pre-existing, spontaneous acquired, and induced acquired), the post-treatment results are identical: the resistant phenotype dominates. However, the manner by which resistance is generated in each case is fundamentally different, and the transient dynamics (both drug-independent and drug-dependent) may vary drastically in each scenario. As a result, the same therapeutic design may result in different outcomes depending on the (dominant) mode of resistance. Although there is experimental evidence for these three forms of drug resistance, differentiating them experimentally is non-trivial. For example, what appears to be drug-induced acquired resistance may simply be the rapid selection of a very small number of pre-existing resistant cells, or the selection of cells that spontaneously acquired resistance21,29. It is in this realm that mathematics can provide invaluable assistance. Formulating and analyzing precise mathematical models describing the previously-mentioned origins of drug resistance can lead to novel conclusions that may be difficult, or even impossible, given current technology, to determine utilizing experimental methods alone.

Fig. 1: Three distinct ways in which drug resistance may be generated.
figure 1

In the first row, resistance is pre-existing, so that resistant clones are selected during therapy. In spontaneously acquired resistance, random genetic and epigenetic modifications produce a resistant phenotype during therapy, where again selection dominates. In the case of induced acquired resistance, the application of the drug causes the resistant cell to arise, so that the mechanism is not simply random variation (e.g. mutations) followed by classical Darwinian selection. Instead, there is a “Lamarckian” component (gray arrow). Note that any combination of these three mechanisms may be causing resistance in a single patient.

The main goal of this work is to develop a validated mathematical model of drug-induced resistance, and to explore the implications for protocol design30,31. An outline of the paper is as follows. In the “Results”, we start by describing an experimental dataset which demonstrates the induction of resistance to BRAF inhibitors in vitro in a melanoma cell line that harbors a BRAF mutation32,33. We next demonstrate that a modified version of a previously-developed induced resistance model gives excellent fits to the data across a range of drug doses on a training dataset, and well predicts time-course data for doses in a validation dataset. The outcomes of an optimal control study are also presented in this section. Discussion then explores limitations of our modeling study and directions for future work. We conclude with details of the mathematical model, computational methods, and the formulation of the optimal control problem in “Methods”.

The main goal of this work is to develop a validated mathematical model of drug-induced resistance, and to explore the implications for protocol design30,31. An outline of the paper is as follows. In the Results section we start by describing an experimental dataset which demonstrates the induction of resistance to BRAF inhibitors in vitro in a melanoma cell line that harbors a BRAF mutation32,33, and we then demonstrate that a modified version of a previous induced resistance model of ours gives excellent fits across a range of drug doses on a training dataset, and well-predicts time-course data for doses in a validation dataset. We also present the results of an optimal control study in this section. The “Discussion section” has remarks and a discussion of directions for future work. Finally, in the Methods section we review our previously-developed model of resistance induction31 and propose several modifications to the model necessitated by the response of the melanoma cell line to the BRAF inhibitor, we describe the methodology employed to fit and validate the model, carry out an identifiability analysis, and describe in more detail the optimal control problem that we formulated and solved in this context.

Results

Melanomas harboring mutations in BRAF, which result in excessive growth signaling, are commonly treated with small-molecule BRAF inhibitors such as vemurafenib, dabrafenib, and encorafenib34. These drugs prevent the over-activation of pathways controlled by BRAF, including several types of MAPK pathways. As with other therapeutics, resistance to BRAF inhibitors poses a significant barrier to treatment success. Drug-induced reprogramming by vemurafenib has been well-studied experimentally35.

In an effort to better understand resistance to vemurafenib, Fallahi-Sichani et al. subjected COLO858 melanoma cells (which harbor a BRAFV600E mutation) in vitro to vemurafenib at various drug concentrations before imaging the cells over the course of 84 h32. Single cell imaging during vemurafenib exposure revealed three possible outcomes: death, quiescence (cells survive but do not divide), and resistance (cells survive and divide). Interestingly, if the COLO858 cells were pre-treated overnight with a low-dose of vemurafenib before their entrance into the higher-dosed assay conditions, greatly increased cell viability was observed32. This finding indicates that a lower dose of drug may “teach” tumor cells the mechanism of resistance, which will then allow them to grow under treated conditions that they would not have otherwise survived. They also found that this vemurafenib resistance is reversible when non-sensitive cells were plated in fresh drug media. Upon the return of vemurafenib to these sorted cells’ culture conditions, resistance was re-induced, indicating that this mechanism of resistance is both reversible and persistent32. In follow-up work, COLO858 cells were studied in vitro under varying vemurafenib concentrations33. An algorithm was used to count the number of cells, division, and death events over this time period. As in Fallahi-Sichani et al.32, cells were observed to be in either a sensitive, quiescent, or resistant state.

Herein, we consider cell count data from these experiments over 100 h for COLO858 cultured at the following doses of vemurafenib: 0.032, 0.1, 0.32, 1, and 3.2 μM. The mathematical model we first utilize to describe this data is a two-population model of the number of sensitive cells S and the number of resistant cells R under exposure to vemurafenib:

$$\dot{S}={r}_{S}S-{d}_{S}(1-{e}^{-{\gamma }_{1}t})S-\alpha (1-{e}^{-{\gamma }_{2}t})S$$
(1)
$$\dot{R}={r}_{R}R+\alpha (1-{e}^{-{\gamma }_{2}t})S-{d}_{R}(1-{e}^{-{\gamma }_{1}t})R.$$
(2)

In this model, rS and rR are the growth rates of the sensitive and resistant population, respectively, and we assume that rS ≥ rR. dS and dR are the corresponding drug-induced kill rates, with dR ≤ dS by the definition of resistance. Drug-induced resistance occurs at the rate α and γi represents the delay in drug action. Unless otherwise indicated, we are studying this model subject to the condition that initially all cells are sensitive. More details can be found in “Mathematical Models.

As the data33 was repeated over four replicates, we consider the mean cell count across these replicate experiments, normalized by the mean initial cell count (see Fig. 2a), to parameterize our model. A multi-start fitting algorithm is used to independently find the best-fit parameter values at the training doses of 0.032, 0.32, and 3.2 μM. Doses of 0.1 and 1 μM were withheld for validation purposes. The cost function minimized to find the best-fit parameter values is the sum of the absolute differences:

$$\zeta := \mathop{\sum }\limits_{t=1}^{T}\left\vert {N}_{{\rm{model}}}(t)-{N}_{{\rm{data}}}(t)\right\vert ,$$
(3)

where Nmodel(t) = S(t) + R(t) is the model-predicted normalized cell count at time t, and Ndata(t) is the normalized experimental mean (across four replicates) cell count at time t. “Mathematical models” provides more details on the fitting algorithm.

Fig. 2: Data and fits for the two-population model.
figure 2

a Mean across four replicates of the normalized cell count data from33 and model fits. Best model fits are shown at a drug concentration of (b) 0.032 μM, (c) 0.32 μM, and (d) 3.2 μM. The estimate of the standard error of the sample mean (SEM) across four replicates, are shown in blue in (bd). The relative value of the cost function is ζnorm = 11.43 for the dose of 0.032 μM, ζnorm = 14.42 for the dose of 0.32 μM, and ζnorm = 21.56 for the dose of 3.2 μM. The fitting process results in time course predictions for the respective sizes of the sensitive (S shown in the dashed green curve) and resistant (R shown in the dotted black curve) cell subpopulations.

Two-population model adequately captures dynamics over a range of doses

Here we employ the fitting methodology described in detail in “Computational methods” to independently fit the mean normalized cell count of COLO858 cells at the training doses of 0.032, 0.32, and 3.2 μM. The results of this fitting can be seen in Fig. 2b–d. We observe an excellent agreement between the best-fit model solution and the three datasets. The optimal parameter values are provided in Table 1.

Table 1 Definition and best-fit value of parameters for the model in Eqs. (1, 2) over three drug concentrations

Confidence in model parameterization is quantified in Fig. 3. As detailed in “Computational methods”, model fitting is attempted for M = 1000 sets of initial parameter guesses. If realization i of the fitting algorithm converges to a parameterization for which the cost function value ζi is within 5% of the optimal cost function value ζopt, the corresponding best-fit parameters are classified as near-optimal. In other words, a near-optimal parameterization is one in which the cost function value is within this predefined percent from the optimal value of the cost function; it is not a parameterization that deviates a fixed percentage from the parameter’s optimal value. The threshold 5% was chosen by visual inspection (not shown), as all such parameterizations seemed to provide a good fit to the data.

Fig. 3: Practical identifiability of parameters in two-population model.
figure 3

Distribution of parameters that yield model fits within 5% of the optimal parameterization for three doses: 0.032 μM in blue, 0.32 μM in red, and 3.2 μM in yellow. Below each distribution, the mean is provided, along with the coefficient of variation (CV). Plots are shown for (a) rS, (b) dS, (c) α, (d) rR, and (e) dR. Further details about the near-optimal parameters are found in Supplementary Table S1.

Figure 3 displays the distribution of the near-optimal parameters for drug concentrations of 0.032, 0.32, and 3.2 μM. We find that, for each model parameter, the near-optimal values are tightly distributed about the optimal value across the three doses, which can be interpreted as a measure of practical identifiability of model parameters; structural identifiability is justified in “Structural identifiability analysis”. Further, each parameter behaves monotonically with respect to dose. We now discuss the behavior of each model parameter as a function of dose in detail.

The growth rate rS of sensitive cells is observed to monotonically increase as a function of drug dose, with no overlap in the near-optimal distributions across the three doses. This counterintuitive observation may be a consequence of a relatively fast cellular response to stress (therapy), which prompts a release of growth signals via paracrine signaling; this has been observed in the case of pancreatic ductal adenocarcinoma upon radiotherapy36. Other possible mechanisms might include one or more of the following when a targeted inhibitor such as vemurafenib is applied: (1) an increase in growth rate as the cell is relieved from its feedback regulation of cell growth37 or (2) activation of compensatory metabolic programs that are more efficient, allowing energy or other resources being redirected toward growth38.

As expected, the drug-induced death rate of sensitive cells dS monotonically increases as a function of drug concentration. Again we find no overlap in the near-optimal distributions across the three doses, indicating that drug-induced cell kill behaves in a dose-dependent manner.

The rate of drug-induced resistance α is found to monotonically decrease as a function of drug concentration. As opposed to the sensitive population growth and induced death rates rS and dS, a slight overlap exists in the distribution of α over the lower two doses. In particular, the maximal near-optimal value of α found at dose of 0.32 μM is 0.2518, whereas the minimal value found at a dose of 0.032 μM is 0.2461. This overlap is visually represented in Fig. 3 by the absence of the hatch marks between the distributions. We had originally hypothesized that the relationship observed between α and drug dose would be the reverse of what is observed; that is, we had hypothesized that higher drug doses would induce resistance at a quicker rate. A possible explanation for this behavior is that higher dosages may kill the sensitive cell population (as a result of the increased value of dS) prior to the cell being able to transition to a resistant state.

The distributions of the resistant subpopulation parameters tell an interesting story about the behavior of drug-resistant cells that form at higher drug doses. Focusing first on the growth rate rR, we observe the expected relationship: this rate decreases as a function of drug dose. Interestingly, the order of magnitude of the near-optimal growth rates are the same at the two lower doses of the drug. In particular, the near-optimal values at the dose of 0.032 μM are in the range [0.0498, 0.0546]. For the dose of 0.32 μM, the near-optimal parameter range is [0.0353, 0.0445]. Although a dose-dependence between these lower doses exists, the rR parameter is always on the order 10−2. We observe a different phenomenon at the highest dose of 3.2 μM, where >98% of the near-optimal parameters are on the order of 10−3. Similarly, near-optimal values of the drug-induced death rate dR range from [0.0001, 0.0417] at the highest drug concentration, as compared to [0.0581, 0.0752] at the intermediate concentration and [0.0691, 0.0779] at the lowest concentration. Taken together, we find that resistant cells that were induced by the highest drug concentration grow, and are killed by the drug, at significantly lower rates than at intermediate and low drug concentrations. This result is worth comparing to a finding in ref. 33 that COLO858 cells treated at a dose of 3.2 μM exhibit one of the three responses: rapid death, quiescence (the cells were observed to survive, but not divide, in the experimental time window), and resistant-like behavior where the surviving cells divide at much lower rates than observed in the absence of drug. The fact that, only at the highest dose in our model, the resistant subpopulation takes on quiescent-like behavior (very slow growth and death) seems consistent with the response to high drug doses in33.

We also observe an interesting relationship between the net resistant growth rate, rRdR, and dose, which is provided in the last row of Table 1. As this rate is always negative, we see that all of the applied doses are able to asymptotically eliminate the resistant population. However, this net rate is most negative at the intermediate dose of 0.32 μM, suggesting that intermediate dosing regimens may be more effective than a maximally tolerated dose (MTD) inspired strategy. This is investigated more thoroughly in “Optimal control results”, where we solve an optimal control problem to determine the effectiveness of different treatment protocols. Note that rRdR represents the asymptotic net growth rate of the resistant population, as the sensitive population is always eliminated (rSdS is negative for all doses in Table 1).

Taken together, the tight distribution of the best fit parameter values across three drug concentrations, as well as the monotonicity of the parameter distributions as a function of drug concentration, lends support to the sufficiency of the proposed two-compartment model proposed of drug-induced resistance in Eqs. (1, 2). In other words, the dynamics observed in the data require only the existence of a sensitive and a resistant population, which is an experimentally testable hypothesis. For instance, gene expression data could be analyzed through clustering methods to see if two subgroups are sufficient to classify the data. Although these results all assumed no pre-existing resistance, the optimal value of the parameters and cost function are effectively unchanged if we instead assume that 1% of the initial population is intrinsically resistant to the drug (see Supplementary Table S3).

A finer-grained three-population model does not improve fits to data

Given that experimental data indicates the presence of a subpopulation of quiescent cells at the highest drug dose that is distinct from the resistant subpopulation, we next explored whether a three population model that accounts for both a quiescent and a resistant subpopulation would better describe the experimental data. The three-population model we consider is a direct extension of our two-population model:

$$\dot{S}={r}_{S}S-{d}_{S}(1-{e}^{-{\gamma }_{1}t})S-q(1-{e}^{-{\gamma }_{2}t})S$$
(4)
$$\dot{Q}=q(1-{e}^{-{\gamma }_{2}t})S-\beta Q$$
(5)
$$\dot{R}=\beta Q+{r}_{R}R-{d}_{R}(1-{e}^{-{\gamma }_{1}t})R.$$
(6)

In this model, Q represents the normalized number of quiescent cells, and it is assumed that cells must pass through the quiescent state before becoming resistant. The drug-induced transition of sensitive cells to quiescent cells occurs at rate q, and quiescent cells transition to resistant at rate β. More details are provided in “Mathematical models”.

The best fits of this three-population model to the experimental cell counts at doses of 0.032, 0.32 and 3.2 μM are shown in Fig. 4a–c, under the assumption that all cells are initially-sensitive. Although excellent fits to the data are obtained, the fits do not show any noticeable improvements from the two-population model (see Fig. 2). This visual observation is confirmed by comparing the value of the normalized goodness-of-fit function:

$${\zeta }_{norm}:= \mathop{\sum }\limits_{t=1}^{T}\frac{\left\vert {N}_{{\rm{model}}}(t)-{N}_{{\rm{data}}}(t)\right\vert }{{N}_{{\rm{data}}}(t)}.$$
(7)

using the two- and three-population model (Fig. 4d).

Fig. 4: Model fits for the three-population model.
figure 4

Best model fits are shown at a drug concentration of (a) 0.032 μM, (b) 0.32 μM, and (c) 3.2 μM. The estimate of the standard error of the sample mean (SEM) across four replicates, are shown in blue in (ac). d The relative value of the cost function ζnorm when data from each dose is fit using either the two-population (Eqs. (1, 2), shown in blue) or the three-population (Eqs. (4,6), shown in red) model.

Plots of the near-optimal parameter distributions for the three-population model (Fig. 5) further highlight the lack of necessity of a three-population model, as many of the model parameters are non-identifiable. The exception to this are the parameters describing the sensitive cells. Not only do rS and dS have tight distributions in the three-population, the distributions are incredibly similar to the two-population case in Fig. 3. The lack of identifiability is demonstrated quite clearly for the resistant cell parameters at the dose of 3.2 μM. While the resistant cell growth rate rR behaves quite similarly in the two- and three-population model at lower doses, at the dose of 3.2 μM, the near-optimal parameters for the three-population model extend over a very wide range of values, indicating that we do not have sufficient data to identify the value of the parameter in this case. Interestingly, the drug-induced death rate dR of resistant cells differs across doses when comparing the two- and three-population model. However, as with rR, the loss of identifiability is most apparent at the highest drug dose. The transition parameters q and β preserve their identifiability at higher drug doses, but lose identifiability at the lowest dose of 0.032 μM. Taken together, this analysis indicates that a number of model parameters cannot be adequately identified given the available data when using this three-population model. The absence of improvement in model fits, coupled with the lack of identifiability of model parameters, lends strong support to the sufficiency of the two-population model.

Fig. 5: Practical identifiability of parameters in the three-population model.
figure 5

Distribution of parameters that give fits within 5% of the optimal parameterization for three-population model at three doses: 0.032 μM in blue, 0.32 μM in red, and 3.2 μM in yellow. Plots are shown for (a) rS, (b) dS, (c) q, (d) β, (e) rR, and (f) dR. Further details about the near-optimal parameters are found in Supplementary Table S2.

We also considered a three-population that instead assumes quiescence is a terminal state in its own right, rather than a transient state en route to resistance (see SI S.4). In this revised three-population model, the drug induces a fraction π of sensitive cells into a quiescent state and 1 − π into a resistant state. Quite interestingly, the optimal parameterization of this model reveals that, at low doses, the model effectively reduces to a two-population model with only sensitive and resistant cells, as the transition rate into quiescence is very small. At the highest dose, the resistant population grows so slowly and is killed so rapidly that they effectively behave like quiescent cells. In other words, the three-compartment model again effectively behaves likes a two-compartment model.

Predicting population response at other doses

Given our choice to proceed with the two-population model, we next turned to validation of that model. We approximated the value of the parameters for the non-fit doses of 0.1 and 1 μM via the piecewise linear interpolation process described in “Computational methods”. The parameters that result from this interpolation process are shown in Fig. 6.

Fig. 6: Parameter predictions.
figure 6

Piecewise linear interpolation (dashed blue line) of best-fit parameter values at drug concentrations 0.032, 0.32 and 3.2 μM to predict parameter value at withheld concentrations of 0.1 and 1 μM.

In Fig. 7, we illustrate the goodness-of-fit of the model solved at the interpolated parameter sets to the in vitro cell count data at the dose of 0.1 and 1 μM. We observe that the model is able to well-capture the qualitative dynamics of tumor growth at the two doses that were withheld for validation purposes. Quantitatively, the fits are quite good, although they are inferior to the fits for the training datasets (Fig. 4d). Specifically, the worst fit to the training data is observed at the dose of 3.2 μM, with this case having a relative cost ζnorm of 21.56 in the two-population model; the value of the relative cost for the dose of 1 μM is ~2.5 times higher. This occurs because, at this dose, model predictions are at the upper bound of the experimental data. Similarly, the value of the relative cost for the dose of 1 μM is approximately double the worst fit in the training data. This occurs because, at this dose, the model predicts that the normalized cell counts achieves its maximum several hours earlier than is observed in the data. Further, the interpolated parameters also predict a local minimum around hour 40 that is not seen in the in vitro data.

Fig. 7: Data and model predictions using interpolated parameters.
figure 7

Predictions are shown at a drug concentration of (a) 0.1 μM and (b) 1 μM. The estimates of the standard error of the sample mean (SEM) across four replicates, are shown in blue. The relative value of the cost function ζnorm at the dose of 0.1 μM is 53.79, and at the dose of 1 μM is 44.42.

Nonetheless, given that parameters at both doses were obtained by a simple piecewise linear interpolation of parameter values obtained at other doses, the quality of these predictions does indicate that our model capable of describing COLO858 response to a wide range of vemurafenib doses.

Optimal control results

The parameterization of the two-compartment model in Table 1 suggests that high drug doses may not be optimal. While the induction rate α decreases as a function of dose, the drug-induced resistant cells are actually most-targetable at an intermediate drug dose. This can be seen in the values of resistant cell net growth rate inclusive of drug effect rRdR in Table 1. The larger the value of rRdR the “more resistant” the drug-induced phenotype is. As this combined parameter is minimized at the intermediate dose of 0.32 μM, and the induced resistance rate is not maximal at this dose, together this indicates an intermediate dose of drug may be optimal.

As our mathematical model has been calibrated and validated on a number of doses, we can use it to determine theoretical optimal dosing strategies, as well as how close “standard” dosing protocols are to being optimal. Here, we explore solutions to the optimal control problem posed in “Optimal control formulation”. In particular, we numerically seek to minimize the final cancer cell number (Eq. (23)) subject to a total dosage constraint (Eq. (25)) using CasADI. As different doses in the experimental data utilized different values for the total applied dose, we also explore the impact of varying the total dosage M in Fig. S4.

The computed optimal control, as well as the population response, is provided in Fig. 8. We also include a comparison with a number of other dosing strategies, including a constant intermediate dose, two maximally tolerated dose (MTD) strategies (one where the dose is applied at the beginning of the treatment, and the other where it is applied at the end), as well as a periodic “bang-bang” switching strategy. Note that all applied controls, including the optimal control, reach the upper bound on the total applied dosage M, and thus are all comparable with respect to the total amount of applied drug. Although not shown, all M corresponding to the constant dosages considered in the experimental data yield similar optimal control strategies.

Fig. 8: Response of cell population to applied controls.
figure 8

Control strategies considered appear on the left, and include the numerically computed optimal control (black) a constant intermediate dose (blue), a maximal tolerated dose applied at the onset of therapy (green), a maximal tolerated dose applied at the end of therapy (gold), and a pulsed “bang-bang” strategy with four “on” cycles (red, dotted). All strategies obtain the same total applied dosage of M = 100 μM hours. Total cell population (N = S + R) responses are provided in the right figure. Note that the numeric optimal control yields a smaller final tumor size compared to the other strategies considered, but is comparable in value to the constant intermediate (i.e. metronomic) therapy.

The structure of the computed optimal control confirms the prediction discussed in “Two-population model adequately captures dynamics over a range of doses” regarding the optimality of intermediate doses: the optimal dosing regimen remains well below the upper bound of 3.2 μM. Moreover, while this control may be difficult or even impossible to implement, we observe that it is well approximated by the constant intermediate dosing strategy u(t) ≡ 1 μM. This is especially clear when we compare the final cancer cell populations, with the optimal control only slightly outperforming the constant strategy. As the constant therapy is an idealization of metronomic therapy (i.e. low dose, frequently or continuously administered chemotherapy39), our results imply that such therapy is near optimal. Comparing the population responses to the MTD therapies, we see that this is even more apparent, as this approximate metronomic therapy results in substantially smaller relative number of cancer cells (N(tf) ≈ 1.61, 4.89, 2.83 for constant, initial MTD, and ending MTD, respectively). For this cell population, it thus appears that MTD treatments are far from optimal, which is supported by a number of recent experimental and theoretical studies40,41,42,43,44,45,46. Furthermore, although the optimal control yields a slightly smaller final cancer size compared to constant therapy, it only achieves this result at the very end of the therapeutic window, and for most of the experiment, constant therapy results in a smaller cancer burden (e.g. the constant treatment has a smaller L1 norm).

Discussion

Drug-induced resistance is a major impediment to the success of targeted therapy. Several deterministic and stochastic mathematical models of induced resistance have been recently proposed47,48,49,50,51,52,53, and are briefly discussed in SI S.5. None of these models are simple and interpretable while at the same time providing good fits to tumor cell count (or volume data) when resistance is known to be induced by the drug. This paper describes a minimal mathematical model of drug-induced resistance which is a modification of a previous model proposed by us31. Here, we demonstrate that this model is able to provide excellent fits to time-resolved in vitro experimental data, allowing one to consider both induced resistance and delays in drug action. Moreover, using only bulk data on total cell numbers, the model allows one to separate the contributions of sensitive and resistance subpopulations and describes their dynamics. A theoretical identifiability analysis demonstrated that the parameters of the proposed model are, in principle, uniquely obtainable from data. A practical identifiability analysis showed tight confidence bounds on all parameters given the available data, provided we only consider a sensitive and resistant subpopulation of cells; practical identifiability is lost in finer-grained models that also include a quiescent subpopulation. To determine the predictive power of the model, we assessed its ability to predict data at two untrained drug doses. Excellent qualitative, and reasonable quantitative, predictions were made by the model at both untrained doses. Using the validated model, we numerically explored an optimal control problem, with the goal of minimizing the final cell count subject to an applied dosage constraint. While we did identify a non-standard dosing strategy as the optimal control, the optimal control only slightly outperformed a metronomic-like protocol of administering a constant, lower dose of drug. On the other hand, maximal tolerated dose-like protocols proved to be far inferior.

Despite the model’s ability to very well describe the training data, and reasonably well describe the validation data, several shortcomings of the model should be noted. First, asymptotically, the model predicts that both the sensitive and resistant populations tend to extinction. This is an artifact of assuming exponential net growth, where both the growth rate and drug-induced death rate are proportional to the subpopulation size. The only two outcomes of such a model, once the drug has eliminated the sensitive cells, is that the resistant population is eventually eradicated or that the resistant population grows unbounded. Since the net growth terms that result from model fitting (see Table 1) are all negative, extrapolating far beyond the data fitting window would indeed eventually result in the extinction of the resistant population. While a more complicated growth or drug-induced death term could remedy this, the data were not available to parametrize a model with more complexity. It is important to note, however, that the model only captures subpopulation dynamics subject to treatment. While it is certainly possible that, in vitro, indefinite administration of therapy could eradicate both sensitive and resistant cells, due to toxicity concerns such a scenario is not realistic in vivo. Given this, we were careful in this work to not extrapolate predictions outside the 100-h window of treatment.

Another limitation is that the model fits to the validation doses of 0.1 and 1 μM are inferior to the fits on the training doses. Particularly, at the dose of 0.1 μM, the model predictions are consistently at the upper bound of the experimental data, and at a dose of 1 μM, the timing of the peak cell count in the model is shifted relative to the data. These discrepancies could be indicative of an inadequacy in the model structure or to the simplifying assumption that parameter values at intermediate doses can be determined from a piecewise linear interpolation of neighboring doses. It would be desirable to carry out validation experiments at a finer grid of concentrations, to further restrict the model.

Finally, our model makes one counterintuitive prediction that requires deeper study. In particular, we observe that the growth rate of sensitive cells increases as a function of dose. We speculate that this may be due to its intrinsic response to stress (in days)54, which happens at a faster time scale than the emergence of the resistant population arising from slower mechanisms such as epigenetic reprogramming (in weeks)35. Identifying such mechanisms and then blocking their action may elucidate the relation between these two dose-dependent effects.

This study suggests a variety experimental follow-ups. One of the most important of these is the validation of our predictions, at least qualitatively if not quantitatively, of the respective time-resolved contributions of resistant and sensitive populations. Our mathematically derived S and R populations suggest the possibility to group the distributions of mRNA, protein, or epigenetic dynamic signatures of individual cells into just two major clusters with separate drug sensitivity and growth and death rates. These signatures might reflect resistance markers, such as the increased expression of membrane transporters that enhance the efflux of drugs as in the study21. Experimental tools that could be utilized to investigate this include the use of single cell clonally-resolved transcriptome datasets (scRNA-seq) as in55.

Another direction for work is the comparison of drug dosing strategies, for the same in vitro cell lines and drugs, as investigated in the mathematically derived optimal control analysis (Two-population model adequately captures dynamics over a range of doses). Many natural questions exist. For example, is a continuous-dose or a constant “metronomic” strategy actually superior to an MTD approach? Are other periodic strategies superior in terms of the final population size at a specific time horizon? Do pulsed strategies not afford any advantages, as with our example (red dashed curve) in Fig. 8?

Of course, the development of similar models for other rich time-resolved data sources would also be of interest. Different cell lines and different drug choices will affect the parameters differently, through interference with metabolic processes or disruption of regulatory feedback loops.

Bridging the gap between preclinical drug response evaluation to clinical testing remains a major challenge. One key step to address this gap, however, is to understand the mechanisms of tumor cells developing drug resistance, since targeted therapies often yield substantial response in patients initially before tumor relapse. In this work, we use mathematical modeling to dissect the population dynamics in tumor cell adaptation to drugs, a process that ultimately seeds future tumor relapse. In particular, we leverage a unique dataset, where individual BRAF mutant melanoma cells (a prototypical system for assess drug adaptation to targeted therapy) were being tracked over 5 days under variable clinically relevant doses of Vemurafenib treatment, which is also an FDA-approved drug for BRAF-mutant melanoma56. Importantly, at this time scale, subpopulations of tumor cells often are able to adapt and escape therapy32,54, constituting a reservoir of tumor cells that later develop more stable resistance. Hence, quantitatively understanding the process of drug adaptation in this system can yield important insights in understanding drug response that can significantly impact clinical outcomes.

Methods

Mathematical models

A large number of mathematical models of drug-induced resistance have been proposed of late48,49,50,51,52,53,57,58,59,60. (Several of them are reviewed in SI S.5.) Many of these model the transition of cancer cells from a sensitive compartment to a resistant one in the presence therapy (although, in general, not distinguishing random from induced effects). The precise functional form for this transition, along with the functional forms for the growth and tumor-kill terms, differ from model to model. Thus, the field is at the stage where these modeling assumptions must be interrogating against datasets with demonstrated resistance induction.

In an effort to better understand the role of selection and induction in the evolution to drug resistance, in ref. 31 we introduced a simple, phenomenological mathematical framework to distinguish between spontaneous and induced resistance. This system of ordinary differential equations divided cells into three subpopulations: sensitive cells, non-reversible resistant cell, and reversibly resistant cells. Each population was assumed to grow logistically and to be killed by the drug at a dose-dependent rate. Spontaneous, drug-independent resistance causes sensitive cells to transition to the non-reversible resistant compartment, while a drug-dependent term results in sensitive cells transitioning to the reversibly resistant compartment.

We have previously analyzed in detail this three-compartment model subject to several simplifying assumptions. The first assumption is that there is only a single resistant subpopulation that is induced by the drug, while the second is that the reversion to the sensitive phenotype in the presence of drug is negligible on the time scales considered21,31. Thus, this simplified model considers both pre-existing resistance and spontaneous drug-induced resistance (without reversibility), but not randomly acquired drug resistance (see Fig. 1). Under these assumptions, the dynamics of the number of sensitive cells S and the number of resistant cells R are described as follows:

$$\dot{S}={r}_{S}\left(1-\frac{S+R}{K}\right)S-\,\alpha u(t)S-{d}_{S}u(t)S,$$
(8)
$$\dot{R}={r}_{R}\left(1-\frac{S+R}{K}\right)R+\alpha u(t)S-{d}_{R}u(t)R.$$
(9)

Here rS and rR are the growth rates of the sensitive and resistant population, respectively, and dS and dR are the corresponding drug-induced kill rates. The function u(t) represents the effective applied drug dose at time t, and is thought of as a control input; dots denote derivatives with respect to time in all equations in the manuscript. By definition, the drug is less effective against resistant cells, so we assume that dR ≤ dS. We also assume that rS ≥ rR, as experimental evidence supports that resistant cells grow slower than nonresistant cells61,62. Drug-induced resistance occurs at the rate α. Our analysis of this model31 indicated that, provided there was a minimal amount of pre-existing resistance, treatment response was controlled by the extent of drug-induced resistance α. Hence we focused our prior analysis on the impact of induced resistance by assuming R(0) = 0. We found a qualitative distinction in tumor response to two standard treatment schedules (continuous versus pulsed) when the drug has the same cytotoxic potential but different levels of induction. A rigorous analysis of the optimal control structure (both theoretical and numerical) can be also be found in30.

The model we utilize in this work has the same overall structure as the two-compartment model in Eqs. (8, 9), albeit with two notable modifications. The first is that, because we fit to in vitro data, we converted the logistic growth terms to exponential growth. The second change is motivated by the fact that, in the experimental data33, there is a notable delay in drug response. This can be seen Fig. 2a, wherein the mean cell count at all doses follows the same trajectory until around 18 h. Only at that time do the dose-specific trajectories diverge from one another. Biologically, this delay in drug effect may be due to various factors such as drug uptake and engagement of the target63, initiation of the apoptotic pathway64, or cell-cycle inhibition65. Note that model (8, 9) does not include any such delay, and thus any applied difference in dosage will result in an immediate response by the cell population.

Thus, in the current work, we consider the revised model of the normalized cell count (relative to the initial count) S and R given in Eqs. (1, 2). Rather than explicitly including the control term u(t) in this model, we instead consider the growth rates rS and rR, the induction parameter α, and the drug-kill parameters dS and dR to be dose-dependent, and thus fit separately to time-course data from experiments utilizing different doses. However, structurally Eqs. (1, 2) model drug-kill and resistance induction in the same way as Eqs. (8, 9). That is, the delay in drug response is accounted for by modulating the kill and induction response by terms of the form \((1-{e}^{-{\gamma }_{t}})\), so that the system of differential equations is explicitly time-varying with respect to these rates. A more mechanistic derivation of the model (1, 2), that also permits the consideration of time-varying dosing strategies in an optimal control problem, is presented in the next subsection.

Given that the experimental data indicated the presence of a subpopulation of quiescent cells that is distinct from the resistant subpopulation at the highest drug dose33, we also consider a three population extension of the model in Eqs. (1, 2) that explicitly accounts for this quiescent subpopulation (see Eqs. (46)). In this model, we assume that cells must pass through the quiescent state (at rate q) before becoming resistant (at rate β). The definition of all other parameters in this three-population model is the same as the definition in the two-population model in Eqs. (1, 2).

Details on derivation of mathematical model

To incorporate the delay on both drug-induced cell kill and drug-induced resistance, we differentiate between the applied dosage u and the effective dosage v. Here, u represents the drug concentration of vemurafenib in the in vitro experiment, while v corresponds to the phenotypic effect of the dosage on cell kill and induction rates. We assume that the dynamics on the rate parameters can be described via classical input (u) tracking66, with constant rates γ1 and γ2 determining the delay timescales for the drug-induced cell death and resistance-induction rates, respectively. The model we consider takes the following form:

$$\dot{S}={r}_{S}(u(t))S-{v}_{d,S}(t)S-{v}_{\alpha }(t)S$$
(10)
$$\dot{R}={r}_{R}(u(t))R+{v}_{\alpha }(t)S-{v}_{d,R}(t)R$$
(11)
$${\dot{v}}_{d,S}={\gamma }_{1}({d}_{S}(u(t))-{v}_{d,S})$$
(12)
$${\dot{v}}_{d,R}={\gamma }_{1}({d}_{R}(u(t))-{v}_{d,R})$$
(13)
$${\dot{v}}_{\alpha }={\gamma }_{2}(\alpha (u(t))-{v}_{\alpha })$$
(14)

As in Eqs. (8, 9), state S represents the normalized number of sensitive cells and R represents the normalized number of cells that have acquired an “inducible drug resistant” state. The parameters rS and rR represent growth rates of the drug sensitive and drug resistant populations, respectively, as a function of the applied dose u(t).

As opposed to parameters, we now consider statesvd,S and vd,R, which represent each subpopulations’ drug-induced death rate, and vα, the rate at which drug sensitive cells are induced to transition to the resistant phenotype. Note that division rates rS and rR are assumed to respond to instantaneous changes in the applied dosage u, while the effective apoptosis and induction rates vd,S, vd,R, and vα approach their asymptotic values at a rate determined by γ1 and γ2, respectively. This tracking dynamic thus models the observed delay in the onset of observable effects of the applied drug (see Fig. 2a). We emphasize that all rate parameters (rS, rR, dS, dR and α) are dependent on the applied dose u(t), and we denote this via expressions of the form rS(u(t)), rR(u(t)), etc. As discussed in “Computational methods”, these rates will be estimated from the normalized cell count data for different constant doses u(t) ≡ uconstant.

As an example of the dynamic introduced by rate tracking in cell death and induction, consider Eq. (12) describing the effective apoptosis rate for the sensitive population:

$${\dot{v}}_{d,S}={\gamma }_{1}({d}_{S}(u(t))-{v}_{d,S}).$$

Assuming vd,S(0) = v0, the solution of this linear ODE is given by the variation of parameters formula

$${v}_{d,S}(t)={v}_{0}{e}^{-{\gamma }_{1}t}+{\gamma }_{1}\mathop{\int}\nolimits_{0}^{t}{e}^{-{\gamma }_{1}(t-\sigma )}{d}_{S}(u(\sigma ))\,{\rm{d}}\sigma .$$
(15)

Similar statements hold for vd,R and vα. Hence, Eqs. (1014) represent a time-varying two-dimensional system in the S and R populations.

The experimental data from33 utilized in this work considers constant applied doses u, so that when fitting model (1014) to experimental data, we assume that u(t) ≡ u, a constant dose. Thus, we suppress the explicit applied dosage dependence of the rate parameters, and write rSrS(u), rR(u) rR(u), dSdS(u), dRdR(u), and αα(u), as these are constants (but are dose-dependent). The expressions in Eq. (15) thus simplify as follows. Assuming that the effective dosages for the cell kill and induction rates are zero initially (i.e. vd,S(0) = vd,R(0) = vα(0) = 0), we see from Eq. (15) that the solutions of Eqs. (1214) take the form

$${v}_{d,S}(t)={d}_{S}\left(1-{e}^{-{\gamma }_{1}t}\right)$$
(16)
$${v}_{d,R}(t)={d}_{R}\left(1-{e}^{-{\gamma }_{1}t}\right)$$
(17)
$${v}_{\alpha }(t)=\alpha \left(1-{e}^{-{\gamma }_{2}t}\right).$$
(18)

Using the above, system (1014) reduces to the model presented in Eqs. (1, 2):

$$\begin{array}{ll}\dot{S}\,=\,{r}_{S}S-{d}_{S}(1-{e}^{-{\gamma }_{1}t})S-\alpha (1-{e}^{-{\gamma }_{2}t})S\\ \dot{R}\,=\,{r}_{R}R+\alpha (1-{e}^{-{\gamma }_{2}t})S-{d}_{R}(1-{e}^{-{\gamma }_{1}t})R.\end{array}$$

The above system is what is used to determine the rate parameters as a function of (constant) dose u. Note that we will later consider time-varying doses u = u(t) when formulating and analyzing an optimal control problem (see “Optimal control formulation” and “Two-population model adequately captures dynamics over a range of doses”); in this case, we will consider the model as formulated in Eqs. (1014).

Computational methods

The in vitro data from ref. 33 includes five different dosing regimes covering a wide range of drug concentrations. Three of those doses were selected to parameterize the model in Eqs. (1, 2): 0.032 μM, 0.32 μM, and 3.2 μM. The doses of 0.1 μM and 1 μM are withheld for validation purposes. All model parameters are fit to this data except the delay terms γ1 and γ2. Figure 1C in the paper32 showed that at around 24 h after initiation of vemurafenib (1 uM) treatment, COLO858 cells begin to die and most division events stopped. Additionally, although the experiments in ref. 67 are for a different cell line, and were only performed for 24 h of treatment before being stimulated with EGF for 8 h, it is interesting that the data there also suggests that the onset of drug-adapted conditions for the A375 BRAF V600E melanoma cell line is observed at around 24 h. In another study68, cytotoxic effects reached full effect between 120 and 130 h. Although not all of these studies were specifically related to COLO858 cells, and the exact times are approximations, to decrease the complexity of the model (and specifically parameter estimation), we decided to fix γ1 = γ2 = 0.01. This corresponds to both terms achieving roughly 50% of their maximal effect at 72 h. We found that this assumption, which helps in practical identifiability and hence in parameter estimation, is not critical, as fitting for all parameters as well as the γi resulted in parameters γi of the same order of magnitude (not shown). Similar delay terms have been studied in the context of doxorubicin treatment69, and it has been remarked that the incorporation of delays is necessary for identifiability70. It has also been suggested that delays can be implemented in a model by inserting a series of intermediate compartments71.

Given that we assume that all resistance is drug-induced, and that S and R represent normalized cell counts, the model is solved using the initial conditions of S(0) = 1 and R(0) = 0, unless otherwise specified.

The above assumptions result in needing to fit five model parameters per drug dose: the sensitive cell growth rate rS, the sensitive cell drug-induced death rate dS, the drug-induced resistance rate α, the resistant cell growth rate rR (assumed to be ≤rS), and the resistant cell drug-induced death rate dR (assumed to be ≤dS). We find the best fit value of these parameters by minimizing the sum of the absolute differences between the model and the data, as defined in Eq. (3). We normalized the experimental data by the initial tumor cell count so as to standardize cell counts between different replicates.

In order to identify the parameter set, per dose, that minimizes ζ, we utilize MATLAB® and the built-in function fmincon. This executes an interior point algorithm to solve the posed constrained optimization problem. Beyond the previously-mentioned inequality constraints posed on the growth and drug-induced death rate of resistant cells, we limited the maximum function evaluations to 30,000. Further, to avoid numerical integration issues, all parameters were assumed to have a lower bound of 10−6, with the exception of rS which had a lower bound of 10−3. To avoid evaluating the model at biologically unrealistic parameter values, an upper bound was specified for each parameter. To evaluate the value of the cost function ζ at a specified model parameterization, the system of equations in (1, 2) is numerically solved in MATLAB® using ode23s, a stiff solver that implements a modified Rosenbrock formula of order two.

To mitigate the risk of fmincon converging to a local minimum, we implement a multi-start fitting algorithm. That is, rather than calling fmincon a single time, with a single starting guess for the parameters, we repeat the optimization process K = 1000 times, each time starting with a different initial guess for the parameters. Each starting parameter guess is obtained by using a quasi-Monte Carlo method to randomly sample the parameter space. The lower and upper bound for each parameter was set to the constraints imposed on the optimization problem itself. We used Sobol’s low-discrepancy sequences, which have uniformity properties other sampling techniques lack while being computationally efficient72. The multi-start fitting algorithm thus begins by randomly sampling K Sobol points of the form (p1, …, pn), where n is the number of parameters. As each coordinate pi in a sampled point is in the range [0, 1], we then scale the values of each pi to be in the range defined by its lower and upper bound. For each such parameterization, fmincon is implemented using that parameter set as the starting guess.

The end result of this multi-start algorithm is a set of K “best-fit” parameterizations of the model. The parameterization with the lowest cost function value ζopt is selected as the optimal model parameterizaton. Any other “best-fit” parameterization for which ζ varies from ζopt by 5% or less is considered a “near-optimal” parameter set. We will use these near-optimal parameter sets to study the sensitivity and identifiability of the model parameters. This multi-start parameter fitting algorithm was applied to the mean of the aggregate cell count data for doses of 0.032, 0.32 and 3.2 μM. All code for fitting the parameters to the proposed model, and identifying the optimal and near-optimal parameter sets, is available at https://github.com/sontaglab/Induced_Resistance.

Data corresponding to the experimental dose of 0.1 and 1 μM was withheld for model validation purposes. For each parameter pi, we predict the value of pi at these held out doses by performing a piecewise linear interpolation of the optimal value of pi at the fit doses of 0.032, 0.32 and 3.2 μM. This piecewise linear interpolation is done on a log base 10 scale, as the experimental doses are uniformly spaced on this scale. A piecewise linear interpolation is chosen because it avoids the complication of choosing a precise functional form for how each parameter changes as a function of dose, without assuming a single equation can describe the behavior of the parameter across all of the doses considered.

Structural identifiability analysis

Consider the set \({\mathcal{P}}\) of all possible parameters θ of the form

$$\theta =\left({r}_{S},{d}_{S},\alpha ,{r}_{R},{d}_{R}\right).$$

We examine the local structural identifiability of our model (1, 2) with fixed γ1 = γ2 and initial conditions of S(0) = 1, R(0) = 0 (normalized cell counts). Local structural identifiability is a “well-posedness” property: it asks if different (but close by) parameters θ1 and θ2 should give different total cell counts S(t) + R(t). In other words, from (perfect) time-resolved measurements of S(t) + R(t), one should in theory be able to recover all parameters in a neighborhood of a given one. Structural identifiability analysis deals with the ideal case in which there are no measurement errors or replication variability. Although this is not a realistic assumption in practice, the goal of this analysis is to determine if there are parameter combinations that are redundant, in which case numerical fitting methods would not provide interpretable parameters due to non-uniqueness.

In the field of structural identifiability, one often looks for generic results. (Genericity is used in order to ignore accidental equalities. For example, if we are given the expression px and we wish to determine x, this can always be done unless p happens to be zero, which is a very special parameter value.) Mathematically, we use the terminology in73: genericity means that there exists a “Zariski open”74 subset \({{\mathcal{P}}}_{0}\) of \({\mathcal{P}}\) so that the identifiability property holds for pairs of vectors θ1 and θ2 in \({{\mathcal{P}}}_{0}\). A Zariski open set is defined as the set of zeros of some nonzero polynomial. Such a set is “generic” in the sense that its complement has (Lebesgue) measure zero, so with “probability 1” a randomly chosen set of parameters will be identifiable. We assume that γ1 = γ2 = γ is fixed and known. For showing identifiability, we can then take without loss of generality γ = 1 (the general case can be reduced to this one by a time-reparameterization, multiplying all parameters by the known γ). To simplify, we introduce new parameters

$$a:= {r}_{S}-{d}_{S}-\alpha ,\,b:= {d}_{S}+\alpha ,\,c:= \alpha ,\,d:= {d}_{R},\,f:= {r}_{R}-{d}_{R}.$$

Identifiability with these new parameters is equivalent to the original ones, since the two sets are related by an invertible linear transformation. Our equations (1, 2) become, with these notations:

$$\begin{array}{ll}\dot{S}\,=\, \left(a+b{e}^{-t}\right)S\\ \dot{R}\,=\, \left(c-c{e}^{-t}\right)S+\left(f+d{e}^{-t}\right)R.\end{array}$$

We view y(t) = S(t) + R(t) as the “measured output” of this system. To show that the parameters can be recovered from the output y(t), it is enough to show that they can be recovered from the derivatives of y(t) at any fixed time (for example, t = 0), because these derivatives in turn are theoretically obtainable from the function itself.

Let us therefore compute the first five derivatives at time t = 0: \(y{\prime} (0)\) to y(5)(0). These derivatives are polynomial functions of the parameters. These computations can be done in closed form, using the product rule. For example, \({S}^{{\prime} }(0)=a+b\) and \({R}^{{\prime} }(0)=0\) (recall that S(0) = 1 and R(0) = 0). Similarly, \({S}^{{\prime\prime} }(t)={[(a+b{e}^{-t})S(t)]}^{{\prime} }=-b{e}^{-t}S(t)+[(a+b{e}^{-t})]{S}^{{\prime} }(t)=-b{e}^{-t}S(t)+{[(a+b{e}^{-t})]}^{2}\), and hence S(0) = −b + (a+b)2. One has \({y}^{{\prime} }(0)=a+b\), y(0) = cb − (d + f)(b − (a+b)2) + (a+b)2, and so forth. Let us call F the mapping from the 5-vector of parameters (a, b, c, f, e) into the 5-vector of output derivatives at time 0, \(({y}^{{\prime} }(0),{y}^{{\prime\prime} }(0),{y}^{(3)}(0),{y}^{(4)}(0),{y}^{(5)}(0))\). The Jacobian of F (that is, the matrix of partial derivatives of F with respect to the parameters) is a 5 × 5 matrix of parameters which will be nonsingular except at those parameter vectors where its determinant D is zero. Since D is a polynomial function of the parameters, nonsingularity holds on the Zariski open subset consisting of parameters for which D(a, b, c, d, f) is nonzero. The Implicit Mapping theorem75 then guarantees that this mapping is locally invertible around such points, meaning that one has at least identifiability in a local sense (any distinct but two close enough parameters give distinct outputs). It remains to actually verify that D is a nonzero polynomial. This can be done explicitly, For example when evaluated at a = 2 and b = c = d = f = 0, one computes D(2, 0, 0, 0, 0) = 576, which is nonzero. (Computations not shown.) This completes the proof of local generic identifiability.

It is also possible go further, and conclude global generic identifiability as well. To do this, we may apply a computational package such as the Structural Identifiability Toolbox (SIAN)76. In order to apply this software, we need to transform our system into one given by a polynomial system of equations. Thus we study instead the following system:

$$\begin{array}{ll}\dot{X}= -X\\ \dot{S}\,=\, aS+bXS\\ \dot{R}\,=\, cS+fR-cXS+dXR\end{array}$$

with outputs S + R and X. The software package SIAN (available online at https://maple.cloud/app/6509768948056064) confirms global (generic) identifiability of this system, even when initial states are unknown and need to be identified simultaneously with the parameters. We also verified the local identifiability property using an alternative package, STRIKE-GOLDD77, confirming the result.

It is interesting to note that the delay terms are essential for structural identifiability. If γ = 0 (no delay), the equations would become:

$$\begin{array}{ll}\dot{S}\,=\, (a+b)S\\ \dot{R}\,=\, (f+d)R.\end{array}$$

In this case, it is impossible to distinguish pairs (a, b) with the same sum a + b, and pairs (d, f) with the same d + f.

Optimal control formulation

Although the proposed model was calibrated to data in which cells were exposed to a constant concentration of drug, we next sought to utilize the model to identify (possibly non-constant) dosing strategies that minimize the cancer cell population at the end of the experiment. That is, we seek to identify the optimal dosing strategies u(t) which minimize S(tf) + R(tf), where tf is a fixed final time, subject to a maximum total applied dosage. The precise formulation of this optimal control problem is provided in this section.

As described in “Computational methods”, we obtain values for the parameters θ(u)  (rS(u), dS(u), α(u), rR(u), dR(u)) as a function of five doses. The values for the training doses u = 0.032 μM, 0.32 μM, 3.2 μM were arrived at by minimizing the cost function ζ in Eq. (3). The values for validation doses u = 0.1, 1 μM were determined using piecewise linear interpolation of the parameters determined for the training doses. The optimal control problem is formulated utilizing these expressions for θ(u).

As discussed in “Mathematical models”, for time-varying dosing strategies, we must consider the more general model (1014) presented in Sec 4.2. Denote by \(x:= (S,R,{v}_{d,S},{v}_{d,R},{v}_{\alpha })\in {{\mathbb{R}}}^{5}\) the state of model (1014), where vd,S and vd,R are the state variables describing the phenotypic effect of dose on cell death for sensitive and resistant cells, respectivectly, and vα is the state variable describing the phenotypic effect of dose on the resistance induction rate. The system then takes the general form \(\dot{x}(t)=f(x(t),\theta (u(t)))\), with

$$f(x,\theta (u)):= \left(\begin{array}{c}{r}_{S}(u(t))S-{v}_{d,S}(t)S-{v}_{\alpha }(t)S\\ {r}_{R}(u(t))R+{v}_{\alpha }(t)S-{v}_{d,R}(t)R\\ {\gamma }_{1}({d}_{S}(u(t))-{v}_{d,S})\\ {\gamma }_{1}({d}_{R}(u(t))-{v}_{d,R})\\ {\gamma }_{2}(\alpha (u(t))-{v}_{\alpha })\end{array}\right).$$
(19)

These equations indicate that growth rates instantaneously respond to the presence of drug, whereas the death and induction rates have a delayed response controlled by γ1,2. “Details on derivation of mathematical model” for more details.

The optimal control problem is then defined on a finite-time horizon via the initial-value problem \(\dot{x}=f(x,\theta (u)),\,x(0)={x}_{0}\) on [0, tf], with tf fixed during each optimization. As in most simulations in this work (see “Computational methods”), we fix the initial conditions as an entirely sensitive cell population and zero effective dose. Thus, x0 = (1, 0, 0, 0, 0).

We define our objective as the final cancer cell population size at tf,

$$J(u):= {x}_{1}({t}_{f})+{x}_{2}({t}_{f})=S({t}_{f})+R({t}_{f}),$$
(20)

and we determine the dosing strategy u* which minimizes J. Since our rate functions θ(u) are calibrated only for doses between 0.032 μM and 3.2 μM (see (S2)), we restrict our control set U to the convex, compact interval U [0.032, 3.2]. The set of admissible controls then takes the form \({\mathcal{U}}:= \{u:[0,{t}_{f}]\to {\mathbb{R}}| u\,\text{is Lebesgue measurable}\,\}\). To guarantee the existence of a minimizing function u* of J, we need to define the set of admissible controls as measurable (see, for example,78). However, this is not restrictive for the problem considered in this work, because all optimal controls appear to be piecewise continuous (see, for example, Fig. 8).

To incorporate toxicity constraints, we bound the total applied dosage by a constant M:

$$\mathop{\int}\nolimits_{0}^{{t}_{f}}u(t)\,{\rm{d}}t\le M.$$
(21)

This constraint can be incorporated as an auxiliary state variable

$$z(t):= \mathop{\int}\nolimits_{0}^{t}u(s)\,{\rm{d}}s,$$
(22)

so that (21) becomes z(tf) ≤ M. The extended state of the original system defined by the vector field in Eq. (19) is thus defined as \(\tilde{x}:= (x,z)\in {{\mathbb{R}}}^{6}\), and the extended vector field is \(\tilde{f}(\tilde{x},u):= (f(x,\theta (u)),u)\). Since z(0) = 0, the initial conditions for \(\tilde{x}\) become \(\tilde{x}(0)=(1,0,0,0,0,0)\). The control problem then incorporates an additional constraint on the last component of \(\tilde{x}\) at the final time \({t}_{f}:{\tilde{x}}_{6}({t}_{f})=z({t}_{f})\le M\).

In summary, the optimal control problem we analyze is to minimize the final cancer cell population at final time tf,

$$\mathop{\min }\limits_{u\in {\mathcal{U}}}\,J(u),$$
(23)

subject to the following initial-value problem in \({{\mathbb{R}}}^{6}\)

$$\dot{\tilde{x}}=\tilde{f}({\tilde{x}},u),\quad{\tilde{x}}(0)={\tilde{x}}_{0},$$
(24)

together with the additional final time constraint

$${\tilde{x}}_{6}({t}_{f})\le M.$$
(25)

Lastly, we note that by construction, the rate functions θ(u) depend on the logarithm of the dose, i.e. \(\theta (u)=\theta ({\log }_{10}(u))\). Numerically, we thus reformulate the control with respect to the logarithm of the dose, so that we solve this system with respect to \(\tilde{u}(t):= {\log }_{10}u(t)\). Effectively, this redefines the control set as \(\tilde{{\mathcal{U}}}:= [{\log }_{10}(0.032),{\log }_{10}(3.2)]\), as well as the dynamics on the auxiliary variable z as \(\dot{z}(t)=1{0}^{\tilde{u}(t)}\). Thus, the vector field numerically integrated takes the form \(\tilde{f}(\tilde{x},\tilde{u})=(f(x,\theta (\tilde{u})),1{0}^{\tilde{u}})\).

To solve the posed optimal control problem, we numerically solve (23) subject to (24) and (25) using the open-source framework CasADi v3.6.7, which was implemented in a Python 3.8.1 wrapper. CasADi is a numerical optimal control tool for nonlinear optimization and algorithmic differentiation79. Specifically, we converted the above optimal control problem to a nonlinear programming problem, and utilized IPOPT to solve the latter in the CasADI framework. As our model is only validated on the experimental time window t [0, 100] hours, we fix tf = 100 h. We also assume an intermediate total applied dosage of M = 100 μM hours, which corresponds to the constant dose of u ≡ 1 μM applied throughout [0, tf]. In order to solve our optimal control problem in CasaADi, we had to construct a smooth approximation to the rate function θ(u). Details of the construction this smooth approximation are provided in SI S.1.