arising from X. Zhang et al. Nature Communications https://doi.org/10.1038/s41467-022-34027-9 (2022)

In a recent paper Zhang et al.1 elegantly incorporate within-host evolution into an epidemiological model. They show that this leads to substantial changes in the system dynamics and in particular that evolution can “rescue" the pathogen population if the mutation rate is high enough. However, their results rest on the assumption that substitutions—i.e. mutations that have become established in the population2—that affect transmission are neutral on average. Here we show that under less restrictive assumptions concerning the fitness distribution of substitutions, the effect can easily disappear and higher mutation rates in fact reduce the likelihood of a pandemic.

Zhang et al.1 present an analytical epidemiological model, backed by numerical simulations, that takes into account the evolution of intra- and inter-host virus fitness. Their results show that in situations with R0 < 1, that would classically be assumed to lead to the eventual disappearance of the pathogen, under high mutation rates evolution can increase R0 quickly enough to prevent extinction. These results rest on the assumption that mutations affecting inter-host fitness (transmission) (1) are equally likely to have positive as negative fitness effects and (2) do not affect intra-host fitness. If this is the case, their accumulated effect over the time span between first infection and transmission can be modelled as a random walk, leading at point of transmission to a normal distribution of inter-host fitness with the mean at 1 and the standard deviation representing substitution rate.

As a general rule, however, adaptive mutations are usually rare compared to those with no or negative effects on fitness3, a pattern that seems to hold in virus populations as well4,5. Under the above mentioned assumption (2) we would expect the distribution of inter-host fitness effects of substitions to reflect that of mutations. With that in mind, we replicated the study’s results using a more mechanistic model of the pathogen’s fitness that allows us to adjust the distribution of the effect of mutations on inter-host fitness.

We are able to replicate the results of Zhang et al. when we incorporate the same assumption of net neutral fitness effects (Fig. 1a). However, when we assume that the effect of mutations is more often negative than positive, pandemic behaviour arises only for high R0 and low substitution rate. In particular, the evolutionary “rescue effect" that occurs in the original scenario, disappears nearly entirely and higher substitution rates prevent rather than trigger pandemic behaviour (Fig. 1b). We note that the distribution of fitness effects we assume in this case is much closer to neutrality than the distributions found empirically4.

Fig. 1: No rescue effect under biased mutations.
figure 1

Phase diagram for a unbiased (μ = 1.0) and b slightly biased (μ = 0.975) mutation effects dependent on substitution rate and reproduction number. Shading represents mean incidence over 30 replicates after 3000 time steps.

Zhang et al. provide a simple and elegant illustration of a mechanism by which a sub-pandemic pathogen may ‘break through’ into a pandemic phase driven by mutation. Our replication demonstrates that a mechanistic approach can reproduce the results of the original model, but also that these results might only apply under specific assumptions. The scenarios presented in the supplementary material of the original paper suggest a similar conclusion. Here, assuming a correlation between inter- and intra-host fitness leads to similar large-scale changes in the system behaviour (as shown by the phase diagrams in Fig. S4). However, contrary to our results, in this case higher mutation rates increase rather than reduce the probability of pandemic behaviour.

It seems therefore that while the assumption of pathogen evolution can indeed strongly affect under which circumstances a pandemic can occur, how exactly this effect plays out is highly sensitive to initial assumptions regarding the fitness effects of mutations. Future work could build on these foundational results by incorporating the evolution of both intra- and inter-host fitness based on realistic assumptions relevant to specific pathogens. We propose that such models could provide more insightful long-term forecasting of the epidemiological dynamics of evolving pathogens.

Methods

We implemented an agent-based model in Julia replicating the original agent-based version of the model (see section “Methods” in the original paper) where possible, that is in particular the assumptions concerning number of agents, network topology and initial number of infected agents.

In their implementation Zhang et al. model the effects of neutral genetic drift on inter-host fitness as an unbiased random walk by adding a number drawn from a normal distribution to a single fitness value per agent in each time step. We replaced this part of the model with a simple mechanistic model of the occurrence of mutations and their effect on fitness.

It should be noted that strictly speaking “mutations" in both versions of the model are in fact substitutions2. Under assumption (2) (see above), however, substitutions that affect inter-host fitness are an unbiased subset of all mutations.

We assume that there are only point mutations, that they happen with a fixed probability per base pair and time step and that the same base pair never mutates twice. We can then approximate the number of mutations occurring in a given time step by drawing from a Poisson distribution with parameter λ = pmut nbpair. With the fitness effect of a single mutation \({\psi }_{i} \sim {{\mathcal{N}}}(\mu,\sigma )\) the overall fitness after a number of time steps is then simply the product of the initial fitness and the fitness effects of all mutations that have occurred since then:

$$\psi (t)={\psi }_{0}\prod {\psi }_{i}$$
(1)

By simulating a large “population" of virus particles over a sufficiently long time span in the way described above we can numerically approximate the probability density function of inter-host fitness at each time step. We then use the expected value of inter-host fitness for a given host—determined by the host’s time since infection t and the fitness of the original, infecting pathogen - to calculate that host’s infection probability:

$${p}_{\inf }=1-{(1-{p}_{\inf,0})}^{E\psi (t)}$$
(2)

When determining the fitness value for the newly infected host \(\psi^{\prime}\) we assume that the distribution of ψ(t) (Eq. (1)) describes the infecting host’s virus population and that the probability for a given virus particle to cause an infection is proportional to its fitness:

$$P(\psi^{\prime}_{\!0} )\propto \psi (t)P(\psi (t))$$
(3)