Abstract
Deciding on when to initiate or relax an intervention in response to an emerging infectious disease is both difficult and important. Uncertainties from noise in epidemiological surveillance data must be hedged against the potentially unknown and variable costs of false alarms and delayed actions. Here, we clarify and quantify how case under-reporting and latencies in case ascertainment, which are predominant surveillance noise sources, can restrict the timeliness of decision-making. Decisions are modelled as binary choices between responding or not that are informed by reported case curves or transmissibility estimates from those curves. Optimal responses are triggered by thresholds on case numbers or estimated confidence levels, with thresholds set by the costs of the various choices. We show that, for growing epidemics, both noise sources induce additive delays on hitting any case-based thresholds and multiplicative reductions in our confidence in estimated reproduction numbers or growth rates. However, for declining epidemics, these noise sources have counteracting effects on case data and limited cumulative impact on transmissibility estimates. We find that this asymmetry persists even if more sophisticated feedback control algorithms that consider the longer-term effects of interventions are employed. Standard surveillance data, therefore, provide substantially weaker support for deciding when to initiate a control action or intervention than for determining when to relax it. This information bottleneck during epidemic growth may justify proactive intervention choices.
Similar content being viewed by others
Introduction
Rapidly and reliably detecting the key dynamics of infectious diseases is a recurrent challenge in epidemiology with critical consequences and trade-offs1,2,3. Early warnings of upcoming epidemic growth or decline can provide valuable signals for mobilising or relaxing interventions4,5,6 and even enhance the effectiveness of decisions to act7. However, epidemic data are often scarce or uncertain in early periods8,9 and the economic and other costs of premature action (i.e., false alarms) may be detrimental. Choosing to wait for data to accumulate and improve the evidence base for decision-making can also be costly, as the resulting delays to intervene (i.e., missed actions) may mean lost opportunities to minimise epidemic burden or relax interventions when infections are, respectively, rising or waning10,11,12.
Although the decision to act or not at any time commonly involves many external factors (e.g., political and sociocultural precedents)13,14, at the core of these choices is the complex but fundamental trade-off between uncertainty in epidemic dynamics and uncertainty in the costs of deciding to intervene (and hence modify those dynamics). While integrated epidemiological and economic modelling frameworks15,16,17,18 have been proposed to balance uncertainties and optimise decision-making, these approaches require knowledge or at least assumptions about these uncertainties, which can be difficult to quantify accurately. Even if this uncertainty is well-described, recommended actions may be sensitive to parameter, model structure, and implementation choices7,19. The complexity of integrated models may further constrain the ability to validate the robustness of outputs, precluding generalised insights9,20,21.
Given these challenges, a complementary set of studies have instead aimed to use simplified models, which are easier to parametrise and validate, to uncover the fundamental limits that uncertainties impose on public health decision-making22. While these studies have yielded valuable insights into the robustness and timeliness of intervention choices and into how we can optimally detect shifts in epidemic dynamics6,7,23,24,25, few works have directly coupled these to costs of missed actions and false alarms (though they do examine proxies e.g., the times over which interventions are sustained) or characterised how noise in surveillance data impacts detection and hence decision timepoints, which then feed back onto those costs. Here, we aim to resolve these gaps by adapting optimal Bayesian detection theory26,27, with the goal of extracting general guidelines about how noise constrains cost-optimal action points.
We show that the costs of missed actions and false alarms, even when not known accurately, imply a threshold criterion that we must compare with our evidence to act. We then describe how the information supporting this decision to act accumulates above this threshold with time. The costs of erroneous actions are therefore directly balanced by the aggregate of information that supports action, i.e., larger costs require more evidence and longer wait-times to act are justifiable26. For growing and waning epidemics, this evidence, respectively, informs the initiating and relaxing of interventions. Our central contribution is to reformulate this decision-cost framework to understand how under-reporting of infections and delays in reporting modify the optimal thresholds for supporting action. Interestingly, we find these ubiquitous sources of noise intrinsically but asymmetrically limit our capacity to respond to epidemics.
When an epidemic is growing, delays and under-reporting introduce additive lags on achieving any threshold based on the incidence of infections or related curves such as those of cases and deaths, and multiplicative reductions in information available for estimating transmissibility, i.e., uncertainties around estimates of reproduction numbers and growth rates are amplified by both noise sources. However, when an epidemic is waning, the noise sources counteract, with reporting delays offsetting the impact of under-reporting when hitting thresholds, and both haveminimal effect on transmissibility estimate uncertainties. This asymmetry results from the directionality of delays and under-reporting, is practically magnified because data are often scarcer during early epidemic growth stages23, and persists even if sophisticated decision-algorithms28 that optimise intervention timings and their longer-term impact are applied. We argue that markedly more restrictive bottlenecks to optimal decision-making during emergent epidemics fundamentally support proactive outbreak intervention choices9.
Results
Detection thresholds and the costs of intervening
We start by adapting Bayesian decision theory to better understand what a decision involves and how the impact of costs (even if not accurately known) can be included29. Given some information at time \(t\), denoted \({X}_{t}\), on the dynamics of an infectious disease, we must decide if it is optimal to act now or wait for future information. We use \({H}_{1}\) to represent the hypothesis that we should act now and \({H}_{0}\) as the null hypothesis that we should not act. For a growing epidemic, this action may be to initiate non-pharmaceutical interventions (NPIs), while for a declining one, this may involve relaxing any existing NPIs. To determine what optimal means, we must attach costs to actions. We explore the mathematical details of this approach in the Methods, but here outline the essential components and results.
A missed action represents when we act too slowly and is a false negative (FN) with cost \({c}_{{FN}}\). A false alarm describes a premature action and is classed a false positive (FP) with cost \({c}_{{FP}}\). A true positive (TP) and true negative (TN), respectively indicate when action (or inaction) is appropriately taken, with associated costs of \({c}_{{TP}}\) and \({c}_{{TN}}\). Applying the principles of decision theory we can express the optimal binary decision at time \(t\), \({i}_{t}^{* }\), as in Eq. (1) below.
Here \({{{\mathbf{1}}}}({{{\boldsymbol{.}}}})\) is an indicator function. When its condition is satisfied, \({i}_{t}^{* }=1\) and we should act. The optimal time of action is \({t}^{{{{\boldsymbol{* }}}}}\), and \({{{\rm{\eta }}}}\) is a threshold that depends only on relative costs.
This expression reveals that our decision to initiate (or relax) an intervention in the face of a potentially growing (or waning) epidemic rests on comparing the posterior evidence for the need to act, \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\), against the proportion of the total excess cost of making incorrect choices that results from false alarms. The time at which we are statistically justified in acting is \({t}^{{{{\boldsymbol{* }}}}}\) i.e., the first time when \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\) exceeds \({{{\rm{\eta }}}}\). Larger \({c}_{{FP}}-{c}_{{TN}}\) is only counterbalanced by stronger evidence for the need to act and increases \({t}^{{{{\boldsymbol{* }}}}}\). Equation (1) is noteworthy as whatever the costs of action or inaction, ultimately optimal decision-making involves testing a threshold \({{{\rm{\eta }}}}\) against our evidence, which (assuming no change in the epidemic state) accumulates with time. When false alarms (or missed actions) are prohibitively more expensive then \({{{\rm{\eta }}}}\) rises to 1 (or falls to 0) and the optimal time to act \({t}^{{{{\boldsymbol{* }}}}}\) becomes infinite (or 0).
Since we cannot control or sometimes even know (as quantifying the costs of interventions is non-trivial) the value of \({{{\rm{\eta }}}}\), we focus on understanding the factors that regulate the posterior evidence. If \({H}_{1}\) is true, then we expect that any threshold will eventually be crossed. However, for fixed costs, the lag to crossing is regulated by at least two key factors. The first is the choice of information source \({X}_{t}\). In the next section, we consider two widely used sources – the reported count of symptomatic cases over time and the inferred transmissibility parameters underpinning those cases. Second, the evidence from these sources for initiating (or relaxing) an intervention will deteriorate with the level of noise in the epidemic surveillance data. We examine and quantify, for the first time to our knowledge, how common types of surveillance noise reduce the information in these sources for guiding intervention decisions.
We further interpret this Bayesian decision problem using information theory. Effectively, we want to communicate a 1 or 0 to a policymaker to indicate the evidence, respectively, for action or inaction. When \({H}_{1}\) is true, then \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)={p}_{t}\) accrues with time and is the probability that we should act given the epidemic state (which we estimate from \({X}_{t}\)). In Eq. (6) of the “Methods” we find Eq. (1) implies that \({i}_{t}^{* }={{{\boldsymbol{1}}}}\left(-\frac{d{{{{\mathcal{H}}}}}_{{p}_{t}}}{d{p}_{t}}\ge {\log }_{2}\frac{\eta }{1-\eta }\right)\) with \(-\frac{d{{{{\mathcal{H}}}}}_{{p}_{t}}}{d{p}_{t}}\) as the loss in entropy as \({p}_{t}\) increases. Our decision-making process compares this loss or equally the increase in certainty that \({H}_{1}\) is true, against a cost-based threshold. The optimal time \({t}^{{{{\boldsymbol{* }}}}}\) that a policymaker should wait before acting is controlled by how \({p}_{t}\) rises with time30. For any cost structure, this time depends on the surveillance noise, which reduces \({p}_{t}\) (when \({H}_{1}\) is true) and hence the optimality and speed of possible intervention decisions. We explore these effects next.
Responding to epidemic growth and decline using reported incidence
The most timely and visible indicator of the likely state of an epidemic is the reported incidence of symptomatic cases, \({C}_{t}\) at time \(t\). This serves as a proxy for the count of new infections, \({I}_{t}\), which is rarely observable. Case incidence also measures potential epidemic burden because the incidence of hospitalisations or deaths may be described as delayed and scaled versions of \({C}_{t}\)31,32\(.\) A common approach to epidemic detection involves sequentially comparing the reported incidence against some baseline threshold6,25,33,34. We denote such a baseline by \(a\) and define our decision problem with the posterior evidence of \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\equiv {{{\bf{P}}}}\left({I}_{t}\ge {a|}{C}_{1}^{t}\right)\) for a growing epidemic and \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\equiv {{{\bf{P}}}}\left({I}_{t}\le {a|}{C}_{1}^{t}\right)\) for a waning or controlled epidemic. We use \({Y}_{i}^{j}\) for some variable \(Y\) to represent the time series or vector \(\{{Y}_{i},{Y}_{i+1},\ldots ,{Y}_{j-1},{Y}_{j}\}\). Our evidence is based on the likely number of infections given observed cases.
We start by deriving analytic insight from a related deterministic model of epidemic spread. As cases suffer from under-reporting and delays in reporting (relative to the underlying infections) we define a fixed reporting fraction \(\rho\) (the proportion of infections reported as cases) and delay \(\tau\) (the lag before an infection is reported as a case). We can then describe the true infection incidence \({I}_{t}\) and the reported case incidence \({C}_{t}\) as in Eq. (2) with exponential dynamics. Here \(r\) is the effective growth rate, which corresponds to an effective reproduction number \(R\) and \({I}_{{{{\rm{ic}}}}}\) is the initial condition on the count of infections at time \({t}_{{{{\rm{ic}}}}}\). If we are early in the epidemic, where susceptible depletion is negligible and with \({t}_{{{{\rm{ic}}}}}=0\), then \(R={R}_{0}\) and \(r={r}_{0}\) i.e., they become the basic reproduction number and intrinsic growth rate, respectively.
Here ρ encodes effects such as under-ascertainment and asymptomatic spread, while τ describes latencies due to surveillance limitations and lags in diagnosing symptomatic patients. The last relation in Eq. (2) specialises this growth process to the dynamics of a classical susceptible-infected-recovered (SIR) or susceptible-exposed-infected-recovered (SEIR) model when the fraction of susceptible individuals is constant, with \(\gamma\) as the duration of infectiousness. Equation (2) also approximates epidemic dynamics more generally as if the growth rate is time-varying (e.g., during declining epidemic stages), we can interpret \(r\) as the mean \(\frac{1}{t-{t}_{{{{\rm{ic}}}}}}{\int }_{{t}_{{{{\rm{ic}}}}}}^{t}r\left(s\right){{{\rm{d}}}}s\) and apply similar expressions to obtain averaged (approximate) reproduction numbers35,36.
We assume that neither the reporting fraction nor the delay is known. This is common during the early stages of an emerging outbreak but may remain problematic into the later epidemic stages due to interventions or new pathogen variants changing the value of these and related epidemiological variables37. Consequently, we may often have to make decisions by simply choosing some alert threshold \(a\) and finding the first time that this value is exceeded by the epidemic curve. Such exceedance approaches have been practically used6 but, as far as we can tell, the heterogeneous impacts of major surveillance noise sources on these decision-making problems has not been quantified analytically. We consider both growing and waning epidemics with \({\Delta }_{{{{\rm{grow}}}}}={t}_{C\ge a}-{t}_{I\ge a}\) as the lag in exceeding our alert threshold for a growing epidemic and \({\Delta }_{{{{\rm{wane}}}}}={t}_{I\le a}-{t}_{C\le a}\) as the related time lag for a waning epidemic.
We calculate these crossing times by equating expressions from Eq. (2) with the alert value i.e., we solve for \({a=I}_{{{\rm{ic}}}}{e}^{r({t}_{I\ge a}-{t}_{{{\rm{ic}}}})}={\rho I}_{{{\rm{ic}}}}{e}^{r({t}_{C\ge a}-{t}_{{{\rm{ic}}}}-\tau )}\left)\right.\). This yields Eq. (3) with \(r > 0\) (\(r < 0\)) for growing (waning) epidemics with \({\tau }_{\rho }\) defined as an effective lag resulting from \(\rho\).
While Eq. (3) is trivial to derive it embodies some key insights. First, we can treat the influence of incomplete reporting as an effective shift of \({\tau }_{\rho }\) time units. Second, \({\tau }_{\rho }\) adds to the reporting delay \(\tau\) for growing epidemics but subtracts from it when aiming to detect waning epidemics. Accordingly, early detection of emerging epidemics is substantially more difficult than the early detection of waning epidemics. For example, if \({\tau }_{\rho }\approx \tau\), then \({\Delta }_{{{{\rm{grow}}}}}\approx 2\tau\) but \({\Delta }_{{{{\rm{wane}}}}}\approx 0\). This occurs when \(\rho \approx {e}^{-\left|r\right|\tau }\), which yields \(\rho =\frac{1}{2}\) if \(\tau =\frac{\log 2}{{|r|}}\) equals the doubling (halving) time.
The asymmetry in Eq. (3) is striking and implies that a substantial bottleneck exists when responding to growing epidemics versus waning ones. Importantly, we find that this insight holds for more realistic models. We simulate epidemic case incidence according to the noisy renewal model from Eq. (8) of the Methods, which includes stochasticity both in the incidence generation and the reporting and delay distributions. Using parameters for COVID-19 we showcase the expected asymmetry for hitting alert thresholds in Fig. 1. Our results do not depend on \(a\) being constant so we can also allow it to vary with time. In this setting, it can theoretically encode stochastic, time-varying and seasonal baselines for exceedance alerts.
We plot the lag in crossing alert threshold \(a\) for 1000 epidemics simulated with COVID-19 parameters under a renewal process (see Methods). a Upward crossing lag times \({\varDelta }_{{\rm{grow}}}={t}_{C\ge a}-{t}_{I\ge a}\) during the growth stage of the simulated epidemics. b Downward crossing lags \({\varDelta }_{{\rm{wane}}}={t}_{I\le a}-{t}_{C\le a}\) for waning epidemics. We plot lags for cases corrupted by only under-reporting (red), delays (green) and then for both noise sources combined (blue).
The histograms of Fig. 1 summarise the lags in times at which cases \({C}_{t}\) exceed \(a\) relative to infections \({I}_{t}\) i.e., \({\Delta }_{{{{\rm{grow}}}}}\) and \({\Delta }_{{{{\rm{wane}}}}}\). These inform on our evidence criteria \({{{\bf{P}}}}\left({I}_{t}\ge {a|}{C}_{1}^{t}\right)\ge \eta\) for growing epidemics and \({{{\bf{P}}}}\left({I}_{t}\le {a|}{C}_{1}^{t}\right)\ge \eta\) for declining ones. We directly use case counts instead of hypothesis probabilities as cases provide the clearest extrinsic signals that may be used to decide if to act or not3. Note that we can infer infections from cases and other proxies by simply upsizing the case counts by the noise probabilities (under a popular binomial model)32,38. Our central insight is that, irrespective of the values of \(\eta\) and \(a\), practical noise sources induce an important asymmetry that makes timely interventions in response to resurgences markedly more difficult e.g., in Fig. 1 at every \(a\) we find that \({{{\bf{E}}}}[{\Delta }_{{{\rm{grow}}}}] > 6{{{\bf{E}}}}[{\Delta }_{{{\rm{wane}}}}]\). Next, we test if these asymmetries hold when intrinsic transmissibility signals are used instead.
Responding to epidemic growth and decline using transmissibility estimates
The effective reproduction number, \(R\), is a popular metric of infectious disease transmissibility that is frequently used to inform public health policymaking39. The value of \(R\) is compared to a threshold of 1 to indicate whether the epidemic will grow or wane. Because \(R\) is a latent parameter of the epidemic, it needs to be estimated from observed data. The incidence of new infections, \({I}_{t}\), contains the most information for inferring \(R\), but practical surveillance biases mean that we frequently can only observe new cases, \({C}_{t}\). This necessarily results in a loss in information about \(R\), which leads to an increase in our estimate uncertainty. We can quantify these losses by computing the Fisher information of \(R\) from the incidence of infections \({{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)\) and cases \({{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)\)32,40\(.\) The Fisher information is an important measure because it defines the smallest asymptotic uncertainty achievable by any unbiased estimator27.
We may use estimates of \(R\) to inform actions by setting our decision problem according to the posterior evidence \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\equiv {{{\bf{P}}}}\left(R\ge {b|}{C}_{1}^{t}\right)\) and \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\equiv {{{\bf{P}}}}\left(R\le {b|}{C}_{1}^{t}\right)\) for growing and waning epidemics respectively, with some threshold \(b\). As we aim to derive analytic insights about our ability to solve these decision problems generally, we consider asymptotic limits at which the Fisher information measures the uncertainty around our \(R\) estimates. At these limits the posterior distribution of \(R\) is Gaussian with mean at its maximum likelihood estimate (which converges to the true \(R\)) and variance \({\sigma }^{2}\) inversely proportional to the Fisher information. The \(b\) that we should act on, given \(\eta\), relates to the quantile function \(\surd 2{\sigma {{{\rm{erf}}}}}^{-1}(2\eta -1)\) (directly for \({{{\bf{P}}}}\left(R\le {b|}{C}_{1}^{t}\right)\) or its complement for \({{{\bf{P}}}}\left(R\ge {b|}{C}_{1}^{t}\right)\)), with \({{{\rm{erf}}}}\) as the error function.
Consequently, for any decision cost-threshold \(\eta\) the Fisher information plays a central role. We model epidemics using renewal processes as in the Methods and Fig. 1 and assume that cases derive from infections with a reporting proportion of \({\rho }_{s}\) at time \(s\) and cumulative delay probability \({F}_{t-s}\) for reports delayed by at most \(t-s\) time units. We compute the Fisher information about \(R\) from case data \({{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)\) as on the left of Eq. (4) with \({\Lambda }_{s}\stackrel{\scriptscriptstyle{{\mathrm{def}}}}{=}{\sum }_{x=1}^{s-1}{w}_{s-x}{I}_{s}\) as the total infectiousness of the disease and \({w}_{s-x}\) as the probability of an infection being transmitted in \(s-x\) time units. Equation (4) follows from the Methods and the framework in refs. 32,41 and involves summing along the period over which \(R\) is (approximately) constant.
If we set \({\rho }_{s}=1\) for all \(s\) (perfect reporting) and \({F}_{0}=1\) (no delays) then we recover the Fisher information from the infection counts \({{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)\). This measures the intrinsic uncertainty from the stochasticity of transmission. For example, if infections are few, then \({\Lambda }_{s}\) and \({{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)\) are small, reflecting inherent limitations to inferring spread in these settings. We cannot compute \({{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)\) without knowledge of the noise sources corrupting surveillance data, but we can extract key insights by taking the ratio of the case to infection values, \(\frac{{{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)}{{{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)}\), as on the right of Eq. (4). This reveals that the losses in information about \(R\) arising from the noise in the cases depend multiplicatively on noise terms \({\rho }_{s}{F}_{t-s}\) weighted by \({\Lambda }_{s}{\left({\sum }_{s=1}^{t}{\Lambda }_{s}\right)}^{-1}\) across time.
As delays are always forward in time and induce more information loss towards the present \(t\) (i.e., \({F}_{0}=\min {F}_{t-s}\) as it is cumulative) we find an inherent asymmetry in our ability to make decisions for growing versus waning epidemics. Growing epidemics have an increasing \({\Lambda }_{s}\) so \({\Lambda }_{s}{\left({\sum }_{s=1}^{t}{\Lambda }_{s}\right)}^{-1}\) rises as \(s\to t\), contributing more to the sum in Eq. (4). Consequently, the noise from the delay has a magnified effect on the overall estimate uncertainty, limiting our evidence \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\). The converse occurs for declining epidemics. In Fig. 2 we plot the Fisher information ratios from the simulations underlying Fig. 1 (which use realistic noise parameters) and expose this performance gap, with growing epidemics having ratios that are approximately 3 times smaller. Note that as we shrink the noise towards perfect reporting (not shown), both our ratios tend to 1 (as expected) but the convergence during the growth phase is slower.
a Presents ratios of the Fisher information for cases \({{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)\) to that from the true infections \({{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)\) for the growing phase (red, the true reproduction number \(R > 1\) of the epidemic is given in red in the inset) and waning phase (blue, true reproduction number \(R < 1\) also in blue in the inset) of an epidemic. Solid lines show mean values while the shaded ribbons cover the full range of the information ratios derived from all case curves in (b). b Shows epidemic trajectories simulated using COVID-19 parameters from renewal models (see Methods) but subject to under-reporting and delays. We plot 1000 case \({C}_{t}\) curves, in various colours against the true infections \({I}_{t}\) (black). The inset histograms plot differences in crossing times between when the posterior evidence for \(R\ge b\) (\({\varDelta }_{{\rm{grow}}}\), red) and \(R\le b\) (\({\varDelta }_{{\rm{wane}}}\), blue), exceed a threshold \(\eta\) (see main text), when inferred from \({C}_{1}^{t}\) versus \({I}_{1}^{t}\).
We also consider one example of the practical consequences of this asymmetry in the Fisher information ratios in the inset of Fig. 2. There we apply an estimation method (EpiFilter42) to generate posterior distributions \({{{\bf{P}}}}\left(R\ge {b|}{I}_{1}^{t}\right)\) and \({{{\bf{P}}}}\left(R\ge {b|}{C}_{1}^{t}\right)\) for the growing portion of the epidemic and \({{{\bf{P}}}}\left(R\le b|{I}_{1}^{t}\right)\) and \({{{\bf{P}}}}\left(R\le b|{C}_{1}^{t}\right)\) for the waning period and record differences in times for when these measures cross a threshold \(\eta\). We again find appreciable asymmetry but note that the discrepancies are not as large as in Fig. 1. This is unsurprising because the two approaches are not directly comparable and the posterior estimates of \(R\) also depend on factors such as the assumed prior distributions, generation time distribution choices and the smoothing method used. This is why we examined the Fisher information. Nevertheless, the asymmetry is apparent.
The main alternative to \(R\) is the epidemic growth rate \(r\), which has similar properties39,43 and indicates resurgence or control based on whether it is positive or negative. The growth rate is sometimes preferred to \(R\) because \(r\) offers a measure of the speed or timing needed from an intervention44 and are more robust to generation time distribution misspecifications. These benefits come at the expense of some loss in mechanistic interpretability and in making smoothing assumptions that may induce other biases43. We can broadly relate \(R\) and \(r\) for a given generation time distribution via the Euler-Lotka equation36. We make the common assumption of a gamma generation time distribution with shape \(\alpha\) and scale \({\beta }^{-1}\) parameters and apply Eq. (10) of the Methods to derive the Fisher information on the left of Eq. (5).
Interestingly, we observe that the noise terms appear in the same way as Eq. (4) and that the remaining terms are independent of the noise. As a result, when we take the ratio of case to infection data Fisher information values, as in the right of Eq. (5), we obtain precisely the same result as in Eq. (4), despite the Fisher information itself being different for \(r\). The noise induced asymmetry observed for \(R\) therefore also affects \(r\) and likely cannot be overcome by changing our measure of epidemic transmissibility. We confirm this further by noting that recent metrics from45 that reformulate \({\Lambda }_{s}\) to reduce its dependence on generation times have similar Fisher information formulae that will also show this skew. Consequently, optimal decision thresholds for growing epidemics are substantially more difficult to resolve than equivalent thresholds for waning epidemics and are unlikely to improve by changing how policymakers track spread.
Asymmetry persists under complex (feedback) decision frameworks
In previous sections we demonstrated that surveillance noise induces asymmetric limits to optimal decision-making when comparing growing and waning epidemics. Since we aimed to extract generalisable insights, the models we investigated, while commonly used to study real epidemics39, were somewhat simplified. Here we confirm that the asymmetry we uncovered remains an intrinsic barrier to performance even when cost-optimal decision frameworks that leverage feedback control theory46,47 and Bayesian optimisation are employed. Using the same parameters from Fig. 2 (see Methods), we simulated COVID-19 epidemics and applied the model predictive control algorithm from28 to optimise the timing of decisions. This algorithm aligns with the framework of Eq. (1) but explicitly evaluates and propagates the costs of both action and inaction at every decision time (weekly).
We detail these costs in Eq. (11) of the Methods, including terms that account for the disruption (both economic and other types) of initiating an NPI and penalties for the trajectory of the epidemic (e.g., peak and endemic infection loads induce costs linked to healthcare burdens) expected to occur due to our NPI decisions. We model a lockdown as a multiplicative reduction in \(R\), using parameters consistent with48. We find the optimal time for initiating (relaxing) a lockdown for a growing (waning) epidemic, by minimising Eq. (11) across a projected horizon that considers longer-term feedback or rebound effects resulting from action or inaction. This procedure uses case thresholds, considers uncertainties in \(R\), which are inferred as part of the algorithm and applies explicit costs to actions, uniting all the previous Results sections.
We present results of this algorithm in Fig. 3 for simulated COVID-19 epidemics. We compute optimal decision times for the perfect (noiseless) case and for scenarios featuring realistic surveillance noise (under-reporting and delays in line with previous sections). Analysing the start, end and length of lockdowns (histograms in Fig. 3), we observe that noisy surveillance causes lags of 2-4 weeks when initiating an NPI. However, this same noise induces delays of only under 1 week when relaxing that NPI. The growth-waning asymmetry is well-defined and persistent even under sophisticated decision algorithms. In Fig. 4 we show equivalent analyses for epidemics simulated under Ebola virus parameters (see Methods for details), verifying the consistency of our claims. Noise causes delays of 4-5 weeks when initiating lockdowns but only 1–2 weeks when relaxing them. Absolute times are longer for Ebola virus due to its slower timescale of spread (its mean generation time49 is more than double that of COVID-19). As a result, the bottleneck we have discovered is resistant to sophisticated predictive algorithms and suggests that proactive policymaking action is necessary during epidemic growth.
a, b Present simulated epidemics under COVID-19 parameters (see Methods) and renewal models, with interventions implemented by the model predictive control (MPC) algorithm from28. This projects the costs of action or inaction forward in time together with the outcomes of those choices (e.g., larger epidemic peaks or endemic infection loads) and optimises timing based on minimising costs over the projections (see Methods). We plot multiple stochastic epidemic realisations in light shades with one trajectory highlighted. There is no noise, so we observe the true infections. a Focuses on growing epidemics and the optimal initiation of the first lockdown. In some cases, multiple lockdown actions are visible because the MPC algorithm continually optimises costs, but we focus only on the statistics of the first decision (in line with Figs. 1–2). b Examines scenarios where no interventions were made for the first nine weeks after which the MPC algorithm initiated a lockdown and focuses on the waning components of this epidemic and optimally releasing that lockdown. c, d Repeat the simulations of a, b, but now we cannot access the true infections (shown in black) and must decide optimal actions based on case curves subject to surveillance noise (delays and under-reporting). Again, multiple stochastic epidemic realisations are in light shades with one trajectory highlighted. e, f Plots histograms of the times for lockdown start, end and duration for the perfect (shaded) and noisy (unshaded) scenarios. e Focuses on differences in the control problems of a and c, while f considers differences from b and d. Note that because for waning epidemics the beginning of the lockdown is the same both with and without noise, outcomes only deviate afterwards. The difference in the corresponding histograms shows the impact of the surveillance noise. Our results indicate that optimally relaxing a lockdown or another related NPI is minimally affected by imperfect surveillance. The related histograms also show smaller discordance (the shaded and unshaded ones are closer together).
a–f Repeat the simulations and analyses of Fig. 3 but under Ebola virus transmission and surveillance noise parameters (see “Methods” for details). a and c Provide the optimal control performance for initiating a lockdown as epidemics grow under no noise and practical surveillance respectively. e Shows the histograms of lockdown start, end, and duration times for the perfect (shaded) and noisy (unshaded) scenarios. b and d Present the optimal MPC performance for lifting a lockdown as epidemics wane under no noise and practical surveillance, respectively. f Plots corresponding histograms of timing and duration performance. We find the asymmetry in Fig. 3 therefore, persists.
Discussion
Policymakers often have to make critical public health decisions from uncertain and unreliable data. While research has shown how this uncertainty can markedly impact (often in unintuitive ways) the timing, selection and success of interventions7,12,16,18,28, much is still unknown about how real-time decision-making can best incorporate or mitigate this uncertainty. Recent studies have proposed integrating uncertainty within rigorous decision theory frameworks50,51,52. Here, we support this proposal, but instead of applying complex modelling to guide specific action or inaction, we concentrated on extracting generalisable insights about how uncertainty fundamentally limits decision-making. We tackled this problem by exploring how predominant forms of surveillance uncertainty or noise, relating to under-reporting and delays in reporting cases, present bottlenecks for real-time decisions based on two of the most common outbreak indicators – the incidence of cases and the reproduction number (or related growth rate).
We discovered that there is a surprising, intrinsic and important asymmetry induced by these sources of noise that results in substantially reduced performance of the solutions to optimal decision problems for growing epidemics, relative to the performance on equivalent problems for waning epidemics. This asymmetry remains irrespective of whether we base decisions on extrinsic proxies of infection incidence (Fig. 1) or intrinsic estimates of transmissibility (Fig. 2). We found theoretical justification for this asymmetry using information and Bayesian decision theory and then confirmed this asymmetry remained when complex algorithms were used to explicitly optimise decisions according to the costs of action as well as the opportunity costs of inaction (Figs. 3, and 4). While surveillance noise is expected to restrict actionable information, it is not obvious that the bottleneck it imposes should be notably more restrictive during growth.
There are several important ramifications of this asymmetry. First, as this innate performance restriction on responding to epidemic growth is likely further compounded by scarcer data and additional unknowns (e.g., transmissibility can be poorly specified during epidemic emergence12), proactive interventions may be necessary to achieve timely control7,23. In contrast, a more reactive approach to relaxing interventions is likely sufficient. Proactive policymaking may require reducing the threshold for action (potentially elevating the likelihood of false alarms) in response to routine data or leveraging other data streams to extract early-warning signals of resurgence (potentially increasing disease monitoring costs or infrastructure requirements) that can guide timelier decisions. Sentinel surveillance of species that may be sources of zoonotic spillovers or digital monitoring of online search and sentiments may offer such signals53,54.
Second, if surveillance noise can be reduced by reallocating resources (e.g., rapid diagnostic tests, data sharing agreements) or relying on alternate data (e.g., deaths and hospitalisations may be more reliable than cases early on, though potentially more delayed) then deployment should prioritise growing stages of outbreaks. More localised surveillance (e.g., gathering data at community versus regional levels or by age group23,55) may also help mitigate inherent asymmetry in detecting resurgence because emerging infections appear first at small scales (though data are scarcer at these scales) before the epidemic becomes widely established. Third, when NPI effectiveness is retrospectively assessed it may be valuable to contextualise performance against the limits imposed by the bottlenecks we have explored to gain a more objective evaluation. This provides a clearer reference for effectiveness, ensuring we are not comparing against unrealistic ideals or infeasible counterfactual scenarios.
Although the asymmetry we uncovered is fundamental to noisy epidemic surveillance, there are limitations to our analyses. We only considered homogeneous (well-mixed) transmission models. The impact of realistic heterogeneities in spread due to geography, demography and other characteristics may influence what an optimal decision should be and hence the level of asymmetry. This is an open question that we aim to explore in the future. Additionally, we have not examined how emerging epidemic data could help us overcome these bottlenecks. Recent initiatives56 have proposed integrating wastewater surveys and genomic sequencing within standard surveillance schemes. While promising, the potential of these data for enhancing early warnings of outbreaks is still being assessed. Should these data feature some latency and under-ascertainment32, we expect decision-making asymmetries to re-emerge.
Lastly, we constrained our analyses to binary decisions between action and inaction based on data available in real time. This was both to embody the common situation where policymakers must make choices from the latest (updating) data and to allow for analytic and generalisable insights to be extracted. However, this framework neglects that optimal decisions could involve more complex trade-offs among multiple, simultaneous NPI options with differing benefits and costs informed by both real-time and historical data from other outbreaks or past experience with those NPIs. Although some complex decision problems can be reduced to generalisations of the binary case we study, yielding multiple decision thresholds26,57, it remains unknown if and when these complexities could modify the performance asymmetry that we uncovered, particularly when combined with the heterogeneities and auxiliary data above.
These limitations notwithstanding, the consistent asymmetry we found, using both elementary mathematical arguments and sophisticated (predictive) algorithms, underscores a meaningful directionality in how surveillance imperfections shape and bound decision-making in the face of uncertainty. Knowing these asymmetric performance bounds is not only helpful for deciding when interventions need to be proactive or reactive, but also for diagnosing how our responses to outbreaks may fail and for justifying why faster and more stringent measures are necessary. With mounting calls for better integration of formal decision frameworks16,18,47,50 together with enhanced and multimodal surveillance data37,56, epidemic models will only become more complex and difficult to interrogate. Having generalisable insights into performance limits can help anticipate failures in surveillance-driven decisions and improve their robustness.
Methods
Bayesian decisions from epidemiological data
Bayesian decision theory offers a formalised and rigorous way of informing decisions with data and under uncertainty26,27. Here, we rework some classical results to gain insight into how we can optimise intervention decisions. We examine a binary decision or hypothesis testing problem where, given information \({X}_{t}\) on the dynamics of an infectious disease up to time \(t\) (e.g., cases or estimates of transmissibility), we want to decide to act now (at time \(t\)) or to wait for more information to accrue. We let \({H}_{0}\) be our null hypothesis, which defines the situation that we should do nothing. \({H}_{1}\) is the alternative hypothesis that we should act now.
These hypotheses are reassessed sequentially with time and delineate optimal decision points for action. While we consider binary decision-making only, the expressions below are known to generalise to complex decision problems involving multiple hypotheses. Solutions to those problems are qualitatively similar57. We define \({c}_{{ij}}\) as the cost of acting according to \({H}_{i}\) when \({H}_{j}\) is true. A FP action (or false alarm) occurs when we act before we should and has cost \({{c}_{{FP}}=c}_{10}\). A FN or missed action has cost \({{c}_{{FN}}=c}_{01}\). TP and TN, i.e., correct actions have respective costs of \({{c}_{{TP}}=c}_{11}\) or \({{c}_{{TN}}=c}_{00}\). The expected cost of a decision \(i\), \({{{\bf{E}}}}\left[{c}_{i}\right]\), and the associated optimal action \({i}_{t}^{* }=\mathop{\min }_{i\in \{{{\mathrm{0,1}}}\}}{{{\bf{E}}}}\left[{c}_{i}\right]\), follow as in Eq. (6). There \({{{\bf{P}}}}\left({H}_{j}|{X}_{t}\right)\) is the posterior distribution of the evidence for hypothesis \(j\in \{{{\mathrm{0,1}}}\}\) given available information \({X}_{t}\)57\(.\)
Equation (6) states that the optimal action depends on a threshold or decision boundary \(\epsilon\) that solely depends on the costs. This standard result from decision theory is already interesting as even when we do not know these costs accurately, we can still assess their likely effects via different choices of \(\epsilon\). Moreover, if we include prior distributions on each hypothesis, Eq. (1) defines a likelihood ratio test. Equation (6) also generalises to multiple hypotheses29, where we would find different thresholds for every hypothesis. We can rearrange our decision rule from Eq. (6) by recognising that \({{{\bf{P}}}}\left({H}_{0}|{X}_{t}\right)=1-{{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)\) to obtain Eq. (1) of the main text, with \(\eta =\frac{\epsilon }{1+\epsilon }\).
We can also apply information theory to derive insights into this decision problem. We wish to communicate one bit of information to a decision maker, i.e., a 1 or 0 to indicate that the evidence indicates that action or inaction, respectively, is required. If \({H}_{1}\) is true, then the evidence for action \({{{\bf{P}}}}\left({H}_{1}|{X}_{t}\right)={p}_{t}\) will accumulate. The Shannon entropy of a Bernoulli process with success probability \({p}_{t}\) is \({{{{\mathcal{H}}}}}_{{p}_{t}}={p}_{t}{\log }_{2}\frac{1}{{p}_{t}}+\left(1-{p}_{t}\right){\log }_{2}\frac{1}{1-{p}_{t}}\) and defines the uncertainty for a distribution over two choices with probabilities \({p}_{t}\) and \(1-{p}_{t}\)40\(.\) Logarithms are to the base 2 so \(0\le {{{{\mathcal{H}}}}}_{{p}_{t}}\le 1\) is in bits. In Eq. (7) we take the negative derivative of this function with \({p}_{t}\), showing how this uncertainty reduces as we accrue evidence to act.
Consequently, optimal decision-making compares the loss of randomness or rise in certainty that \({H}_{1}\) is true with the logarithm of the cost-based threshold from Eq. (6). This is characterised by \(-\frac{d{{{{\mathcal{H}}}}}_{{p}_{t}}}{d{p}_{t}}\), which is a logit function. The speed at which we cross our decision threshold, i.e., the cost-adjusted amount of time \({t}^{{{{\boldsymbol{* }}}}}\) that we should wait before acting, is controlled by how quickly \({p}_{t}\) grows with time to surpass our evidence-based threshold30.
Renewal models with practical surveillance
The renewal branching process58 is widely used to model infectious disease epidemics of COVID-19, Ebola virus, pandemic influenza and many others59. It simulates the incidence of new infections \({I}_{t}\) at some time \(t\) in terms of the effective reproduction number \(R\) and past incidence \({I}_{1}^{t-1}\equiv \{{I}_{1},{I}_{2},\ldots ,{I}_{t-1}\}\) as in the left of Eq. (8) with \({{{\bf{Pois}}}}\) as Poisson noise. Here, \({\Lambda }_{t}\) is the total infectiousness and encodes the impact of past incidence, i.e.,\({\Lambda }_{t}\stackrel{\scriptscriptstyle{{\mathrm{def}}}}{=}{\sum }_{x=1}^{t-1}{w}_{t-x}{I}_{x}\). The \({w}_{t-x}\) terms are the probability of an infection being transmitted in \(t-x\) time units and define the generation time distribution of the disease with \({\sum }_{u=1}^{\infty }{w}_{u}=1\)36\(.\)
Normally, infections are not observable, so the standard renewal model is modified to describe the incidence of symptomatic cases, \({C}_{t}\) at time \(t\), instead. This requires including noise terms such as reporting fractions, \({\rho }_{s}\) at time \(s\), and the probability, \({\delta }_{t-s}\) of a reporting delay of \(t-s\) time units. This leads to the right side of Eq. (8)32,38. In all of these equations, we assume a constant \(R\) until time \(t\), but note that they remain valid for time-varying reproduction numbers (in which case \(R\) becomes an approximate mean). If we have no delay and perfect reporting then \({\rho }_{s}=1\), \({\delta }_{0}=1\), \({\delta }_{x > 0}=0\) and hence \({C}_{t}={I}_{t}\), recovering the original renewal model.
In Figs. 1–3, we simulate under established COVID-19 transmission and surveillance noise parameters from the literature. We use the generation time distribution from ref. 10 (a gamma distribution with a mean of 6.5 days) and model delays in reporting, as done in ref. 60 (a negative binomial distribution with a mean of 10.8 days) with the fraction of infections reported as cases based on ref. 61 (a beta distribution with mean of 0.38, implying under-reporting of mean 0.72). In Fig. 4 we perform equivalent simulations to Fig. 3 but now under Ebola virus parameters. We apply the generation time distribution from ref. 49 (a gamma distribution with a mean of 15.3 days) and reporting noise from62,63 (delays follow a negative binomial distribution with a mean of 11.8 days and under-reporting conforms to a beta distribution with a mean of 0.4). For the precise distributions underlying both our COVID-19 and Ebola virus analyses (as well as code to reproduce these figures) see the Data and code availability section.
Fisher information given noisy observations
The minimum (asymptotic) uncertainty around estimates of \(R\) derived from the renewal model of Eq. (8) can be quantified using the Fisher information \({{{\bf{I}}}}({{{\boldsymbol{.}}}})\). This can be computed from the expected curvature of the log-likelihood of the statistical models from Eq. (8). This leads to the relations in Eq. (9), which provide the information from infections and cases respectively, and are adapted from the frameworks introduced in refs. 32,41. Here \({F}_{t-s}={\sum }_{x=0}^{t-s}{\delta }_{x}\) describes the cumulative reporting delay probability and \({{{\bf{I}}}}\left({R|}{I}_{1}^{t}\right)\ge {{{\bf{I}}}}\left({R|}{C}_{1}^{t}\right)\) (noise reduces information).
In the main text, we examine ratios of these information terms and also derive related ratios for epidemic growth rates \(r\). These growth rates have a mapping to \(R\) described by the Euler-Lotka equation36. This equation depends on the generation time distribution of the disease. Commonly, a gamma-distributed generation time distribution is assumed43,59 i.e., the \({w}_{u}\) probabilities describe a shape-scale \({{{\bf{Gam}}}}\left(\alpha ,{\beta }^{-1}\right)\) distribution. Under this setting the Euler-Lotka relationship \(f(.)\) takes the form of the left side of Eq. (10).
For a given generation time distribution, \(R\) is a smooth function of \(r\), \(f(r)\), and we may apply the Fisher information change of variables formula on the right side of Eq. (10) to convert the expressions for \(R\) in Eq. (9) to ones for the growth rate. Here \({X}_{1}^{t}\) may be the time series of cases \({C}_{1}^{t}\) or infections \({I}_{1}^{t}\) as required. In the main text, we use the above equations to derive Fisher information ratios for both reproduction numbers and growth rates.
Cost-optimal feedback control of epidemics
We have focused on extracting general insights about the asymmetries in response times to growing or waning epidemics under arbitrary cost thresholds. This required tractable modelling approaches that facilitate analytic results. However, it is possible to instead construct complex algorithms15 that directly integrate competing costs from interventions and disease burden (e.g., epidemic peaks and infection endemic levels), as well as consider longer-term, feedback effects from chosen interventions47. We adopt such an algorithm from28, which uses model predictive control (MPC)46 to balance the component costs of \(\psi (t)\) in Eq. (11).
This MPC algorithm considers costs due to (i) the difference between infections and a practical endemic goal \({I}_{{{{\rm{end}}}}}\) (weighted by \(\alpha\)), (ii) exceeding a peak value \({I}_{{{{\rm{pk}}}}}\), which for example models the level of infections beyond which healthcare resources are overrun (weighted by \(\beta\)) and (iii) the actual economic or other penalties from enforcing an NPI (wrapped in a flexible function \(\phi (.)\)). Heuristically, applying a more stringent NPI increases (iii) but can decrease (i) and (ii). Hence, we penalise both action and inaction. This aligns and expands on the framework from Eq. (1) and implies a threshold for action based on how components (i)-(iii) balance. The MPC approach of ref. 28 finds the optimal time to initiate or relax an NPI by minimising the long-term costs of those actions over a horizon \(h\) i.e., \({\sum }_{s=0}^{h}{\gamma }^{s}\psi (t+s)\), with \(\gamma\) as a discount factor.
This is done by projecting the consequences of those choices using a generalised form of Eq. (8) and then applying Bayesian optimisation to select the cost-optimal choice. In computing projections, the algorithm infers the effective reproduction number, incorporating uncertainty from transmissibility. See ref. 28 for complete details of this procedure. In our analyses \({{{\rm{NP}}}}{{{{\rm{I}}}}}_{t}\) is 0 when inactive at time \(t\). When active it is 1 and models a lockdown or stay-at-home order as a multiplicative reduction in \(R\) (though it can be easily modified to model other intervention types). The cost \(\phi \left(1\right)\) associated with this action is set based on analyses in ref. 48. Late NPI relaxation increases (iii), while late NPI initiation increases (i)-(ii). When surveillance is noisy, we replace \({I}_{t}\) with cases \({C}_{t}\) and apply appropriate delay and under-reporting parameters from the literature. Projections using \({I}_{t}\) consider the stochasticity of transmission, while those using \({C}_{t}\) factor in the additional noise from the surveillance imperfections.
Data availability
All relevant data are available from the authors upon request.
Code availability
All code to reproduce the analyses and figures of this work are freely available (MATLAB and R) at: https://github.com/kpzoo/asymmetricDetection with release https://doi.org/10.5281/zenodo.17184842.
References
Wagner, M. M. et al. The emerging science of very early detection of disease outbreaks. J. Public Health Manag Pr. 7, 51–59 (2001).
Buckee, C. Improving epidemic surveillance and response: big data is dead, long live big data. Lancet Digit Health 2, e218–e220 (2020).
Jackson, M. L., Baer, A., Painter, I. & Duchin, J. A simulation study comparing aberration detection algorithms for syndromic surveillance. BMC Med Inf. Decis. Mak. 7, 6 (2007).
Brett T. S., Drake J. M., Rohani P. Anticipating the emergence of infectious diseases. J. R. Soc. Interface. 14. https://doi.org/10.1098/rsif.2017.0115.
Yuan, M., Boston-Fisher, N., Luo, Y., Verma, A. & Buckeridge, D. L. A systematic review of aberration detection algorithms used in public health surveillance. J. Biomed. Inf. 94, 103181 (2019).
Sonesson, C. & Bock, D. A review and discussion of prospective statistical surveillance in public health. J. R. Stat. Soc. A 166, 5–21 (2003).
Morris, D. H., Rossine, F. W., Plotkin, J. B. & Levin, S. A. Optimal, near-optimal, and robust epidemic control. Commun. Phys. 4, 78 (2021).
Metcalf, C. J. E. et al. Challenges in evaluating risks and policy options around endemic establishment or elimination of novel pathogens. Epidemics 37, 100507 (2021).
Metcalf, C. J. E. & Lessler, J. Opportunities and challenges in modeling emerging infectious diseases. Science 357, 149–152 (2017).
Ferguson N. et al. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Imperial College London. https://doi.org/10.25561/77482.
Karatayev, V. A., Anand, M. & Bauch, C. T. Local lockdowns outperform global lockdown on the far side of the COVID-19 epidemic curve. Proc. Natl. Acad. Sci. USA 117, 24575–24580 (2020).
Probert, W. J. M. et al. Real-time decision-making during emergency disease outbreaks. PLoS Comput. Biol. 14, e1006202 (2018).
Greer, S. L., King, E. J., da Fonseca, E. M. & Peralta-Santos, A. The comparative politics of COVID-19: The need to understand government responses. Glob. Public Health 15, 1413–1416 (2020).
Sebhatu, A., Wennberg, K., Arora-Jonsson, S. & Lindberg, S. I. Explaining the homogeneous diffusion of COVID-19 nonpharmaceutical interventions across heterogeneous countries. Proc. Natl. Acad. Sci. USA 117, 21201–21208 (2020).
Haw D. J. et al. Data needs for integrated economic-epidemiological models of pandemic mitigation policies. Epidemics 41, 100644 (2022).
Barnett, M., Buchak, G. & Yannelis, C. Epidemic responses under uncertainty. Proc. Natl. Acad. Sci. USA 120, e2208111120 (2023).
Keogh-Brown, M. R., Jensen, H. T., Edmunds, W. J. & Smith, R. D. The impact of Covid-19, associated behaviours and policies on the UK economy: A computable general equilibrium model. SSM Popul. Health 12, 100651 (2020).
Shea, K. et al. Multiple models for outbreak decision support in the face of uncertainty. Proc. Natl. Acad. Sci. USA 120, e2207537120 (2023).
Li, S.-L. et al. Essential information: Uncertainty and optimal control of Ebola outbreaks. Proc. Natl. Acad. Sci. USA 114, 5659–5664 (2017).
Ferguson, N. M. et al. Planning for smallpox outbreaks. Nature 425, 681–685 (2003).
Carrasco, L. R. et al. Trends in parameterization, economics and host behaviour in influenza pandemic modelling: a review and reporting protocol. Emerg. Themes Epidemiol. 10, 3 (2013).
Fraser, C., Riley, S., Anderson, R. M. & Ferguson, N. M. Factors that make an infectious disease outbreak controllable. Proc. Natl. Acad. Sci. USA 101, 6146–6151 (2004).
Parag, K. V. & Donnelly, C. A. Fundamental limits on inferring epidemic resurgence in real time using effective reproduction numbers. PLoS Comput. Biol. 18, e1010004 (2022).
Dehning J. et al. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science 369. https://doi.org/10.1126/science.abb9789.
Unkel, S., Farrington, C. P., Garthwaite, P. H., Robertson, C. & Andrews, N. Statistical methods for the prospective detection of infectious disease outbreaks: a review. J. R. Stat. Soc. A 175, 49–82 (2012).
Johnson P., Moriarty J., Peskir G. Detecting changes in real-time data: a user’s guide to optimal detection. Philos Trans A Math Phys Eng Sci. 375. https://doi.org/10.1098/rsta.2016.0298.
Kay S. M. Fundamentals of Statistical Signal Processing: Detection theory. reprint. Prentice-Hall PTR; 1998.
Beregi, S. & Parag, K. V. Optimal algorithms for controlling infectious diseases in real time using noisy infection data. PLoS Comput. Biol. 21, e1013426 (2025).
Berger J. O. Statistical decision theory and Bayesian analysis. New York, NY: Springer New York; 1985. https://doi.org/10.1007/978-1-4757-4286-2.
Gabriele, T. Information criteria for threshold determination (Corresp. IEEE Trans. Inf. Theory 12, 484–486 (1966).
Goldstein, E. et al. Reconstructing influenza incidence by deconvolution of daily mortality time series. Proc. Natl. Acad. Sci. USA 106, 21825–21829 (2009).
Parag, K. V., Donnelly, C. A. & Zarebski, A. E. Quantifying the information in noisy epidemic curves. Nat. Comput Sci. 2, 584–594 (2022).
Salmon, M., Schumacher, D. & Stark, K. Others. Bayesian outbreak detection in the presence of reporting delays. Biometr. J. 57, 1051–1067 (2015).
Southall, E., Brett, T. S., Tildesley, M. J. & Dyson, L. Early warning signals of infectious disease transitions: a review. J. R. Soc. Interface 18, 20210555 (2021).
Bettencourt, L. M. A. & Ribeiro, R. M. Real time bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS One 3, e2185 (2008).
Wallinga, J. & Lipsitch, M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B 274, 599–604 (2007).
Kraemer, M. U. G. et al. Monitoring key epidemiological parameters of SARS-CoV-2 transmission. Nat. Med 27, 1854–1855 (2021).
Azmon, A., Faes, C. & Hens, N. On the estimation of the reproduction number based on misreported epidemic data. Stat. Med 33, 1176–1192 (2014).
Anderson R. et al Reproduction number (R) and growth rate (r) of the COVID-19 epidemic in the UK: methods of. The Royal Society. (2020).
Cover T., Thomas J. Elements of Information Theory Second Edition. John Wiley and Sons. (2006).
Parag, K. V. & Donnelly, C. A. Adaptive estimation for epidemic renewal and phylogenetic skyline models. Syst. Biol. 69, 1163–1179 (2020).
Parag, K. V. Improved estimation of time-varying reproduction numbers at low case incidence and between epidemic waves. PLoS Comput. Biol. 17, e1009347 (2021).
Parag K. V., Thompson R. N., Donnelly C. A. Are epidemic growth rates more informative than reproduction numbers? J. Royal Stat. Soc. A. 185, S5–S15 (2022).
Dushoff, J. & Park, S. W. Speed and strength of an epidemic intervention. Proc. Biol. Sci. 288, 20201556 (2021).
Parag K. V., Cowling B. J., Lambert B. C. Angular reproduction numbers improve estimates of transmissibility when disease generation times are misspecified or time-varying. Proc. Royal Soc. B: Biol. Sci. 290, 20231664 (2023).
Schwenzer, M., Ay, M., Bergs, T. & Abel, D. Review on model predictive control: an engineering perspective. Int J. Adv. Manuf. Technol. 117, 1327–1349 (2021).
Parag, K. V. How to measure the controllability of an infectious disease? Phys. Rev. X. 14, 031041 (2024).
Haw, D. J. et al. Optimizing social and economic activity while containing SARS-CoV-2 transmission using DAEDALUS. Nat. Comput. Sci. 2, 223–233 (2022).
Van Kerkhove, M. D., Bento, A. I., Mills, H. L., Ferguson, N. M. & Donnelly, C. A. A review of epidemiological parameters from Ebola outbreaks to inform early public health decision-making. Sci. Data 2, 150019 (2015).
Runge, M. C. et al. Scenario design for infectious disease projections: Integrating concepts from decision analysis and experimental design. Epidemics 47, 100775 (2024).
Shearer, F. M., Moss, R., McVernon, J., Ross, J. V. & McCaw, J. M. Infectious disease pandemic planning and response: Incorporating decision analysis. PLoS Med. 17, e1003018 (2020).
Swallow, B. et al. Challenges in estimation, uncertainty quantification and elicitation for pandemic modelling. Epidemics 38, 100547 (2022).
Stolerman, L. M. et al. Using digital traces to build prospective and real-time county-level early warning systems to anticipate COVID-19 outbreaks in the United States. Sci. Adv. 9, eabq0199 (2023).
Sharan, M., Vijay, D., Yadav, J. P., Bedi, J. S. & Dhaka, P. Surveillance and response strategies for zoonotic diseases: a comprehensive review. Sci. One Health 2, 100050 (2023).
Monod M. et al. Age groups that sustain resurging COVID-19 epidemics in the United States. Science 371, https://doi.org/10.1126/science.abe837 (2021).
Cori, A., Lassmann, B. & Nouvellet, P. Data needs for better surveillance and response to infectious disease threats. Epidemics 43, 100685 (2023).
Hardt M., Recht B. [2102.05242] Patterns, predictions, and actions: A story about machine learning. arXiv. (2021).
Fraser, C. Estimating individual and household reproduction numbers in an emerging epidemic. PLoS One 2, e758 (2007).
Cori, A., Ferguson, N. M., Fraser, C. & Cauchemez, S. A new framework and software to estimate time-varying reproduction numbers during epidemics. Am. J. Epidemiol. 178, 1505–1512 (2013).
Irons N. J., Raftery A. E. Estimating SARS-CoV-2 infections from deaths, confirmed cases, tests, and random surveys. Proc. Natl Acad. Sci. USA. 118, e2103272118 (2021).
Pullano, G. et al. Underdetection of cases of COVID-19 in France threatens epidemic control. Nature 590, 134–139 (2021).
Dalziel, B. D. et al. Unreported cases in the 2014-2016 Ebola epidemic: Spatiotemporal variation, and implications for estimating transmission. PLoS Negl. Trop. Dis. 12, e0006161 (2018).
Team WER Ebola Virus disease in West Africa – The First 9 Months of the Epidemic and Forward Projections. N. Engl. J. Med. 371, 1481–1495 (2014).
Acknowledgements
KVP and SB acknowledge support (Reference No. MR/X020258/1) from the MRC Centre for Global Infectious Disease Analysis funded by the UK Medical Research Council. This UK-funded grant is carried out in the frame of the Global Health EDCTP3 Joint Undertaking. The funders played no role in study design, data collection and analysis, decision to publish, or manuscript preparation.
Author information
Authors and Affiliations
Contributions
Conceptualization, investigation, formal analysis, writing (original draft preparation), funding acquisition, supervision: K.V.P. Software, visualisation: KVP and SB. Validation, methodology, writing (review and editing): K.V.P., B.L., C.A.D., S.B.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Physics thanks Sung-mok Jung and David Soriano-Paños for their contribution to the peer review of this work. [A peer review file is available].
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Parag, K.V., Lambert, B., Donnelly, C.A. et al. Asymmetric limits on timely interventions from noisy epidemic data. Commun Phys 8, 450 (2025). https://doi.org/10.1038/s42005-025-02358-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42005-025-02358-w






