The Big Bang of an epidemic: a metapopulation approach to identify the spatiotemporal origin of contagious diseases and their universal spreading pattern

Babazadeh Maghsoodlo, Yazdan; Safaeesirat, Amin; Ghanbarnejad, Fakhteh

doi:10.1038/s41598-025-85232-7

Download PDF

Article
Open access
Published: 17 February 2025

The Big Bang of an epidemic: a metapopulation approach to identify the spatiotemporal origin of contagious diseases and their universal spreading pattern

Scientific Reports volume 15, Article number: 5809 (2025) Cite this article

4248 Accesses
2 Citations
1 Altmetric
Metrics details

Subjects

Abstract

In this paper, we propose a mathematical framework that governs the evolution of epidemic dynamics, encompassing both intra-population dynamics and inter-population mobility within a meta-population network. By linearizing this dynamical system, we can identify the spatial starting point(s), namely the source(s) and the initiation time of the epidemic, which we refer to as the “Big Bang” of the epidemic. Furthermore, we introduce a novel concept of effective distance to track disease spread within the network. Our analysis reveals that the contagion geometry can be represented as a line with a universal slope, for any disease type (R0) or mobility network configuration. The mathematical derivations presented in this framework are corroborated by empirical data, including observations from the COVID-19 pandemic in Iran and the US and the H1N1 outbreak worldwide. Within this framework, to detect the Big Bang of an epidemic we require two types of data: (1) A snapshot of the active infected cases in each subpopulation during the linear phase. (2) A coarse-grained representation of inter-population mobility. Also even with access to only the first type of data, we can still demonstrate the universal contagion geometric pattern. Additionally, we can estimate errors and assess the precision of the estimations. This comprehensive approach enhances our understanding of when and where epidemics began and how they spread. It equips us with valuable insights for developing effective public health policies and mitigating the impact of infectious diseases on populations worldwide.

Mathematical model of the feedback between global supply chain disruption and COVID-19 dynamics

Article Open access 29 July 2021

A rigorous theoretical and numerical analysis of a nonlinear reaction-diffusion epidemic model pertaining dynamics of COVID-19

Article Open access 04 April 2024

Emergence of universality in the transmission dynamics of COVID-19

Article Open access 23 September 2021

Introduction

Throughout history, infectious disease outbreaks have significantly impacted human life all over the world^1,2, causing many deaths³. For instance, the COVID-19 pandemic⁴, affected people of all countries^5,6, mentally^7,8,9, financially¹⁰, and beyond. Human mobility is a key factor in this regard^{11,12,13,14,15,16}, accounting for the spatial spread of diseases, facilitating their propagation through the densely connected networks of global, national, local displacement^17,18,19. The complex network of travel routes²⁰ provides numerous direct and indirect pathways for disease transmission at various scales, with air travel playing a crucial role in the rapid spread of viruses including SARS-CoV-2^{21,22,23,24,25,26} at the macroscopic level, given their potential to connect distant locations²⁷. This underscores the importance of taking immediate non-pharmacological interventions²⁸, such as air travel restrictions^21,29, to control disease spread. Understanding the underlying transmission mechanisms of disease at a coarse-grained level, where the mobility network can be considered as a meta-population network^30,31–which consists of nodes, each representing a distinct patch or sub-population.– is essential for developing effective strategies to control pandemics and save lives.

More specifically, answering questions like “Where is (are) the source (sources) of an epidemic?”, “When did it begin?”, which we refer to as the Big Bang of the epidemic, and “How does the outbreak spread through other sub-populations?” after the Big Bang, is of paramount importance.

In recent studies, a variety of mathematical models³², ranging from stochastic models^33,34,35, spatio-temporal spreading models^36,37, reaction-diffusion models³⁸, to agent-based models^{39,40,41,42,43} and network^27,44,45,46 models and meta-population models^{34,41,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61} have been developed to investigate various aspects of disease-spread phenomena. More specifically, effective distance (ED) can effectively address the issues stated above^{13,23,62,63,64,64,65}.

The effective distance between two sub-populations is determined by the probability of travel along direct or indirect routes connecting them. In this context, the most likely path between two sub-populations plays a key role. In 2007, Gautreau et al.⁶² introduced an approach for calculating this effective distance. Later, in 2013, Brockmann and Helbing¹³ refined this approach, proposing an ansatz to quantify effective distance based on travel probabilities. Their method demonstrated a strong correlation between effective distance and the first arrival time of a disease, particularly when applied to the global air transportation network.

The methodology was later improved by adding the effects of all possible paths⁶⁶, resulting in a substantial increase in the correlation between the first arrival time and ED. Also this approach has been validated with empirical data of the 2003 SARS and the 2009 H1N1 pandemics¹³. Other successful modifications on ED have also been reported^23,63,64. For instance, Zhang et al introduced Country Distancing⁶⁴, which is similar to the equivalent resistance defined for parallel resistors in electrical circuits. The idea of ED was also successfully tested for the COVID-19 pandemic on the world air traffic data⁶⁵. This correlation between effective distance and arrival times not only suggests that effective distance can be a useful predictor of disease spread but also provides a means to trace the origin of an epidemic, helping identify the source of the outbreak¹³.

There are also many other methods developed to identify the source of an outbreak in a variety of networks such as a meta-population network or a network of individuals, and in different contexts like disease spreading^{67,68,69,70,71,72,73}, information spreading^74,75,76, food contamination^77,78,79, rumors^{80,80,81,82,83,84,85}, diffusion processes on networks^{86,87,88,89,90,91}, etc. Typically, the aim of these studies is to identify the source of spread from a “snapshot”, for example number of infected ones, which is the state of the system after the start of the spread.

Despite the success of ED methods, there are some limitations. Some examples follow. First, their definitions often rely on intuition rather than being grounded in comprehensive mathematical models, which may hinder their clarity and interpretability, as well as impede a deep understanding of the contagious dynamics they aim to describe. Second, it is usually assumed that the disease originates from a specific location in the network (the source) and contaminates other nodes over time. However, this assumption is not necessarily true. For instance, on a country scale, the disease can reach different nodes (states or provinces) from outside the network during its spread, effectively acting as multi-sources within the country. Finally, the methods lack any correction of time to detect the beginning of the epidemic in data or any error analysis to check the validity of the spatial and temporal source estimations.

In this paper, we address these gaps by first developing a mathematical framework based on intra-population SIR model dynamics and inter-population mobility within a meta-population network. We then derive expressions to identify the source or sources of any given epidemic, as well as its starting time, given the number of infected individuals and coarse-grained mobility data at a specific time. Additionally, we propose a new definition for effective distance, which universally relates the overtaking time of nodes to their distance from the epidemic’s source. We validate this method with real-world data from the COVID-19 pandemic in Iran and the US, as well as the global H1N1 pandemic.

Our mathematical framework

We aim to identify the ’Big Bang’ of a given epidemic’s dynamics and, specifically, to understand how the disease outbreak spreads following this initial event. In doing so, the first step is to propose our general mathematical framework which the spreading dynamics adhere to.

When studying the spread of infectious diseases at a given coarse-grained scale, the disease can be considered to spread through a meta population network. In this network, the number of infected and infectious individuals at time t and for subpopulation (node) i, $I_i(t)$, can be put into components of a vector we call $\vec {I}(t)$:

$$\begin{aligned} \vec {I}(t)=(I_{1}(t), I_{2}(t), ... , I_{n}(t)). \end{aligned}$$

(1)

Here, n is the number of nodes. In general, during epidemic dynamics, there is no specific pattern in the number of infected people across all nodes. However, typically, the number initially increases to reach a peak, followed by a subsequent decline, see Fig. 1A. In our following framework, our primary objective is to demonstrate the evolution of $\vec {I}(t)$ and subsequently discern a straightforward geometric pattern that illustrates the progression of the outbreak within nodes of the meta-population network. Therefore we show that vector $\vec {I}(t)$ evolves in time as follows, see Fig. 1B and the Supplementary Material (S.M.), Section 1 for more details:

$$\begin{aligned} \vec {I}(t) = e^{\hat{B}pt} \vec {I}(0). \end{aligned}$$

(2)

In this equation $\hat{B}$ is a matrix that describes the evolution of $\vec {I}(t)$ and includes $q_{i}-1$ in diagonal components, which represent the growth of the disease in each node and probability matrix $P_{ij}$ in non-diagonal components. The above equation is the solution of the following equation, assuming $S_i \approx N_i$ for the early stages of the dynamic, see S.M. 1.4 for more details:

$$\begin{aligned} \frac{dI_i}{dt} = [\beta _i {N_i} I_i -\gamma _i I_i ] + [p\sum _{j} P_{ji} I_j -p I_i]. \end{aligned}$$

(3)

While this equation represents the dynamic of vector $\vec {I}(t)$ and has two parts:

1.
spreading within the population, namely intra population (First Bracket).
2.
mobility between subpopulations, namely inter population (Second Bracket).

We use a simple Susceptible-Infected-Recovered model (SIR) with a well-mixed assumption for the first part of the equation in which $N_i$ is the population of node i, $\beta$ is the transmission rate, and $\gamma$ is the recovery rate. Moreover, in the second bracket, we connect all nodes, i.e. each population via a meta-population network. $P_{ji}$ in Eq. (3) represents the probability of traveling from node i to node j. We assume that the probability of traveling between all nodes is equal to the average probability of travel in the network, denoted as p. A detailed explanation of how the intra-population term in Equation 3 is derived can be found in S.M. 1.2 and S.M. 1.3. It’s worth noting that in this model we have not considered any immigration to the network, which is another simplifying assumption.

In Eq. (2), matrix $\hat{B}$ (We refer to this operator as Astwihad or the Black Div, who is known as the demon of death in Iranian mythology^92,93.) keeps all of the information regarding the properties of the dynamics (Fig. 1C, center). The diagonal components ($q_i$) represent the properties of internal growth of the disease (Intra Population Dynamics). The population of infected people in each node shows exponential behavior in the early stages. So, in the plot of $\log (I)$ vs time, there is a linear pattern for each node at the beginning of the outbreak called the “linear phase”. In this study, we specifically focus on this part of the dynamics. As it can be seen in Fig. 1C, left, the slope of this line for node i is $q_i$, which sits into the $i_{th}$ diagonal component of matrix $\hat{B}$ (Fig. 1C, center), see more details in S.M. 1.5.

Other components of matrix $\hat{B}$ are called $P_{ij}$, which represent the probability of traveling from node i to j. For calculating the value of $P_{ij}$ we use the flow matrix, $F_{ij}$, which keeps the number of people who travel from node i to j in a specific period. (Fig. 1C, Right). Further details about probability and flow matrices are available in the S.M. 1.2.

Now we can expand the solution (Eq. 2) and write it down as:

$$\begin{aligned} \vec {I}(t)= \left( \hat{1} + \hat{B}pt + \frac{(\hat{B}pt)^2}{2} + ...\right) \vec {I}(0) \end{aligned}$$

(4)

This expansion contains different powers of matrix $\hat{B}$. Since $\hat{B}$ has $P_{ij}$ on its non-diagonal components, different powers of $\hat{B}$ generate different powers of matrix P. As matrix P describes the probability of transition directly from node i to j, higher powers of P describe the probability of indirect transition from node i to j through a specific number of intermediary nodes in between. For instance, we can define $P_{ij}^1 = \sum _k P_{ik}P_{kj}$ as the probability of mobility from node i to j through one intermediary node in the network. This is a new finding we call Intermediate Probability. As can be seen in Fig. 1D, the $k_{th}$ component of the expansion holds the intermediate probability of using $\kappa -1$ intermediary nodes (S.M. 1.6).

The expanded solution can get simpler even further by focusing on the early stage of the dynamic, i.e. the linear phase, and because it takes more time to transit by indirect paths via intermediate nodes. This means that in the early stages of the dynamics, as we go further into the expansion, terms become smaller. Therefore, we can simplify the dynamic by cutting the expansion up to a certain term, keeping only the first terms. This defines the time scales for our model framework. We need to be in a time range in which the assumption of linear evolution for SIR and expansion’s cut work together, or $\tau =$ ${\textit{min}}$ $\{\frac{1}{\beta - \gamma }, \frac{1}{p}\}$, which constrains the time scale.

As explained, we aim to focus on the very beginning of the spread process, when everything just started and almost no one was infected, and study the expansion of the number of infected individuals in a specific order from the start, like the idea of the Big Bang in cosmology. In the following section, we will introduce several algorithms to solve the challenges and find the starting time and place as well as the hidden spread mechanism of the disease.

Derivations, algorithms and results

In this section, using our mathematical framework, we introduce algorithms that reveal where and when the outbreak began and how it spread further using the snapshots of the disease and the flow matrix. In the first algorithm, we detect the potential sources of the disease. In the second one, we estimate the starting time of the spread, then we introduce an algorithm to illustrate a geometric pattern for the spread of the disease. There are different sources of error in the estimations such as approximating the outbreak with SIR model, inaccurate measurement of the number of infected people, and the flow matrix, which is challenging to take into account. Therefore, we only report the theoretical error caused by the cut-off in the expansion of $\exp {\hat{B}pt}$ operator, see details in S.M. Sections 2.5 and 2.8. And the role of estimating epidemiological parameters in error analysis is discussed in the supplementary text (S.M., Section 5). For the value of $R_0$ we have used the method in S.M. 1.7. Given that these algorithms rely on empirical data as input, to distinguish empirical data from theoretical variables, we denote empirical data using the subscript or superscript e. For instance, $\vec {I}(t)$ is a vector containing the number of infected people in our mathematical framework, and $\vec {I}^e(t)$ is the same vector but contains the empirical data of infected people coming from official reports and announcements, see S.M. Sec. 4.

Where did it start?

Here, we aim to find the potential sources, where the dynamic began, having $\vec {I}^e(t)$ as empirical data and (Eq. 2) as theoretical formalism. We first develop the theoretical basis of the algorithm and then discuss how to apply it to COVID-19 data of Iran and the USA.

In a network with n nodes, there is a n-dimensional vector space whose i-th basis represents the node i. We define the basis ${\hat{i}}$ as

$$\begin{aligned} {\hat{i}} = (0,0,...,1,..0), \end{aligned}$$

(5)

in which its i-th component is 1 and others are zeros. Now we can expand vector $\vec {I}(t)$ in this space using these bases. The value of the component of this vector on each basis indicates the contribution of that basis to the spread of the disease. As can be seen in Eq. (2), the vector $\vec {I}(t)$ evolves in time and we have to evolve the bases in time as well, to rewind the dynamic to the origin of the time (see S.M. 2.1). Therefore, we redefine the bases as follows:

$$\begin{aligned} \hat{i'} = e^{\hat{B}pt} {\hat{i}}. \end{aligned}$$

(6)

Now, we define the “weight of a node” as:

$$\begin{aligned} W_i = \frac{\hat{i'}.\vec {I}(t)}{\sum _{\hat{i'}}{\hat{i'}.\vec {I}(t)}}, \end{aligned}$$

(7)

where “.” represents the inner product between each two vectors. The reason for this definition is to introduce $W_i$ as a number between 0 and 1 that shows the impact and contribution of the node i on the spreading and the initial prevalence distribution. If t is measured exactly from the origin of time of the disease, then $W_i$ represents the spatial source of the disease, which can be a single or multiple source. For example, for a given network if $W_i=1$, it means that node i was the only source of the network. For empirical data we use the vector $\vec {I^e}(t_e)$ in Eq. (7), instead of vector $\vec {I}(t)$. $t_e$ is time, measured from the officially reported temporal origin of the disease. Therefore

$$\begin{aligned} W^e_i = \frac{\vec {i'}.\vec {I^e}(t_e)}{\sum _{\vec {i'}}{\vec {i'}.\vec {I^e}(t_e)}} , \end{aligned}$$

(8)

It is important to note that $W^e_i$ calculation is independent of the structure of the network as there is no constraint (such as topology of the network) on the flow matrix used in the calculation. An algorithmic description of the where algorithm is provided in section 2.2.

In Fig. 2, we implement this method for the empirical COVID-19 data of Iran and the US (Panel A and A’) and illustrate the results. Also the mobility probability matrix are plotted respectively in panels B and B’. In the case of Iran (Panel C and D), we observe that the values of weights differ by implementing different snapshots. In the first scenario, Qom has the highest weight value, but when considering data from later days, Tehran (the capital province of Iran) surpasses it. Since the largest international airport in Iran (Imam Khomeini) is located between Qom and Tehran provinces, and other important nodes like Gilan and Mazandaran (which also have high values of W) are geographically close to this airport, we can conclude that it is more probable that the very first seed of the disease came from this airport to the country and Tehran and Qom are the most probable sources of the disease, which is consistent with official reports. For the case of the US, (C’ and D’) Washington state has the highest value of weight in the first plot, but over time, based on the values of W, and error bars, other states like Michigan, California, and New York could also be considered important nodes. So, based on this figure, our model predicts that the states of Washington, Michigan, and New York were the most probable sources of the pandemic in The US. We should mention that importation and lockdown have not been included in our analysis because we are focused on the early stage of the disease and also t=0 in this analysis regards the time when the first seed is already transported to the network (see discussion section).

The reasons for these fluctuations can be the uncertainty in empirical data (error in testing, organization error, and other sources of errors), deviation from the assumption of our model (see discussion section), and errors in calibration ($q_i$ and p).

To calculate the error of the Eqs. (7) and (8) we consider the value of cut-off error since only the first terms of the Taylor expansion of $e^{Bpt}$ is used in the calculations. If $W_i^e$ is calculated using the first $\kappa$ terms of the $e^{Bpt}$ expansion, the error of $W_i^e$ can be calculated using $\kappa$+1 term as

$$\begin{aligned} \delta W^e_i = \frac{(\frac{t^{\kappa +1}\hat{B}^{\kappa +1}}{\kappa !}\vec {i'}).\vec {I^e}(t_e)}{\sum _{\vec {i'}}\vec {i'} \cdot \vec {I^e}(t_e)} , \end{aligned}$$

(9)

in which “.” represents the inner product between each two vectors. The value of error depends both on $\kappa$ and the value of $t_e$. By increasing $\kappa$ or decreasing $t_e$, we expect to get a smaller value of error. To get a better idea from the value of cut-off error in a whole network, we define the error of cut-off vector as:

$$\begin{aligned} |\vec {\delta W^e}| = \frac{\sqrt{\sum _i{(\delta W^e_i)^2}}}{n}, \end{aligned}$$

(10)

which is shown versus $\kappa$ and $t_e$ for Iran (panel E) and the US (panel E’).

When did it start?

In this section, our goal is to estimate the temporal origin of the outbreak. As we already mentioned, the real origin of time may differ from the one that is officially reported.

Assume that the source is known by the Where algorithm or any other method. To find the temporal origin we compare the estimated ($\vec {I}(t)$) and reported ($\vec {I^e}(t_e)$) number of infected people at the time t and $t_e$, respectively. We find the temporal origin so that it minimizes the mean squared error (MSE) between $\vec {I}(t)$ and $\vec {I^e}(t_e)$,

$$\begin{aligned} \Delta _i (t)=\frac{\sum _{j=1}^n (I_j(t) - I_j^e(t_e))^2}{n}. \end{aligned}$$

(11)

To estimate the number of infected people, $\vec {I}(t)$, we assume the source is the node i found by the Where algorithm, Eq. (11) has a unique minimum at time $t^*_i$ (see S.M. 2.3 and (Fig. 3)) :

$$\begin{aligned} t^*_i= \frac{1}{i_0} \frac{-\eta (i_0 - I_i^e(t_e)) + p \sum _j(P_{ij}I_{j}^e(t_e))}{\eta ^2 + p^2 \sum _j P_{ij}^2}, \end{aligned}$$

(12)

in which $\eta = (\beta _i N_i - \gamma _i - p)$, in which $\beta$ is the transmission rate, $N_i$ is the population of node i, $\gamma$ is the recovery rate and p is the average probability of travelling., $i_0$ is the initial number of infected people in the source, $P_{ij}$ is the probability matrix. For the error, we consider the cut-off error by adding the third term of the Taylor expansion as the error-making term to our calculation which shows how the predictions degrade (S.M. 2.5). An algorithmic description of the when algorithm is provided in section 8.4.

Figure 3 demonstrates the result of our algorithm applied to the empirical data of Iran and the US. In panel A and C, $\Delta$ (Eq. 11) is shown versus time for Iran and the US, respectively. A minimum exists in both cases, as we showed in Eq. (66). By adding the third term of the Taylor expansion (Eq. 4) to the calculation, the corrected MSE (specified with different colors) shifts to a new curve and the minimum moves a bit (around two days for Iran and less than a day for the US). To understand the accuracy of the algorithm, we illustrate $t^*_i$ versus $t_e$ for Iran (panel B) and the US (panel D) using snapshots from various days. A linear behavior can be observed up to a certain point in both cases, which indicates the linear range of the dynamics. It is worth noting again that $t^*_i$ describes the starting time of the disease from the epidemiological point of view, while $t_e$ refers to the starting time the official reports claim. So, a difference between these two origins of time is expected. By utilizing these specific snapshots, the onset of the COVID-19 pandemic is estimated to be 8 February 2020 for Iran and 12 February 2020 for the US, marking the commencement of the widespread outbreak in Iran and the US.

How does it spread? The universal pattern of any outbreaks

In previous sections, we estimated the origin of the disease, trying to answer when and where it began. In this section, we aim to illustrate the simple geometric patterns behind the dynamics.

For the first step, let us define the overtaking time in our mathematical formalism. This is the time when enough number of infected passengers arrive in a susceptible node so that we can consider this node as infected. In other words, when the intra-population spreading in this node (the first bracket in Eq. 3) equals the inter-population spreading (the second bracket in Eq. 3), which means

$$\begin{aligned} (N_j\beta _j - \gamma _j)I_j = p \left( \sum P_{kj}I_k - I_j\right) . \end{aligned}$$

(13)

Using the above condition and the dynamic of our model given by Eq. (4), one can show that the overtaking time is (see S.M. Sections 2.6, 2.7 and 2.8 for more details)

$$\begin{aligned} t_{O}^{j} = \frac{1}{p} \frac{1}{(2+q_j-q_i)-\frac{P_{ij}^1}{P_{ij}}}, \end{aligned}$$

(14)

in which the overtaking time, $t_O^j$, has been calculated for the node j, given the single source, node i, in the network.

To detect a simple geometric pattern of the spread dynamic, as shown in Fig. 1.b, we define an effective distance between a non-origin node j and the single origin node i so that there is a linear relation between overtaking time (Eq. 87). Therefore, our effective distance is defined as

$$\begin{aligned} D_{ij} = \frac{1}{(2+q_j-q_i)-\frac{P_{ij}^1}{P_{ij}}}, \end{aligned}$$

(15)

where

$$\begin{aligned} D_{ij} = pt_O^{j}. \end{aligned}$$

(16)

Our innovative approach to defining effective distance distinguishes itself from previous methods^13,62,66. While maintaining a similar geometric pattern of spread, our method uniquely utilizes overtaking time rather than arrival time, which is when the first infected patient arrives. Remarkably, we reveal a universal behavior, characterized by a consistent slope of one when plotting $D_{ij}$ against $pt^j_O$, regardless of network or disease characteristics. In the above equation, p represents the inter-population speed of the disease spread among the nodes in a network.

Our proposed effective distance becomes simpler for some special cases. For example, if all nodes have the same value of q, the effective distance simplifies as

$$\begin{aligned} D_{ij} = \frac{1}{2-\frac{P_{ij}^1}{P_{ij}}}, \end{aligned}$$

(17)

which is independent of the disease properties and only depends on the the travel flow. The defined effective distance only includes those nodes that satisfy $\frac{P_{ij}^1}{P_{ij}}<2$ as effective distance and the overtaking times should be positive for the non-origin nodes. When this condition is not met, an alternative approach may be necessary in order to include other nodes. Further studies could investigate such adaptations.

Figure 4 shows the result of the effective distance analysis in two panels. In both panels, the spreading of the disease has been simulated with the SIR model for meta-population networks using the empirical mobility data of Iran (panel A) and the empirical mobility network of the US (panel B). In each panel, the simulation has been repeated twice (Green and Gray in the left panel, Blue and Red in the right panel), each time with a different source node. The Where algorithm is used in the specific choice of the source nodes. As shown, there is a linear relation between the defined effective distance and pt, with the universal slope of one. Changing the source node does not change the linear relation and the value of the slope.

Implementing effective distance analysis with empirical data can pose several challenges. Some of these are outlined below. First, what is reported as the arrival time in official data is not necessarily the same as what we defined as overtaking time in Eq. (87), even though they are close. Second, measuring the exact value of the mobility probability matrix is difficult, especially due to the intervention policy in each region at the beginning. Finally, the initial number of infected people ($i_0$) is not necessarily known.

It is possible to overcome the challenges stated above by estimating the effective distance of the node j from the source using the number of infected people in that node ($I^e_j(t_e)$). Using the mathematical framework, one can show that $I^e_j(t_e)$ can be estimated by a parabola in a short enough time after the arrival of the disease to the node.

$$\begin{aligned} I_j=Q_j+A_j T + B_j T^2 \end{aligned}$$

(18)

One can show that using the coefficients of this equation and the mathematical framework of this model, we can rewrite the effective distance as:

$$\begin{aligned} D_{ij} = \frac{1}{q_j-\frac{B_j}{A_j}} \end{aligned}$$

(19)

Also estimate the overtaking time by assuming $I_j = 0$ and solve the equation for T:

$$\begin{aligned} T= \frac{-A_j+\sqrt{A_j^2-4Q_jB_j}}{2A_j}. \end{aligned}$$

(20)

See S.M. section 2.9 for a detailed proofs of these equations. Figure 5 shows the estimated effective distance and overtaking time for the empirical data of the COVID-19 pandemic in Iran (panel A), the US (panel B), and the H1N1 pandemic in 2009 in the meta-population of the world. As shown, there is a linear relation between effective distance and overtaking time in all instances, with the universal slope close to 1. This result demonstrates that the linear relation with a slope of one remains robust for any disease type or any size and structure of the meta-population network in which the disease spreads.

Concluding remarks

In summary, we introduced a mathematical framework based on the SIR model for meta-population networks, incorporating inter-population mobility. We derived a compact equation (Eq. 2) that represents the time evolution of the number of infected individuals using the mathematical operator $e^{\hat{B}pt}$. We showed how different terms in the Taylor expansion of the operator represent possible transmission paths with different number of intermediary nodes. Based on this general mathematical framework and the provided data, we were able to determine where and when the outbreak began, as well as how it spread within the meta-network.

Firstly, we derived a measure indicating the contribution of each node to disease spread, whether in single-source or multi-source pandemics. Our analysis of COVID-19 revealed that Qom, Tehran, Gilan, and Mazandaran carry the greatest weight in Iran, indicating these provinces as probable sources of the pandemic. This observation aligns with the proximity of these provinces to Imam Khomeini International Airport and the relatively high volume of travel to these areas. Likewise, Washington, Michigan, New York, and California were identified as likely sources of the pandemic in the US.

Secondly, we derived an expression to find the temporal origin of a pandemic. Thus, we estimated the beginning date of the COVID-19 pandemic in Iran and the US is Feb. 8, 2020, and Feb. 12, 2020, respectively. These dates precede the officially announced start dates in both countries, suggesting that the pandemic may have begun earlier than previously thought.

Thirdly, we introduced a novel definition for effective distance and demonstrated that the effective distance of a node from the source exhibits a linear relationship with the scaled overtaking time (pt), characterized by a universal slope of one. Importantly, this relationship remains robust for any epidemiological parameters of the disease and characteristics of the meta-population network, such as the number of passengers and network structure. This assertion is supported by our simulation results for Iran and the US. Finally, we showed how the effective distance can be estimated only with the data of the number of infected ones in the network. We applied this method to the data from the COVID-19 pandemic in Iran and the US, as well as the 2009 H1N1 pandemic. Our analysis confirmed the existence of a linear relationship with the universal slope of one.

Combining all reported observations, our analysis underscores the following practical implications: Given that the speed of disease propagation in the network is directly proportional to travel probability, p, this emphasizes the crucial role of implementing travel restrictions during the early stages of a pandemic. Additionally, our findings highlight the importance of predicting more accurately when and how diseases reach the next node. This insight provides policymakers with a better understanding of the optimal strategies for implementing lockdowns or travel restrictions, thereby effectively mitigating the spread of infectious diseases.

Our work presents several theoretical implications and prospects for the research community. Firstly, unlike similar studies^13,62,66, our definition of effective distance in this paper is directly derived from the mathematical model that describes the phenomenon, rather than relying solely on intuition or data analysis. Additionally, our analysis reveals that effective distance exhibits a universal geometric pattern, contributing to a deeper understanding of epidemic dynamics across different contexts. Secondly, our study addresses fundamental questions such as where and when pandemics begin within a coherent mathematical framework, shedding light on essential aspects of disease spread. However, our method has limitations stemming from the simplifying assumptions we made. Firstly, we utilized the SIR model for meta-population networks, which limited the scope of our study to epidemics describable by the SIR model in their early stages. However, the study can be extended by incorporating more complex epidemiological models. see S.M. Section 3 as an example. Secondly, we treated certain parameters as fixed, which may not always hold true. For instance, we assumed that the number of susceptible individuals remains constant and equal to the node’s population at the early stage of the dynamic. Additionally, we supposed that $\gamma$ is constant across nodes and that the flow matrix remains fixed over time. While these assumptions are reasonable in many cases, they may not accurately reflect reality in all scenarios. Furthermore, as demonstrated, systematic errors can arise from ignoring higher-order terms of the Taylor expansion (Eq. 4). Therefore, our algorithms and results can be enhanced by avoiding mathematical simplifications and improving data quality. Finally, we did not incorporate the effects of importation and control parameters in our model, nor did we consider various types of noise in both data sources-mobility and the number of infected cases-which could potentially influence the robustness of the results. Each of these aspects, as well as further questions such as the detection of the Big Bang for syndemic scenarios^94,95, warrants further investigation in future studies.

Data availability

The datasets used in the current study are openly available; see the Supplementary Material, Section 4.

References

Madhav, N., Oppenheim, B., Gallivan, M., Mulembakani, P., Rubin, E. & Wolfe, N. Pandemics: Risks, impacts, and mitigation. In Disease Control Priorities: Improving Health and Reducing Poverty. 3rd edition (2017).
Huremović, D. Brief history of pandemics (pandemics throughout history). In Psychiatry of pandemics 7–35. (Springer, 2019).
World Health Organization et al. The Global Burden of Disease: 2004 Update. (World Health Organization, 2008).
Fan, W. et al. A new coronavirus associated with human respiratory disease in China. Nature 579(7798), 265–269 (2020).
Article ADS MATH Google Scholar
Msemburi, W. et al. The who estimates of excess mortality associated with the covid-19 pandemic. Nature 613(7942), 130–137 (2023).
Article ADS CAS PubMed MATH Google Scholar
Wang, H. et al. Estimating excess mortality due to the covid-19 pandemic: A systematic analysis of covid-19-related mortality, 2020–21. Lancet 399(10334), 1513–1536 (2022).
Article CAS Google Scholar
Pfefferbaum, B. & North, C. S. Mental health and the covid-19 pandemic. N. Engl. J. Med. 383(6), 510–512 (2020).
Article CAS PubMed MATH Google Scholar
Galea, S., Merchant, R. M. & Lurie, N. The mental health consequences of covid-19 and physical distancing: the need for prevention and early intervention. JAMA Intern. Med. 180(6), 817–818 (2020).
Article CAS PubMed Google Scholar
Talevi, D. et al. Mental health outcomes of the covid-19 pandemic. Riv. Psichiatr. 55(3), 137–144 (2020).
PubMed Google Scholar
Ashraf, B. N. Economic impact of government interventions during the covid-19 pandemic: International evidence from financial markets. J. Behav. Exp. Financ. 27, 100371 (2020).
Article MATH Google Scholar
Dalziel, B. D., Pourbohloul, B. & Ellner, S. P. Human mobility patterns predict divergent epidemic dynamics among cities. Proc. Roy. Soc. B Biol. Sci. 280(1766), 20130763 (2013).
Google Scholar
Belik, V., Geisel, T. & Brockmann, D. Natural human mobility patterns and spatial spread of infectious diseases. Phys. Rev. X 1(1), 011001 (2011).
MATH Google Scholar
Brockmann, D. & Helbing, D. The hidden geometry of complex, network-driven contagion phenomena. Science 342(6164), 1337–1342 (2013).
Article ADS CAS PubMed MATH Google Scholar
Balcan, D. et al. Multiscale mobility networks and the spatial spreading of infectious diseases. Proc. Natl. Acad. Sci. 106(51), 21484–21489 (2009).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Barbosa, H. et al. Human mobility: Models and applications. Phys. Rep. 734, 1–74 (2018).
Article ADS MathSciNet MATH Google Scholar
Anderson, R. M. & May, R. M. Infectious Diseases of Humans. Dynamics and Control (Oxford University Press, 1991).
Book MATH Google Scholar
Antràs, P., Redding, S. J., & Rossi-Hansberg, E. Globalization and pandemics. Technical report, National Bureau of Economic Research (2020).
Vespignani, A. Multiscale mobility networks and the large scale spreading of infectious diseases. APS March Meeting Abstracts 2010, A4-002 (2010).
MATH Google Scholar
Gómez-Gardeñes, J., Soriano-Panos, D. & Arenas, A. Critical regimes driven by recurrent mobility patterns of reaction–diffusion processes in networks. Nat. Phys. 14(4), 391–395 (2018).
Article MATH Google Scholar
Barabási, A.-L., Albert, R. & Jeong, H. Mean-field theory for scale-free random networks. Physica A 272(1–2), 173–187 (1999).
Article ADS MATH Google Scholar
Adiga, A., Venkatramanan, S., Schlitt, J., Peddireddy, A., Dickerman, A., Bura, A., Warren, A., Klahn, B. D., Mao, C, Xie, D. et al. Evaluating the impact of international airline suspensions on the early global spread of covid-19. medRxiv, (2020).
Colizza, V., Barrat, A., Barthélemy, M. & Vespignani, A. The role of the airline transportation network in the prediction and predictability of global epidemics. Proc. Natl. Acad. Sci. 103(7), 2015–2020 (2006).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Lawyer, G. Measuring the potential of individual airports for pandemic spread over the world airline network. BMC Infect. Dis. 16(1), 1–10 (2015).
Article Google Scholar
Anderson, R. M., Heesterbeek, H., Klinkenberg, D. & Hollingsworth, T. D. How will country-based mitigation measures influence the course of the covid-19 epidemic?. Lancet 395(10228), 931–934 (2020).
Article CAS PubMed PubMed Central Google Scholar
Meidan, D. et al. Alternating quarantine for sustainable epidemic mitigation. Nat. Commun. 12(1), 1–12 (2021).
Article ADS Google Scholar
Luo, X.-F. et al. Nonpharmaceutical interventions contribute to the control of covid-19 in china based on a pairwise model. Infect. Dis. Model. 6, 643–663 (2021).
PubMed PubMed Central MATH Google Scholar
Arino, J. Describing, modelling and forecasting the spatial and temporal spread of covid-19: A short review. In Mathematics of Public Health: Proceedings of the Seminar on the Mathematical Modelling of COVID-19 25–51. (Springer, 2021).
Liu, Y., Eggo, R. M. & Kucharski, A. J. Secondary attack rate and superspreading events for sars-cov-2. Lancet 395(10227), e47 (2020).
Article CAS PubMed PubMed Central MATH Google Scholar
Arenas, A. et al. Modeling the spatiotemporal epidemic spreading of covid-19 and the impact of mobility and social distancing interventions. Phys. Rev. X 10(4), 041055 (2020).
CAS MATH Google Scholar
Arino, J. Spatio-temporal spread of infectious pathogens of humans. Infect. Dis. Model. 2(2), 218–228 (2017).
PubMed PubMed Central MATH Google Scholar
Bichara, D. & Iggidr, A. Multi-patch and multi-group epidemic models: A new framework. J. Math. Biol. 77(1), 107–134 (2018).
Article MathSciNet PubMed MATH Google Scholar
Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 87(3), 925 (2015).
Article ADS MathSciNet MATH Google Scholar
Mollison, D. Spatial contact models for ecological and epidemic spread. J. Roy. Stat. Soc. Ser. B (Methodol.) 39(3), 283–313 (1977).
Article MathSciNet MATH Google Scholar
Colizza, V. & Vespignani, A. Epidemic modeling in metapopulation systems with heterogeneous coupling pattern: Theory and simulations. J. Theor. Biol. 251(3), 450–467 (2008).
Article ADS MathSciNet PubMed MATH Google Scholar
de Souza, D. B. et al. Fock-space approach to stochastic susceptible-infected-recovered models. Phys. Rev. E 106(1), 014136 (2022).
Article ADS MathSciNet PubMed Google Scholar
Robertson, D. A. Spatial transmission models: A taxonomy and framework. Risk Anal. 39(1), 225–243 (2019).
Article MathSciNet PubMed MATH Google Scholar
Mollison, D. et al. Spatial epidemic models: Theory and simulations. Popul. Dyn. Rabies Wildlife 8, 291–309 (1985).
MATH Google Scholar
Rass, L, Lifshits, M. A., & Radcliffe, J. Spatial Deterministic Epidemics. (American Mathematical Soc., 2003).
Grenfell, B. T., Bjørnstad, O. N. & Kappey, J. Travelling waves and spatial hierarchies in measles epidemics. Nature 414(6865), 716–723 (2001).
Article ADS CAS PubMed MATH Google Scholar
Merler, S. & Ajelli, M. The role of population heterogeneity and human mobility in the spread of pandemic influenza. Proc. Roy. Soc. B Biol. Sci. 277(1681), 557–565 (2010).
MATH Google Scholar
Balcan, D. et al. Seasonal transmission potential and activity peaks of the new influenza a (h1n1): A Monte Carlo likelihood analysis based on human mobility. BMC Med. 7(1), 1–12 (2009).
Article Google Scholar
Eubank, S. et al. Modelling disease outbreaks in realistic urban social networks. Nature 429(6988), 180–184 (2004).
Article ADS CAS PubMed MATH Google Scholar
Fumanelli, L., Ajelli, M., Manfredi, P., Vespignani, A. & Merler, S. Inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread. PLoS Comput. Biol. 8(9), e1002673 (2012).
Article ADS MathSciNet CAS PubMed PubMed Central MATH Google Scholar
Keeling, M. J. & Eames, K. T. D. Networks and epidemic models. J. R. Soc. Interface 2(4), 295–307 (2005).
Article PubMed PubMed Central MATH Google Scholar
Paré, P. E., Beck, C. L. & Başar, T. Modeling, estimation, and analysis of epidemics over networks: An overview. Annu. Rev. Control. 50, 345–360 (2020).
Article MathSciNet MATH Google Scholar
Luo, X.-F., Jin, Z., He, D. & Li, L. The impact of contact patterns of sexual networks on zika virus spread: A case study in costa rica. Appl. Math. Comput. 393, 125765 (2021).
MathSciNet MATH Google Scholar
Bajardi, P. et al. Human mobility networks, travel restrictions, and the global spread of 2009 h1n1 pandemic. PLoS ONE 6(1), e16591 (2011).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Viboud, C. et al. Synchrony, waves, and spatial hierarchies in the spread of influenza. Science 312(5772), 447–451 (2006).
Article ADS CAS PubMed Google Scholar
Balcan, D. & Vespignani, A. Invasion threshold in structured populations with recurrent mobility patterns. J. Theor. Biol. 293, 87–100 (2012).
Article ADS MathSciNet PubMed MATH Google Scholar
Colizza, V. & Vespignani, A. Invasion threshold in heterogeneous metapopulation networks. Phys. Rev. Lett. 99(14), 148701 (2007).
Article ADS PubMed MATH Google Scholar
Arino, J. & Portet, S. Epidemiological implications of mobility between a large urban centre and smaller satellite cities. J. Math. Biol. 71(5), 1243–1265 (2015).
Article MathSciNet PubMed PubMed Central MATH Google Scholar
Arino, J. & Van den Driessche, P. A multi-city epidemic model. Math. Popul. Stud. 10(3), 175–193 (2003).
Article MathSciNet MATH Google Scholar
Arino, J. et al. A multi-species epidemic model with spatial dynamics. Math. Med. Biol. 22(2), 129–142 (2005).
Article PubMed MATH Google Scholar
Bock, W. & Jayathunga, Y. Optimal control of a multi-patch dengue model under the influence of Wolbachia bacterium. Math. Biosci. 315, 108219 (2019).
Article MathSciNet PubMed MATH Google Scholar
Gaythorpe, K. & Adams, B. Disease and disaster: Optimal deployment of epidemic control facilities in a spatially heterogeneous population with changing behaviour. J. Theor. Biol. 397, 169–178 (2016).
Article ADS PubMed MATH Google Scholar
Glass, K. & Barnes, B. Eliminating infectious diseases of livestock: A metapopulation model of infection control. Theor. Popul. Biol. 85, 63–72 (2013).
Article PubMed MATH Google Scholar
Harvim, P., Zhang, H., Georgescu, P. & Zhang, L. Transmission dynamics and control mechanisms of vector-borne diseases with active and passive movements between urban and satellite cities. Bull. Math. Biol. 81(11), 4518–4563 (2019).
Article MathSciNet PubMed MATH Google Scholar
Kim, J. E., Lee, H., Lee, C. H. & Lee, S. Assessment of optimal strategies in a two-patch dengue transmission model with seasonality. PLoS ONE 12(3), e0173673 (2017).
Article PubMed PubMed Central MATH Google Scholar
Lee, S. & Castillo-Chavez, C. The role of residence times in two-patch dengue transmission dynamics and optimal strategies. J. Theor. Biol. 374, 152–164 (2015).
Article MathSciNet PubMed MATH Google Scholar
Matthews, L. et al. Neighbourhood control policies and the spread of infectious diseases. Proc. R. Soc. Lond. B 270(1525), 1659–1666 (2003).
Article CAS MATH Google Scholar
Arino, J., Portet, S., Bajeux, N., & Ciupeanu, A. S. Investigation of global and local covid-19 importation risks. Report to the Public Health Risk Science division of the Public Health Agency of Canada, (2020).
Gautreau, A., Barrat, A. & Barthélemy, M. Arrival time statistics in global disease spread. J. Stat. Mech. Theory Exp. 2007(09), L09001 (2007).
Article MATH Google Scholar
Iannelli, F., Koher, A., Brockmann, D., Hövel, P. & Sokolov, I. M. Effective distances for epidemics spreading on complex networks. Phys. Rev. E 95(1), 012313 (2017).
Article ADS MathSciNet PubMed PubMed Central MATH Google Scholar
Zhong, L., Diagne, M., Wang, W. & Gao, J. Country distancing reveals the effectiveness of travel restrictions during covid-19. medRxiv, (2020).
Aleta, A. & Moreno, Y. Evaluation of the potential incidence of covid-19 and effectiveness of containment measures in Spain: A data-driven approach. BMC Med. 18, 1–12 (2020).
Article MATH Google Scholar
Gautreau, A., Barrat, A. & Barthelemy, M. Global disease spread: Statistics and estimation of arrival times. J. Theor. Biol. 251(3), 509–522 (2008).
Article ADS MathSciNet PubMed MATH Google Scholar
Tang, W., Ji, F. & Tay, W. P. Estimating infection sources in networks using partial timestamps. IEEE Trans. Inf. Forensics Secur. 13(12), 3035–3049 (2018).
Article MATH Google Scholar
Li, X., Wang, X., Zhao, C., Zhang, X. & Yi, D. Locating the epidemic source in complex networks with sparse observers. Appl. Sci. 9(18), 3644 (2019).
Article MATH Google Scholar
Antulov-Fantulin, N., Lančić, A., Šmuc, T., Štefančić, H. & Šikić, M. Identification of patient zero in static and temporal networks: Robustness and limitations. Phys. Rev. Lett. 114(24), 248701 (2015).
Article ADS PubMed MATH Google Scholar
Wang, H.-J. & Sun, K.-J. Locating source of heterogeneous propagation model by universal algorithm. EPL (Europhysics Letters) 131(4), 48001 (2020).
Article ADS CAS MATH Google Scholar
Lokhov, A. Y., Mézard, M., Ohta, H. & Zdeborová, L. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Phys. Rev. E 90(1), 012801 (2014).
Article ADS MATH Google Scholar
Aditya Prakash, B., Vreeken, J. & Faloutsos, C. Efficiently spotting the starting points of an epidemic in a large graph. Knowl. Inf. Syst. 38(1), 35–59 (2014).
Article MATH Google Scholar
Choi, J. Epidemic source detection over dynamic networks. Electronics 9(6), 1018 (2020).
Article MATH Google Scholar
Louni, A. & Subbalakshmi, K. P. A two-stage algorithm to estimate the source of information diffusion in social media networks. In 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 329–333. (IEEE, 2014).
Lind, P. G., Da Silva, L. R., Andrade Jr, J. S. & Herrmann, H. J. Spreading gossip in social networks. Phys. Rev. E 76(3), 036117 (2007).
Article ADS MATH Google Scholar
Altarelli, F., Braunstein, A., Dall’Asta, L., Lage-Castellanos, A. & Zecchina, R. Bayesian inference of epidemics on networks via belief propagation. Phys. Rev. Lett. 112(11), 118701 (2014).
Article ADS PubMed MATH Google Scholar
Horn, A. L. & Friedrich, H. Locating the source of large-scale outbreaks of foodborne disease. J. R. Soc. Interface 16(151), 20180624 (2019).
Article PubMed PubMed Central MATH Google Scholar
Horn, A. L., & Friedrich, H. The network source location problem in the context of foodborne disease outbreaks. In Dynamics on and of Complex Networks 151–165. (Springer, 2017).
Schlaich, T., Horn, A. L., Fuhrmann, M. & Friedrich, H. A gravity-based food flow model to identify the source of foodborne disease outbreaks. Int. J. Environ. Res. Public Health 17(2), 444 (2020).
Article PubMed PubMed Central MATH Google Scholar
Ji, F., Tay, W. P. & Varshney, L. R. An algorithmic framework for estimating rumor sources with different start times. IEEE Trans. Signal Process. 65(10), 2517–2530 (2017).
Article ADS MathSciNet MATH Google Scholar
Shelke, S. & Attar, V. Source detection of rumor in social network—A review. Online Social Networks Media 9, 30–42 (2019).
Article MATH Google Scholar
Seo, E., Mohapatra, P., & Abdelzaher, T. Identifying rumors and their sources in social networks. In Ground/Air Multisensor Interoperability, Integration, and Networking for Persistent ISR III, volume 8389, page 83891I. (International Society for Optics and Photonics, 2012).
Wang, Z., Dong, W., Zhang, W. & Tan, C. W. Rumor source detection with multiple observations: Fundamental limits and algorithms. ACM SIGMETRICS Perform. Evaluat. Rev. 42(1), 1–13 (2014).
Article MATH Google Scholar
Jiang, J., Wen, S., Yu, S., Xiang, Y. & Zhou, W. Rumor source identification in social networks with time-varying topology. IEEE Trans. Depend. Secure Comput. 15(1), 166–179 (2016).
Article MATH Google Scholar
Shah, D. & Zaman, T. Rumors in a network: Who’s the culprit?. IEEE Trans. Inf. Theory 57(8), 5163–5181 (2011).
Article MathSciNet MATH Google Scholar
Hu, Z.-L. et al. Locating multiple diffusion sources in time varying networks from sparse observations. Sci. Rep. 8(1), 1–9 (2018).
ADS MATH Google Scholar
Hu, Z.-L., Shen, Z., Tang, C.-B., Xie, B.-B. & Jian-Feng, L. Localization of diffusion sources in complex networks with sparse observations. Phys. Lett. A 382(14), 931–937 (2018).
Article ADS MathSciNet CAS MATH Google Scholar
Hu, Z. L., Wang, L. & Tang, C. B. Locating the source node of diffusion process in cyber-physical networks via minimum observers. Chaos Interdiscip. J. Nonlinear Sci. 29(6), 063117 (2019).
Article CAS MATH Google Scholar
Shen, Z., Cao, S., Wang, W.-X., Di, Z. & Stanley, H. E. Locating the source of diffusion in complex networks by time-reversal backward spreading. Phys. Rev. E 93(3), 032301 (2016).
Article ADS MathSciNet PubMed MATH Google Scholar
Paluch, R., Lu, X., Suchecki, K., Szymański, B. K. & Hołyst, J. A. Fast and accurate detection of spread source in large complex networks. Sci. Rep. 8(1), 1–10 (2018).
Article CAS MATH Google Scholar
Comin, C. H. & da Fontoura Costa, L. Identifying the starting point of a spreading process in complex networks. Phys. Rev. E 84(5), 056105 (2011).
Article ADS MATH Google Scholar
Wikipedia. Div (mythology) - wikipedia, Accessed 2024. [Online; accessed 13-May-2024].
Encyclopædia Iranica. Astwihad, Accessed 2024. [Online; accessed 13-May-2024].
Cai, W., Chen, L., Ghanbarnejad, F. & Grassberger, P. Avalanche outbreaks emerging in cooperative contagions. Nat. Phys. 11(11), 936–940 (2015).
Article CAS MATH Google Scholar
Chen, L., Ghanbarnejad, F. & Brockmann, D. Fundamental properties of cooperative contagion processes. New J. Phys. 19(10), 103041 (2017).
Article ADS MATH Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge Yamir Moreno for his valuable comments, Hossein Afshin for providing Iran’s passenger flow data, and Yasaman Asgari for collecting and analyzing some data and creating some visualizations.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Yazdan Babazadeh Maghsoodlo
Department of Physics, Simon Fraser University, Burnaby, Canada
Amin Safaeesirat
Potsdam Institute for Climate Impact Research (PIK), Member of the Leibniz Association, P.O. Box 601203, 14412, Potsdam, Germany
Fakhteh Ghanbarnejad
School of Technology and Architecture, SRH University of Applied Sciences Heidelberg, Campus Leipzig, Prager Str. 40, 04317, Leipzig, Germany
Fakhteh Ghanbarnejad

Authors

Yazdan Babazadeh Maghsoodlo
View author publications
Search author on:PubMed Google Scholar
Amin Safaeesirat
View author publications
Search author on:PubMed Google Scholar
Fakhteh Ghanbarnejad
View author publications
Search author on:PubMed Google Scholar

Contributions

F. G. conceived and designed the research and led the team. Y. B. established the mathematical framework, which was improved through further discussions with A. S., and F. G.. Y. B. and A. S. carried out numerical experiments and performed simulations. Y. B., A. S., and F. G. contributed to analyzing and discussing intermediate and final results, and writing the paper.

Corresponding author

Correspondence to Fakhteh Ghanbarnejad.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Babazadeh Maghsoodlo, Y., Safaeesirat, A. & Ghanbarnejad, F. The Big Bang of an epidemic: a metapopulation approach to identify the spatiotemporal origin of contagious diseases and their universal spreading pattern. Sci Rep 15, 5809 (2025). https://doi.org/10.1038/s41598-025-85232-7

Download citation

Received: 15 July 2024
Accepted: 01 January 2025
Published: 17 February 2025
Version of record: 17 February 2025
DOI: https://doi.org/10.1038/s41598-025-85232-7