Introduction

Dengue fever is an infectious disease caused by the vector-borne virus (serotypes DENV-1 to DENV-4) transmitted by the mosquitoes of the species Aedes1. It is a global health issue, with an estimated occurrence of 100 million cases annually across the globe. The dengue cases have increased in recent years, 30-fold over the past 50 years with a 50% increase in the past decade alone, with a growing number of countries reporting outbreaks2,3,4,5. In 2020 alone, over 2.5 million cases of dengue were reported worldwide with 1,018 (0.04%) deaths6. The World Health Organization (WHO) has declared dengue fever an endemic in over 100 countries, most of which are in tropical and subtropical regions of the world, including Southeast Asia, South America, and parts of Africa. The dengue threat is highest in Asia and the Americas, with Southeast Asia and the Western Pacific region accounting for over \(70\%\) of the global dengue statistics. In Sri Lanka, dengue threat is the highest in the Western province reporting frequent outbreaks7. The most recent outbreaks were reported in 2017 and 2019. There were over 51,000 cases, with over 90 deaths (0.17%) reported during the 2017 outbreak and 186,101 cases with over 440 (0.23%) deaths were reported during 2019 outbreak according to the Epidemiology unit, Ministry of Health, Sri Lanka. In both of these instances, the Sri Lankan health system collapsed due to unpreparedness and lack of optimum resource allocations. The Sri Lankan government has implemented various health and social measures to prevent and control the spread of the disease by conducting public awareness campaigns, clean-up programs to remove mosquito breeding sites, insecticide fumigation and increasing the use of mosquito repellents7,8. However, similar to other tropical countries, Sri Lanka too has failed to completely wipe-out or successfully control the disease emergence. Given the heightened risk of infectious diseases in the future as a consequence of climate change, it’s essential to anticipate forthcoming dengue threats. This enables adequate time for bolstering preparedness and strengthening the healthcare system.

The infected human and vector population densities are intricately linked to the intensity of virus transmission9. The Aedes vectors thrive in warm and humid environments commonly found in tropical and subtropical regions. In particular, Aedes mosquitoes have multiple stages in their life cycle - egg, larva, pupa, and adult. Mosquitoes lay their eggs in stagnant clean water sources such as uncovered containers, discarded tires, and water-filled areas. With an abundant supply of breeding sites and favorable climatic conditions, the vector population can grow rapidly. Consequently, this high vector population density can amplify the likelihood of dengue transmission for several reasons such as (1) a larger vector population means a greater chance of female vectors being infected with the dengue virus, (2) more mosquitoes increase the probability of contact between infected vectors and susceptible individuals enabling the virus to be transmitted more rapidly10. Therefore, the vector population significantly impact the dengue transmission within human populations. Environmental conditions directly impact the mosquito population density and therefore can create high sensitivity in dengue transmission7,11,12,13,14,15,19. The warmer and humid environments, typically between 26−30 °C (77−86 °CF) of temperatures and above 60% of humidity levels provide a highly favorable environment for the Aedes mosquitoes causing rapid transmission16,17,18. Sri Lanka experiences an average temperature between 26−30 °C, with coastal regions being slightly warmer than inland areas. The humidity level remains consistently above 70% throughout the year reaching nearly 90% during the monsoon season in June. Monsoon rainfalls in Sri Lanka directly impact the dengue emergence patterns, as these rains create ideal breeding conditions for Aedes mosquitoes19 given the favorability of the other environmental conditions20,21,22. Specifically, Sri Lanka falls within the highest-risk category for epidemic potential according to the projected climate change11. Apart from favorable and consistent temperature and humidity conditions throughout the year, rainfall from the two monsoons experienced by Sri Lanka adds an additional level of favorability, generating high risk of dengue23,24,25. In addition to the influence of rainfall, the increased density of hosts and vectors, along with their mobility, contributes to the heightened transmission rate.

Given the severity of the disease, timely forecasting its future risk is critically important to alleviate the stress on the Sri Lankan health system. Moreover, the control measures such as early detection of rapid development, diagnostics and surveillance measures are vital for better handling the dengue threat26. Due to the influence of external factors on vector populations and disease transmission, disease control strategies may not be universally applicable. Tailoring approaches based on specific geographic locations and timeframes will be crucial for achieving effective and efficient outcomes19. Numerous studies have delved into modeling and validating intricate relationships between weather patterns and the emergence of dengue, employing various methodologies. Particularly models developed connecting information of weather and dengue in Sri Lanka are concentrated to establishing the correlation between these variables followed by regression or time-series models27,28,29,30,31,32. Additionally, the intricate dynamics of disease transmission between vectors and humans are extensively explored through compartmental models. These models depict the vector-host relationship by ordinary differential equations to characterize the population dynamics of both entities33,34,35,36,37. Typically, the human population is compartmentalized into three categories: susceptible (\(S_h\)), infected (\(I_h\)), and recovered (\(R_h\)), while the vector population is often divided into susceptible (\(S_v\)) and infected (\(I_v\)). Of these models, disparity in population dynamics between vectors and humans as time progresses was notable. While human populations may exhibit relatively gradual changes, vector populations can undergo drastic fluctuations within short time intervals due to environmental conditions when combined with their short life span. However, these models assume the infection dynamics progress more rapidly for humans than for the vector, achieved by mathematically approximating the system of equations at its quasi-equilibrium of vectors (i.e., \(\dot{I_v}=0\)). While this simplification results in a more straightforward model, it overlooks the potential impact, if any, of changes in the vector population on the infection dynamics of humans38,39. Consequently, vector population variations in response to external factors, particularly weather data, are often overlooked. This oversight of the models can lead to significant limitations in the estimations and simulation processes, as they fail to capture the comprehensive picture of the epidemic. The neglected variations in vector populations with respect to weather data results in a gap in understanding the full impact of environmental factors on disease transmission dynamics. As a result, the insights gained from these models may not accurately reflect the complexities inherent in the epidemic. To address the existing gap in estimating dengue risk based on fluctuating vector populations in the Colombo district, our study aims to establish a quantitative relationship between vector density and a key environmental factor specific to the study region. Specifically, we do not assume a constant vector population in the ordinary differential equation system. Instead, we introduce an indirect optimization approach to estimate vector density in the Colombo district. This method provides a more targeted and accurate estimation of vector populations, leading to improved predictive capabilities.

While several experimental studies have investigated vector populations across different regions of Sri Lanka, the findings from these studies are not universally applicable due to the varying environmental conditions, socio-economic factors, and unique characteristics of each region. As a result, these experimental data, although valuable, are highly case-sensitive and cannot be generalized across the country. Moreover, conducting such experimental studies is often costly and time-consuming, limiting their feasibility on a broader scale. However, our study seeks to develop a method which will indirectly estimate vector densities by correlating to rainfall, offering a more efficient and indirect approach. Accordingly, our study aims to close these gaps related to vector population by establishing a measurable link between weather patterns and the dynamics of vector populations, thereby improving our comprehension of the factors influencing the emergence of dengue. As outlined in literature models for Sri Lanka, rainfall significantly influences the occurrence of dengue cases. When examining the impacts of minimum temperature, maximum temperature, humidity, and rainfall on dengue, principal component regression analysis has demonstrated rainfall accounts for over 95% of the influence on dengue cases in Colombo district. In existing literature, environmental variables are typically used directly to estimate vector populations through regression-type functions. However, we introduce a novel approach that utilizes optimization via Markov chain Monte Carlo (MCMC) to infer vector populations by optimizing the impact of rainfall, leveraging existing data on human infection incidences. In particular, we developed a model based on a probabilistic and dynamical approach connecting rainfall to estimating the per-capita vector density in a SIR-SI (Susceptible, Infected, Recovered for human - Susceptible, Infected for vector) model of differential equations. Moreover, given the compelling evidence of rainfall’s impact on dengue, estimations are performed within identified seasons to align with the monsoon-induced rainfall patterns. Therefore, the estimation is demonstrated in an iterative process for each season in a feedback loop. This estimation algorithm based on Bayesian methods enabled uncertainty quantification using credible intervals for each season. Conducted in the hotspot of dengue emergence, which reports the highest number of dengue cases in Sri Lanka, this study provides critical insights for health authorities to enhance prevention and mitigation strategies.

Method and analysis

The highly densed population in the Colombo district continuously accounts for the highest number of reported dengue cases in the country. Within this district, Colombo Municipal Council (CMC) area stands out for its highest population density, with 24,857 residents per \(km^2\), making it the most densed region in the Colombo district40,41. Monsoons, southwest monsoon from May to September with average total rainfall 2500-4000 mm and the northeast monsoon from December to February with average total rainfall 1000−2000 mm, are one of the major origins of the rainfall patterns in Sri Lanka20,42. The Colombo district, situated in an agricultural wet zone, experiences greater benefits from the southwest monsoon compared to the northeast, primarily because of its geographical positioning (see Fig. 1(b)). Dengue incidences in Sri Lanka is found to be governed by the seasonal rainfall variations experienced throughout the year (see Fig. 1). Based on these rainfall variations, dengue emergence seasons can also be identified by calculating the periodicities of the reported dengue incidences. These periodicities of dengue and rainfall are quantified with Fast Fourier Transform (FFT). Fig. 2 depicting FFT amplitudes indicates both rainfall and dengue exhibit their peak levels at 6-month cycles (see Supplementary Material for numerical results). With these results of 6-month periodicities, a year is divided into 4 segments, which are referred to as seasons throughout the paper. The time span of a season was identified based on the increasing and decreasing trends in the reported rainfall fluctuations and a total of \(s=54\) rainfall seasons were identified during the period of 2009 January and 2022 December. Consequently, the number of weeks comprising each season may differ resulting in distinct durations. Moreover, the impact of rainfall on dengue emergence is studied through wavelet analysis25,46,47. In particular, a \(10 -\)week time lag has been identified between rainfall and dengue emergence in the CMC area.

Fig. 1
figure 1

Variations of dengue infections and rainfall. Variation of weekly (a) dengue infected human population (confirmed dengue infected patient count obtained from the National Dengue Control Unit (NDCU), Ministry of Health, Sri Lanka) and (b) rainfall in the CMC area from 2009−2022 (data obtained from the NASA power data access viewer48).

Fig. 2
figure 2

Periodicity of dengue and rainfall variations. Amplitudes are highest for 26 weeks in both (a) dengue (amplitude is 8133.34) and (b) rainfall (amplitude is 33192.65).

This study uses a SIR- SI compartmental model where human population has three epidemiological states - susceptible (\(S_h\)), infected (\(I_h\)), recovered (\(R_h\)), and vector population has two such states - susceptible (\(S_v\)) and infected (\(I_v\)) (see Fig. 3). Vectors have no recovery state since they do not survive long enough to undergo recovery from the disease. It is assumed the compartments respectively belonging to human and vector preserve a closed system, i.e. the populations are constant (\(N_h={S}_h+{I}_h+{R}_h\) and \(N_v={S}_v+{I}_v\)). One of the limitations of this study is the lack of data availability on distributions of Aedes species and their serotypes. As a result, to setup our model we assumed a homogeneous distribution for both serotype and species. The ordinary differential equations (ODE) system of five state variables is given in Supplementary Material. This five state variable model is reduced to three state variables \({I}_h, {R}_h, \text { and }{I}_v\) (\(3-D\)) using the assumption of closed populations (see Appendix 1 in Supplementary Material). Further, this \(3-D\) population model is converted to \(3-D\) density model by normalizing the state variables, i.e. \({I}={{I}_h}/{N_h}\), \({R}={{R}_h}/N_h\), \({V}={{{I}_v}}/{N_v}\). The reduced ODE system of IR,  and V is given in model (1). The definitions of model (1) parameters are given in Table 1.

Fig. 3
figure 3

Schematic diagram of transmission of the dengue virus between human (host) and mosquito (vector) populations. \(S_h\), \(I_h\), and \(R_h\) are the susceptible, infected, and recovered host populations respectively. \(S_v\) and \(I_v\) are the susceptible and infected vector populations respectively.

Table 1 Variables and parameter descriptions of model (1).
$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{d{I}}{dt}& =\beta _h {z} {V}(1-{I}-{R)}-(\mu _h+\gamma _h){I},\\ \dfrac{d{R}}{dt}& =\gamma _h {I}-\mu _h {R}, \\ \dfrac{d{V}}{dt}& =\beta _vV{I}-\mu _v {V}. \end{array}\right. } \end{aligned}$$
(1)

Of the parameters in model (1), the ratio \({N_v}/{N_h}\) is defined as the per-capita vector density (z), which denotes the total number of vectores available per human. z is highly dependent on abundant vector breeding sites after rainfall21,22,25. Hence, we assume the vector population is directly proportional to the rainfall patterns (see Eq.(2)), since other environmental variables such as temperature and humidity are favorable to vector breeding throughout the year19,21,22. In this context, we presume a linear relationship due to the absence of evidence warranting the introduction of complexities inherent in a nonlinear model. Given the rainfall seasonality originated from monsoons we establish the relation in Eq.(2) seasonally. From this point onward, the bold symbols such as \(\textbf{E}\) represent vectors and non-bold symbols represent scalars. These vectors may vary in length for each season (see Supplementary Material).

$$\begin{aligned} \textbf{z}_i=a_i {\textbf{E}}_i , \end{aligned}$$
(2)

where, for the \(i^{th}\) season, \(\textbf{E}_i\) is the reported weekly total rainfall, \(a_i\) is the rainfall coefficient, and \(\textbf{z}_i\) is the per-capita vector density. The simulation of model (1) for a season requires estimating \(\textbf{z}\) using rainfall, which in turn demands estimating a. We infer this parameter a for each season in an iterative process. The estimation procedure in Bayesian framework yields,

$$\begin{aligned} p\left( \theta |x\right)&=\dfrac{p\left( \theta \right) p\left( x|\theta \right) }{\int p\left( \theta \right) p\left( x|\theta \right) d\theta }, \end{aligned}$$
(3a)
$$\begin{aligned}&=\dfrac{1}{K} p\left( \theta \right) p\left( x|\theta \right) , \end{aligned}$$
(3b)

where \(\theta\) and x respectively denote the parameters and data. Here K is the normalization constant of the posterior probability distribution. In this problem, parameters (unknowns to be estimated) and data (observed information) are identified in Table 2. In parameter estimation, the actual dengue density data (\(\mathbf {I_{obs}}\)) are compared against the solutions of the ODE system (\(\hat{\textbf{I}}\)) in Eq.(1). Weekly reported dengue cases in the CMC area were obtained from the National Dengue Control Unit (NDCU) of Sri Lanka. We acknowledge that reported dengue incidences may not present the full spectrum of total infections. However, the consistency and rigor with which these reports are compiled by the NDCU provide a strong foundation for our analysis. The rainfall data required for this study were extracted from48. The remaining parameters in the model (\(\textbf{P}\)) were borrowed from the literature and assumed constant for every season49,50 (see Table 2). The borrowed parameter values for the simulation are provided in the Supplementary Material.We borrowed values from the literature due to the absence of experimental data specific to the Colombo district. Furthermore, we have assumed there is no significant change in these values during the study period, allowing us to utilize fixed values throughout the analysis.

Table 2 Observables and unknown parameters in model setup applicable to a season. The definitions of the terms can be found in Table 1. We assume the values in \(\textbf{P}\) remain constant across all seasons, while all other variables/parameters vary depending on the chosen season. The standard deviation of seasonal dengue data is represented by sd.

Next we demonstrate the parameter estimation setup for a season and for brevity we eliminated the subscript i, which denotes seasons. The parameters and data applied to the model in (3) yields,

$$\begin{aligned} p\left( a | \mathbf {I_{obs}},\textbf{E}, \textbf{P},\mathbf {Y_0},\sigma , \mathscr {I} \right)&\propto p\left( a| \textbf{E},\mathscr {I}\right) p\left( \textbf{E}|\sigma ,\mathscr {I}\right) p\left( \mathbf {I_{obs}}|a, \textbf{E}, \textbf{P},\mathbf {Y_0}, s,\sigma , \mathscr {I}\right) , \end{aligned}$$
(4)

where \(\mathscr {I}\) denotes any background information according to the standard Bayesian setup. The error variance \(\sigma\) is sampled, independently from the Bayesian setup, using inverse gamma distribution and it is explained in the next section. There is no existing knowledge for the rainfall coefficient or the rainfall and thus we assume uniform priors for \(p\left( a|\textbf{E}, \mathscr {I}\right)\) and \(P(\textbf{E}|\sigma ,\mathscr {I})\). The log likelihood function, assuming errors in observed data follows Gaussian distribution yields,

$$\begin{aligned} \log p\left( \mathbf {I_{obs}} | a,\textbf{E}, \textbf{P},\mathbf {Y_0},s,\mathbf {\sigma } ,\mathscr {I}\right) = \dfrac{n}{2} \log \left( \dfrac{1}{2\pi \sigma ^2}\right) - \dfrac{1}{2\sigma ^2} \Vert {\mathbf {I_{obs}} - \mathbf {\hat{I}}}\Vert ^2, \end{aligned}$$
(5)

where \(||.||^2\) denotes the L2 norm of the errors and \(\hat{\textbf{I}}=\displaystyle \int \dfrac{d\textbf{I}}{dt}dt\). Finally the log posterior distribution yields,

$$\begin{aligned} \log p\left( a | \mathbf {I_{obs}},\textbf{E}, \textbf{P},\mathbf {Y_0}, \sigma , \mathscr {I}\right) \propto \dfrac{n}{2} \log \left( \dfrac{1}{2\pi \sigma ^2}\right) - \dfrac{1}{2\sigma ^2} \Vert {\mathbf {I_{obs}} - \mathbf {\hat{I}}}\Vert ^2. \end{aligned}$$
(6)

We then encapsulated this Bayesian posterior simulation process into the Alg. 1 showcasing the iterative procedure of parameter estimation for s number of seasons. The optimization problem is a \(\text {Min}_{a\in \mathbb {R}} \log p\left( a | \mathbf {I_{obs}},\textbf{E}, \textbf{P},\mathbf {Y_0},\sigma , \mathscr {I}\right)\).

Algorithm 1
figure a

Estimating rainfall coefficients, \(a_i,\quad i=1,\cdots ,s.\)

The marginal posterior distributions obtained for each seasonal rainfall coefficient parameter enable us to calculate the 95% credible intervals. These intervals provide a statistical range for each parameter, reflecting the uncertainty in our dengue incidence estimates. The sensitivity of rainfall coefficient estimates to the dengue incidence estimates are given by the credible intervals calculated from the uncertainties of rainfall coefficients and estimated dengue densities from the ODE model (1). Sensitivity of the parameters allow us to better understand the variability and reliability of our seasonal estimations and predictions. The model validation can be captured via estimates of dengue incidences during specific seasons, for example outbreak periods or festive seasons.

Results and discussion

The posterior distribution in Eq. (6) was simulated using MCMC toolbox of Delayed Rejection Adaptive Metropolis (DRAM) in MATLAB 2024a51,52. The MCMC algorithm systematically explores sample values for the rainfall coefficient \(a_i\), from a chosen probability distribution, which resulted in a series of random samples. For each of these sampled values of \(a_i\), the algorithm computes the corresponding output for \(\mathbf {\hat{I}}\) and compares it to the observed dengue incidences (\(\mathbf {I_{obs}}\)). We then choose the optimum parameter value from the sampled \(a_i\) values resulting the least error between \(\mathbf {\hat{I}}\) and \(\mathbf {I_{obs}}\) out of the sampled \(a_i\) values. This optimum \(a_i\) is then used to obtain the simulated \(\mathbf {\hat{I}}\). In the simulations of model (1), the initial values of \(\mathbf {Y_0}\) were chosen respectively to be [\(I_0,0,V_0, 10\)]. Here \(V_0\) is calculated assuming the vector population is at its quasi-equilibrium. \(I_0\) is the initial value in reported dengue infected human population density. The extracted values for \(\textbf{P}\) were \(\beta _h= 0.75, \mu _h= \frac{1}{75\hspace{0.1cm} \text {years}}, \beta _v=0.375, \gamma _h = \frac{1}{2\hspace{0.1cm} \text {weeks}}\) and \(\mu _v = \frac{1}{6 \hspace{0.1cm}\text {weeks}}\)49,50. The MCMC toolbox, DRAM, samples error variance values \(\sigma\) using an inverse gamma distribution (rq) for each season (via update sigma \(=1\)51),

$$\begin{aligned} p(\sigma |r,q) = \dfrac{q^{-r}}{\Gamma (r)}{\sigma }^{(r-1)}\exp (-{\sigma }/q), \end{aligned}$$
(7)

where \(r=(1+N)/2, q=2/(sd+SSE)\) with NSSE respectively represent the sample size and the \(\Vert {\mathbf {I_{obs}} - \mathbf {\hat{I}}}\Vert ^2\). Starting from the second season, the initial values, \(I_0,R_0,V_0\), were extracted from the estimated solutions of the preceding season. Thus, the iterative estimations for all 54 seasons were performed solving the optimization problem in Alg. 1 using the MCMC simulations. For each season \(N_d=5\times 10^4\) simulations were sufficient to observe the convergence of the samples. With these obtained convergent samples \(30\%\) of initial samples were discarded for accuracy purposes and the marginal probability distributions were obtained for each seasonal parameter. Marginal distributions obtained for seasons 32, 50, 22, and 27 are depicted in Fig. 4. The best point estimate of \(a_i\) is taken as the mean of the marginal distribution,

$$\begin{aligned} E(a_i) = \dfrac{\sum _{j=1}^{N_d}a_j}{N_d}, \quad i=1,\cdots , s. \end{aligned}$$
(8)
Fig. 4
figure 4

Marginal distribution simulations of rainfall coefficients. Histograms of the marginal distribution obtained for 4 seasons(a) 2017 January-April (b) 2021 April-July, (c) 2009 July-October, (d) 2017 April-July are given. The red vertical lines indicate the means of marginal distributions. The black dashed lines indicate the 95% credible intervals of each marginal distribution.

Fig. 5
figure 5

Dengue infection density estimates of first two seasons in (a) \(s_1\) and (b) \(s_2\). The black circles indicate the reported dengue infected human population density and the red line indicates the estimated line. The fitted curve in (a) is obtained by taking the actual initial values at the beginning of the time period. The parameter value, \(a_1\) obtained for season 1 is 10.79. The fitted curve in (b) is obtained by taking the final values of the fitted curve in (a) as the initial values for season in (b). The parameter value, \(a_2\) obtained for season 2 is 2.13.

Figure 5 shows the solution curves of \(\mathbf {\hat{I}}\) against observed data (\(\mathbf {I_{obs}}\)) for the first two seasons. The uncertainties of these estimates were obtained by calculating the credible intervals at 95% level from the simulated marginal distributions of rainfall coefficients (\(a_i\)) (see Fig. 4). The estimates of rainfall coefficient (\(a_i\)) for all seasons are provided in Supplementary Material. The uncertainties of the rainfall coefficient allowed to obtain the uncertainties of the dengue densities, acknowledging the potential impact of undisclosed external variables. Figure 6 depicts the inferred infected population density (\(\mathbf {\hat{I}}\)) (red line) with their uncertainties (gray shades) for the whole study period. The results accurately captured the dengue outbreaks took place during 2017 (seasons \(32-35\)) and 2019 (seasons \(40-43\)) (see Fig. 6). In outbreak years the uncertainties are minimal, underscoring the precision of the estimates and the model’s ability to capture extreme cases when supported by evidence. These relatively narrow uncertainties in outbreak seasons compared to other seasons suggest rainfall is likely one of the primary factors contributing to these outbreaks. Our results effectively capture both the trends and seasonality in dengue incidences, revealing a strong correlation between dengue incidence patterns, fluctuations in vector density, and rainfall patterns. This highlights the significant impact of monsoon rainfall and its seasonality on the spread of dengue in the Colombo district. Consequently, seasonal variations in monsoon rainfall are likely to forecast corresponding seasonal variations in dengue incidences.

However, the estimated results show large uncertainties in specific seasons requiring further investigation. The highest uncertainties (width of the credible interval \(>10\)) are observed in season 21 (month April) and season 31 (month December). The marginal distributions of these seasons show the possible existence of mix Gaussian distribution suggesting the impact of some external factors possibly overpowering the impact of rainfall (Fig. 4(c)). Especially during the festive seasons in April and December, disease transmission elevates due to the increasing human mobility and population gatherings. The moderately high uncertainties are observed in seasons \(7,8,11,12,14-16,22,24,28,32,36,38,42,45,47,48\), and 53 (width of the credible interval between 3 and 6). In addition, unexpected above-average rainfall incidences also may have an impact in the differences shown in the marginal distributions. Utilizing initial values, per-capita vector density, and rainfall data specific to a particular season, the algorithm consistently forecasted dengue cases.

By utilizing these parameters, one can forecast upcoming dengue emergence using the rainfall data. Thus in this study, an out-of-sample forecast was employed where dengue emergence is forecasted for a period of 10 upcoming weeks using rainfall data. The uncertainty of forecast is inferred in two ways (Fig. 6 green box): the initial values \(\mathbf {Y_0}\) for the algorithm are chosen (a) as the point estimates from the last season (the inner shaded area in Fig. 6) and (b) as the credible interval (lower and upper bound values) of the last season (the outer shaded area in Fig. 6).

Fig. 6
figure 6

Estimated dengue infected human population density in the CMC area. The red line shows the estimated results (\(\mathbf {\hat{I}}\)) (see Supplementary Material). The gray circles show the reported infected human population density values calculated from the reported patient count data obtained from the NDCU (\(\mathbf {I_{obs}}\)). Uncertainties of results at 95% level are shown by the shaded gray color bands. Forecast of dengue emergence for one upcoming season is given in the green box. The forecast was obtained in two different ways. The inner band (solid line) in the forecast shows the uncertainty of the forecast based on the end point of the estimated dengue density in the previous season. The outer band (dash line) shows the uncertainty of the forecast based on the end points of the uncertainties (lower and upper) of the dengue estimates in the previous season.

Given the ongoing challenges in dengue eradication, particularly due to the absence of a cure or widely available vaccine, understanding the emergence patterns and timing of dengue outbreaks becomes a cornerstone of effective control and prevention strategies. The insights gained from this study in understanding the dengue emergence patterns are indispensable for enabling public health authorities to implement timely interventions, such as fogging with insecticides or the release of Wolbachia-infected mosquitoes, both of which play crucial roles in reducing vector populations and interrupting the transmission cycle of the virus. Additionally, understanding the timing and intensity of rainfall is pivotal in issuing early warnings to the public, empowering households to take necessary actions to eliminate potential vector breeding sites. This is especially critical in areas like the CMC, where high human mobility and dense populations create a fertile environment for rapid dengue transmission. In such settings, the interplay between rainfall conditions and vector activity becomes a vital factor in forecasting dengue outbreaks, thus enabling timely decision-making and the deployment of targeted control measures. The implications of these findings underscore the need for a nuanced, context-specific approach to dengue control, one that integrates environmental monitoring with public health strategies to mitigate the impact of this persistent threat.

The findings of this study provide a foundation for developing a comprehensive early warning system for dengue emergence in the hotspots of Sri Lanka. By leveraging the observed correlations between rainfall patterns and vector population dynamics, an automated system could be designed to predict periods of increased vector activity and, consequently, heightened disease risk. This system would analyze rainfall data in real-time, using predictive models to forecast vector emergence and potential dengue outbreaks. Such a proactive approach would allow public health authorities to issue timely warnings and implement preventive measures well in advance, significantly mitigating the threat of dengue. For instance, our study’s findings could inform the development of an efficient vector control system by determining the optimal timing for cleaning and insecticide application in relation to rainfall patterns. By anticipating risk and acting before the disease spreads, this system could play a crucial role in reducing the incidence of dengue and protecting vulnerable communities across the country.

This study operates under several key assumptions regarding the emergence of dengue fever within the CMC area. Firstly, this study and its findings are specific to the Colombo district in Sri Lanka, and therefore have limitations when generalizing to other regions globally. However, this study presents a methodology that can be adapted to areas with unique weather patterns. Considering the region-specific findings, we outline the inherent limitations of this study, which can be refined in future research. It is assumed the primary vectors responsible for dengue transmission, Aedes aegypti and Aedes albopictus, and their serotype distribution, are homogeneously distributed throughout the region. Due to the lack of these data in Colombo district, our estimates may not be accurate and possibly represent in the estimated uncertainties. Further our estimates do not represent the intensities of cases due to species type or the virus serotype. The emergence of dengue in this area is hypothesized to be predominantly driven by rainfall, given that other environmental conditions, such as temperature and humidity, are consistently favorable for the breeding and survival of these vector species. Thus, any dengue incidences caused by unexpected fluctuations in environmental conditions other than rainfall are not reflected in our results. In the ODE system, both human and vector populations are assumed to be within a closed system, where the total number of humans and vectors remains constant throughout the study, with no significant external migration or population shifts. This assumption could influence our results, especially given Colombo’s status as a commercial hub with high human mobility. However, we were able to account for this during the Christmas/New Year period through the uncertainties. Additionally, the study assumes that parameters like transmission rates, drawn from existing literature, remain constant over time, without considering potential seasonal variations, which could influence our estimates. Another limitation of our study is the lack of experimental data for the model parameters. Using actual values reflecting dengue transmission in the Colombo district could improve the accuracy of our results. However, these type of errors are usually captured by the uncertainty of results and therefore our estimates account for the possible errors in the chosen parameter values. Additionally, while we have configured our model using fixed parameter values, it would be advantageous to employ dynamic parameters for factors such as transmission rates, though this would add complexity.

Conclusions

In conclusion, the findings of this study underscore the significant impact of weather variability, particularly monsoon rainfall patterns, on the ecology of vectors and the transmission dynamics of dengue in Sri Lanka. By integrating real-time rainfall data into disease mathematical model, our study introduced an algorithm to forecasting dengue cases taking into account the changes in vector populations. In that approach, we were able to estimate dynamic surge in vector abundance and consequently the infected human dynamics. Through this study, we successfully established a robust relationship between vector population dynamics and rainfall patterns. By elucidating how fluctuations in rainfall influence the breeding habitats of disease vectors, we gained valuable insights into the mechanisms driving dengue transmission. Leveraging this understanding, we developed a predictive model that not only quantified changes in vector populations but also enabled us to forecast dengue occurrences. Our study focuses on the Colombo Municipal Council (CMC) area, known for its high population density and significant dengue burden compared to other regions in the country. Our algorithm closely mirrored recorded dengue cases, accurately capturing outbreaks, particularly those in 2017 and 2019. Leveraging Bayesian methods, we quantified the uncertainty of our estimates (Fig. 6). While outbreaks exhibited low uncertainty indicating accurate predictions and the model’s capacity to capture extreme cases when supported by evidence, some seasons displayed large uncertainties suggesting the influence of undisclosed external factors apart from rainfall.

Upon careful observation the inferred high uncertainties are recorded in many of the seasons following Christmas/New Year times. Colombo Municipal Council (CMC) area being the commercial city of the country, Christmas and New year makes it the busiest and the most crowded city during these festive times. Due to the high population density with their frequent inbound and outbound mobility during the festivities, the city’s environmental pollution expected to elevate significantly. Thus during these seasons, the impact of rainfall is overpowered verifying the marginal distribution with high uncertainty. Other seasons with large uncertainties are reported following the seasons having above-average rainfall. Hence, it can be concluded, the external factors may dominate the impact of rainfall and may produce uncertainties in the forecast.

In summary, the iterative algorithm introduced in this study aims to estimate seasonal per-capita vector density and dengue cases using a probabilistic and dynamic modeling approach. By validating against real data and quantifying uncertainty, our approach demonstrates its effectiveness in precisely identifying outbreak years. Moreover, the uncertainty quantification reveals further insights into the elevated risk of dengue, emphasizing the necessity for continued exploration in future studies.