Introduction

Epidemiological modeling is a powerful tool for understanding the dynamics of infectious diseases and guiding public health decisions and policies1,2,3,4,5. Mechanistic models, grounded in the known governing laws and physical principles of disease transmission, have been widely used to investigate various infectious diseases, including respiratory infections6,7,8, sexually transmitted diseases9,10, and vector-borne diseases11,12. Unlike empirical models, which primarily focus on data fitting without necessarily incorporating the underlying causes of observed patterns, mechanistic models aim to explain how and why epidemics unfold.

Despite their utility for predicting and controlling the spread of infectious diseases, traditional mechanistic models, such as the classical susceptible-infected-recovered (SIR) structure, face several challenges. First, the reliability of these models depends heavily on the accuracy of estimated parameters governing transmission dynamics5,13,14. However, current models are often constrained by simplifications and data availability. For example, disease transmissibility, though modeled as dynamic, is frequently calibrated using lagged and potentially incomplete death or hospitalization data. Similarly, human contact patterns, crucial for understanding transmission, are often assumed to be static due to limited access to high-quality, real-time data. Furthermore, the impact of interventions is typically modeled using linear terms, failing to fully capture the complex interplay between public responses and pathogen evolution. Second, despite the wealth of epidemiological knowledge encoded in unstructured and multimodal data sources (e.g., satellite imagery, social media, electronic health records), their incorporation into mechanistic models has largely relied on manual feature extraction, hindering the effective utilization of the richness of these data15,16,17. Third, the rise of big data18,19 has spurred the development of more complex mechanistic models that offer granular and detailed descriptions of disease dynamics20,21, but also increase the computational resources required for model calibration and validation, epidemic simulation, and optimization.

Recent advances in artificial intelligence (AI), especially machine learning (ML) and deep learning (DL), offer promising solutions to overcome the challenges and limitations of traditional epidemiological modeling using mechanistic models22,23,24,25,26. AI techniques demonstrate exceptional capabilities in predicting future outcomes, processing diverse databases, and extracting nuanced patterns and insights from big data. Various AI-based approaches have been successfully deployed for healthcare applications27,28,29,30, including medical image analysis, drug discovery, clinical outcome prediction, and treatment optimization. The potential of AI to transform epidemiological modeling has been actively explored across disciplines31,32,33,34. One line of research focuses on purely AI-driven predictive models35,36,37, as alternatives to traditional mechanistic models. Although these predictive models may perform well in short-term epidemic forecasting, their lack of underlying mechanisms limits their utility for long-term planning and scenario analysis. Integrated models, which combine the data-mining capabilities of AI techniques with the explanatory power of mechanistic models, are gaining significant attention. Despite the wide spectrum of AI methods, current integrations with mechanistic epidemiological models are predominantly limited to traditional statistical models, particularly for parameter inference and model calibration38,39,40,41. Explorations on emerging ML and DL techniques, though promising and rapidly expanding, remain fragmented due to the complexity of these techniques and interdisciplinary communication challenges. Bridging this gap is crucial to fully harnessing the power of AI to advance epidemiological modeling.

Existing reviews on emerging AI applications in infectious disease management have primarily focused on clinical aspects (e.g., diagnosis and treatment), drug discovery, and purely AI-driven predictive models18,42,43,44,45. Some reviews have provided overviews of AI applications in infectious disease surveillance46,47,48, offering vistas into the integration between AI and mechanistic models; however, a comprehensive review dedicated specifically to this integration is lacking. This scoping review aims to address this gap by systematically synthesizing literature in this emerging field. We identify solutions with the potential to address the immediate need in epidemiological modeling from various disciplines, outline the gaps between research and real-world applications, and highlight promising research directions for utilizing integrated models to provide data-driven policy guidance.

Results

Study selection and characteristics

Our search produced 15,460 studies (15,422 from database search, 17 through backward citation search, and 21 through manual search of relevant journals and conference proceedings). After eliminating 6267 duplicates, 9193 studies were screened. Of these, 807 studies advanced to the full-text review, and 245 studies were ultimately included in this scoping review (Methods, Fig. 1). The characteristics of these studies are provided in Supplementary Appendix 5.

Fig. 1
figure 1

PRISMA flowchart detailing the search strategy, screening process, and article selection.

The studies spanned various application areas of integrated models for diverse infectious diseases. Overall, 26 infectious diseases were investigated using integrated models (Supplementary Appendix 6). The majority of these studies focused on COVID-19 (148 studies, 60%), followed by influenza (18 studies, 7%), dengue (4 studies, 2%), and HIV (3 studies, 1%). Additionally, 56 studies (23%) used hypothetical disease scenarios to demonstrate method applicability rather than investigating specific diseases. The recent surge in COVID-19 research has notably increased the volume of studies integrating AI with epidemiological models, with 217 (89%) of the included studies published between 2020 and 2023. Despite the increase in research study volume, the distribution of application areas for integrated models remained consistent over time (Fig. 2). We grouped the application areas into six primary categories (Fig. 2, Box 1, Supplementary Appendix 6): infectious disease forecasting (86 studies, 35%), model parameterization and calibration (77 studies, 31%), and disease intervention assessment and optimization (72 studies, 29%), followed by retrospective epidemic course analysis (16 studies, 7%), transmission inference (9 studies, 4%), and outbreak detection (7 studies, 3%). These categories are not mutually exclusive, indicating that a single integrated model can serve multiple application areas.

Fig. 2: Number of studies satisfying the inclusion criteria, stratified by infectious diseases investigated, application areas, and year of publication.
figure 2

A list of investigated infectious diseases can be found in Supplementary Appendix 6.

Infectious disease forecasting

Among the included studies, 86 reported on the use of integrated models for infectious disease forecasting (Supplementary Appendix 7). Nearly all of these studies validated their proposed forecasting frameworks with real-world datasets, and 76 studies (88%) used COVID-19 datasets.

One study used a hierarchical clustering approach to group regions with similar disease activity patterns, partially determined by epidemiological models49. This approach allowed for the identification of regions with synchronized disease activity and the generation of cluster-based predictions. One study predicted case numbers using tree-based models, with input features informed by an epidemiological model50. Two studies leveraged tree-based methods, trained on synthetic datasets generated by epidemiological models, to discern the relationship between early-phase outbreak situation metrics and future epidemic outcomes51,52. Six studies employed ensemble learning frameworks that combined forecasts from AI and epidemiological models to improve forecasting performance53,54,55,56,57,58. Long short-term memory (LSTM) networks–the most frequently used method–are adept at learning temporal dependencies from time-series data, thereby complementing mechanistic models in generating robust forecasts. Of these six studies, four used weighted averaging to combine forecasts based on historical model performance53,54,55,57; one utilized stacking, where an LSTM-based meta-model was trained to learn the optimal way to integrate forecasts from epidemiological models56; and one employed boosting, where a neural network learned to correct errors in the epidemiological model’s forecasts58.

Twenty-nine studies forecasted epidemic trajectories based on physics-informed neural networks (PINNs; n = 9)32,59,60,61,62,63,64,65,66, epidemiology-aware AI models (EAAMs; n = 11)67,68,69,70,71,72,73,74,75,76,77, and synthetically-trained AI models (n = 9)78,79,80,81,82,83,84,85,86. PINNs represented state variables and other time-varying parameters as neural networks with the input time \(t\). The loss function of PINNs consists of two components: (i) the data loss, reflecting the disparity between neural network outputs and actual data, and (ii) the residual loss, ensuring adherence to disease transmission mechanisms represented by differential equations. By incorporating epidemiological knowledge into neural networks through residual loss, PINNs exhibit enhanced performance in parameter inference and disease forecasting. PINNs extrapolated future state variables using time steps over the forecast period as input. In contrast, EAAMs offered more adaptable model structures and knowledge-infusing frameworks that extended standard AI models, such as recurrent neural networks (RNNs) and graph neural networks (GNNs), by assimilating epidemiological knowledge into the architectures, loss functions, and training processes of AI models. Synthetically-trained AI models employed time series or spatiotemporal forecasting models, such as LSTM networks and GNNs, for infectious disease forecasting. These models acquire epidemiological insights by learning transmission mechanisms from synthetic datasets generated by epidemiological models. Such integrated approaches surmount the limitations of methodological frameworks that rely solely on mechanistic or AI models, which may be computationally impractical when surveillance data are noisy or sparse.

Forty-seven studies adopted AI-augmented epidemiological models, which replaced parts of epidemiological models (e.g., model parameters, derivatives, or derivative orders) with AI components. These AI components were employed to predict future values of epidemiological model parts, which were subsequently inputted for forecasting. Among these, eight studies employed end-to-end training of AI-augmented epidemiological models87,88,89,90,91,92,93,94. When inserting epidemiological model parts directly or indirectly approximated by AI models into numerical solvers, AI-augmented epidemiological models can produce estimated observational data, which were used to train AI models by minimizing the loss function designed based on the difference between actual and estimated observations. Thirty-seven studies forecasted unknown components in epidemiological models based on supervised learning frameworks, where AI models (primarily RNNs) were trained on synthetic data generated by epidemiological models or historical component values95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131. Two studies did not specify their component learning frameworks132,133.

Model parameterization and calibration

Seventy-seven studies explored the use of integrated models for parameterization or calibration of epidemiological models (Supplementary Appendix 8). Among these, four studies employed AI techniques to improve observational data by extracting auxiliary information from non-traditional surveillance sources, such as social media content and search trend data134,135,136,137. The improved observational data were then used for precise parameterization and calibration of epidemiological models, investigating diseases including COVID-19 and influenza. Two of these studies utilized support vector machines (SVMs) or tree-based methods to generate disease activity data by inferring individuals’ health status from social media content134,136. Another study employed tree-based methods to refine observed data and estimate unobserved data by supplementing traditional surveillance data (specifically, laboratory-confirmed influenza hospitalizations) with search trend data137. The final study used tree-based methods to determine the relative importance of various non-pharmacological interventions in modifying the transmission rate within the epidemiological model135.

The remaining 73 studies implemented AI-enhanced calibration methods using three main approaches: surrogate modeling (n = 13), synthetically-trained scenario classifiers (n = 5), and direct parameter calibration (n = 56). In surrogate modeling-based calibration methods, lightweight AI-based surrogates of epidemiological models were integrated into Bayesian138,139,140 or simulation-based optimization frameworks86,141,142 to achieve efficient parameter inference, thereby replacing computationally intensive processes. Trained on datasets generated by epidemiological models with varying input parameters, these surrogates learned the relationships between inputs and simulated outputs, thereby accelerating model fitting. Most of these studies (9 out of 13) developed surrogates for agent/individual-based models, primarily due to their high computational demands. Neural networks were the most frequently used surrogates (10 of 13 studies). Five studies evaluated the performance of AI-based surrogates using simulation datasets generated by disease-specific models. Scenario classifier-based methods reframe parameter inference as classification tasks. To predict epidemic scenarios characterized by sets of parameters, classifiers are trained on synthetic data from epidemiological models143,144,145,146,147. This approach has been applied to calibrate epidemiological models for diseases such as influenza and tomato spotted wilt virus infection. Common classification algorithms, including tree-based methods (4 out of 5) and SVMs (2 out of 5), were frequently employed.

Of the 56 studies employing direct parameter calibration methods, 29 leveraged PINNs32,59,60,61,62,63,64,65,66,131,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166, 3 used EAAMs73,74,167, 18 adopted AI-augmented epidemiological models87,89,91,92,93,94,168,169,170,171,172,173,174,175,176,177,178,179, 4 employed Bayesian neural networks180,181,182,183, and 2 used synthetically-trained neural networks95,139. In PINN or EAAM-based methods, parameters in epidemiological models were either represented by AI models or set as trainable weights, enabling parameter estimations to be updated during training. Among studies employing AI-augmented epidemiological models, 17 utilized neural networks to estimate model parameters87,89,91,92,93,168,169,170,171,172,173,174,175,176,177,178,179, while one study utilized a regression method to predict model parameters with unobserved features estimated by tree-based methods94. These AI-augmented models underwent end-to-end training, similar to the approach discussed in the Infectious Disease Forecasting section. Bayesian neural networks leverage the capabilities of neural networks in handling high-dimensional data to enhance parameter inference in Bayesian approaches, including variational and simulation-based inference. These approaches attempt to approximate the posterior distribution of model parameters given observed data, especially when faced with intractable likelihood or marginal likelihood. One study employed variational inference to jointly infer unknown parameters and latent diffusion processes in the epidemiological model181. An RNN formed part of the variational approximation of the joint posterior distribution of the parameters and diffusion processes, conditional on the observed data. This approximation was optimized by maximizing the evidence lower bound (ELBO). Three studies utilized simulation-based inference techniques to approximate the posterior when the likelihood, implicitly defined by epidemiological models, was intractable180,182,183. These techniques include neural density estimation methods and neural network-based approximate Bayesian computation184. Methods utilizing synthetically-trained neural networks, by contrast, involved training on labeled datasets generated by epidemiological models to directly predict parameter values from observational data.

Among 77 studies that focused on model parameterization and calibration, 28 used well-calibrated models for retrospective disease intervention assessment or future projections32,59,62,64,87,92,134,135,136,137,147,148,150,151,152,156,160,161,163,164,165,167,168,169,170,174,176,177. Retrospective assessments were achieved by analyzing the fitted values of parameters affected by interventions. Parameter values over the projection horizon were set as their final values in the training window.

Disease intervention assessment and optimization

Seventy-two studies leveraged integrated models to assess (n = 13) or optimize (n = 59) the impact of interventions (Supplementary Appendix 9). To accelerate estimating the effectiveness of interventions, seven of these studies constructed neural network-based or tree-based surrogates of epidemiological models185,186,187,188,189,190,191. In four studies, AI-augmented epidemiological models were utilized to establish relationships between control measures and epidemiological parameters. AI models, including tree-based methods, SVMs, and neural networks, were trained for this purpose31,192,193,194. The impact of control strategies was then assessed by incorporating the parameter values estimated by these AI models into epidemiological models. Additionally, one study employed a cluster-based framework to assess the effectiveness of large-scale interventions. K-means clustering was used to identify representative areas with distinct archetypes195, after which an agent-based model was used to evaluate the impact of interventions on these areas. Three studies utilized a game-theoretic approach to assess196 or optimize197,198 control measures using agent-based models. In these studies, Nash equilibrium was derived using a neural network approach. Of the 13 studies utilizing integrated models for intervention assessment, six focused on investigating COVID-19, one studied malaria, one was plague, one was on dengue, and one examined HIV. Three studies did not investigate any specific disease.

In addition to the game theory-based optimization frameworks197,198, intervention optimization in integrated models also used reinforcement learning (RL) (38 studies)34,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235, key node finding (10 studies)236,237,238,239,240,241,242,243,244,245, optimal control theory (6 studies)93,212,246,247,248,249, Markov decision process (MDP) (1 study)250, and surrogate modeling (3 studies)251,252,253 frameworks. In RL-based frameworks, RL environments were constructed based on epidemiological models to assess the impact of different intervention strategies. Through interactions with these environments, RL agents learned the optimal intervention strategy. In key node finding-based frameworks, AI models were utilized to identify the optimal set of individuals for interventions based on their attributes. With the exception of one study that used an AI-augmented epidemiological model to derive node importance236, all other studies employed synthetically-trained AI models, primarily GNNs (7 out of 9 studies). These models were trained on datasets generated by epidemiological models, featuring various source node sets and transmission network structures, to identify key nodes in unobserved scenarios.

In optimal control problem frameworks applied to intervention optimization, obtaining optimal control signals is typically challenging due to complexities arising from the underlying transmission dynamics. To address this challenge, one study developed a reduced model for an agent-based model, where neural networks were employed to approximate transition rates among individuals in different compartments246. Other studies used neural networks to approximate decision variables, thereby transforming the optimal control problem into a parameter learning problem. Identifying the optimal control strategy parallels training neural networks to minimize the loss function designed based on the objective function of the intervention optimization problem.

In MDP-based frameworks, the optimal intervention problem was translated into a discrete-time MDP, where neural networks approximated time-dependent control strategies as a function of current states, similar to the training strategy in the optimal control theory-based frameworks. In surrogate frameworks for intervention optimization, AI models were trained to improve the computational efficiency of identifying optimal control strategies. These AI models, often tree-based methods, learned decision rules from computationally intensive non-AI techniques such as search-based optimization methods.

Among the 59 studies utilizing integrated models for intervention optimization, 25 studies investigated COVID-19, one studied malaria, one was on foot-and-mouth disease, one was related to influenza, one studied HIV, one was on porcine reproductive and respiratory syndrome, and one examined Zymoseptoria tritici infection. Additionally, 28 studies proposed general methodological frameworks without investigating any specific disease.

Retrospective epidemic course analysis

Sixteen studies leveraged integrated models to retrospectively analyze past epidemics using surrogate modeling frameworks (Supplementary Appendix 10). In 14 of these studies, surrogate models were used to identify key factors influencing transmission dynamics. This was done by training the models to recognize dependencies between various factors and the corresponding simulation outputs254,255,256,257,258,259,260,261,262,263,264,265,266,267. Two studies used surrogate models to understand how individual characteristics and behaviors relate during epidemics268,269. Most (13 out of 16) studies adopted tree-based models. This framework was applied to investigate COVID-19 (4 studies), influenza (3 studies), dengue (1 study), enterovirus infection (1 study), brucellosis (1 study), foot-and-mouth disease (1 study), smallpox (1 study), pertussis (1 study), SARS (1 study), and varicella zoster virus infection (1 study).

Transmission inference

Nine studies investigated the use of integrated models for transmission inference (Supplementary Appendix 11), focusing on source localization (n = 4)270,271,272,273, determining the underlying transmission network or pattern (n = 2)236,274, inferring the health status of unobserved individuals (n = 1)275, reconstructing disease evolution dynamics (n = 1)276, and inferring incidence from death records (n = 1)33.

One study defined the COVID-19 transmission mechanism based on the renewal equation33. This knowledge was then incorporated into the loss function of a convolutional neural network, which connected death records with incidence data, similar to the PINNs/EAAMs described previously. The other eight studies relied on individual-based disease models. Due to challenges in obtaining real-world individual-level disease transmission networks, the majority (6 out of 8 studies) proposed general methodological frameworks and evaluated their performance using hypothetical networks and disease transmission scenarios. Only two studies validated their methods using real-world COVID-19 or tuberculosis datasets272,274.

Among studies employing individual-level models, one study formulated transmission inference as an optimal control problem with the unknown network structure treated as the control variable, the underlying transmission dynamics as constraints, and the difference between actual and estimated observations as the objective function236. The problem was then solved using a neural network approach. Another study employed a tree-based classifier to infer the health status of unobserved individuals based on their attributes. Disease propagation properties derived from epidemiological models were used as features in the classifier275.

The remaining six studies employed AI models trained on datasets generated by epidemiological models. These trained models were then applied to unseen synthetic or real-world data to infer transmission dyanamics270,271,272,273,274,276, such as generating source probability distribution272,273 or identifying the underlying transmission pattern (e.g., homogeneous transmission and super-spreader transmission)274. GNN-based models were commonly used in 5 out of 6 studies, owing to their ability to learn the intricate structures of transmission networks and dynamics on networks.

Outbreak detection

Seven studies reported on the use of integrated models for outbreak detection (Supplementary Appendix 12). Among these, two studies formulated the problem of COVID-19 outbreak detection as a classification problem, where tree-based, SVM-based, or multilayer perceptron (MLP) classifiers were trained to predict the outbreak risk level in a region based on its associated features277,278. Three studies estimated the outbreak risk of vector-borne diseases using epidemiological models parameterized by tree-based or Natural Language Processing (NLP) methods279,280,281. The final two studies gauged influenza outbreak risks using posterior probabilities of epidemiological models in the presence and absence of outbreaks282,283. Specifically, NLP methods were used to extract patient diagnosis data from emergency department reports, which were then input into Bayesian frameworks to derive the posterior probabilities for model selection.

Summary of integration methodologies

We identified nine primary methodological frameworks (Fig. 3, Supplementary Appendix 16), with surrogate modeling/synthetically-trained AI models comprising the largest proportion at 28% (68 studies). AI-augmented epidemiological models accounted for 26% (64 studies), AI-enhanced optimization frameworks made up 20% (48 studies), and PINNs and EAAMs collectively represented 17% (42 studies). Epidemiological models utilizing improved observational data appeared in 4% (9 studies), Bayesian neural networks in 3% (7 studies), ensemble learning frameworks in 2% (6 studies), AI models incorporating epidemiological input features in 1% (3 studies), and cluster-based transmission analysis frameworks in 1% (2 studies).

Fig. 3: Illustrative examples of the methodological frameworks employed by studies satisfying the inclusion criteria.
figure 3

Nine primary frameworks were identified, including AI-augmented epidemiological models (a), epidemiological models with improved observational data (b), PINNs/EAAMs (c), AI models incorporating epidemiological input features (d), surrogate modeling/synthetically-trained AI models (e), ensemble learning frameworks (f), Bayesian neural networks (g), AI-enhanced optimization frameworks (h), and cluster-based transmission analysis frameworks (i). Details of these frameworks can be found in Supplementary Appendix 16. PINNs physics-informed neural networks, EAAMs epidemiology-aware AI models.

Seven integration approaches were adopted across methodological frameworks (Fig. 4). Nearly half (112 studies) of the studies employed AI models to learn unknown components of epidemiological models, enabling the incorporation of time-varying components and diverse datasets into disease modeling. Other common integration approaches (76 studies) included training AI techniques using data generated from epidemiological models. These approaches were used to learn disease transmission mechanisms, build surrogates for faster estimation and evaluation of model outcomes, or overcome the limitations of scarce and low-quality real-world data by leveraging synthetic datasets. Additionally, 73 studies demonstrated the integration of epidemiological knowledge into the input, loss functions, architectures, and learning processes of AI models. Forty-seven studies utilized AI models, primarily RL and optimal control theory-based frameworks, to determine optimal decisions under dynamic disease spreading processes. Only ten studies employed AI models to enhance observational data by extracting auxiliary information from non-traditional surveillance data, while six studies combined AI and epidemiological models through ensemble modeling frameworks to improve epidemic forecasting performance. Finally, one study used clustering methods to decompose large-scale epidemiological models.

Fig. 4: Sankey diagram visualization of study categorization.
figure 4

The weight of each edge is proportional to the number of studies. Edges between application areas and methodological frameworks are colored based on application areas. Edges between methodological frameworks and integration approaches are colored based on integration approaches. Source data for the Sankey diagram can be found in Supplementary Appendix 14. PINNs physics-informed neural networks, EAAMs epidemiology-aware AI models.

Measures of quality

Among the 178 articles published in peer-reviewed journals, 14 (8%) did not list an impact factor, and 15 (8%) lacked a listed h5-index in Google Scholar (Supplementary Appendix 13). Seventy-four articles (42%) were published in quartile 1 (Q1) journals, meaning their impact factors were higher than those of at least 75% of journals in the same subject domain. Of the 67 articles published in the proceedings of conferences, workshops, or symposiums, 15 (22%) lacked a listed h5-index in Google Scholar. Additionally, 75 articles (26 published in peer-reviewed journals and 49 published in the proceedings of conferences, workshops, or symposiums) did not have citation information in Web of Science.

Discussion

The rapid expansion of big data and advancements in computational capacity have greatly broadened the integration of AI techniques with mechanistic epidemiological modeling. This scoping review identified and synthesized contributions to this burgeoning field. Among the 245 studies reviewed, nearly 90% were published during the past four years, propelled by the surge in COVID-19 research. We identified 26 infectious diseases that have been investigated using integrated models, with COVID-19 research constituting 60% of the studies. The applications of integrated models fell into six primary areas: infectious disease forecasting, model parameterization and calibration, disease intervention assessment and optimization, retrospective epidemic course analysis, transmission inference, and outbreak detection. The majority of studies focused on the first three categories. In contrast, fewer studies addressed retrospective epidemic course analysis, with notably limited research on transmission inference and outbreak detection, highlighting potential areas for future exploration. The majority of studies validated their proposed frameworks using real-world datasets (Supplementary Appendix 5). However, studies on transmission inference or intervention optimization often relied on synthetic data due to a sparsity of real-world data required for validation.

Integrated models have successfully addressed the challenges posed by mechanistic models in the face of continuously evolving epidemiological situations. This success has been achieved by leveraging AI techniques (Supplementary Appendix 15) to extract valuable information from diverse databases, uncover hidden spatiotemporal dependencies within high-dimensional data, discern complex relationships between variables and outcomes of interest, effectively learn and transfer knowledge embedded in the data, and introduce methodological innovations within established Bayesian and optimization frameworks.

Our review identified significant gaps and opportunities in the literature regarding the use of AI in mechanistic epidemiological modeling. First, among the six application areas identified in our review, integrated models stand out for their practical potential to revolutionize disease forecasting, model parameterization, and calibration in the near future. Traditional mechanistic models, grounded in human knowledge of disease progression and pathogen characteristics, have been constrained by their inflexibility in rapidly refining model structures and parameters to reflect current disease landscapes and policy priorities5,13. Continuous model refinement is resource-intensive and time-consuming, potentially leading to more complex models that are difficult to calibrate. Empowered by AI’s ability to handle diverse datasets and approximate complex functions, integrated models offer crucial and timely solutions. These advancements enable the effective utilization of widely available, yet dynamically evolving and intrinsically noisy, non-traditional surveillance data and facilitate the calibration of increasingly sophisticated mechanistic models with numerous free parameters. While extensive studies focus on intervention optimization, most remain theoretical, with limited demonstrations of practical applicability. For example, despite the heterogeneity in intervention objectives, most studies formulated the optimization problem using oversimplified assumptions about disease transmission dynamics, decision-making processes, and the costs and impacts of intervention strategies. Realistic considerations for decision-makers, such as the reasonableness of model assumptions, the feasibility of intervention strategies, public responses, and the trade-offs between disease and socioeconomic outcomes, were frequently overlooked. Although several studies attempted to incorporate economic factors into decision-making processes to balance health benefits and costs, significant gaps remain between proof-of-concept methods and real-world applications.

Second, while big data has great potential to enhance these models, the integration of non-traditional surveillance data such as social media content, search queries, medical reports, and satellite imagery remains limited. These data types could significantly augment or even replace traditional data sources in some contexts. For example, the transmission of climate-sensitive vector-borne diseases is influenced by a variety of climate and environmental factors. However, existing mechanistic models often utilize basic weather data, such as temperature, rainfall, and humidity11,284. These data, primarily collected by meteorological institutions, can be limited by spatial coverage and temporal resolution. Satellite imagery presents a valuable supplement, offering real-time, high-resolution data for a wide range of climate and environmental variables47,285. Previous studies have shown the potential of integrating satellite data into disease surveillance and forecasting models, utilizing purely AI approaches286,287,288. This highlights the ability of AI to extract rich information from satellite data, enabling mechanistic models to build more comprehensive and dynamic representations of abiotic and biotic drivers of disease transmission.

Third, disease transmission is a complex process influenced by a confluence of epidemiological, biological, and socio-behavioral factors. However, existing integrated models focus predominantly on epidemiological aspects, often neglecting the intricate interplays between biological and socio-behavioral processes5,289. This omission constrains the models’ utility for in-depth analysis, long-term forecasting, and strategic decision-making. These observations underscore the need for broader data integration and the development of new analytical tools capable of generating detailed, timely, and high-resolution insights into disease dynamics and evolution, policy impacts, and population behavior. The successful application of AI techniques across related fields290,291,292, including sociology and biology, creates opportunities to bridge this gap. For instance, agent-based models, equipped with large language models to enable human-like reasoning and decision-making, have demonstrated remarkable success in replicating human behaviors293. Incorporating such advancements into agent-based models of infectious diseases, which often rely on rule-based methods, has the potential to improve the realism of simulations in capturing complex human behaviors during epidemics294. Furthermore, machine learning and deep learning methods, trained on the rapidly growing volume of biological data, exhibit great promise for forecasting viral evolutionary dynamics and understanding immunity landscapes295,296.

Fourth, our review reveals a research landscape that is currently concentrated on direct transmission (especially COVID-19). The dominance of COVID-19 research is likely attributable to the vast amounts of data collected and shared during the pandemic. However, such extensive data may not be available for less prevalent or local diseases, particularly in low- and middle-income countries with limited surveillance capabilities. The practicality and scalability of most methodological frameworks depend on the availability of abundant, high-quality data, which is often lacking. For instance, GNN-based methods require individual-level contact networks that are frequently unavailable. Therefore, it is crucial to invest in both enhanced disease surveillance and research to improve modeling techniques capable of handling incomplete and noisy data.

Moreover, this narrow focus on direct transmission raises concerns about their generalizability to diseases with indirect transmission routes. While these models offer valuable insights into common modeling challenges faced across disease types—such as model calibration, parameter estimation, and capturing the non-linear impacts of interventions—their lack of consideration of more complex transmission mechanisms hinders their full potential. For example, models for vector-borne diseases should account for the vector population dynamics and the interactions between vector and human populations. Similarly, models for water-borne diseases require a detailed representation of environmental factors, such as sanitation infrastructure and contamination pathways, and human behavioral factors, such as hygiene practices and access to clean water sources. Traditional mechanistic models, constrained by simplified assumptions and parameter uncertainties, face challenges in fully capturing these complexities. By leveraging AI’s ability to integrate diverse data, learn complex patterns, and generate accurate predictions, integrated models can potentially enhance our understanding of underlying transmission mechanisms, optimizing a balance between model simplification and realism. This enables the application of integrated models to a broader spectrum of infectious diseases, each with unique challenges requiring tailored AI solutions.

Finally, while this review identified a diverse range of methodological frameworks, many studies lacked rigorous evaluations of robustness, sensitivity, and generalizability—all crucial for real-world application. Addressing these deficiencies in performance evaluation is essential to increase model transparency and reliability, ultimately fostering public trust and facilitating wider adoption of these models. Moreover, the proliferation of methodological frameworks raises important questions about the relative performance of integrated, purely AI, and traditional mechanistic models. While integrated models often outperform simple mechanistic models (e.g., classical SIR systems), they may not surpass those that capture sufficient realism for practical decision-making. Additionally, although traditional statistical models fall outside the scope of AI techniques considered in our review, their integration into mechanistic epidemiological models remains valuable38. However, comparisons between these integrated approaches, especially those incorporating traditional statistical models versus those based on ML/DL, are rarely discussed. For example, while LSTM networks, with their ability to capture temporal dependencies, were frequently used in AI-augmented epidemiological models to predict dynamic model parameters, traditional statistical models such as autoregressive integrated moving average (ARIMA) models may be more robust when faced with noisy and scarce time series data297. Therefore, future research is needed to rigorously examine the limitations and comparative utility of AI-assisted and well-established traditional disease transmission models to guide effective model selection.

In conclusion, AI techniques and mechanistic epidemiological models can synergistically enhance one another, leveraging the strength of AI methods to learn complex input-output relationships while incorporating the prior epidemiological knowledge embedded within mechanistic models. This scoping review systematically synthesizes the literature in this field and identifies diverse applications of integrated models, including disease forecasting, model calibration, and intervention optimization. While highlighting promising methodological advancements with practical potential, our review also reveals significant gaps in the current literature. These include the need for rigorous evaluation and comparison of methodologies, better incorporation of domain expertise in guiding the development of integrated models for policy-relevant decision-making, and expanded exploration of diverse datasets and underlying biological and socio-behavioral mechanisms. By addressing these challenges through interdisciplinary collaboration, we can unlock the full potential of AI to enrich the toolkit for epidemiological modeling, ultimately enhancing our ability to understand, prevent, mitigate, and respond to infectious disease outbreaks.

Methods

Overview

We conducted a scoping review to synthesize the literature on the integration of AI techniques, specifically ML and DL models, with mechanistic epidemiological models of infectious disease dynamics. Our methodology adhered to the framework proposed by the Joanna Briggs Institute (JBI)298 and reporting follows the guidelines for the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR)299. The protocol was registered prospectively with the Open Science Framework (registration https://doi.org/10.17605/OSF.IO/E8ZG7) on October 9, 2023. Our review focuses on three key questions: (i) In which areas of epidemiological modeling have integrated models been applied? (ii) What infectious diseases have been modeled using integrated models? (iii) How have AI techniques and mechanistic epidemiological models been integrated?

Eligibility criteria

Our review included studies that integrated AI techniques with mechanistic models of infectious disease, irrespective of the disease type or research objective. We included studies published in English that underwent peer review, whether in journals or in the proceedings of conferences, workshops, or symposiums. Seven types of mechanistic models, commonly used in epidemiological modeling, were considered eligible for this review: compartmental models, individual/agent-based models, metapopulation models, cellular automata, renewal equations, chain binomial models, and branching processes. Eligible AI techniques include all ML and DL models, excluding statistical models and fuzzy logic systems. We define “integrated models” as those combining mechanistic epidemiological models with AI techniques specifically for transmission analysis and disease intervention optimization. Studies that constructed AI techniques as alternative modeling methods, compared the performance of AI techniques with mechanistic models, or used mechanistic models solely to generate validation/testing datasets for AI techniques were not considered eligible under this definition. We excluded studies that did not use eligible AI techniques or mechanistic models, did not integrate AI techniques with mechanistic models, or were not original research (e.g., reviews, commentaries, and editorial notes). Studies lacking methodological details, numerical results, or accessible full texts were excluded. Detailed eligibility criteria and justifications are provided in Supplementary Appendix 1.

Search strategy

We conducted searches across six databases: PubMed, Embase (Ovid), Web of Science, Scopus, IEEE Xplore, and ACM Digital Library (the ACM Full-Text collection). Our search combined six categories of terms (“AI”, “Epidemic modeling”, “Modeling”, “Infectious disease”, “Infectious agents”, and “Spreading”) using Boolean operators. We employed three kinds of search strings: (“AI” AND “Epidemic modeling”), (“AI” AND “Modeling” AND “Infectious disease”), and (“AI” AND “Modeling” AND “Infectious agents” AND “Spreading”). Within each category, terms were linked by the Boolean operator “OR.” Restrictions on search functionality within IEEE Xplore and ACM Digital Library required multiple separate searches. We used the Polyglot tool300 to translate search syntax across databases. Details of the search strings for each database can be found in Supplementary Appendix 2.

Our initial literature search commenced on October 6, 2023. To ensure comprehensive results, we updated this search on November 7, 2023 and December 6, 2023 to include newly published literature. We also conducted manual searches of relevant journals and conference proceedings (e.g., Nature Machine Intelligence and the ACM SIGKDD Conference on Knowledge Discovery and Data Mining). Finally, we reviewed the references of all included studies to identify additional relevant studies.

Selection of sources of evidence

All retrieved records were initially imported into EndNote 20.1301 to remove duplicates before transferring to Covidence302 for screening and processing. Three reviewers (YY, AP, and CB) screened titles and abstracts during the primary screening stage, using the ASReview software (version 1.5)303 — an open-source, ML-assisted tool for active learning and systematic reviews — to aid in decision-making (Supplementary Appendix 3). Full-text screening was conducted independently by six reviewers (YY, AP, CB, DMS, RR, and AS), with any discrepancies resolved through team discussion, ensuring adherence to the inclusion and exclusion criteria throughout the review process.

Data charting and data items

We extracted data from studies satisfying the inclusion criteria using Covidence and exported the data to Google Forms for further analysis. We piloted a standardized data extraction form with two members of the team (Y.Y. and A.P.) on three studies to ensure it captured all necessary information; the form is available in Supplementary Appendix 4. The data extraction form was first completed by one reviewer (Y.Y.), and subsequently verified by another reviewer (A.P.) for correctness, with any discrepancies resolved through discussion.

Quality assessment

The quality of each included study was assessed through the 2022 journal impact factor (not applicable for proceedings of conferences, workshops, or symposiums), the h5-index of the publication (journal, conference, workshop, or symposium) from Google Scholar, and the number of citations listed in Web of Science as of November 13, 2024. The h-index is an author-level metric that measures both the productivity and citation impact of a publication, and the h5-index is the h-index for articles published in the last 5 complete years (2018–2022).

Synthesis of results

We tabulated and summarized the characteristics of eligible studies by grouping them based on their application area, methodological framework, and integration approach. The application area encompassed the specific use of integrated models, such as outbreak detection and forecasting. We also identified the advantages of integrated models and highlighted the most commonly used AI techniques.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.