Abstract
Infectious disease threats to individual and public health are numerous, varied and frequently unexpected. Artificial intelligence (AI) and related technologies, which are already supporting human decision making in economics, medicine and social science, have the potential to transform the scope and power of infectious disease epidemiology. Here we consider the application to infectious disease modelling of AI systems that combine machine learning, computational statistics, information retrieval and data science. We first outline how recent advances in AI can accelerate breakthroughs in answering key epidemiological questions and we discuss specific AI methods that can be applied to routinely collected infectious disease surveillance data. Second, we elaborate on the social context of AI for infectious disease epidemiology, including issues such as explainability, safety, accountability and ethics. Finally, we summarize some limitations of AI applications in this field and provide recommendations for how infectious disease epidemiology can harness most effectively current and future developments in AI.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
McCarthy, J., Minsky, M. L., Rochester, N. & Shannon, C. E. A proposal for the Dartmouth summer research project on artificial intelligence, August 31, 1955. AI Mag. 27, 12 (2006).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Hodges, A. B. Jack Copeland (ed.), The Essential Turing: The Ideas that Gave Birth to the Computer Age. Oxford: Clarendon Press, 2004. Pp. viii+613. ISBN 0-19-825079-7. £50.00 (hardback). ISBN 0-19-825080-0. £14.99 (paperback). Br. J. Hist. Sci. 39, 470–471 (2006).
Russell, S. J., Norvig, P. & Davis, E. Artificial Intelligence: A Modern Approach (Prentice Hall, 2010).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2001).
Gelman, A., Carlin, J. B. & Stern, H. S. Bayesian Data Analysis. Texts in Statistical Science 696 (Chapman & Hall/CRC Press, 2003).
Nocedal, J. & Wright, S. Numerical Optimization (Springer New York, 2006).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Proc. 33 International Conf. Neural Information Processing Systems (eds. Wallach, H. M. et al.) 8026–8037 (Curran Associates, 2019).
Carpenter, B. et al. Stan: a probabilistic programming language. J. Stat. Softw. 76, 1–32 (2017).
Anderson, R. M. & May, R. M. Population biology of infectious diseases: part I. Nature https://doi.org/10.1038/280361a0 (1979).
Myszczynska, M. A. et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat. Rev. Neurol. 16, 440–456 (2020).
Topol, E. J. Medical forecasting. Science 384, eadp7977 (2024).
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Taylor-Robinson, D. & Kee, F. Precision public health-the Emperor’s new clothes. Int. J. Epidemiol. 48, 1–6 (2019).
Anderson, P. W. More is different. Science 177, 393–396 (1972).
Krieger, N., Waterman, P. D., Chen, J. T., Testa, C. & Hanage, W. P. Missing again: US racial and ethnic data for COVID-19 vaccination. Lancet 397, 1259–1260 (2021).
Wiens, J. & Shenoy, E. S. Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology. Clin. Infect. Dis. 66, 149–153 (2018).
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Kraemer, M. U. G. et al. Monitoring key epidemiological parameters of SARS-CoV-2 transmission. Nat. Med. 27, 1854–1855 (2021).
Monto, A. S., Koopman, J. S. & Longini, I. M. Jr Tecumseh study of illness. XIII. Influenza infection and disease, 1976–1981. Am. J. Epidemiol. 121, 811–822 (1985).
Cauchemez, S., Carrat, F., Viboud, C., Valleron, A. J. & Boëlle, P. Y. A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data. Stat. Med. 23, 3469–3487 (2004).
Tran, T., Pham, T. T., Carneiro, G., Palmer, L. & Reid, I. A Bayesian data augmentation approach for learning deep models. In Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. et al.) 2794–2803 (ACM, 2017).
Rezende, D. & Mohamed, S. Variational inference with normalizing flows. In Proceedings of the 32nd International Conference on Machine Learning 37, 1530–1538 (PMLR, 2015).
Knock, E. S. et al. Key epidemiological drivers and impact of interventions in the 2020 SARS-CoV-2 epidemic in England. Sci. Transl. Med. 13, eabg4262 (2021).
Bouman, J. A. et al. Bayesian workflow for time-varying transmission in stratified compartmental infectious disease transmission models. PLoS Comput. Biol. 20, e1011575 (2024).
Semenova, E., Mishra, S., Bhatt, S., Flaxman, S. & Unwin, H. J. T. in Epistemic Uncertainty in Artificial Intelligence (eds Cuzzolin, F. & Sultana, M.) 13–27 (Springer, 2024).
Mishra, S. et al. πVAE: a stochastic process prior for Bayesian deep learning with MCMC. Stat. Comput. 32, 96 (2022).
Cranmer, K., Brehmer, J. & Louppe, G. The frontier of simulation-based inference. Proc. Natl Acad. Sci. USA 117, 30055–30062 (2020). This study highlights recent advancements in simulation-based inference, focusing on new machine learning techniques that improve inference quality in complex simulations across various scientific domains.
Savcisens, G. et al. Using sequences of life-events to predict human lives. Nat. Comput. Sci. 4, 43–56 (2024). This study presents a predictive framework using transformer-based models to analyse life-event sequences, demonstrating enhanced accuracy in forecasting individual outcomes such as mortality and personality traits.
Medley, G. F. A consensus of evidence: the role of SPI-M-O in the UK COVID-19 response. Adv. Biol. Regul. 86, 100918 (2022).
Brockwell, P. J. & Davis, R. A. in Time Series: Theory and Methods (eds Brockwell, P. J. & Davis, R. A.) 1–41 (Springer, 1987).
Sherratt, K. et al. Exploring surveillance data biases when estimating the reproduction number: with insights into subpopulation transmission of COVID-19 in England. Philos. Trans. R. Soc. Lond. B 376, 20200283 (2021).
Mena, G. E. et al. Socioeconomic status determines COVID-19 incidence and related mortality in Santiago, Chile. Science 372, eabg5298 (2021).
Hawryluk, I. et al. Gaussian process nowcasting: application to COVID-19 mortality reporting. In Proc. 37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 Vol. 161, 1258–1268 (PMLR, 2021). This paper introduces a Gaussian process model for nowcasting COVID-19 mortality, correcting for reporting delays and providing robust uncertainty estimates to improve real-time epidemiological assessments.
Lison, A., Abbott, S., Huisman, J. & Stadler, T. Generative Bayesian modeling to nowcast the effective reproduction number from line list data with missing symptom onset dates. PLoS Comput. Biol. 20, e1012021 (2024). This paper presents a unified generative Bayesian model to estimate the effective reproduction number from incomplete line list data, addressing limitations in traditional methods by considering missing symptom onset dates and right-truncated case counts.
McGough, S. F., Johansson, M. A., Lipsitch, M. & Menzies, N. A. Nowcasting by Bayesian smoothing: a flexible, generalizable model for real-time epidemic tracking. PLoS Comput. Biol. 16, e1007735 (2020). This paper introduces a Bayesian smoothing approach for nowcasting that accurately estimates real-time epidemic case counts by incorporating temporal relationships and adapting to reporting delays across diseases.
Brizzi, A., O’Driscoll, M. & Dorigatti, I. Refining reproduction number estimates to account for unobserved generations of infection in emerging epidemics. Clin. Infect. Dis. 75, e114–e121 (2022).
Cramer, E. Y. et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proc. Natl Acad. Sci. USA 119, e2113561119 (2022). This study evaluates probabilistic forecasting methods for COVID-19 mortality, showing that ensemble approaches, which integrate predictions from multiple models, consistently yield more accurate forecasts.
Wood, D. et al. A unified theory of diversity in ensemble learning. J. Mach. Learn. Res. 24, 1–49 (2023). This paper presents a framework for understanding ensemble diversity in supervised learning, shifting the focus from maximizing diversity to managing a bias–variance–diversity trade-off in ensemble methods.
Das, A. et al. A decoder-only foundation model for time-series forecasting. In Proceedings of the 41st International Conference on Machine Learning 235, 10148–10167 (PMLR, 2024).
Pourpanah, F. et al. A review of generalized zero-shot learning methods. IEEE Trans. Pattern Anal. Mach. Intell. 45, 4051–4070 (2023).
Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In Proc. 33rd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) Vol. 48, 1050–1059 (PMLR, 2016).
Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M. & Suganthan, P. N. Ensemble deep learning: a review. Eng. Appl. Artif. Intell. 115, 105151 (2022).
Angelopoulos, A. N. & Bates, S. Conformal Prediction: A Gentle Introduction (Now, 2023).
Hunter, E., Namee, B. M. & Kelleher, J. D. A taxonomy for agent-based models in human infectious disease epidemiology. J. Artif. Soc. Soc. Simul. 20, 2 (2017).
Quera-Bofarull, A., Chopra, A., Calinescu, A., Wooldridge, M. & Dyer, J. Bayesian calibration of differentiable agent-based models. Preprint at arxiv.org/abs/2305.15340 (2023). This paper introduces a method that combines variational inference with differentiable agent-based models for better Bayesian parameter calibration, addressing issues with complex likelihood functions and model inaccuracies.
Pakkanen, M. S. et al. Unifying incidence and prevalence under a time-varying general branching process. J. Math. Biol. 87, 35 (2023).
Unwin, H. J. T. et al. Using Hawkes processes to model imported and local malaria cases in near-elimination settings. PLoS Comput. Biol. 17, e1008830 (2021).
Rizoiu, M.-A., Mishra, S., Kong, Q., Carman, M. & Xie, L. SIR-Hawkes: linking epidemic models and Hawkes processes to model diffusions in finite populations. In Proc. 2018 World Wide Web Conference 419–428 (International World Wide Web Conferences Steering Committee, 2018).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conf. on Learning Representations (ICLR, 2017).
Liu, Z. et al. A review of graph neural networks inepidemic modeling. In Proc. 30th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining 6577–6587 (2024).
Salathé, M. et al. A high-resolution human contact network for infectious disease transmission. Proc. Natl Acad. Sci. USA 107, 22020–22025 (2010).
Al-Garadi, M. A., Khan, M. S., Varathan, K. D., Mujtaba, G. & Al-Kabsi, A. M. Using online social networks to track a pandemic: a systematic review. J. Biomed. Inform. 62, 1–11 (2016).
Panagopoulos, G., Nikolentzos, G. & Vazirgiannis, M. Transfer graph neural networks for pandemic forecasting. AAAI 35, 4838–4845 (2021). This paper presents a method to use graph neural networks and transfer learning to predict infectious disease spread, modelling population movement and disease diffusion patterns, which improves forecasting accuracy across various regions.
Deng, S., Wang, S., Rangwala, H., Wang, L. & Ning, Y. Cola-GNN: cross-location attention based graph neural networks for long-term ILI prediction. In Proc. 29th ACM International Conference on Information & Knowledge Management 245–254 (ACM, 2020).
Chang, S. et al. Measuring vaccination coverage and concerns of vaccine holdouts from web search logs. Nat. Commun. 15, 6496 (2024).
Wang, L. et al. CausalGNN: causal-based graph neural networks for spatio-temporal epidemic forecasting. AAAI 36, 12191–12199 (2022).
Aylett-Bullock, J. et al. June: open-source individual-based epidemiology simulation. R. Soc. Open Sci. 8, 210506 (2021).
Liu, J. et al. Towards graph foundation models: a survey and beyond. Preprint at arxiv.org/abs/2310.11829 (2023).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). This study introduces AlphaFold, a neural network approach that achieves an extremely high level of accuracy in predicting protein structure.
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024). This paper introduces AlphaFold 3, a model that uses a diffusion-based architecture to predict complex biomolecular interactions, enhancing the understanding of protein interactions and aiding in the design of targeted therapies.
Hie, B., Zhong, E. D., Berger, B. & Bryson, B. Learning the language of viral evolution and escape. Science 371, 284–288 (2021).
Zrnic, T. & Candès, E. J. Cross-prediction-powered inference. Proc. Natl Acad. Sci. USA 121, e2322083121 (2024).
de Bernardi Schneider, A. et al. SARS-CoV-2 lineage assignments using phylogenetic placement/UShER are superior to pangoLEARN machine-learning method. Virus Evol. 10, vead085 (2024).
Malik, A. J., Poole, A. M. & Allison, J. R. Structural phylogenetics with confidence. Mol. Biol. Evol. 37, 2711–2726 (2020).
Voznica, J. et al. Deep learning from phylogenies to uncover the epidemiological dynamics of outbreaks. Nat. Commun. 13, 3896 (2022).
Ito, J. et al. A Protein language model for exploring viral fitness landscapes. Preprint at bioRxiv https://doi.org/10.1101/2024.03.15.584819 (2024).
Mollentze, N. & Streicker, D. G. Predicting zoonotic potential of viruses: where are we? Curr. Opin. Virol. 61, 101346 (2023).
Babayan, S. A., Orton, R. J. & Streicker, D. G. Predicting reservoir hosts and arthropod vectors from evolutionary signatures in RNA virus genomes. Science 362, 577–580 (2018).
Thadani, N. N. et al. Learning from prepandemic data to forecast viral escape. Nature 622, 818–825 (2023). This study introduces EVEscape, a framework that uses deep learning from historical viral sequences and structural information to predict viral mutations that evade immune responses, providing early insights for vaccine design and pandemic preparedness.
Koel, B. F. et al. Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution. Science 342, 976–979 (2013).
Shan, S. et al. Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization. Proc. Natl Acad. Sci. USA 119, e2122954119 (2022).
Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283 (2024).
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Mahmud, A. S., Martinez, P. P. & Baker, R. E. The impact of current and future climates on spatiotemporal dynamics of influenza in a tropical setting. PNAS Nexus 2, gad307 (2023).
Maher, M. C. et al. Predicting the mutational drivers of future SARS-CoV-2 variants of concern. Sci. Transl. Med. 14, eabk3445 (2022).
Mollentze, N., Babayan, S. A. & Streicker, D. G. Identifying and prioritizing potential human-infecting viruses from their genome sequences. PLoS Biol. 19, e3001390 (2021).
Grenfell, B. T. et al. Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303, 327–332 (2004).
Zhang, C. & Matsen, F. A. IV. A variational approach to Bayesian phylogenetic inference. J. Mach. Learn. Res. 25, 1–56 (2024). This paper presents a variational framework for Bayesian phylogenetic inference that enhances exploration efficiency and reduces the number of required iterations compared to traditional Markov chain Monte Carlo methods.
Bajaj, S. et al. COVID-19 testing and reporting behaviours in England across different sociodemographic groups: a population-based study using testing data and data from community prevalence surveillance surveys. Lancet Digit. Health 6, e778–e790 (2024).
Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).
100,000 Genomes Project Pilot Investigators. 100,000 Genomes pilot on rare-disease diagnosis in health care—preliminary report. N. Engl. J. Med. 385, 1868–1880 (2021).
Brito, A. F. et al. Global disparities in SARS-CoV-2 genomic surveillance. Nat. Commun. 13, 7003 (2022).
Yang, Y. et al. Exploring the big data paradox for various estimands using vaccination data from the global COVID-19 Trends and Impact Survey (CTIS). Sci. Adv. 10, eadj0266 (2024).
Dan, S. et al. Estimating fine age structure and time trends in human contact patterns from coarse contact data: the Bayesian rate consistency model. PLoS Comput. Biol. 19, e1011191 (2023).
Settles, B. Active Learning Literature Survey (University of Wisconsin-Madison, 2009).
Garnett, R. Bayesian Optimization Book (Cambridge Univ. Press, 2023).
Tsui, J. L. et al. Toward optimal disease surveillance with graph-based active learning. Proc. Natl Acad. Sci. USA 121, e2412424121 (2024). This research proposes an active learning policy that optimally allocates limited testing resources across a network to improve disease surveillance and predictive accuracy while minimizing the number of required tests.
Wymant, C. et al. The epidemiological impact of the NHS COVID-19 app. Nature 594, 408–412 (2021).
Ferretti, L. et al. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science 368, eabb6936 (2020).
Ferretti, L. et al. Digital measurement of SARS-CoV-2 transmission risk from 7 million contacts. Nature 626, 145–150 (2024).
Kraemer, M. U. G. et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368, 493–497 (2020).
Chang, S. et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature 589, 82–87 (2021).
Kraemer, M. U. G. et al. Mapping global variation in human mobility. Nat. Hum. Behav. 4, 800–810 (2020).
Page, B. & Topol, E. J. Digitising the outbreak. Lancet 402, 2186 (2023).
Radin, J. M. et al. Sensor-based surveillance for digitising real-time COVID-19 tracking in the USA (DETECT): a multivariable, population-based, modelling study. Lancet Digit. Health 4, e777–e786 (2022).
Bastani, H. et al. Efficient and targeted COVID-19 border testing via reinforcement learning. Nature 599, 108–113 (2021).
Quer, G. et al. Wearable sensor data and self-reported symptoms for COVID-19 detection. Nat. Med. 27, 73–77 (2021).
Schmidt, B. & Hildebrandt, A. Deep learning in next-generation sequencing. Drug Discov. Today 26, 173–180 (2021).
Skums, P. et al. QUENTIN: reconstruction of disease transmissions from viral quasispecies genomic data. Bioinformatics 34, 163–170 (2018).
Turakhia, Y. et al. Ultrafast sample placement on existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic. Nat. Genet. 53, 809–816 (2021).
Turakhia, Y. et al. Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape. Nature 609, 994–997 (2022).
Baker, R. E. et al. Infectious disease in an era of global change. Nat. Rev. Microbiol. 20, 193–205 (2022).
Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382, 1416–1421 (2023).
Chinazzi, M. et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 368, 395–400 (2020).
Bengio, Y. et al. Managing extreme AI risks amid rapid progress. Science 384, 842–845 (2024).
Transcript of module 2 public hearing on 23 November 2023—searchable transcripts of the UK COVID-19 Inquiry hearings documentation. UK COVID-19 Inquiry ukcovid19inquiry.dracos.co.uk/module-2/2023-11-23/ (UK COVID-19 Inquiry, 2023).
Sutton, R. S. & Barto, A. G. Reinforcement Learning, Second Edition: An Introduction (MIT Press, 2018).
Benz, N. C. & Rodriguez, M. G. Human-aligned calibration for AI-assisted decision making. In Proc. 37th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) 14609–14636 (ACM, 2023).
Gupta, R. K. et al. HIV-1 drug resistance before initiation or re-initiation of first-line antiretroviral therapy in low-income and middle-income countries: a systematic review and meta-regression analysis. Lancet Infect. Dis. 18, 346–355 (2018).
Shea, K. et al. Harnessing multiple models for outbreak management. Science 368, 577–579 (2020). This article discusses the importance of integrating expert elicitation methods with decision-theoretic frameworks to enhance the effectiveness of multiple modelling approaches in managing outbreaks, assisting policymakers in navigating uncertainties.
Kekić, A. et al. Evaluating vaccine allocation strategies using simulation-assisted causal modeling. Patterns 4, 100739 (2023).
WHO. Ethics and Governance of Artificial Intelligence for Health (WHO, 2021).
Parker, M. Ethical Hotspots in Infectious Disease Surveillance for Global Health Security Social Justice and Pandemic Preparedness (Oxford Univ. Press, 2023).
Parker, M. J., Fraser, C., Abeler-Dörner, L. & Bonsall, D. Ethics of instantaneous contact tracing using mobile phone apps in the control of the COVID-19 pandemic. J. Med. Ethics 46, 427–431 (2020).
Gradoń, K. T., Hołyst, J. A., Moy, W. R., Sienkiewicz, J. & Suchecki, K. Countering misinformation: a multidisciplinary approach. Big Data Soc. 8, 20539517211013848 (2021).
WHO. How to report misinformation online. WHO www.who.int/campaigns/connecting-the-world-to-combat-coronavirus/how-to-report-misinformation-online (WHO, 2024).
WHO. Fides—a network of healthcare influencers. WHO www.who.int/teams/digital-health-and-innovation/digital-channels/fides (WHO, 2024).
Chen, C. & Stadler, T. GenSpectrum Chat: data exploration in public health using large language models. Preprint at arxiv.org/abs/2305.13821 (2023).
Williams, R., Hosseinichimeh, N., Majumdar, A. & Ghaffarzadegan, N. Epidemic modeling with generative agents. Preprint at arxiv.org/abs/2307.04986 (2023). This study introduces an agent-based epidemic modelling framework using generative AI, enabling agents to make autonomous decisions based on contextual information and effectively incorporating human behaviour into epidemic simulations.
Allen, J., Watts, D. J. & Rand, D. G. Quantifying the impact of misinformation and vaccine-skeptical content on Facebook. Science 384, eadk3451 (2024).
Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code (John Wiley & Sons, 2019).
Norori, N., Hu, Q., Aellen, F. M., Faraci, F. D. & Tzovara, A. Addressing bias in big data and AI for health care: a call for open science. Patterns 2, 100347 (2021).
Morley, J. et al. The ethics of AI in health care: a mapping review. Soc. Sci. Med. 260, 113172 (2020).
Exploring Public Attitudes Towards the Use of Digital Health Technologies and Data (The Health Foundation, 2023); www.health.org.uk/publications/long-reads/exploring-public-attitudes-towards-the-use-of-digital-health-technologies.
Zhang, Y. et al. Siren’s song in the AI ocean: a survey on hallucination in large language models. Preprint at arxiv.org/abs/2309.01219 (2023).
Farquhar, S., Kossen, J., Kuhn, L. & Gal, Y. Detecting hallucinations in large language models using semantic entropy. Nature 630, 625–630 (2024).
AlphaFold Protein Structure Database. EBI alphafold.ebi.ac.uk/ (EBI, 2024).
100,000 Genomes Project. Genomics England www.genomicsengland.co.uk/initiatives/100000-genomes-project (Genomics England, 2022).
Xu, B. et al. Epidemiological data from the COVID-19 outbreak, real-time case information. Sci. Data 7, 106 (2020).
WHO. Digital health and innovation. WHO www.who.int/teams/digital-health-and-innovation (WHO, 2024); .
Omar, M., Brin, D., Glicksberg, B. & Klang, E. Utilizing natural language processing and large language models in the diagnosis and prediction of infectious diseases: a systematic review. Am. J. Infect. Control https://doi.org/10.1016/j.ajic.2024.03.016 (2024).
Schmidt, L. et al. Data extraction methods for systematic review (semi)automation: update of a living systematic review. F1000Res. 10, 401 (2021).
Wu, S. et al. Deep learning in clinical natural language processing: a methodical review. J. Am. Med. Inform. Assoc. 27, 457–470 (2020).
Hasan, B. et al. Integrating large language models in systematic reviews: a framework and case study using ROBINS-I for risk of bias assessment. BMJ Evid. Based Med. https://doi.org/10.1136/bmjebm-2023-112597 (2024).
Kasy, M. & Abebe, R. Fairness, equality, and power in algorithmic decision-making. In Proc. 2021 ACM Conference on Fairness, Accountability, and Transparency 576–586 (ACM, 2021).
Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems (eds. Larochelle, H. et al.) 793 (Curran Associates Inc., 2020).
Dayan, I. et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat. Med. 27, 1735–1743 (2021).
Hassan, C., Bon, J. J., Semenova, E., Mira, A. & Mengersen, K. Federated learning for non-factorizable models using deep generative prior approximations. Preprint at arxiv.org/abs/2405.16055 (2024).
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Adebamowo, C. et al. Ethical oversight of data science health research in Africa. NEJM AI 1, AIpc2400033 (2024).
Barreras, F. & Watts, D. J. The exciting potential and daunting challenge of using GPS human-mobility data for epidemic modeling. Nat. Comput. Sci. 4, 398–411 (2024).
Bommasani, R. et al. The Foundation Model Transparency Index after 6 months. Stanford CFM crfm.stanford.edu/2024/05/21/fmti-may-2024.html (Stanford CRFM, 2024).
Gibb, R. et al. Interactions between climate change, urban infrastructure and mobility are driving dengue emergence in Vietnam. Nat. Commun. 14, 8179 (2023).
Peters, J., Janzing, D. & Schölkopf, B. Elements of Causal Inference (MIT Press, 2021).
Wong, C. How AI is improving climate forecasts. Nature 628, 710–712 (2024).
Van Calster, B. et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 17, 230 (2019).
Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64, 107–115 (2021).
Silva, D. S. & Smith, M. J. Data sharing during pandemics: reciprocity, solidarity, and limits to obligations. J. Bioeth. Inq. 20, 667–672 (2023).
OpenAI et al. GPT-4 technical report. Preprint at arxiv.org/abs/2303.08774 (2023).
Burki, T. Crossing the frontier: the first global AI safety summit. Lancet Digit. Health 6, e91–e92 (2024).
Charles, G. et al. Seq2Seq surrogates of epidemic models to facilitate Bayesian inference. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence 14170–14177 (AAAI Press, 2023).
Garza, A. & Mergenthaler-Canseco, M. Nixtla: TimeGPT-1: production ready pre-trained time series foundation model for forecasting and anomaly detection. Github https://github.com/Nixtla/nixtla (Github, 2023).
Borchering, R. K. et al. Public health impact of the U.S. scenario modeling hub. Epidemics 44, 100705 (2023).
Mehrjou, A. et al. Pyfectious: an individual-level simulator to discover optimal containment policies for epidemic diseases. PLoS Comput. Biol. 19, e1010799 (2023).
Tsui, J. L.-H. et al. Toward optimal disease surveillance with graph-based active learning. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.2412424121 (2024).
Tu, T. et al. Towards generalist biomedical AI. NEJM AI https://doi.org/10.1056/AIoa2300138 (2024).
Ouyang, L. et al. Training language models to follow instructions with human feedback. Proceedings of the 36th International Conference on Neural Information Processing Systems, 2011–2025 (Curran Associates Inc., 2024).
Ghani, A. C. et al. Methods for estimating the case fatality ratio for a novel, emerging infectious disease. Am. J. Epidemiol. 162, 479–486 (2005).
Kenah, E., Lipsitch, M. & Robins, J. M. Generation interval contraction and epidemic data analysis. Math. Biosci. 213, 71–79 (2008).
Newman, M. E. J. Spread of epidemic disease on networks. Phys. Rev. E 66, 016128 (2002).
Gostic, K. M. et al. Practical considerations for measuring the effective reproductive number, Rt. PLoS Comput. Biol. 16, e1008409 (2020).
Acknowledgements
M.U.G.K. acknowledges funding from The Rockefeller Foundation (PC-2022-POP-005), Google.org, the Oxford Martin School Programmes in Pandemic Genomics and Digital Pandemic Preparedness (also C.F. and L.F.), European Union’s Horizon Europe programme projects MOOD (874850) and E4Warning (101086640), the John Fell Fund, a Branco Weiss Fellowship, Wellcome Trust grants 303666/Z/23/Z, 226052/Z/22/Z and 228186/Z/23/Z (also H.T.), United Kingdom Research and Innovation (APP8583) and the Medical Research Foundation (MRF-RG-ICCH-2022-100069, also H.T.), UK International Development (301542-403), the Bill & Melinda Gates Foundation (INV-063472) and Novo Nordisk Foundation (NNF24OC0094346, also H.T.); E.C.H. from a National Health and Medical Research Council (NHMRC) Investigator award and by AIR@InnoHK administered by the Innovation and Technology Commission, Hong Kong Special Administrative Region, China; and S. Bhatt from the MRC Centre for Global Infectious Disease Analysis (reference MR/X020258/1), funded by the UK Medical Research Council (MRC). This UK funded award is carried out in the framework of the Global Health EDCTP3 Joint Undertaking. S. Bhatt is funded by the National Institute for Health and Care Research (NIHR) Health Protection Research Unit in Modelling and Health Economics, a partnership between UK Health Security Agency, Imperial College London and LSHTM (grant code NIHR200908). S. Bhatt acknowledges support from the Novo Nordisk Foundation through The Novo Nordisk Young Investigator Award (NNF20OC0059309). S. Bhatt acknowledges the Danish National Research Foundation (DNRF160) through the chair grant, which also supports N. Scheidwasser, M.P.K. and J.L.C.-S. S. Bhatt acknowledges support from The Eric and Wendy Schmidt Fund For Strategic Innovation through the Schmidt Polymath Award (G-22-63345). O.R. acknowledges funding support from the Bill & Melinda Gates Foundation (OPP117509, OPP1084362), EPSRC (EP/X038440/1), NIH (R01AI155080) and the Moderna Charitable Foundation; and E.J.T. from the US National Institutes of Health grant UM1TR004407. M.A.S. is supported in part through the US National Institutes of Health grants R01 AI153044 and R01 AI162611. E.S. acknowledges support in part by the AI2050 program at Schmidt Futures (grant G-22-64476). S. Bajaj is supported by the Clarendon Scholarship and St Edmund Hall College, University of Oxford and NERC DTP (grant number NE/S007474/1). T.S. acknowledges funding of ETH Zürich; F.D. from the NIH U24ES035309; C.F. from the Moh Family Foundation; M.G.-R. from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 945719); D.A.D. from the Novo Nordisk Foundation through The Novo Nordisk Data Science Emerging Investigator Award (NNF23OC0084647); and S.F. and M.Z. from the Engineering and Physical Sciences Research Council (EP/V002910/2). J.L.-H.T. is supported by a Yeotown Scholarship from New College, University of Oxford. M.J.P.’s research on infectious disease ethics is supported by the Moh Family Foundation and Wellcome (221719). B.S. acknowledge funding by the Machine Learning Cluster of Excellence, EXC number 2064/1. H.T. acknowledges support to CERI from grants from the South African Medical Research Council (SAMRC) with funds received from the National Department of Health, the Rockefeller Foundation (HTH 017), the Abbott Pandemic Defense Coalition (APDC), the National Institute of Health USA (U01 AI151698) for the United World Antivirus Research Network (UWARN), the INFORM Africa project through IHVN (U54 TW012041) and the eLwazi Open Data Science Platform and Coordinating Center (U2CEB032224), the SAMRC South African mRNA Vaccine Consortium (SAMVAC), the European Union supported by the Global Health EDCTP3 Joint Undertaking and its members, the European Union’s Horizon Europe Research and Innovation Programme (101046041), the Health Emergency Preparedness and Response Umbrella Program (HEPR Program), managed by the World Bank Group (TF0B8412), the GIZ commissioned by the Government of the Federal Republic of Germany, the UK’s Medical Research Foundation (MRF-RG-ICCH-2022-100069) and the Wellcome Trust for the Global.health project (228186/Z/23/Z). C.M. is supported by a studentship from the UK’s Engineering and Physical Sciences Research Council; and C.A.D. by the UK National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emerging and Zoonotic Infections in partnership with Public Health England (grant number: HPRU200907). The contents of this publication are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission, NIH, NIHR, UK Health Security Agency or the Department of Health and Social Care, or the other funders.
Author information
Authors and Affiliations
Contributions
S. Bhatt and M.U.G.K. conceptualized the study with input from O.G.P. and S.C.; J.L.-H.T. made the figures with input from S. Bhatt and M.U.G.K.; M.U.G.K. and S. Bhatt wrote the original draft with input from O.G.P. and S.C. All of the authors contributed to sections and reviewed, edited and approved the manuscript. S. Bhatt and M.U.G.K. administered the project.
Corresponding authors
Ethics declarations
Competing interests
S. Bhatt is a paid member of the Academic Council of the Schmidt Science Fellows programme outside the scope of this work. This affiliation is unrelated to the submitted work, and the programme does not stand to benefit from this publication. M.A.S. receives grants from the US National Institutes of Health within the scope of this work, and grants and contracts from the US Food and Drug Administration, the US Department of Veterans Affairs, and Johnson and Johnson, all outside the scope of this work. C.F. is a member of two committees that advise the UK Department of Health on emerging epidemics, namely NERVTAG and SPI-M. The other authors declare no competing interests.
Peer review
Peer review information
Nature thanks Peter Klimek, Amalio Telenti and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kraemer, M.U.G., Tsui, J.LH., Chang, S.Y. et al. Artificial intelligence for modelling infectious disease epidemics. Nature 638, 623–635 (2025). https://doi.org/10.1038/s41586-024-08564-w
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41586-024-08564-w
This article is cited by
-
Computational modeling of infectious diseases: insights from network-based simulations on measles
BMC Medical Informatics and Decision Making (2025)
-
In the aftermath of the adoption of the landmark Pandemic Accord: what are the strategic options for its effective implementation in Africa?
Globalization and Health (2025)