Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
An evaluation of a Bayesian method to track outbreaks of known and novel influenza-like illnesses
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 01 April 2026

An evaluation of a Bayesian method to track outbreaks of known and novel influenza-like illnesses

  • John M. Aronis1,
  • Ye Ye2,
  • Jessi Espino1,
  • Marian G. Michaels3,
  • Harry Hochheiser1 &
  • …
  • Gregory F. Cooper1 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Diseases
  • Health care
  • Mathematics and computing
  • Medical research

Abstract

Tracking influenza and similar respiratory diseases is an important problem in public health and clinical medicine. The problem is complicated by the clinical similarity and co-occurrence of many of these illnesses. Additionally, recent history has shown that detecting new or reemergent diseases, such as COVID-19, is of paramount importance. This paper describes the design and testing of a system called ILI Tracker that is capable of tracking known influenza-like illnesses and early and accurately detecting the presence of a novel disease, such as COVID-19. We extracted clinical findings from 2.9M clinical records from five emergency departments using natural language processing. We constructed statistical models of six influenza-like illnesses for the first five years of the dataset and then used these models and a Bayesian filter to track the rates of these diseases in the five remaining years of data. We found significant daily correlation with the number of patients who were diagnosed with influenza and respiratory syncytial virus, but lower correlation with the other tracked diseases. We extended ILI Tracker to detect the presence of a novel, unmodeled disease, resulting in a strong signal near the beginning of the COVID-19 outbreak, and also in response to artificial injections of COVID-19 cases into case data streams, and known outbreaks of influenza and RSV treated as novel, unmodeled diseases. Our results suggest that ILI Tracker can detect the presence of a novel, unmodeled disease in a timely fashion with few false alarms. The ILI Tracker system is freely available.

Data availability

ILI Tracker is freely available for use in monitoring for disease outbreaks, including outbreaks of novel or reemergent diseases. The source code is available at https://github.com/RodsLaboratory/PDS. We also have made available a Docker container that embeds the ILI Tracker code within a more comprehensive outbreak detection system, which is available at https://github.com/rodslaboratory/pds-docker. The container includes the following com-ponents: (1) scripts to run MetaMap Lite to convert free-text ED reports into coded Concept Unique Identifiers (CUIs) from UMLS, (2) the CDS case-detection system, which performs Bayesian case diagnosis of modeled diseases, (3) the ILI Tracker program, (4) scripts that integrate all components into a complete processing pipeline, (5) a web application for running the entire system, and (6) simulated ED reports to use in conducting a test drive of the system. Documentation for the Docker container also includes instructions for processing a user’s own ED reports through the system. Real-time deployment in EDs and other highvolume healthcare settings could help clinicians and public health officials recognize the emergence of new disease outbreaks earlier. The source ED data used in this project are protected health information and therefore cannot be shared externally.

References

  1. Aronis, J. M. et al. A Bayesian system to detect and track outbreaks of influenza-like illnesses including novel diseases. JMIR Public Health Surveill. https://doi.org/10.2196/57349 (2024).

    Google Scholar 

  2. Villanueva, J., Schweitzer, B., Odle, M. & Aden, T. Detecting emerging infectious diseases: An overview of the laboratory response network for biological threats. Public Health Rep. 134, 16S (2019).

    Google Scholar 

  3. Smith, G. et al. Developing a national primary care-based early warning system for health protection-a surveillance tool for the future? Analysis of routinely collected data. J. Public Health (Oxf.) 29, 75–82 (2007).

    Google Scholar 

  4. Ginsberg, J. et al. Detecting influenza epidemics using search engine query data. Nature 457, 1012–1014 (2009).

    Google Scholar 

  5. Villamarin, R., Cooper, G. F., Wagner, M., Tsui, F.-C. & Espino, J. A method for estimating from thermometer sales the incidence of diseases that are symptomatically similar to influenza. J. Biomed. Inform. 46, 444–457 (2013).

    Google Scholar 

  6. Kim, J. & Ahn, I. Infectious disease outbreak prediction using media articles with machine learning models. Sci. Rep. https://doi.org/10.1038/s41598-021-83926-2 (2021).

    Google Scholar 

  7. Henning, K. J. Overview of syndromic surveillance: What is syndromic surveillance. Morb. Mortal. Wkly. Rep. (MMWR) 53, 7 (2004).

    Google Scholar 

  8. Hughes, H. E., Edeghere, O., O’Brien, S. J., Vivancos, R. & Elliot, A. J. Emergency department syndromic surveillance systems: A systematic review. BMC Public Health 20, 1891 (2020).

    Google Scholar 

  9. Li, M. et al. Time of arrival analysis in NC DETECT to find clusters of interest from unclassified patient visit records. Online J. Public Health Inform. 5, e13 (2013).

    Google Scholar 

  10. Burkom, H., Elbert, Y., Piatko, C. & Fink, C. A term-based approach to asyndromic determination of significant case clusters. Online J. Public Health Inform. 7, e11 (2015).

    Google Scholar 

  11. Nobles, M., Lall, R., Mathes, R. W. & Neill, D. B. Presyndromic surveillance for improved detection of emerging public health threats. Sci. Adv. 8, eabm4920 (2022).

    Google Scholar 

  12. Aronis, J. M. et al. A Bayesian approach for detecting a disease that is not being modeled. PLoS ONE 15, e0229658 (2020).

    Google Scholar 

  13. Visweswaran, S. et al. An atomic approach to the design and implementation of a research data warehouse. J. Am. Med. Inform. Assoc. 29, 601 (2010).

    Google Scholar 

  14. Aronson, A. R. & Lang, F.-M. An overview of MetaMap: Historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17, 229–236. https://doi.org/10.1136/jamia.2009.002733 (2010).

    Google Scholar 

  15. Mitchell, T. M. Machine Learning (McGraw-Hill, 1997).

  16. Wagner, M. M., Gresham, L. S. & Dato, V. Case detection, outbreak detection, and outbreak characterization. In Handbook of Biosurveillance, 27–50 (Elsevier Academic Press, 2006).

  17. RODS Laboratory GitHub https://github.com/RodsLaboratory.

Download references

Funding

This research was supported by grant R01LM013509 (Automated Surveillance of Overlapping Outbreaks and New Outbreak Diseases) from the National Library of Medicine (NLM) of the U.S. National Institutes of Health (NIH). Harry Hochheiser and Jessi Espino also received support from NIGMS grant U24GM132013 (MIDAS Coordination Center) and NIGMS grant R24GM153920 (MIDAS Coordination Center). Ye Ye also received support from NLM grant R00LM013383 (Transfer Learning to Improve the Re-usability of Computable Biomedical Knowledge). Marian Michaels also received support from CDC grant U01IP001152 (New Vaccine Surveillance Network). Jessi Espino also received support from CDC grant 5U01IP001184 (Evaluating Respiratory Virus Vaccine Effectiveness in a Large, Diverse Healthcare System). This work was also supported by the National Institutes of Health through Grant Number UL1 TR001857.

Author information

Authors and Affiliations

  1. Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA

    John M. Aronis, Jessi Espino, Harry Hochheiser & Gregory F. Cooper

  2. School of Public Health and Emergency Management, Southern University of Science and Technology, Shenzhen, China

    Ye Ye

  3. Department of Pediatrics, University of Pittsburgh School of Medicine, UPMC Children’s Hospital of Pittsburgh, Pittsburgh, Pennsylvania, USA

    Marian G. Michaels

Authors
  1. John M. Aronis
    View author publications

    Search author on:PubMed Google Scholar

  2. Ye Ye
    View author publications

    Search author on:PubMed Google Scholar

  3. Jessi Espino
    View author publications

    Search author on:PubMed Google Scholar

  4. Marian G. Michaels
    View author publications

    Search author on:PubMed Google Scholar

  5. Harry Hochheiser
    View author publications

    Search author on:PubMed Google Scholar

  6. Gregory F. Cooper
    View author publications

    Search author on:PubMed Google Scholar

Contributions

All authors contributed to the writing of this paper. John Aronis provided conceptual formulation, mathematical modeling, implementation and testing of the ILI Tracker system. Ye Ye provided conceptual formulation, mathematical modeling, implementation and testing of patient modeling. Jessi Espino provided conceptual formulation, data management, and extraction of MetaMap findings. Marian Michaels provided conceptual formulation, and clinical expertise. Harry Hochheiser provided conceptual formulation. Gregory Cooper provided conceptual formulation, clinical expertise, and mathematical modeling.

Corresponding author

Correspondence to John M. Aronis.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aronis, J.M., Ye, Y., Espino, J. et al. An evaluation of a Bayesian method to track outbreaks of known and novel influenza-like illnesses. Sci Rep (2026). https://doi.org/10.1038/s41598-026-45934-y

Download citation

  • Received: 18 August 2025

  • Accepted: 23 March 2026

  • Published: 01 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-45934-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Biosurveillance
  • Outbreak
  • Disease modeling
  • Novel disease
  • Natural language processing
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics