Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
An annotated dataset of Gram stains from positive blood cultures
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 23 January 2026

An annotated dataset of Gram stains from positive blood cultures

  • Qiaolian Yi1,
  • Xiaoyan Gou1,
  • Renyuan Zhu1,
  • Xiuli Xie1,
  • Mengting Hu1,
  • Xing Wang1,
  • Tai’e Wang1,
  • Kaiwen Xu1 &
  • …
  • Ying-Chun Xu1 

Scientific Data , Article number:  (2026) Cite this article

  • 530 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Laboratory techniques and procedures
  • Medical imaging

Abstract

Bloodstream infections (BSIs) of high morbidity and mortality are across all age groups, and urgent for accurate intervention. Gram stain interpretation of positive blood cultures (PBCs) is crucial for early diagnosing BSIs, yet this manual process is labor-intensive, time-consuming, and highly operator-dependent. Artificial intelligence (AI)-assisted microscopic interpretation of stained smears presents beneficial to microbiology diagnostics. Addressing the auto-identification of blood-culture Gram stains, this study introduces a dataset of Gram-stain smears collected in clinical practice. The dataset includes 505 microscopic images, covering up to 57 species associated with BSIs, with a total of 7528 annotations. These annotations categorized by staining characteristics and morphological features into cocci, bacilli, and fungi. We trained and validated an object detection model based on the YOLOv10 architecture on this dataset to automatically localize and classify these morphological categories in microscopic images. The publicly released dataset will help developments that utilize artificial intelligence to auto-interpretate the Gram stains from PBCs for routine clinical application.

Similar content being viewed by others

Monitoring for early prediction of gram-negative bacteremia using machine learning and hematological data in the emergency department

Article Open access 19 November 2025

Bacterial profile, antimicrobial susceptibility patterns, and associated factors among bloodstream infection suspected patients attending Arba Minch General Hospital, Ethiopia

Article Open access 05 August 2021

Rapid diagnosis of bloodstream infections using a culture-free phenotypic platform

Article Open access 23 April 2024

Data availability

The dataset is available at the Figshare repository16.

Code availability

The annotation tool used for the dataset labelling is publicly available in GitHub, https://github.com/jsbroks/coco-annotator/. The customizable Image Annotation Tools used for the dataset labelling technical check is available from https://github.com/KeyOfSpectator/ImageAnnotationTools, including Double Check IoU Annotation Tool and COCO Json Merge/Split Tool.

References

  1. Jin, L. et al. Clinical Profile, Prognostic Factors, and Outcome Prediction in Hospitalized Patients With Bloodstream Infection: Results From a 10-Year Prospective Multicenter Study. Front Med (Lausanne) 8 (2021).

  2. Dubourg, G., Raoult, D. & Fenollar, F. Emerging methodologies for pathogen identification in bloodstream infections: an update. Expert Rev Mol Diagn 19, 161–173 (2019).

    Google Scholar 

  3. Adrie, C. et al. Attributable mortality of ICU-acquired bloodstream infections: Impact of the source, causative micro-organism, resistance profile and antimicrobial therapy. J Infect 74, 131–141 (2017).

    Google Scholar 

  4. Ikuta, K. S. et al. Global mortality associated with 33 bacterial pathogens in 2019: a systematic analysis for the Global Burden of Disease Study 2019. The Lancet 400, 2221–2248 (2022).

    Google Scholar 

  5. Timsit, J. F., Ruppé, E., Barbier, F., Tabah, A. & Bassetti, M. Bloodstream infections in critically ill patients: an expert statement. Intensive Care Med 46, 266–284 (2020).

    Google Scholar 

  6. Cecconi, M., Evans, L., Levy, M. & Rhodes, A. Sepsis and septic shock. The Lancet 392, 75–87 (2018).

    Google Scholar 

  7. Kern, W. V. & Rieg, S. Burden of bacterial bloodstream infection—a brief update on epidemiology and significance of multidrug-resistant pathogens. Clinical Microbiology and Infection 26, 151–157 (2020).

    Google Scholar 

  8. Pien, B. C. et al. The clinical and prognostic importance of positive blood cultures in adults. American Journal of Medicine 123, 819–828 (2010).

    Google Scholar 

  9. Lamy, B., Sundqvist, M. & Idelevich, E. A. Bloodstream infections – Standard and progress in pathogen diagnostics. Clinical Microbiology and Infection 26, 142–150 (2020).

    Google Scholar 

  10. Ito, H. et al. The role of Gram stain in reducing broad-spectrum antibiotic use: A systematic literature review and meta-analysis. Infect Dis Now 53, 104764 (2023).

    Google Scholar 

  11. Thomson, R. B. One small step for the Gram stain, one giant leap for clinical microbiology. J Clin Microbiol 54, 1416–1417 (2016).

    Google Scholar 

  12. Smith, K. P. & Kirby, J. E. Image analysis and artificial intelligence in infectious disease diagnostics. Clinical Microbiology and Infection 26, 1318–1323 (2020).

    Google Scholar 

  13. Smith, K. P., Kang, A. D. & Kirby, J. E. Automated interpretation of blood culture Gram stains by use of a deep convolutional neural network. J Clin Microbiol 56 (2018).

  14. Walter, C. et al. Performance evaluation of machine-assisted interpretation of Gram stains from positive blood cultures. J Clin Microbiol 62 (2024).

  15. Makrai, L. et al. Annotated dataset for deep-learning-based bacterial colony detection. Sci Data 10, 497 (2023).

    Google Scholar 

  16. Yi, Q. et al. An annotated dataset of Gram stains from positive blood cultures. Figshare. https://doi.org/10.6084/m9.figshare.26004610

Download references

Acknowledgements

We appreciate the help from Mr. Shichun Feng for technical validation. This work has been supported by the Noncommunicable Chronic Diseases-National Science and Technology Major Project [No.2024ZD0532800], Peking Union Medical College Hospital Talent Cultivation Program-Category D [No. UHB12289], and Young Elite Scientists Sponsorship Program of the Beijing High Innovation Plan.

Author information

Authors and Affiliations

  1. Department of Laboratory Medicine, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China

    Qiaolian Yi, Xiaoyan Gou, Renyuan Zhu, Xiuli Xie, Mengting Hu, Xing Wang, Tai’e Wang, Kaiwen Xu & Ying-Chun Xu

Authors
  1. Qiaolian Yi
    View author publications

    Search author on:PubMed Google Scholar

  2. Xiaoyan Gou
    View author publications

    Search author on:PubMed Google Scholar

  3. Renyuan Zhu
    View author publications

    Search author on:PubMed Google Scholar

  4. Xiuli Xie
    View author publications

    Search author on:PubMed Google Scholar

  5. Mengting Hu
    View author publications

    Search author on:PubMed Google Scholar

  6. Xing Wang
    View author publications

    Search author on:PubMed Google Scholar

  7. Tai’e Wang
    View author publications

    Search author on:PubMed Google Scholar

  8. Kaiwen Xu
    View author publications

    Search author on:PubMed Google Scholar

  9. Ying-Chun Xu
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Q.Y. and Y.X. conceived the concept of the work. X.G., M.H. X.W. and T.W. performed the microorganism identification. Q.Y., X.G., R.Z. K.X. and X.X. made the digital images and annotated the images. R.Z. K.X. and X.X. curated the digital images. Q.Y. drafted the manuscript and performed the technical validation. Q.Y. and Y.X. had overarching administrative responsibility for the project. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Qiaolian Yi or Ying-Chun Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yi, Q., Gou, X., Zhu, R. et al. An annotated dataset of Gram stains from positive blood cultures. Sci Data (2026). https://doi.org/10.1038/s41597-026-06651-3

Download citation

  • Received: 05 May 2025

  • Accepted: 19 January 2026

  • Published: 23 January 2026

  • DOI: https://doi.org/10.1038/s41597-026-06651-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing