Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
CODE-II: a large-scale dataset for artificial intelligence in ECG analysis
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 26 May 2026

CODE-II: a large-scale dataset for artificial intelligence in ECG analysis

  • Petrus E. O. G. B. Abreu1,
  • Gabriela M. M. Paixão1,
  • Jiawei Li2,
  • Paulo R. Gomes3,
  • Peter W. Macfarlane4,
  • Ana C. S. Oliveira1,
  • Vinícius T. Carvalho1,
  • Thomas B. Schön2,
  • Antonio Luiz P. Ribeiro1,3 &
  • …
  • Antônio H. Ribeiro2 

npj Digital Medicine (2026) Cite this article

  • 594 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Cardiology
  • Computational biology and bioinformatics
  • Diseases
  • Health care
  • Medical research

Abstract

Data-driven methods for electrocardiogram (ECG) interpretation are rapidly progressing. Large datasets have enabled advances in artificial intelligence (AI) based ECG analysis, yet limitations in annotation quality, size, and scope remain major challenges. Here we present CODE-II, a large-scale real-world dataset of 2,735,269 12-lead ECGs from 2,093,807 adult patients collected by the Telehealth Network of Minas Gerais (TNMG), Brazil. Each exam was annotated using standardized diagnostic criteria and reviewed by cardiologists. A defining feature of CODE-II is a set of 66 clinically meaningful diagnostic classes, developed with cardiologist input and routinely used in telehealth practice. We additionally provide an openly available subset: CODE-II-open, a public subset of 15,000 patients, and the CODE-II-test, a non-overlapping set of 8,475 exams reviewed by multiple cardiologists for blinded evaluation. A neural network pre-trained on CODE-II achieved superior transfer performance on external benchmarks (PTB-XL and CPSC 2018) and outperformed alternatives trained on larger datasets.

Similar content being viewed by others

A hybrid deep learning network for automatic diagnosis of cardiac arrhythmia based on 12-lead ECG

Article Open access 18 October 2024

A large-scale 12-lead electrocardiogram dataset for acute coronary syndrome prediction containing 19,955 ECGs

Article Open access 04 May 2026

AI-enabled privacy-preserving cardiac diagnostics via electrocardiograms

Article Open access 15 April 2026

Acknowledgements

The authors thank the Telehealth Network of Minas Gerais for their long-term support for data acquisition and clinical validation, and the cardiologists and healthcare professionals involved in the generating and reviewing the electrocardiographic reports. The authors also acknowledge the institutional support from the Brazilian Health Ministry, participating universities, and research centers, which enabled the development and execution of this study. This work was supported by National Council for Scientific and Technological Development (CNPq), grants 310790/2021-2, 409604/2022-4, 443121/2023-0 and 408659/2024-6; Minas Gerais State Foundation for Research Support (FAPEMIG), grants PPE-00030-21 and RED-00192-23; and the Secretary for Information and Digital Health (SEIDIGI) of the Brazilian Ministry of Health (TEDs 22 and 114/2024). A.H.R. is partially supported by the eSSENCE strategic collaborative research program. A.L.P.R. is supported by the Innovation Center on Artificial Intelligence for Health (CIIA-S) and the Institute for Health Assessment and Translation for Chronic and Neglected Diseases of High Relevance (IATS-CARE). P.E.O.G.B.A. is supported by a CNPq scholarship (Brazil), grants 317219/2023-5, 302087/2024-9 and 201639/2024-6. T.B.S. is partially supported by the Kjell och Märta Beijer Foundation. The funders had no role in the study design, data collection, analysis, interpretation of the results, manuscript preparation, or the decision to submit the manuscript for publication.

Funding

Open access funding provided by Uppsala University.

Author information

Authors and Affiliations

  1. Faculdade de Medicina, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

    Petrus E. O. G. B. Abreu, Gabriela M. M. Paixão, Ana C. S. Oliveira, Vinícius T. Carvalho & Antonio Luiz P. Ribeiro

  2. Uppsala University, Uppsala, Sweden

    Jiawei Li, Thomas B. Schön & Antônio H. Ribeiro

  3. Telehealth Center, Hospital das Clínicas, UFMG, Belo Horizonte, Brazil

    Paulo R. Gomes & Antonio Luiz P. Ribeiro

  4. University of Glasgow, Glasgow, UK

    Peter W. Macfarlane

Authors
  1. Petrus E. O. G. B. Abreu
    View author publications

    Search author on:PubMed Google Scholar

  2. Gabriela M. M. Paixão
    View author publications

    Search author on:PubMed Google Scholar

  3. Jiawei Li
    View author publications

    Search author on:PubMed Google Scholar

  4. Paulo R. Gomes
    View author publications

    Search author on:PubMed Google Scholar

  5. Peter W. Macfarlane
    View author publications

    Search author on:PubMed Google Scholar

  6. Ana C. S. Oliveira
    View author publications

    Search author on:PubMed Google Scholar

  7. Vinícius T. Carvalho
    View author publications

    Search author on:PubMed Google Scholar

  8. Thomas B. Schön
    View author publications

    Search author on:PubMed Google Scholar

  9. Antonio Luiz P. Ribeiro
    View author publications

    Search author on:PubMed Google Scholar

  10. Antônio H. Ribeiro
    View author publications

    Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Petrus E. O. G. B. Abreu, Antonio Luiz P. Ribeiro or Antônio H. Ribeiro.

Ethics declarations

Competing interests

A.H.R. holds equity options in Einthoven Tecnologia LTDA and serves as a technical advisor for the company. The other authors do not have a competing interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Supplementary Data 3 (download XLSX )

Supplementary Data 4 (download XLSX )

Supplementary Data 5 (download XLSX )

Supplementary Data 6 (download XLSX )

Supplementary Data 7 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abreu, P.E.O.G.B., Paixão, G.M.M., Li, J. et al. CODE-II: a large-scale dataset for artificial intelligence in ECG analysis. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02704-4

Download citation

  • Received: 12 December 2025

  • Accepted: 23 April 2026

  • Published: 26 May 2026

  • DOI: https://doi.org/10.1038/s41746-026-02704-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing