Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 20 January 2026

A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes

  • Roberto Rojas-Pizarro  ORCID: orcid.org/0009-0008-3564-37741,2,
  • Constanza Vásquez-Venegas1,3,
  • Gonzalo Pereira4,
  • María F. Eyssautier4,
  • Felipe Bravo-Bahamóndez  ORCID: orcid.org/0009-0003-1960-91235,
  • Nicolás Sanhueza4,
  • Paulina Gallardo-Badilla4,
  • Francisca Caro-Flores4,
  • Camila Ormeño-Candia4,
  • Felipe Santander1,
  • Nicolás Pérez1,
  • María M. Molina5,
  • Gonzalo Rojas3,
  • Steffen Härtel1,5,6,7,8 &
  • …
  • Guillermo Cabrera-Vives3,9 

Scientific Data , Article number:  (2026) Cite this article

  • 1638 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Cancer imaging
  • Data publication and archiving
  • Machine learning

Abstract

The Response Evaluation Criteria in Solid Tumors (RECIST 1.1) protocol is the gold standard for assessing treatment response in oncological clinical trials and routine practice. It requires radiologists to review and select appropriate target lesions and perform precise diameter measurements, making the process labor-intensive and variable. Artificial Intelligence (AI) holds great promise for automating this workflow, but progress is hindered by the lack of public datasets with comprehensive lesion annotations and RECIST-compliant measurements. We address this gap by presenting a dataset of 1,246 manually segmented lesions from 58 CT scans of 22 cancer patients treated at the Clinical Hospital of the University of Chile (HCUCH). All cases were evaluated under RECIST 1.1, with diameter measurements reported for 82 target lesions. This resource supports diverse applications, including validating automated RECIST tools, applying radiomics to study metastatic heterogeneity, benchmarking segmentation algorithms, and advancing foundation models in medical imaging. By including data from a Latin American institution, this dataset also promotes global representation in the development of generalizable medical AI tools.

Similar content being viewed by others

A promptable CT foundation model for solid tumor evaluation

Article Open access 25 April 2025

MRI-based clinical-radiomics model predicts tumor response before treatment in locally advanced rectal cancer

Article Open access 08 March 2021

A clinically applicable and generalizable deep learning model for anterior mediastinal tumors in CT images across multiple institutions

Article Open access 30 January 2026

Data availability

The dataset is available on Zenodo (https://zenodo.org/records/17788162) under a Creative Commons Attribution 4.0 International (CC BY 4.0) license15.

Code availability

All code used for data conversion, windowing normalization and statistical analysis is available at https://github.com/robertorojasp06/recist-dataset. Code for fine-tuning and evaluating nnUNet is available at https://github.com/robertorojasp06/nnUNet. Code for experiments using MedSAM is available at https://github.com/robertorojasp06/MedSAM.

All code for dataset preprocessing and characterization was implemented in Python (3.10) on Ubuntu 22.04.5 LTS. The specific Python packages used are listed in the environment.yml file of the recist-dataset repository available at https://github.com/robertorojasp06/recist-dataset. Experiments involving nnUNet and MedSAM were conducted using the same package versions specified in their respective GitHub repositories. All experiments were run on a machine equipped with an AMD Ryzen 7 5700 G processor and an NVIDIA GeForce RTX 4090 GPU.

References

  1. Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).

    Google Scholar 

  2. Lewandowska, A. M. et al. Environmental risk factors for cancer-review paper. Ann. Agric. Environ. Med. 26, 1–7 (2018).

    Google Scholar 

  3. Iannessi, A. et al. RECIST 1.1 assessments variability: a systematic pictorial review of blinded double reads. Insights Imaging 15, 199 (2024).

    Google Scholar 

  4. Bucho, T. M. T. et al. Reproducing RECIST lesion selection via machine learning: insights into intra and inter-radiologist variation. Eur. J. Radiol. Open 12, 100562 (2024).

    Google Scholar 

  5. Welsh, J. L. et al. Comparison of response evaluation criteria in solid tumors with volumetric measurements for estimation of tumor burden in pancreatic adenocarcinoma and hepatocellular carcinoma. Am. J. Surg. 204, 580–585 (2012).

    Google Scholar 

  6. Beaumont, H. et al. Radiology workflow for RECIST assessment in clinical trials: Can we reconcile time-efficiency and quality? Eur. J. Radiol. 118, 257–263 (2019).

    Google Scholar 

  7. Bucho, T. M. T. et al. How does target lesion selection affect RECIST? A computer simulation study. Invest. Radiol. 59, 465–471 (2024).

    Google Scholar 

  8. van der Loo, I. et al. Measurement variability of radiologists when measuring brain tumors. Eur. J. Radiol. 183, 111874 (2025).

    Google Scholar 

  9. Azad, R. et al. Medical image segmentation review: The success of u-net. IEEE Trans. Pattern Anal. Mach. Intell., (2024).

  10. Rayed, M. E. et al. Deep learning for medical image segmentation: State-of-the-art advancements and challenges. Inform. Med. Unlocked 43, 101504 (2024).

    Google Scholar 

  11. Moawad, A. W. et al. Multimodality annotated hepatocellular carcinoma data set including pre- and post-TACE with imaging segmentation. Sci. Data 10, 33 (2023).

    Google Scholar 

  12. Ma, J. et al. Segment anything in medical images. Nat. Commun. 15, 654 (2024).

    Google Scholar 

  13. Cryptography Community. Fernet (symmetric encryption), version 44.0.0. Cryptography. https://cryptography.io/en/latest/fernet (2025).

  14. Vásquez-Venegas, C. et al. Human-in-the-loop—A Deep Learning Strategy in Combination with a Patient-Specific Gaussian Mixture Model Leads to the Fast Characterization of Volumetric Ground-Glass Opacity and Consolidation in the Computed Tomography Scans of COVID-19 Patients. J. Clin. Med. 13, 5231 (2024).

    Google Scholar 

  15. Rojas-Pizarro, R. et al. A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes, version 1.1.0. Zenodo https://doi.org/10.5281/zenodo.17788162 (2025).

  16. Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).

    Google Scholar 

  17. Antonelli, M. et al. The medical segmentation decathlon. Nat. Commun. 13, 4128 (2022).

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Chilean National Commission for Scientific and Technological Research (FONDEQUIP EQM210020), and projects FONDEF ID23|10337 (all authors), NCN2024_068 (SH), FONDECYT 1211988 (SH, GP), CNRSIRL2807 (SH), and CORFO 16CTTS-66390 (SH), MINEDUC grant RED 21994 (SH, MM), BASAL FB210005 (SH), CNRS IRL 2807 (SH), DAAD 57519605 (SH), Centro CTI220001 (SH).

Author information

Authors and Affiliations

  1. Laboratory for Scientific Image Analysis SCIAN-Lab, Interdisciplinary Nucleus for Biology and Genetics, Institute of Biomedical Sciences ICBM, Faculty of Medicine, University of Chile, Av. Independencia 1027, Santiago, 8380453, Chile

    Roberto Rojas-Pizarro, Constanza Vásquez-Venegas, Felipe Santander, Nicolás Pérez & Steffen Härtel

  2. Department of Medical Technology, Faculty of Medicine, University of Chile, Av. Independencia 1027, Santiago, 8380453, Chile

    Roberto Rojas-Pizarro

  3. Department of Computer Science, Faculty of Engineering, Universidad de Concepción, Edmundo Larenas 219, Concepción, 4030000, Chile

    Constanza Vásquez-Venegas, Gonzalo Rojas & Guillermo Cabrera-Vives

  4. Radiology Department, Clinical Hospital University of Chile, University of Chile, Dr. Carlos Lorca Tobar 999, Santiago, 8380420, Chile

    Gonzalo Pereira, María F. Eyssautier, Nicolás Sanhueza, Paulina Gallardo-Badilla, Francisca Caro-Flores & Camila Ormeño-Candia

  5. Centro de Informática Médica y Telemedicina CIMT, Institute of Biomedical Sciences ICBM, Faculty of Medicine, University of Chile, Av. Independencia 1027, Santiago, 8380453, Chile

    Felipe Bravo-Bahamóndez, María M. Molina & Steffen Härtel

  6. Biomedical Neuroscience Institute BNI, Faculty of Medicine, University of Chile, Av. Independencia 1027, Santiago, 8380453, Chile

    Steffen Härtel

  7. National Center for Health Information Systems CENS, Av. Independencia 1027, Santiago, 8380453, Chile

    Steffen Härtel

  8. Centro de Modelamiento Matemático, Universidad de Chile, Santiago, Beauchef 851, Casilla 170-3, Santiago, Chile

    Steffen Härtel

  9. Center for Data and Artificial Intelligence, Universidad de Concepción, Concepción, Chile

    Guillermo Cabrera-Vives

Authors
  1. Roberto Rojas-Pizarro
    View author publications

    Search author on:PubMed Google Scholar

  2. Constanza Vásquez-Venegas
    View author publications

    Search author on:PubMed Google Scholar

  3. Gonzalo Pereira
    View author publications

    Search author on:PubMed Google Scholar

  4. María F. Eyssautier
    View author publications

    Search author on:PubMed Google Scholar

  5. Felipe Bravo-Bahamóndez
    View author publications

    Search author on:PubMed Google Scholar

  6. Nicolás Sanhueza
    View author publications

    Search author on:PubMed Google Scholar

  7. Paulina Gallardo-Badilla
    View author publications

    Search author on:PubMed Google Scholar

  8. Francisca Caro-Flores
    View author publications

    Search author on:PubMed Google Scholar

  9. Camila Ormeño-Candia
    View author publications

    Search author on:PubMed Google Scholar

  10. Felipe Santander
    View author publications

    Search author on:PubMed Google Scholar

  11. Nicolás Pérez
    View author publications

    Search author on:PubMed Google Scholar

  12. María M. Molina
    View author publications

    Search author on:PubMed Google Scholar

  13. Gonzalo Rojas
    View author publications

    Search author on:PubMed Google Scholar

  14. Steffen Härtel
    View author publications

    Search author on:PubMed Google Scholar

  15. Guillermo Cabrera-Vives
    View author publications

    Search author on:PubMed Google Scholar

Contributions

R.R.P., C.V.V., S.H. and G.C.V. conceived and designed the study. N.S., P.G.B., F.C.F. and C.O.C. annotated the training data. G.P. and M.F.E. annotated the test data and reviewed the annotations of training data. G.R., F.S. and N.P. integrated MedSAM into 3D Slicer to assist annotation of training data. R.R.P., C.V.V. and G.C.V. conceived and conducted the DL experiments for technical validation. F.B. conducted experiments of automated diameter length measurements. R.R.P., C.V.V., G.P., S.H., G.C.V. and M.M.M. analysed the results. R.R.P. prepared the draft of the work. All authors reviewed and revised the manuscript.

Corresponding authors

Correspondence to Steffen Härtel or Guillermo Cabrera-Vives.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Material

Open datasets with annotations of tumors

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rojas-Pizarro, R., Vásquez-Venegas, C., Pereira, G. et al. A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes. Sci Data (2026). https://doi.org/10.1038/s41597-026-06597-6

Download citation

  • Received: 11 September 2025

  • Accepted: 08 January 2026

  • Published: 20 January 2026

  • DOI: https://doi.org/10.1038/s41597-026-06597-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer