Abstract
The Response Evaluation Criteria in Solid Tumors (RECIST 1.1) protocol is the gold standard for assessing treatment response in oncological clinical trials and routine practice. It requires radiologists to review and select appropriate target lesions and perform precise diameter measurements, making the process labor-intensive and variable. Artificial Intelligence (AI) holds great promise for automating this workflow, but progress is hindered by the lack of public datasets with comprehensive lesion annotations and RECIST-compliant measurements. We address this gap by presenting a dataset of 1,246 manually segmented lesions from 58 CT scans of 22 cancer patients treated at the Clinical Hospital of the University of Chile (HCUCH). All cases were evaluated under RECIST 1.1, with diameter measurements reported for 82 target lesions. This resource supports diverse applications, including validating automated RECIST tools, applying radiomics to study metastatic heterogeneity, benchmarking segmentation algorithms, and advancing foundation models in medical imaging. By including data from a Latin American institution, this dataset also promotes global representation in the development of generalizable medical AI tools.
Similar content being viewed by others
Data availability
The dataset is available on Zenodo (https://zenodo.org/records/17788162) under a Creative Commons Attribution 4.0 International (CC BY 4.0) license15.
Code availability
All code used for data conversion, windowing normalization and statistical analysis is available at https://github.com/robertorojasp06/recist-dataset. Code for fine-tuning and evaluating nnUNet is available at https://github.com/robertorojasp06/nnUNet. Code for experiments using MedSAM is available at https://github.com/robertorojasp06/MedSAM.
All code for dataset preprocessing and characterization was implemented in Python (3.10) on Ubuntu 22.04.5 LTS. The specific Python packages used are listed in the environment.yml file of the recist-dataset repository available at https://github.com/robertorojasp06/recist-dataset. Experiments involving nnUNet and MedSAM were conducted using the same package versions specified in their respective GitHub repositories. All experiments were run on a machine equipped with an AMD Ryzen 7 5700 G processor and an NVIDIA GeForce RTX 4090 GPU.
References
Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).
Lewandowska, A. M. et al. Environmental risk factors for cancer-review paper. Ann. Agric. Environ. Med. 26, 1–7 (2018).
Iannessi, A. et al. RECIST 1.1 assessments variability: a systematic pictorial review of blinded double reads. Insights Imaging 15, 199 (2024).
Bucho, T. M. T. et al. Reproducing RECIST lesion selection via machine learning: insights into intra and inter-radiologist variation. Eur. J. Radiol. Open 12, 100562 (2024).
Welsh, J. L. et al. Comparison of response evaluation criteria in solid tumors with volumetric measurements for estimation of tumor burden in pancreatic adenocarcinoma and hepatocellular carcinoma. Am. J. Surg. 204, 580–585 (2012).
Beaumont, H. et al. Radiology workflow for RECIST assessment in clinical trials: Can we reconcile time-efficiency and quality? Eur. J. Radiol. 118, 257–263 (2019).
Bucho, T. M. T. et al. How does target lesion selection affect RECIST? A computer simulation study. Invest. Radiol. 59, 465–471 (2024).
van der Loo, I. et al. Measurement variability of radiologists when measuring brain tumors. Eur. J. Radiol. 183, 111874 (2025).
Azad, R. et al. Medical image segmentation review: The success of u-net. IEEE Trans. Pattern Anal. Mach. Intell., (2024).
Rayed, M. E. et al. Deep learning for medical image segmentation: State-of-the-art advancements and challenges. Inform. Med. Unlocked 43, 101504 (2024).
Moawad, A. W. et al. Multimodality annotated hepatocellular carcinoma data set including pre- and post-TACE with imaging segmentation. Sci. Data 10, 33 (2023).
Ma, J. et al. Segment anything in medical images. Nat. Commun. 15, 654 (2024).
Cryptography Community. Fernet (symmetric encryption), version 44.0.0. Cryptography. https://cryptography.io/en/latest/fernet (2025).
Vásquez-Venegas, C. et al. Human-in-the-loop—A Deep Learning Strategy in Combination with a Patient-Specific Gaussian Mixture Model Leads to the Fast Characterization of Volumetric Ground-Glass Opacity and Consolidation in the Computed Tomography Scans of COVID-19 Patients. J. Clin. Med. 13, 5231 (2024).
Rojas-Pizarro, R. et al. A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes, version 1.1.0. Zenodo https://doi.org/10.5281/zenodo.17788162 (2025).
Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).
Antonelli, M. et al. The medical segmentation decathlon. Nat. Commun. 13, 4128 (2022).
Acknowledgements
This research was supported by the Chilean National Commission for Scientific and Technological Research (FONDEQUIP EQM210020), and projects FONDEF ID23|10337 (all authors), NCN2024_068 (SH), FONDECYT 1211988 (SH, GP), CNRSIRL2807 (SH), and CORFO 16CTTS-66390 (SH), MINEDUC grant RED 21994 (SH, MM), BASAL FB210005 (SH), CNRS IRL 2807 (SH), DAAD 57519605 (SH), Centro CTI220001 (SH).
Author information
Authors and Affiliations
Contributions
R.R.P., C.V.V., S.H. and G.C.V. conceived and designed the study. N.S., P.G.B., F.C.F. and C.O.C. annotated the training data. G.P. and M.F.E. annotated the test data and reviewed the annotations of training data. G.R., F.S. and N.P. integrated MedSAM into 3D Slicer to assist annotation of training data. R.R.P., C.V.V. and G.C.V. conceived and conducted the DL experiments for technical validation. F.B. conducted experiments of automated diameter length measurements. R.R.P., C.V.V., G.P., S.H., G.C.V. and M.M.M. analysed the results. R.R.P. prepared the draft of the work. All authors reviewed and revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Rojas-Pizarro, R., Vásquez-Venegas, C., Pereira, G. et al. A CT Dataset with RECIST Measurements and Comprehensive Segmentation Masks for Tumors and Lymph Nodes. Sci Data (2026). https://doi.org/10.1038/s41597-026-06597-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-06597-6


