Table 1 Overview of the data splits used in the study.
From: Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing
Training and validation | Test set I | Test set II | |
|---|---|---|---|
CTC dataset | CTC dataset | Primary tumor dataset | |
Cancer cells | 130 + 8 clusters | 132 + 6 clusters | 1534 |
Blood cells | 38 | 43 | 27,620 |
Source | 50% of GSE109761 | 50% of GSE109761 | GSE118389 + E-HCAD-4 |