Table 1 Overview of the data splits used in the study.

From: Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing

 

Training and validation

Test set I

Test set II

CTC dataset

CTC dataset

Primary tumor dataset

Cancer cells

130 + 8 clusters

132 + 6 clusters

1534

Blood cells

38

43

27,620

Source

50% of GSE109761

50% of GSE109761

GSE118389 + E-HCAD-4