Table 1 Confusion matrices of breast cancer subtype classification using the frequency of k-mers discovered by GECKO and the transcript per million values of the PAM50 gene set
From: GECKO is a genetic algorithm to classify and explore high throughput sequencing data
Classification with GECKO k-mers | Classification with PAM50 TPM values | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
Predicted class | Basal | 97.7 | 2.2 | 0 | 0 | Predicted class | Basal | 86 | 5.2 | 5.5 | 3.3 |
Her2 | 2 | 87.5 | 6.2 | 4.2 | Her2 | 15.3 | 60.6 | 3.6 | 20.6 | ||
LumA | 1.5 | 1.5 | 92.3 | 4.6 | LumA | 15.3 | 2.2 | 88.1 | 8.6 | ||
LumB | 0 | 3.4 | 18.8 | 77.8 | LumB | 5.9 | 15.4 | 36.5 | 42.2 | ||
Basal | Her2 | LumA | LumB | Basal | Her2 | LumA | LumB | ||||
True class | True class | ||||||||||