Table 1 Confusion matrices of breast cancer subtype classification using the frequency of k-mers discovered by GECKO and the transcript per million values of the PAM50 gene set

From: GECKO is a genetic algorithm to classify and explore high throughput sequencing data

 

Classification with GECKO k-mers

 

Classification with PAM50 TPM values

Predicted class

Basal

97.7

2.2

0

0

Predicted class

Basal

86

5.2

5.5

3.3

Her2

2

87.5

6.2

4.2

Her2

15.3

60.6

3.6

20.6

LumA

1.5

1.5

92.3

4.6

LumA

15.3

2.2

88.1

8.6

LumB

0

3.4

18.8

77.8

LumB

5.9

15.4

36.5

42.2

  

Basal

Her2

LumA

LumB

  

Basal

Her2

LumA

LumB

  

True class

  

True class