Table 6 The search space of each classifier based on the distributions over its hyperparameters (n.b. F denotes feature count; for biased categorical distributions, tuples (ps, v) designate the sampling probability and the value assigned)

From: A principled machine learning framework improves accuracy of stage II colorectal cancer prognosis

Classifier

Hyperparameter

Distribution

Values

SVM, linear kernel

C

Log-uniform

[ln (1e−5), ln (1e2)]

 

Class weight

Categorical

Balanced or none

SVM, RBF kernel

C

Log-uniform

[ln (1e−5), ln (1e2)]

 

Gamma

Log-uniform

[ln (1e−3), ln (1e3)]

 

Class weight

Categorical

Balanced or none

LR

Type of penalty

Categorical

L1 or L2

 

C

Log-uniform

[ln (1e−5), ln (1e2)]

 

Class weight

Categorical

Balanced or none

RF

Number of trees

Log-uniform integer

[10, 1000]

 

Criterion

Categorical

Gini or entropy

 

Maximum features

Biased categorical

(0.2, √F), (0.1, ln F), (0.1, F), (0.6,U(0, F))

 

Maximum depth

Biased categorical

(0.1, 2), (0.1, 3), (0.1, 4), (0.7, none)

 

Bootstrap

Categorical

True or False

 

Class weight

Categorical

Balanced or none

KNN

K

Log-uniform integer

[1, 50]

 

Weights

Categorical

Uniform, or Euclidean distance

 

Metric

Categorical

Balanced or none

 

P

Categorical

Balanced or none