Table 4 Hyperparameter search spaces for each classification model used in GridSearchCV. (Abbreviations: kNN = k-nearest neighbors, LR = logistic regression, NB = naïve Bayes, DT = decision tree, SVM = support vector machine, RF = random forest, and XGBoost = extreme gradient boosting. Default parameters indicates models without tunable hyperparameters.).

Models	Scikit-learn parameter names	Hyperparameter grid
kNN	n_neighbors	[3, 5, 7, 9, 11, 13, 15]
LR	C	[0.001, 0.01, 0.1, 1, 10, 100]
NB	–	Default parameters (GaussianNB)
DT	max_depth	[5, 10, 15, 20, None]
DT	min_samples_split	[2, 5, 10, 15]
SVM	C	[0.01, 0.1, 1, 10, 100]
	kernel	[‘linear’, ‘rbf’, ‘poly’]
	degree	[2, 3, 4, 5]
RF	n_estimators	[50, 100, 200, 500]
	max_depth	[5, 10, 15, 20, None]
	min_samples_split	[2, 5, 10, 15]
XGBoost	n_estimators	[50, 100, 200, 500]
	max_depth	[3, 6, 9]
	learning_rate	[0.01, 0.1, 0.2]
	subsample	[0.6, 0.8, 1.0]
	colsample_bytree	[0.6, 0.8, 1.0]

Quick links