Communications Biology

Table 1 Parameters used for models when comparing to EBMs

From: StratoMod: predicting sequencing and variant calling errors with interpretable machine learning

Model	Implementation	Hyperparameter levels
Decision tree	rpart (R)	Cost_complexity: 0.00001, 0.0001, 0.001, 0.01, 0.1
Logistic regression	glmnet (R)	Penalty: 0.000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10 Mixture: 0, 0.5, 1
Random forest	ranger (R)	mtry : 1, 4, 7 trees: 500, 1000, 2000
XGBoost	xbgoost (python/gpu accel)	max_depth : 3, 6, 9 n_estimators: 100, 500, 1000 gamma: 1, 10, 100

All models (including the EBMs) were trained on a compute cluster with 512 GB memory, 2 20-core Intel Xeon E52698 v4 CPUs, and 8 Nvidia Tesla V100 (per node). Each job was allowed 3 days of compute time. Of all the algorithms used (including EBMs), only xgboost was able to take advantage of GPU acceleration.

Back to article page

Search

Advanced search

Quick links