Table 2 XGBoost hyperparameters, search spaces explored, and optimised values after the nested CV with BHO
From: XGBoost model for the quantitative assessment of stress corrosion cracking
Hyperparameter a | Function | Search Space | Optimised Value |
---|---|---|---|
eval_metric | Evaluates model’s performance during training using RMSE | - | “rmse” |
booster | Specifies the use of gradient boosted trees as the base learner for the XGBoost model. | - | “gbtree” |
objective | Indicates that model being trained for a regression task using the squared error as loss function. | - | “reg:squarederror” |
reg_lambda | Controls the ridge regression strength on the model weights, preventing overfitting. | 1e-8 to 10 | 114 |
reg_alpha | Controls the lasso regression strength on the model weights, promoting sparsity and feature selection. | 1e-8 to 10 | 9.831 |
n_estimators | Determines the number of boosting rounds (i.e., trees) used in the model. | 100 to 10000 | 5376 |
max_depth | Limits maximum depth of each tree, controlling model complexity and preventing overfitting. | 3 to 20 | 12 |
max_leaves | Limits the maximum number of leaves in each tree when grow policy is set to “lossguide”. | 3 to 20 | 7 |
max_delta_step | Restricts the maximum change in the weight estimation of each leaf during a tree update, contributing to model robustness. | 1 to 20 | 13 |
subsample | Specifies the fraction of training samples used for growing each tree, introducing randomness and reducing overfitting. | 0.6 to 0.9 | 882 |
colsample_bytree | Specifies the fraction of features randomly sampled for each tree, further reducing overfitting. | 0.6 to 0.9 | 867 |
colsample_bynode | Specifies the fraction of features randomly sampled for each split within a tree. | 0.6 to 0.9 | 668 |
colsample_bylevel | Specifies the fraction of features randomly sampled for each level in a tree. | 0.6 to 0.9 | 727 |
min_child_weight | Defines the minimum sum of instance weights needed in a child node, helping to control overfitting. | 1 to 10 | 1 |
learning_rate | Determines step size at each iteration while moving towards the minimum of a loss function, impacting the model’s convergence speed and accuracy. | 1e-8 to 0.1 | 134 |
gamma | Sets the minimum reduction in the loss function required to make a further partition on a leaf node of the tree, acting as a regularization parameter. | 1e-8 to 1.0 | 1.84e-06 |
grow_policy | Controls how new nodes are added to the tree, aiming to optimise the loss reduction with each split. | “depthwise” or “lossguide” | “lossguide” |
num_parallel_tree | Determines number of parallel trees constructed during each iteration in boosted random forests. | 2 to 10 | 4 |