Table 2 XGBoost hyperparameters, search spaces explored, and optimised values after the nested CV with BHO

From: XGBoost model for the quantitative assessment of stress corrosion cracking

Hyperparameter a

Function

Search Space

Optimised Value

eval_metric

Evaluates model’s performance during training using RMSE

-

“rmse”

booster

Specifies the use of gradient boosted trees as the base learner for the XGBoost model.

-

“gbtree”

objective

Indicates that model being trained for a regression task using the squared error as loss function.

-

“reg:squarederror”

reg_lambda

Controls the ridge regression strength on the model weights, preventing overfitting.

1e-8 to 10

114

reg_alpha

Controls the lasso regression strength on the model weights, promoting sparsity and feature selection.

1e-8 to 10

9.831

n_estimators

Determines the number of boosting rounds (i.e., trees) used in the model.

100 to 10000

5376

max_depth

Limits maximum depth of each tree, controlling model complexity and preventing overfitting.

3 to 20

12

max_leaves

Limits the maximum number of leaves in each tree when grow policy is set to “lossguide”.

3 to 20

7

max_delta_step

Restricts the maximum change in the weight estimation of each leaf during a tree update, contributing to model robustness.

1 to 20

13

subsample

Specifies the fraction of training samples used for growing each tree, introducing randomness and reducing overfitting.

0.6 to 0.9

882

colsample_bytree

Specifies the fraction of features randomly sampled for each tree, further reducing overfitting.

0.6 to 0.9

867

colsample_bynode

Specifies the fraction of features randomly sampled for each split within a tree.

0.6 to 0.9

668

colsample_bylevel

Specifies the fraction of features randomly sampled for each level in a tree.

0.6 to 0.9

727

min_child_weight

Defines the minimum sum of instance weights needed in a child node, helping to control overfitting.

1 to 10

1

learning_rate

Determines step size at each iteration while moving towards the minimum of a loss function, impacting the model’s convergence speed and accuracy.

1e-8 to 0.1

134

gamma

Sets the minimum reduction in the loss function required to make a further partition on a leaf node of the tree, acting as a regularization parameter.

1e-8 to 1.0

1.84e-06

grow_policy

Controls how new nodes are added to the tree, aiming to optimise the loss reduction with each split.

“depthwise” or “lossguide”

“lossguide”

num_parallel_tree

Determines number of parallel trees constructed during each iteration in boosted random forests.

2 to 10

4

  1. Hyperparameter names are reported as specified in the Python library documentation XGBoost129.