Scientific Reports

Table 5 Default values used and remarks.

From: Comparative analysis of machine learning techniques for temperature and humidity prediction in photovoltaic environments

Model	Default values used	Remarks
Decision tree	“random_state = 42”, “max_depth = None”, “min_samples_split = 2”, “min_samples_leaf = 1”	No maximum depth (splits until leaves are pure). Larger depths risk overfitting; smaller depths can underfit.
LR	“fit_intercept = True”, “normalize = False” (deprecated, default behavior)	Assumes a linear relationship; no direct regularization. Sensitive to multicollinearity and outliers.
RR	“alpha = 1.0”, “fit_intercept = True”	L2 regularization shrinks coefficients; helps with multicollinearity and reduces overfitting.
Lasso regression	“alpha = 1.0”, “fit_intercept = True”	L1 regularization encourages sparsity (coefficient = 0) for less important features.
SVR	“kernel=’rbf’”, “C = 1.0”, “epsilon = 0.1”, “gamma=’scale’”	Learns a function within an ε-tube. Sensitive to “C”, “epsilon”, and “gamma”; may require careful scaling and tuning for best results.
RF	“random_state = 42”, “n_estimators = 100”, “max_depth = None”	Ensemble of decision trees via bagging. Generally robust to outliers and can handle high-dimensional data.
GB	“random_state = 42”, “n_estimators = 100”, “learning_rate = 0.1”, “max_depth = 3”	Sequentially adds weak learners to minimize loss. Can overfit if “n_estimators” is large without regularization.
AdaBoost	“random_state = 42”, “n_estimators = 50”, “learning_rate = 1.0”	Boosts performance by focusing on mis-predicted samples. Works well with shallow base estimators (e.g., short decision trees).
XGBoost	“random_state = 42”, “n_estimators = 100”, “learning_rate = 0.1”, “max_depth = 6”, “subsample = 1.0”, “colsample_bytree = 1.0”	Efficient gradient boosting library with built-in regularization and tree-pruning. Can overfit if parameters are not tuned.

Back to article page

Search

Advanced search

Quick links