Scientific Reports

Table 7 Hyperparameters for random forest in Arabic text classification.

From: Quantum computing and machine learning for Arabic language sentiment classification in social media

Hyperparameter	Value	Description
n_estimators	100	The number of decision trees to be used in the Random Forest ensemble. Having a higher number of trees can improve the model's performance by reducing overfitting and increasing robustness to noise in the data
Max_depth	None	The maximum depth allowed for each decision tree in the ensemble. A deeper tree can capture more complex relationships in the data, but setting it to None allows the tree to expand until all the leaves are pure or until the minimum number of samples required for a leaf is reached
Min_samples_split	2	The minimum number of samples required to split an internal node during the construction of a decision tree. It prevents overfitting by controlling the threshold for further partitioning of nodes. A higher value can help to avoid splitting nodes with too few samples
Min_samples_leaf	1	The minimum number of samples required to be at a leaf node. It prevents overfitting by ensuring that each leaf node has a minimum number of samples. A higher value can help to avoid creating leaf nodes with too few instances
Max_features	"auto"	The number of features to consider when looking for the best split at each tree node. "auto" uses all features, while "sqrt" uses the square root of the total number of features, and "log2" uses the logarithm of the total number of features. Selecting a smaller value can reduce the correlation among trees and enhance diversity
Bootstrap	True	A Boolean value indicating whether bootstrap samples should be used when building decision trees. Setting it to True enables random sampling with replacement, which helps to introduce randomness and diversity in the training process
Class_weight	None	An optional parameter that assigns weights to different classes. If the dataset is imbalanced, setting it to "balanced" automatically adjusts the weights inversely proportional to the class frequencies. This helps to handle class imbalance and give more weight to minority classes
Random_state	None	A seed value used by the random number generator. It ensures reproducibility of results when the same seed is used. By setting it to None, different random states will be used for each execution, resulting in different ensemble models
n_jobs	None	The number of parallel jobs to run for fitting and predicting. Specifying None uses one job, while -1 uses all available processors, potentially speeding up the training and prediction process

Back to article page

Search

Advanced search

Quick links