Table 4 Classification technique.
Classifiers | Description |
|---|---|
Naive Bayes Algorithm (NB)33 | Notable for multi-class prediction. Utilizing this algorithm, we can foresee the likelihood of different classes of target variables. In this work, we use three variants of naive Bayes algorithms to generate models for predicting web service anti-patterns i.e., Gaussian Naive Bayes(GNB), Multinomial Naive Bayes(MNB), Bernoulli Naive Bayes(BNB). |
Decision Tree (DT)34 | Represents the estimate of a target variable via the use of several independent variables in a decision model. |
Logistic Regression Analysis (LOGR)35 | A statistical approach used to analyze a dataset in which there are one or more independent variables that may be used to predict the outcomes of a dependent variable |
Support Vector Classifier (SVC)36 | It functions as a non-probabilistic binary linear classifier by classifying input data into one of two categories which makes it an excellent choice for developing a classification model. SVC with three different kernels i.e., linear (SVC-L), polynomial (SVC-P), and radial (SVC-R) are employed for training models in this work. |
Least Square Support Vector Machine (LSSVM)37 | this algorithm applies minimization of the sum of squared errors to the objective functions. This is a supervised learning method that analyzes data to recognize patterns. LSSVM with linear (LSSVM-Lin), Polynomial (LSSVM-Poly), and Radial Basis Functions (LSSVM-RBF) are used for training the models. |
Extreme Learning Machine (ELM)38 | This is a learning procedure for single hidden layer feed-forward neural networks. The key component of this approach is the random creation of hidden nodes, in which hidden node parameters are assigned at random, regardless of training samples. The anti-pattern detection models were trained with ELM using linear (ELM-Lin), polynomial (ELM-Poly), and radial basis functions (ELM-RBF). |
Weighted Extreme Learning Machine (WELM)39 | When dealing with imbalanced data, this approach gives more weight to the minority class and less weight to the majority class. WELM selects a weighting scheme based on the class distribution, and the weights created are inversely proportional to the number of samples in the training set. We implemented four different kernel functions (Sigmoid, Radbas, Tribes, and Sine) to WELM to boost its speed even further. |
Multi-Layer Perceptron (MLP)40 | MLP can train a non-linear function approximator for either classification or regression from a collection of features and a target. It is different from logistic regression because there can be one or more non-linear layers, called hidden layers, between the input and output layers. |
MLP with Stochastic Gradient Descent (MLP-SGD) | It is necessary to update the weights to reduce output error while using MLP. SGD is employed for this purpose. The SGD technique finds the minima in error space by taking the 1st-order derivative of the total error function. |
MLP with Quasi-Newton Method (MLP-LNF) | is a quick optimization approach that may be used as an alternative to conjugate gradient methods. Calculating the 2nd order derivatives of the total error function for each component of the gradient vector is required for this technique to be effective. |
MLP with Stochastic Gradient with Adaptive Learning Rate Method (MLP-ADAM) | As the sample size is too small, the training procedure will take excessive time to converge. Although it is theoretically feasible to predict the best value of the learning rate (\(\alpha\)) before training, it is practically impossible to predict the value of changes throughout the training process. Thus, ADAM is employed for training the prediction model in this study. |
K-Nearest Neighbour (KNN)41 | KNN is a non-parametric algorithm, which implies that it makes no assumptions about the data it is given as an input. It is sometimes referred to as a lazy learner algorithm since it does not learn from the training set immediately; instead, it stores the dataset and then acts on the dataset when it comes time to classify the data. During the training phase, the KNN algorithm saves the dataset and then classifies new data into a category that is very comparable to the latest data. |
Bagging Classifier (BAG)42 | is an ensemble meta-assessor that fits base classifiers each on subjective subsets of the underlying dataset and afterward aggregates their remote predictions performed either via voting or using averaging to form the concluding prediction. |
Random Forest Classifier (RF)43 | This algorithm makes decision trees on data samples and a while later receives the prediction from all of them and finally chooses the best arrangement using the method of voting. |
Extra Trees Classifier (EXTR)44 | This actualizes a meta-assessor that suits different randomized decision trees or extra-trees on different sub-samples of the dataset and utilizes averaging to enhance the predictive accuracy and supervises over-fitting. |
AdaBoost Classifier (AdaB)45 | is a meta-estimator that starts evolving by fitting a classifier on the first dataset and later on fits more duplicates of the classifier on the equivalent dataset; however, the weights of incorrectly classified instances are changed with the end goal ensuing classifiers revolve more around troublesome cases. |
Gradient Boosting Classifier (GraB)46 | The ideology of the GraB classifier is to restrict the loss or the differentiation between the actual class estimation of the training instance and the predicted class esteem. It facilitates constructing an additive model in a forward stage-wise style. |
Deep Learning Technique (DL)47 | Deep learning uses artificial neural networks, a kind of machine learning that works dependent on the structure and capacity of the human brain. This algorithm uses various instances from the dataset or relevant examples for training the machines. The primary benefit of an ANN over other types of algorithms is its novel information processing architecture. In this work, we have used Deep Learning(DL) technique with a distinct number of hidden layers, i.e., DL with one hidden layer(DL1), DL with two hidden layers (DL2), DL with three hidden layers (DL3), DL with four hidden layers (DL4), DL with five hidden layers (DL5) and DL with six hidden layers (DL6). |