Table 2 Evaluation metrices of machine learning models employed for risk prediction of HPAI.

From: Investigating environmental determinants and spatiotemporal dynamics of highly pathogenic avian influenza H5N1 outbreaks in India through machine learning

Sl. No

Models

Model Specification

KAPPA

ROC

TSS

AUC

Accuracy

ERROR RATE

F1 SCORE

LOG

LOSS

1.

GLM

\(\:E\left(Y\right|X)=\mu\:={g}^{-1}(X\beta\:)\)

Y- Expected Value, X-Conditional, \(\:\:X\beta\:\)- Linear Predicator,

g-Link Function

0.34

0.80

0.49

0.80

0.75

0.25

0.89

0.43

2.

GAM

\(\:g\left(E\left(Y\right)\right)={\beta\:}_{0}+{f}_{1}\left({x}_{1}\right)+{f}_{2}\left({x}_{2}\right)+\cdots+{f}_{i}\left({x}_{i}\right)\)

Y-Response Variable, g-Link Function, fi-Specified Parametric Form, xi-Predicator Variable

0.34

0.80

0.49

0.80

0.75

0.25

0.89

0.43

3.

RF

Y = \(\:\sum\:_{i=1}^{n}\text{f}\left(\text{t}\text{n}\right)\)

Y- Average of aggregated predictions of the multiple decision trees,

tn – multiple decision trees trained on different subset of the same training data

0.68

1.00

0.96

1.00

0.98

0.02

0.99

0.12

4.

GBM

\(\:f\left(x\right)=arg\:mi{n}_{\theta\:}\sum\:_{i=1}^{n}L({y}_{i}\:,\theta\:)+\sum\:_{m=1}^{M}\eta\:{\rho\:}_{m}{\varphi\:}_{m}\left(x\right)\)

m- Iteration, \(\:\:\eta\:\)-Learning Rate, \(\:{\rho\:}_{m}\)- Step length

0.50

0.93

0.72

0.93

0.84

0.16

0.92

0.31

5.

NNET

\(\:Y=f\left(\sum\:_{i=1}^{n}{x}_{i}{w}_{i}\right)+b\)

Y-Output, \(\:{x}_{i}\)-Inputs, \(\:{w}_{i}\)- Weights, \(\:\:b\)- Bias

0.00

0.50

0.00

0.50

0.25

0.75

0.00

25.90

6.

MARS

\(\:\widehat{f}\left(x\right)=\sum\:_{i=1}^{k}{c}_{i}{B}_{i}\left(x\right)\)

\(\:{c}_{i}\)- Constant Coefficient, \(\:{B}_{i}\left(x\right)\)- Basis Function

0.41

0.87

0.60

0.87

0.80

0.21

0.91

0.34

7.

FDA

\(\:{\eta\:}_{l}\left(x\right)={X}^{T}{\beta\:}_{l}\)

-0.01

0.50

-0.01

0.50

0.79

0.21

0.88

7.28

8.

CT

\(\:f\left(x\right)=\sum\:_{j=1}^{T}{w}_{j}I(x\in\:{R}_{j})\)

0.66

0.93

0.74

0.93

0.85

0.15

0.93

0.34

9.

SVM

\(\:\{x:f(x)={x}^{T}\beta\:+{\beta\:}_{0}=0\}\)

0.51

0.87

0.68

0.87

0.85

0.15

0.91

0.53

10.

NB

\(\:P\left(c\right|x)=\frac{P\left(x\right|c\left)P\right(c)}{P\left(x\right)}\)

\(\:P\left(c\right|x)\)-Posterior Probability

\(\:\:P(x\left|c\right)\)-Likelihood,

\(\:P\left(c\right)\)-Class Prior Probability, \(\:\:P\left(c\right)\)-Predictor Prior Probability

-0.20

0.72

-0.18

0.72

0.30

0.70

0.47

3.55

11.

ADA

\(\:{F}_{T}\left(x\right)=\sum\:_{t=1}^{T}{f}_{t}\left(x\right)\)

\(\:{f}_{t}\)- Weak Learner, \(\:\:x\)- Input,\(\:\:\:T\)- \(\:\:T\)th Positive or Negative Classifier

0.73

0.83

0.66

0.83

0.92

0.08

0.95

2.71

  1. Generalized Linear Models (GLM), Generalized Additive Models (GAM), Random Forest (RF), Gradient Boosting Machine (GBM), Artificial Neural Network (NNET), Multiple Adaptive Regression Splines (MARS), Flexible Discriminant Analysis (FDA), Classification Tree Analysis (CT), Support Vector Machine (SVM), Naive Bayes (NB), Adaptive Boosting (ADA), Receiving Operating Characteristic (ROC) curve, True Skill Statistics (TSS), Area Under the ROC Curve (AUC), Logistic Loss (LOGLOSS).