Table 9 Post hoc test of the proposed hybrid stacking model with the other top four models for each metric.

From: Lifestyle data-based multiclass obesity prediction with interpretable ensemble models incorporating SHAP and LIME analysis

Metric

Hybrid

stacking vs.

Statistic

Adjusted p-value

H0

Accuracy

RF

1.78885

0.29455

A

XGB

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A

F1-score

RF

1.78885

0.29455

A

XGB

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A

AUC

CB

1.56525

0.4701

A

ET

1.11803

0.79066

A

XGB

0.67082

1

A

Hybrid voting

0

1

A

Precision

XGB

1.78885

0.29455

A

RF

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A

MCC

RF

1.78885

0.29455

A

XGB

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A

Recall

RF

1.78885

0.29455

A

XGB

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A

Kappa

RF

1.78885

0.29455

A

XGB

1.34164

0.53914

A

GB

0.89443

0.74219

A

Hybrid voting

0.44721

0.74219

A