Table 12 Statistical test results.

From: A practical evaluation of AutoML tools for binary, multiclass, and multilabel classification

(a)Statistical test results for \(F_1\) score. Each row shows normality checks, variance checks, the selected method, and whether the differences were significant. Note that dataset 41465 met both normality and homoscedasticity assumptions, allowing an ANOVA + Tukey approach

Scenario

Dataset

Normal pairs

Normal?

Homoscedastic Pairs

Homoscedastic?

Parametric?

Method

p-value

Significant?

Binary

1462

15/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(3.1\times 10^{-39}\)

Yes

1479

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.6\times 10^{-53}\)

Yes

1510

21/120

No

120/120

Yes

No

Kruskal–Wallis + Dunn

\(1.3\times 10^{-17}\)

Yes

31

105/120

No

120/120

Yes

No

Kruskal–Wallis + Dunn

\(8.3\times 10^{-24}\)

Yes

37

66/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.4\times 10^{-18}\)

Yes

40945

36/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(6.8\times 10^{-27}\)

Yes

44

66/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(4.5\times 10^{-33}\)

Yes

Multiclass

1466

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.3\times 10^{-45}\)

Yes

181

91/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.7\times 10^{-34}\)

Yes

23

91/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.4\times 10^{-25}\)

Yes

36

91/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(2.0\times 10^{-11}\)

Yes

40691

105/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.5\times 10^{-31}\)

Yes

40975

45/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(4.1\times 10^{-34}\)

Yes

54

91/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(3.2\times 10^{-28}\)

Yes

Multilabel (Native)

285

3/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(2.0\times 10^{-9}\)

Yes

41464

3/3

Yes

0/3

No

No

Kruskal–Wallis + Dunn

\(4.1\times 10^{-12}\)

Yes

41465

6/6

Yes

6/6

Yes

Yes

ANOVA + Tukey

\(4.7\times 10^{-60}\)

Yes

41468

6/6

Yes

0/6

No

No

Kruskal–Wallis + Dunn

\(9.2\times 10^{-15}\)

Yes

41470

6/6

Yes

0/6

No

No

Kruskal–Wallis + Dunn

\(1.7\times 10^{-14}\)

Yes

41471

3/6

No

6/6

Yes

No

Kruskal–Wallis + Dunn

\(3.5\times 10^{-15}\)

Yes

41473

6/6

Yes

6/6

Yes

Yes

ANOVA + Tukey

\(3.1\times 10^{-80}\)

Yes

Multilabel (Powerset)

285ps

15/45

No

0/45

No

No

Kruskal–Wallis + Dunn

\(4.4\times 10^{-23}\)

Yes

41464

36/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(6.7\times 10^{-25}\)

Yes

41465

66/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(5.6\times 10^{-22}\)

Yes

41468

45/105

No

0/105

No

No

Kruskal–Wallis + Dunn

\(3.2\times 10^{-26}\)

Yes

41470

36/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(6.5\times 10^{-28}\)

Yes

41471

21/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(4.6\times 10^{-26}\)

Yes

41473

21/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(1.0\times 10^{-27}\)

Yes

(b)Statistical test results for training time. Rows show normal/variance checks, chosen method, and resulting significance

Scenario

Dataset

Normal pairs

Normal?

Homoscedastic pairs

Homoscedastic?

Parametric?

Method

p-value

Significant?

Binary

1462

15/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(2.9\times 10^{-51}\)

Yes

1479

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.2\times 10^{-53}\)

Yes

1510

6/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(3.3\times 10^{-50}\)

Yes

31

28/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.4\times 10^{-52}\)

Yes

37

28/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(2.9\times 10^{-52}\)

Yes

40945

28/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(2.0\times 10^{-38}\)

Yes

44

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(2.5\times 10^{-53}\)

Yes

Multiclass

1466

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(5.9\times 10^{-51}\)

Yes

181

6/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(5.2\times 10^{-47}\)

Yes

23

15/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.1\times 10^{-49}\)

Yes

36

21/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(6.7\times 10^{-47}\)

Yes

40691

3/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(1.3\times 10^{-49}\)

Yes

40975

36/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(3.5\times 10^{-47}\)

Yes

54

45/120

No

0/120

No

No

Kruskal–Wallis + Dunn

\(3.4\times 10^{-47}\)

Yes

Multilabel (Native)

285

0/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(6.7\times 10^{-12}\)

Yes

41464

1/3

No

0/3

No

No

Kruskal–Wallis + Dunn

\(8.7\times 10^{-11}\)

Yes

41465

3/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(3.7\times 10^{-15}\)

Yes

41468

3/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(1.4\times 10^{-13}\)

Yes

41470

0/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(1.2\times 10^{-15}\)

Yes

41471

1/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(1.3\times 10^{-15}\)

Yes

41473

3/6

No

0/6

No

No

Kruskal–Wallis + Dunn

\(3.9\times 10^{-14}\)

Yes

Multilabel (Powerset)

285ps

15/45

No

0/45

No

No

Kruskal–Wallis + Dunn

\(1.6\times 10^{-30}\)

Yes

41464

10/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(1.8\times 10^{-29}\)

Yes

41465

10/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(2.2\times 10^{-39}\)

Yes

41468

21/105

No

0/105

No

No

Kruskal–Wallis + Dunn

\(3.2\times 10^{-37}\)

Yes

41470

3/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(1.1\times 10^{-37}\)

Yes

41471

6/78

No

0/78

No

No

Kruskal–Wallis + Dunn

\(3.0\times 10^{-38}\)

Yes

41473

3/66

No

0/66

No

No

Kruskal–Wallis + Dunn

\(9.2\times 10^{-42}\)

Yes