Table 7 Post-hoc analysis summary across datasets.
Dataset | Significant Difference (KW Test)? | Significant Pairs (Post-hoc)? | Were Proposed Models Statistically Better? | Non-significant but Outperformed Cases |
---|---|---|---|---|
Satimage | Yes (p < 0.00001) | Yes (≥ 79 pairs) | Yes, Several proposed models (e.g., P4, P3, P10) significantly better than traditional models (e.g., RAF, Tan-Sig, Sine, Sigmoid) | Some cases like P1 vs. P6 had better performance but not significant |
Yes (p < 0.00001) | Yes (≥ 83 pairs) | Yes, Proposed 10, 11, 5, 7 showed superiority over several traditional models | Proposed 10 vs. 4 showed better scores but lacked significance | |
Breast | Yes (p < 0.00001) | Yes (≥ 66 pairs) | Yes, Proposed 10, 3, 11 consistently better than others | Variants like P2 and P4 performed well but not always significantly |
IRIS | Yes (p < 0.00001) | Yes ((≥ 30 pairs | Yes proposed 11 better than others and equally perform with RAF, Sigmoid, Tang-sigmoid | Mostly equally perform not much difference as data is very simple classification and even base classifiers equally performed than ensemble models. |