Table 1 Micro- and macro-averaged F1 scores for all models across EC levels 1-4 on the test set

Metric	Level	CLEAN		DeepECtransformer		DeepEC
		MLP	KAN	MLP	KAN	MLP	KAN
Micro F1-scores	1	0.968 ± 0.007	0.971 ± 0.008*	0.784 ± 0.059	0.905 ± 0.052*	0.895 ± 0.019	0.915 ± 0.014*
	2	0.938 ± 0.012	0.948 ± 0.011*	0.773 ± 0.063	0.894 ± 0.059*	0.887 ± 0.022	0.914 ± 0.019*
	3	0.882 ± 0.014	0.891 ± 0.013*	0.768 ± 0.064	0.880 ± 0.060*	0.880 ± 0.019	0.890 ± 0.018*
	4	0.875 ± 0.010	0.880 ± 0.010*	0.787 ± 0.015	0.867 ± 0.046*	0.859 ± 0.014	0.874 ± 0.013*
Macro F1-scores	1	0.962 ± 0.021	0.965 ± 0.019*	0.797 ± 0.057	0.903 ± 0.051*	0.906 ± 0.028	0.923 ± 0.020*
	2	0.910 ± 0.013	0.915 ± 0.010*	0.687 ± 0.086	0.819 ± 0.071*	0.808 ± 0.021	0.899 ± 0.019*
	3	0.809 ± 0.018	0.812 ± 0.012*	0.582 ± 0.095	0.781 ± 0.080*	0.720 ± 0.022	0.791 ± 0.014*
	4	0.802 ± 0.024	0.811 ± 0.021*	0.585 ± 0.068	0.727 ± 0.059*	0.589 ± 0.019	0.737 ± 0.013*

Asterisks (*) denote statistical significance by the Wilcoxon signed-rank test; boldface indicates the best performing variant (MLP or KAN) within each model.

Quick links

Search