Table 1 Predictive performance comparison of multiple classification algorithms on the constructed dataset

Dataset	Methods	Accuracy	Precision	Recall	F1 Score	ROC–AUC	PR–AUC	MCC
All	Ridge regression	0.857	0.360	0.742	0.485	0.881	0.574	0.450
	Balanced bagging	0.886	0.419	0.650	0.510	0.840	0.493	0.463
	Linear SVC	0.918	0.677	0.192	0.299	0.870	0.549	0.331
	Random forest	0.939	0.800	0.433	0.562	0.888	0.633	0.561
	XGBoost	0.949	0.791	0.600	0.683	0.885	0.692	0.663
Bacteria	Ridge regression	0.870	0.363	0.857	0.510	0.917	0.519	0.504
	Balanced bagging	0.891	0.396	0.714	0.509	0.892	0.548	0.479
	Linear SVC	0.931	0.614	0.351	0.446	0.915	0.516	0.431
	Random forest	0.939	0.674	0.429	0.524	0.927	0.657	0.507
	XGBoost	0.946	0.688	0.571	0.624	0.922	0.660	0.598
Eukaryota	Ridge regression	0.849	0.326	0.737	0.452	0.844	0.475	0.422
	Balanced bagging	0.813	0.275	0.737	0.400	0.856	0.456	0.370
	Linear SVC	0.920	0.556	0.263	0.357	0.855	0.433	0.346
	Random forest	0.929	1.000	0.158	0.273	0.929	0.694	0.383
	XGBoost	0.942	0.688	0.579	0.629	0.940	0.685	0.600
Viruses	Ridge regression	0.800	0.500	0.417	0.455	0.792	0.572	0.335
	Balanced bagging	0.808	0.526	0.417	0.465	0.716	0.453	0.354
	Linear SVC	0.842	0.857	0.250	0.387	0.780	0.573	0.409
	Random forest	0.850	0.800	0.333	0.471	0.794	0.611	0.452
	XGBoost	0.867	0.786	0.458	0.579	0.764	0.583	0.532

Quick links

Search