Table 1 The average performance and standard deviation of various methods on the new 115-endpoint acute toxicity dataset via 5-fold cross-validation

Methods	All 115 endpoints		68 Small-sized endpoints		11 Human endpoints
	R² ↑	RMSE ↓	R² ↑	RMSE ↓	R² ↑	RMSE ↓
ST-DNN	0.11 ± 0.47	1.10 ± 0.38	−0.13 ± 0.45	1.28 ± 0.34	−0.50 ± 0.24	1.70 ± 0.32
ST-RF	0.21 ± 0.42	1.01 ± 0.35	0.01 ± 0.42	1.16 ± 0.32	−0.39 ± 0.25	1.60 ± 0.30
GAT	0.08 ± 0.32	0.94 ± 0.26	−0.01 ± 0.38	1.00 ± 0.30	−0.06 ± 0.24	1.23 ± 0.26
GCN	0.12 ± 0.43	0.91 ± 0.29	−0.04 ± 0.48	1.00 ± 0.32	−0.01 ± 0.19	1.19 ± 0.21
Attentive FP	0.32 ± 0.22	0.82 ± 0.24	0.26 ± 0.25	0.87 ± 0.27	0.24 ± 0.12	1.06 ± 0.24
MT-DNN	0.36 ± 0.37	0.77 ± 0.32	0.22 ± 0.40	0.88 ± 0.34	−0.22 ± 0.23	1.34 ± 0.30
MT-GCN	0.34 ± 0.35	0.80 ± 0.32	0.20 ± 0.38	0.91 ± 0.33	−0.25 ± 0.20	1.34 ± 0.31
DLCA	0.39 ± 0.32	0.76 ± 0.29	0.28 ± 0.35	0.86 ± 0.32	−0.15 ± 0.18	1.28 ± 0.29
ToxACoL	0.51 ± 0.24	0.68 ± 0.23	0.44 ± 0.28	0.73 ± 0.26	0.28 ± 0.24	1.02 ± 0.26

A small-sized endpoint contains no more than 200 available toxicity measurement data. An upward arrow next to each metric indicates that higher values represent better performance, while the upward arrow indicates the opposite. Best performance among all methods for each metric is shown in bold.

Quick links

Search