Fig. 3: The performance comparison between ToxACoL and baseline models on small-sized acute toxic endpoints.

a, c, and e Average R2 of different models over 5-fold cross-validation on human-oral-TDLo, women-oral-TDLo, and man-oral-TDLo. Their significant differences were analyzed on the basis of a two-sided Student t-test. b, d, and f Acute toxicity estimation curves of ToxACoL for testing compounds at three human-related endpoints. Here, ToxACoL was trained using four folds of the whole toxicity dataset, and the testing compounds are all from the remaining one test fold. g Comparison between ToxACoL and advanced baseline methods on more small-sized endpoints. Taking the first subgraph for example, it considered the 4 endpoints (n = 4) with sample size of measurements <130, and so on for the following three subgraphs. The dots on the bar represent R2 values at single endpoints, and the bar with the error bar denotes the mean R2 value with standard deviation over the n small-sized endpoints (from left to right, n = 4, 8, 14, 21, respectively). h Comparison between ToxACoL and advanced baseline methods on 11 large-sized endpoints (n = 11). The bar with the error bar represents the mean R2 value with standard deviation over the 11 large-sized endpoints. Source data are provided as a Source Data file.