Fig. 8: Structural similarity between toxic compounds and endogenous metabolites evaluated using different molecular fingerprints.

a SICL-based structural distance distribution for all possible pairs of toxic compounds and non-toxic endogenous metabolites (from YMDB and HMDB) sharing the same molecular formula. Plots indicate the number of pairs with absolute SICL differences below each threshold. b Comparison of similarity scores using AP, ECFP, and SICL fingerprints. For AP and ECFP, Tanimoto coefficients were used; for SICL, scores were normalized to a 0–1 scale by dividing by the maximum value. Blue circles indicate AP, red circles ECFP, and black circles SICL. c Evaluation of high-similarity pairs (similarity ≥ 0.9) between toxic compounds and non-toxic metabolites using each fingerprint. The number of positional isomer pairs and the number of falsely identified identical compounds (similarity = 1.0) were also assessed. Notably, SICL yielded no pairs with identical similarity scores (n = 0), whereas AP and ECFP produced 4 and 3 such cases, respectively.