Table 1 Correlation of reliability metrics to train set similarity

From: Molecular deep learning at the edge of chemical space

Reliability metric

Scaffold similarity

Molecular core overlap

Pharmacophore similarity

Unfamiliarity

−0.46 ± 0.03

−0.32 ± 0.04

−0.24 ± 0.04

Embedding distance

−0.36 ± 0.03

−0.56 ± 0.02

−0.10 ± 0.05

Uncertainty

−0.15 ± 0.04

0.02 ± 0.04

−0.15 ± 0.04

  1. Spearman correlation between three model-based reliability metrics and several test-to-training set similarity metrics. Embedding distance is determined as the Mahalanobis distance of embeddings (z vectors) to the training set. Mean and s.e.m. for all datasets are reported. Highest correlations are reported in bold.