Table 3 Experimental data for GEOM species from MoleculeNet31.

From: GEOM, energy-annotated molecular conformations for property prediction and molecular generation

Category

Dataset

Property

Tasks

Species

Recovered

Sources

Physical chemistry

ESOL

Water solubility

1

1,113

99.6%

28

FreeSolv

Hydration free energy

1

642

100.0%

29

Lipophilicity

log Koctanol-water

1

4,194

99.9%

24,101

Biophysics

BACE

BACE-1 inhibition

1

1,511

99.9%

39

Physiology

BBBP

Blood-brain barrier penetration

1

1,959

99.2%

102

Tox21

Qualitative toxicity

12

7,677

98.0%

103

ToxCast

Qualitative toxicity

617

8,405

98.0%

104

SIDER

Drug side effects

27

1,356

95.1%

105

ClinTox

Toxicity of failed, approved drugs

2

1,438

98.7%

106,107

  1. “Species” denotes the number of MoleculeNet compounds that have CREST CREs in vacuum. “Recovered” gives this quantity as a percentage of the original number of compounds in MoleculeNet. The original numbers in each dataset, used to compute the “recovered” percentage, are slightly different than in ref. 31. This is because several of the original compounds were found to be identical after SMILES pre-processing and conversion to InChi keys. Note that 1,511 BACE species (99.9%) also have CREST CREs in water.