Table 2 Overall performance of our SCAGE and state-of-the-art methods on nine molecular property benchmarks with random scaffold split

From: A self-conformation-aware pre-training framework for molecular property prediction with substructure interpretability

Methods

Classification (AUC-ROC)

Regression (RMSE)

Dataset

BACE63

BBBP58

ClinTox61

Tox2159

ToxCast62

SIDER60

FreeSolv64

ESOL66

Lipophilicity65

Size

1513

2039

1478

7831

8575

1427

642

1128

4200

Tasks

1

1

2

12

617

27

1

1

1

GROVER17

0.894(0.028)

0.940(0.019)

0.944(0.021)

0.831(0.025)

0.737(0.010)

0.658(0.023)

1.544(0.397)

0.831(0.120)

0.560(0.035)

GROVER-10M17

0.923(0.005)

0.940(0.004)

0.956(0.004)

0.840(0.002)

0.741(0.004)

0.691(0.009)

1.366(0.175)

0.730(0.026)

0.556(0.012)

MolCLR24

0.828(0.007)

0.733(0.010)

0.898(0.027)

0.741(0.053)

0.659(0.021)

0.612(0.036)

2.301(0.247)

1.113(0.023)

0.789(0.009)

ImageMol29

0.939(0.010)

0.952(0.002)

0.975(0.007)

0.847(0.003)

0.752(0.002)

0.708(0.010)

1.149(0.004)

0.690(0.690)

0.625(0.009)

GEM26

0.925(0.010)

0.953(0.007)

0.977(0.019)

0.849(0.003)

0.742(0.004)

0.663(0.014)

-

-

-

KANO25

0.931(0.021)

0.960(0.016)

0.944(0.003)

0.837(0.013)

0.732(0.016)

0.652(0.008)

1.142(0.258)

0.670(0.019)

0.566(0.007)

SCAGE (Ours)

0.959(0.010)

0.968(0.003)

0.993(0.001)

0.856(0.009)

0.753(0.011)

0.734(0.014)

0.802(0.033)

0.621(0.011)

0.534(0.006)

  1. The best performing results are highlighted in bold. The second-best performing results are underlined. Note that “-” denotes the result is not available.
  2. The indicators are the area under the receiver operating characteristic curve (AUC-ROC) and the root mean square error (RMSE).