Table 5 Screening success of the general and target-specific scoring functions trained with MLR, SMOreg and RF for the FA7, RENI, TRYB1 and UROK datasets from DUD-E. ac, dec and tot are the number of active, decoy compounds and the total number of molecules in the final dataset (i.e., compounds that were docked and rescored with DockThor and DockTScore, respectively). Only the top-scored protonation state of a compound according to each scoring function (SF) was kept.

From: New machine learning and physics-based scoring functions for drug discovery

Target

Metrics

General SFs

Protease-specific SFs

MLR

SMOreg

RF

MLR

SMOreg

RF

FA7

AUC

0.789

0.860

0.875

0.818

0.893

0.869

ac = 112

EF1% (max = 52.973)

8.979

9.876

8.979

12.570

17.059

17.059

dec = 5,821

BEDROC20

0.299

0.346

0.328

0.350

0.478

0.397

tot = 5,933

BEDROC100

0.181

0.181

0.165

0.230

0.333

0.310

RENI

AUC

0.786

0.769

0.763

0.807

0.771

0.782

ac = 73

EF1% (max = 86.425)

16.462

20.577

10.975

17.834

16.462

8.231

dec = 6,236

BEDROC20

0.300

0.334

0.271

0.349

0.346

0.268

tot = 6,309

BEDROC100

0.253

0.281

0.155

0.283

0.207

0.119

TRYB1

AUC

0.619

0.649

0.614

0.651

0.651

0.633

ac = 147

EF1% (max = 51.633)

1.359

1.359

2.038

4.076

7.473

8.153

dec = 7,443

BEDROC20

0.099

0.103

0.080

0.141

0.203

0.169

tot = 7,590

BEDROC100

0.037

0.040

0.046

0.080

0.167

0.167

UROK

AUC

0.740

0.774

0.775

0.762

0.814

0.788

ac = 129

EF1% (max = 69.837)

7.760

8.536

6.208

11.640

14.743

10.088

dec = 8,880

BEDROC20

0.262

0.306

0.295

0.295

0.352

0.283

tot = 9,009

BEDROC100

0.123

0.147

0.118

0.179

0.232

0.182