Table 6 Comparison among the molecular generative models targeting each protein

From: Molecular optimization using a conditional transformer for reaction-aware compound exploration with reinforcement learning

Protein

Model

Total

Uniqueness (%)

Uniqueness to USPTO (%)

FCD

QSAR  >  0.5

      

Num

Diversity

DRD2

TRACER

96043

89.2

99.8

14.3

16372

0.523

 

(Sum of SMs 1–5)

    

(17.0%)

 
 

Molecule Chef

218823

80.0

91.1

2.60

1963

0.490

      

(0.90%)

 
 

DoG-Gen

153199

73.1

41.3

2.97

26780

0.425

      

(23.9%)

 
 

CasVAE

615

85.7

91.5

-

201

0.526

      

(32.7%)

 
 

SynFlowNet

58680

10.9

99.6

17.0

3

0.343

      

(0.09%)

 

AKT1

TRACER

40959

82.3

99.9

12.2

10424

0.564

 

(Sum of SMs 6–10)

    

(25.5%)

 
 

Molecule Chef

174959

99.5

95.2

3.13

827

0.534

      

(0.47%)

 
 

DoG-Gen

164369

74.2

46.2

3.14

34199

0.639

      

(28.0%)

 
 

CasVAE

830

79.8

71.8

-

295

0.396

      

(35.5%)

 
 

SynFlowNet

61999

2.42

99.9

-

87

0.551

      

(0.14%)

 

CXCR4

TRACER

80280

92.4

99.8

11.2

10834

0.552

 

(Sum of SMs 11–15)

    

(13.5%)

 
 

Molecule Chef

151469

99.3

94.1

1.92

765

0.578

      

(0.51%)

 
 

DoG-Gen

143602

82.6

44.6

4.42

29204

0.603

      

(24.6%)

 
 

CasVAE

540

74.3

48.1

-

63

0.476

      

(11.7%)

 
 

SynFlowNet

62627

2.85

99.9

-

7

0.478

      

(0.01%)

 
  1. In accordance with the original paper, which indicated that FCD calculations are unreliable when using fewer than 5000 compounds, FCD values are not shown for cases where the number of unique molecules was less than 5000.
  2. The highest values or efficiencies are shown in bold.