Table 1 Comparison of DRAGONFLY with a fine-tuned recurrent neural network (RNN) approach, assessing the percentage of molecules meeting various criteria: (i) Unique and novel, (ii) Novelty score ≥ 0.65, (iii) Retrosynthetic accessibility score (RAScore) ≥ 0.5, (iv) QSAR score ≤ 1 μM, and (v) meeting all four criteria

From: Prospective de novo drug design with deep interactome learning

Template / Method

Unique and novel / %

Novelty score ≥ 0.65 / %

RAScore ≥ 0.5 / %

QSAR score ≤ 1 μM / %

All criteria / %

PPARγ

     

RNN-SMILES

75.4 ( ± 2.7)

28.7 ( ± 1.1)

67.9 ( ± 2.3)

29.6 ( ± 2.4)

5.1 ( ± 0.2)

DRAGONFLY-SMILES

91.8 ( ± 0.3)

47.9 ( ± 1.4)

86.0 (±0.3)

34.7 (±0.3)

9.4 ( ± 0.0)

DRAGONFLY-SELFIES

99.8 (±0.1)

77.4 (±0.1)

82.2 ( ± 0.2)

31.9 ( ± 0.1)

13.3 (±0.0)

LXRβ

     

RNN-SMILES

92.4 ( ± 2.5)

65.9 ( ± 2.6)

87.9 ( ± 2.8)

28.6 ( ± 0.9)

11.3 ( ± 0.4)

DRAGONFLY-SMILES

94.3 ( ± 0.5)

80.2 ( ± 1.2)

89.1 (±0.5)

26.2 ( ± 0.2)

11.8 (±0.1)

DRAGONFLY-SELFIES

100 (±0.0)

91.3 (±0.5)

84.2 ( ± 0.3)

27.9 (±0.2)

11.1 ( ± 0.1)

RARα

     

RNN-SMILES

69.7 ( ± 5.9)

41.9 ( ± 3.3)

57.2 ( ± 4.3)

30.1 ( ± 1.8)

11.1 ( ± 0.7)

DRAGONFLY-SMILES

92.2 ( ± 0.4)

62.4 ( ± 0.7)

75.6 ( ± 0.5)

32.4 (±0.7)

12.7 ( ± 0.2)

DRAGONFLY-SELFIES

99.8 (±0.0)

87.5 (±0.3)

77.1 (±0.2)

29.6 ( ± 0.3)

14.0 (±0.1)

BRAF

     

RNN-SMILES

89.2 ( ± 3.5)

35.1 ( ± 3.1)

85.9 ( ± 3.0)

35.0 ( ± 1.3)

6.7 ( ± 0.3)

DRAGONFLY-SMILES

87.9 ( ± 0.6)

46.0 ( ± 0.8)

80.9 (±0.5)

42.9 (±0.5)

10.7 ( ± 0.1)

DRAGONFLY-SELFIES

99.7 (±0.1)

81.1 (±0.6)

77.3 ( ± 0.4)

34.3 ( ± 0.1)

12.4 (±0.0)

BTK

     

RNN-SMILES

82.0 ( ± 4.4)

64.5 ( ± 4.1)

61.9 ( ± 4.7)

20.7 ( ± 1.8)

4.5 ( ± 0.2)

DRAGONFLY-SMILES

88.9 ( ± 0.7)

53.2 ( ± 0.4)

69.6 ( ± 0.9)

36.3 (±0.7)

8.8 (±0.1)

DRAGONFLY-SELFIES

100 (±0.0)

85.8 (±0.7)

68.2 (±1.0)

25.8 ( ± 0.1)

5.8 ( ± 0.0)

JAK2

     

RNN-SMILES

88.8 ( ± 3.9)

60.2 ( ± 4.2)

79.9 ( ± 3.4)

35.0 ( ± 2.2)

14.5 ( ± 0.8)

DRAGONFLY-SMILES

84.8 ( ± 1.0)

39.4 ( ± 0.9)

69.0 ( ± 1.0)

55.9 (±1.5)

14.8 ( ± 0.2)

DRAGONFLY-SELFIES

99.2 (±0.0)

73.3 (±0.8)

70.5 (±0.5)

50.5 ( ± 1.0)

18.3 (±0.2)

  1. Bold indicates whether the SELFIES- or SMILES-based models achieve a higher value for the investigated property in both structure- and ligand-based models. The values are presented as mean and standard deviation, based on three runs (N = 3), each sampling 2000 SMILES-strings. The complete list of 20 investigated targets can be found in Tables S2S6. JAK Janus kinase, PPAR Peroxisome proliferator-activated receptor, BRAF Serine/threonine-protein kinase B-Raf (rapidly accelerated fibrosarcoma), BTK Bruton’s tyrosine kinase, RAR  Retinoic acid receptor, LXR Liver X receptor.