Table 1 Top-k accuracies of RSGPT model and baselines on USPTO-50k60 dataset with reaction class unknown

From: RSGPT: a generative transformer model for retrosynthesis planning pre-trained on ten billion datapoints

Model

Reaction class unknown

Top-1

Top-3

Top-5

Top-10

Template-based

    

RetroSim61

37.3

54.7

63.6

74.1

NeuralSym62

44.4

65.3

72.4

78.9

GLN8

52.5

69.0

75.6

83.7

LocalRetro52

53.4

77.5

85.9

92.4

RetroComposer9

54.5

77.2

83.2

87.7

Semi-template-based

    

G2G63

48.9

67.6

72.5

75.5

RetroXpert51

50.4

61.1

62.3

63.4

RetroPrime64

51.4

70.8

74.0

76.1

G2Retro65

54.1

74.1

81.2

86.7

SemiRetro12

54.9

75.3

80.4

84.1

Graph2Edits13

55.1

77.3

83.4

89.4

Template-free

    

SCROP17

43.7

60.0

65.2

68.7

MEGAN66

48.1

70.7

78.4

86.1

Graph2SMILES18

52.9

66.5

70.0

72.9

R-SMILES43

56.3

79.2

86.2

91.0

NAG2G10

55.1

76.9

83.4

89.9

EditRetro45 (×20) a

60.8

80.6

86.0

90.3

RSGPT

63.4

84.2

89.2

93.0

RSGPT (×20) a

77.0

90.9

94.3

96.7

  1. a Twenty-fold augmentation was implemented for SMILES in both the training and test sets.
  2. The performance regarding existing methods is derived from their references. The best-performing results are marked in bold. The different types of models are distinguished by the italicized terms “Template-based”, “Semi-template-based”, and “Template-free”.