Table 4 Accuracies of ablation prediction by RSGPT involving training strategies and data augmentation on USPTO-50k60 dataset with reaction class unknown

From: RSGPT: a generative transformer model for retrosynthesis planning pre-trained on ten billion datapoints

Training strategies/data augmentation

Top-k accuracy (%)

Top-1

Top-3

Top-5

Top-10

Pre-training

RLAIFa

Fine-tuning

    

63.4

84.2

89.2

93.0

×

59.9

80.0

87.3

92.9

×

×

26.4

37.5

41.4

46.4

Augmentation of training set

Augmentation of test set

    

×1

×1

63.4

84.2

89.2

93.0

×20

×1

55.1

73.6

78.8

85.0

×20

×5

75.9

90.0

93.6

96.0

×20

×10

76.5

90.7

94.2

96.4

×20

×20

77.0

90.9

94.3

96.7

  1. a RLAIF represents reinforcement learning from artificial intelligence feedback.
  2. The performance regarding existing methods is derived from their references. The best-performing results are marked in bold.