Table 2 Transformer parameter settings in the dual-encoder and the substructure-level sequence-to-sequence model
From: Single-step retrosynthesis prediction by leveraging commonly preserved substructures
Parameters | Dual-encoder | Substructure-level seq-to-seq |
|---|---|---|
Embedding size | 512 | 512 |
Hidden size | 256 | 512 |
Feedforward hidden size | 2048 | 2048 |
Encoder blocks | 3 | 10 |
Encoder attention heads | 8 | 8 |
Max total training steps | 500,000 | 500,000 |
Warm-up steps | 4000 | 8000 |
Dropout | 0.1 | 0.1 |