Extended Data Fig. 1: Explanation of the double sub-flow design of SynGFN and the pre-training strategy. | Nature Computational Science

Extended Data Fig. 1: Explanation of the double sub-flow design of SynGFN and the pre-training strategy.

From: SynGFN: learning across chemical space with generative flow-based molecular discovery

Extended Data Fig. 1

(a) Explanation of the hierarchical action design adopted by SynGFN, illustrating the process of adding a new reactant in a single step. Current State St represents the building block or intermediate product. Policy model 1 and Policy model 2 are used to predict actions based on the current state and previous actions. For Action 1, the model predicts the reaction to be performed, with probabilities for each reaction. For Action 2, the model predicts the reactants, and their probabilities are determined similarly. Sampled actions correspond to the selected reaction or reactant. (b) Explanation of the flow network behind SynGFN. The flow network is defined as a Directed Acyclic Graph (DAG). Each node represents a state, with intermediate states denoted as x0, x1, and all terminal states as t0, t1. The process of generating objects in the flow network is analogous to water flowing from a start to an endpoint. The probability of taking an action at each state corresponds to the flow through a pipe. A flow-matching constraint ensures that the water entering a state equals the flow out. The flow network constrains the flow through terminal states to their reward feedback R(x). We implement a double sub-flow design for state transitions, representing hierarchical actions. For Action 2, Policy model 2 uses a pre-training strategy via a multi-label classification task to recognize and select reactants that match the specified reactions before training.

Back to article page