Fig. 12: Normalized differences in setpoints selected by two policies trained under identical conditions, differing only in the random seed used to initialize neural network weights. | Communications Engineering

Fig. 12: Normalized differences in setpoints selected by two policies trained under identical conditions, differing only in the random seed used to initialize neural network weights.

From: Dynamic optimizers for complex industrial systems via direct data-driven synthesis

Fig. 12

Deviations from zero indicate differences in the setpoints selected by the policies. For instance, the second policy chooses lower reactor pressure and higher recycle valve position (suboptimal) but compensates by selecting a lower reactor level (optimal). These variations illustrate the learned policies' behavioral variability and the optimizer’s ability to achieve objectives via alternative strategies. Here, yA and yAC denote the respective proportions of A and A + C in the reactor feed.

Back to article page