Table 2 List of the hyperparameters and their values used in the problem of quantum compiling by rotations operators. The proximal policy optimization agent (PPO) exploits scaled exponential linear units (SELU) as nonlinear activation functions in the hidden layers of the neural network.
Area | Hyperparameter | Value |
|---|---|---|
Neural network | # hidden layers | 128, 128 |
activations | SELU, SELU | |
Training | learning rate | 0.0001 |
batch size | 128 | |
# agents | 40 | |
Algorithm | max length episode | 300 |