Table 2 List of the hyperparameters and their values used in the problem of quantum compiling by rotations operators. The proximal policy optimization agent (PPO) exploits scaled exponential linear units (SELU) as nonlinear activation functions in the hidden layers of the neural network.

From: Quantum compiling by deep reinforcement learning

Area

Hyperparameter

Value

Neural network

# hidden layers

128, 128

 

activations

SELU, SELU

Training

learning rate

0.0001

 

batch size

128

 

# agents

40

Algorithm

max length episode

300