Fig. 8: CSS code and encoding discovery in a next-to-nearest neighbor connectivity with periodic boundary conditions.

The initial layer of Hadamard gates was chosen by us and fixed. We considered two scenarios: Starting from just that initial Hadamard layer (as in [[17, 1, 5]]) or also providing the first layer of CNOTs to start from neighboring Bell pairs (as in [[25, 1, 5]]). The rest of the circuit is successfully discovered by the RL agent. We remark that the agent does not place gates in parallel, the circuits shown here show gates in parallel for compactness.