Fig. 2: Initial state preparation by reinforcement learning (RL) method.
From: Single-atom exploration of optimized nonequilibrium quantum thermodynamics by reinforcement learning

a, b Designed pulses of the effective Rabi frequency \(\widetilde{{{\Omega }}}\) and phase ϕ for initial state preparation using RL method, respectively. c Experimental measurement of the Stokes parameters Sx, Sy, and Sz, in comparison with the theoretical simulation, where the Stokes parameters are acquired by measuring the populations from x, y and z directions (see Supplementary Note 3). The error bars indicating the statistical standard deviation of the experimental data are obtained by 10,000 measurements for each data point. After the last step, the fidelity is F = 0.9799 ± 0.0103. d Time evolution of the population Pe due to decay, from which we acquire decay γeff = 11.99 kHz. Other parameters: Rabi frequency Ω0/2π = 20kHz and time for every step δτ = 0.25/Ω0.