Fig. 4: Single-qubit nonequilibrium thermodynamic transformations in the open system.
From: Single-atom exploration of optimized nonequilibrium quantum thermodynamics by reinforcement learning

a Designed pulses for the effective Rabi frequency \(\widetilde{{{\Omega }}}\) and phase ϕ, segmented by 13 steps, due to reinforcement learning (RL) control. b Dynamics of coherence with or without RL-control. c Time evolutions of fidelity F and entropy production reduction ΔΣ, where dots are experimental results and lines represent theoretical simulation. The blue and red lines denote the RL-engineered evolution and free evolution, respectively, and the black line is plotted for variation of ΔΣ. After the last step, Fopt = 0.9886 ± 0.0067, ΔΣ = 0.9420 ± 0.0341 and the reduction of the work dW = 0.8360 ± 0.1321. Inset presents time evolution of the population Pe due to decay. d, e Comparison of robustness against the deviations of Rabi frequency δΩ/Ω0 and resonance frequency Δ/Ω0 for Fopt, ΔΣ and dW in the presence of decay, where Fopt can be larger than 95% when the deviation is 20%. The error bars indicating the statistical standard deviation of the experimental data are obtained by 10,000 measurements for each data point. Other parameters: Rabi frequency Ω0/2π = 20kHz, decay γeff = 0.0216Ω0, time for every step δτ = 0.25/Ω0, and inverse temperature βΩ0 = 3.