Fig. 7: RL navigation in the double gyre flow field.

a Navigation problem setup. The start and target regions are L/2 in diameter and located at (3L/2, L/2) and (L/2, L/2), respectively. b A naive policy achieves 40.9% success rate on average. c The velocity RL swimmer trained on the cylinder wake navigates the double gyre flow poorly, indicating its navigation policy did not generalize. d After receiving training for the double gyre flow, the velocity RL swimmer is able to adapt and navigate more effectively than either swimmer. As with the cylinder flow, successful attempts to reach the target are green, while unsuccessful attempts are red. An episode is successful when a swimmer reaches within a radius of L/50 around the target location. The stated success rates are averaged over 12,500 episodes and are shown with one standard deviation arising from the five times each swimmer was trained.