Fig. 4: Cross-policy transferability evaluations.

a, b: Stroke-averaged center of mass velocity \(\bar{v}\) of N-bead microswimmers deployed with policies optimized for NT (color-coded dotted lines) for type A and B microswimmers, respectively. The vast majority of the policies evolved for NT-beads generalize well to vastly different N-bead morphologies without further optimization (circle symbols mark utilized N, magenta × symbols illustrate training conditions where NT = N). (c) and (d): The mean angular velocity \(\bar{\omega }\) of the arm-length limit-cycle dynamics of the most optimal N-bead type A and B microswimmers, respectively (blue circles; fastest policies trained with NT-beads and deployed to N-bead microswimmers); magenta × symbols illustrate \(\bar{\omega }\) for the corresponding training conditions, N = NT. Similar to (c–f) show the mean cross-correlation time \(\bar{\tau }\) between neighboring arm lengths, and (g, h) show the corresponding (dimensionless) wavelength \(\bar{\lambda }\) (see columns in Fig. 3e, f at fixed time tk and see the “Transferable evolved policies: decentralized decision-making generalizes to arbitrary morphologies” subsection of the “Results and Discussion” for details) of type A and B microswimmers, respectively, as a function of N. Dashed lines indicate functional fits: (c) \(\bar{\omega }\approx a\ln N+b\) with a = −5.26 × 10−3rad/Δt and b = 1.13 × 10−1rad/Δt. (d) \(\bar{\omega }\approx \alpha {N}^{\beta }\) with α = 3.92 × 10−1rad/Δt and β = −8.19 × 10−1. e \(\bar{\tau }\approx 12.2\,\Delta t\). g \(\bar{\lambda }\approx a\ln N+b\) with a = 2.37 × 10−1 and b = 4.64. h \(\bar{\lambda }\approx \alpha {N}^{\beta }\) with α = 1.33 and β = 9.17 × 10−1. Error-bars in (c–h) represent the STD of \(\bar{\omega }\), \(\bar{\tau }\) and \(\bar{\lambda }=2\pi /(\bar{\omega }\bar{\tau })\) (see also “Swimming-gait analysis” subsection in the “Methods”) for all data points within a 3-sigma STD; error-bars in (c–d) are smaller than symbols.