Fig. 5: Gaussian RBF kernel regression on high-dimensional spherical data.

a Phase diagram for non-monotonic learning curves obtained from the theory by counting the zeros of \(\frac{\partial {E}_{g}}{\partial \alpha }\). Colored squares and colored circles correspond to curves in c, d, respectively. b Kernel regression with Gaussian RBF \(K({\bf{x}},{\bf{x}}^{\prime} )={\mathrm{{e}}}^{-\frac{1}{2D{\omega }^{2}}| | {\bf{x}}-{\bf{x}}^{\prime} | {| }^{2}}\) with ω = 3, D = 100 and noise-free labels. Target is \(\bar{f}({\bf{x}})={\sum }_{k,m}{\bar{w}}_{km}\sqrt{{\eta }_{km}}{Y}_{km}({\bf{x}})\) with random and centered weights \({\bar{w}}_{km}\) such that \(\,\left\langle {\bar{w}}_{km}^{2}\right\rangle ={\eta }_{km}\) (Supplementary Note 5). Dashed lines represent the locations of N(D, 1) and N(D, 2), showing different learning stages. c, d Generalization error for Gaussian RBF kernel for various kernel widths ω corresponding to specific \({\tilde{\lambda }}_{L}\)’s and noise variances \({\tilde{\sigma }}_{L}\) pointed in the phase diagram in D = 100. Solid lines—theory (Eq. (4)). Larger regularization suppresses the descent peaks, which occur at P* ~ N(D, L) shown by the vertical dashed lines. c Varying \({\tilde{\lambda }}_{L}\) with fixed the \({\tilde{\sigma }}_{L}\). d vice versa. For fixed noise, we observe an optimal \({\tilde{\lambda }}_{1}\) for up to P/N(D, 1) ~10 after which the next learning stage starts. Error bars indicate standard deviation over 300 trials for b and 100 trials for c, d.