Fig. 1: Effect of task-model alignment on the generalization of kernel regression.

a, b Projections of digits from MNIST along the top two (uncentered) kernel principal components of 2-layer NTK for 0s vs. 1s and 8s vs. 9s, respectively. c Learning curves for both tasks. The theoretical learning curves (Eq. (4), dashed lines) show strong agreement with experiment (dots). d The kernel eigenspectra for the respective datasets. e The cumulative power distributions C(ρ). Error bars show the standard deviation over 50 trials.