Fig. 1: Overview of lazy and non-lazy regimes.
From: Coding schemes in neural networks learning classification tasks

a Sketch of the network with two hidden layers (L = 2) and two outputs (m = 2). b, d Last layer feature space in the lazy regime where predominantly random features are almost orthogonal to the readout weight vector (b) and in the non-lazy regime where learned features are aligned to the readout weight vector (d). c, e Kernels and activations of feature layer neurons in lazy (c) and non-lazy (e) networks for a binary classification task and three neuronal nonlinearities (linear, sigmoidal, and ReLU). For ReLU networks, only the 20 most active neurons are shown. In (e), all three kernels show a similar, pronounced task-learned structure. However, the patterns of activation are strikingly different.