Fig. 6: Dissociating the effect of frequency and information in supervised learning tasks. | Nature Communications

Fig. 6: Dissociating the effect of frequency and information in supervised learning tasks.

From: Efficient neural codes naturally emerge through gradient descent learning

Fig. 6

a We trained 3-layer nonlinear neural networks to classify the orientation of sinusoidal gratings into 3 bins, varying either input frequency or output noise. b We controlled the informativeness of input orientations by injecting noise into the labels as a function of orientation (specifically, the binary labels are multiplied by a Bernoulli dropout with a rate that depends on orientation). The sensitivity of the first layer to input orientation is shown over learning. With uniform statistics, the more informative features are preferentially learned. c The effect of varying input frequency without applying label noise. In this case, the more frequent features are preferentially learned. d We then balanced noise and frequency such that the total information in the input dataset about each output label is uniform (see Methods). Learning with gradient descent still prefers common angles.

Back to article page