Extended Data Fig. 7: Effect of nonlinearities at the compression layer. | Nature Neuroscience

Extended Data Fig. 7: Effect of nonlinearities at the compression layer.

From: Optimal routing to cerebellum-like structures

Extended Data Fig. 7

To achieve a performance with nonlinear compression layer units comparable to that of linear units, we set Nc = 250. To maximize the dimension of the compression layer after the nonlinearity, we also introduced a random rotation of the optimal compression matrix (see Methods 5). a: Dimension of the compression layer representation for linear versus nonlinear (ReLU) compression. For ReLU compression, the nonlinearity is applied after random (left), PC-aligned (center), and whitening compression (right). b: Same as a, but showing the noise strength at the compression layer Δc. c: Same as a, but showing the fraction of errors in the random classification task. In panels a-c, the box boundary extends from the first to the third quartile of the data. The whiskers extend from the box by 1.5 times the inter-quartile range. The horizontal line indicates the median. d: Fraction of errors over training when the compression weights are trained using gradient descent and the compression layer units are nonlinear (ReLU). For comparison, the horizontal dashed lines indicate the performance of networks with linear compression layer units. The solid lines indicate the mean over 10 network realizations and the shading indicates the standard deviation across network realizations. e: Performance at convergence for the same networks as in d. For all panels, parameters were N = D = P = 500, Nc = 250, M = 2000, f = 0.1, fc = 0.3, and σ = 0.1.

Back to article page