Fig. 3: Learning in eight-layer SNNs on the MNIST dataset for different models and initializations.
From: High-performance deep spiking neural networks with 0.3 spikes per neuron

a A ReLU network with standard deep learning initialization w(n) is trained (red-dashed). The B1-model (identity mapping w(n) = W(n)) with standard deep learning initialization follows the same training curve (light blue) that converges to 100% training accuracy in <100 epochs. The α1-model (\({w}^{(n)}=\frac{{W}^{(n)}}{{B}_{i}^{(n)}}\)) with smart α1 initialization needs a much smaller learning rate parameter and deviates from the ReLU-network training curve (dark blue). If one continues the training, it eventually converges after around 2000 epochs to 100% training accuracy. b Cosine similarity measured during training between the weights of the ReLU network and the weights of the (i) B1-model with standard deep learning initialization (ii) α1-model with smart α1 initialization.