Fig. 1: Illustration of gradient update computation steps. | Nature Communications

Fig. 1: Illustration of gradient update computation steps.

From: Fast and robust analog in-memory deep neural network training

Fig. 1: Illustration of gradient update computation steps.The alternative text for this image may have been generated using AI.

The general structure of the gradient computation is shared for all improved learning algorithms discussed and is based on Tiki-Taka version 2 (TTv2) (see ref. 24). For each input vector x and backpropagated error vector d the weight gradient is first accumulated on a crossbar array \(\breve{A}\), using a parallel pulsed outer-product update with learning rate λA (13; see Supplementary Alg. 1). Note that the matrices are here displayed in a transposed fashion so that voltage inputs x are delivered from the left and d from the bottom side. Then a single row of the accumulated gradient in \(\breve{A}\) is read out intermittently every ns vector updates (looping through the rows over time), and digital computation is used to arrive at a FP vector zk that is added to the digital storage H with learning rate λH. Finally, the corresponding row of actual weight matrix, which is represented by a second crossbar array \({\breve{W}}\), is updated when a threshold is crossed, and the hidden matrix H is reset correspondingly. The newly proposed algorithms differ in the digital computation to arrive at \(\hat{{{\bf{x}}}}\) and zk. For the TTv2 baseline algorithm, it is \(\hat{{{\bf{x}}}}\equiv {{\bf{x}}}\) and \({{{\bf{z}}}}_{k}\equiv (\breve{A}-\breve{R})\,{{{\bf{v}}}}_{k}\) where the reference crossbar array \(\breve{R}\) is programmed before DNN training and a fast differential analog MVM is used for readout (using one-hot unit vector vk). See “Methods” section “Fast and robust in-memory training” and Supplementary Fig. 2 for more details on the digital operations of the proposed algorithms.

Back to article page