Fig. 4: Weight programming error using different learning algorithms.
From: Fast and robust analog in-memory deep neural network training

Standard deviation of the converged analog weights \(\breve{W}\) to the target weights is plotted in color code. The value of the reference offset device-to-device variation σr is increased horizontally, while vertically changing the number of material device states nstates (see Eq. (6)). Less number of states, in general, corresponds to a noisier conductance response (e.g., for typical ReRAM materials), higher number of states corresponding to more ideal device conductance responses (e.g., ECRAM). A In-memory SGD using stochastic pulse trains. B Baseline Tiki-Taka version 2 (TTv2). C The proposed Chopped-TTv2 (c-TTv2) algorithm. D The proposed (AGAD) algorithm. Simulation details: Parameter settings as in Fig. 3 except that σr and δ are varied. Additionally, we set σb = 0 for \(\breve{W}\) only (to not confound results for not being able to represent the target weight with \(\breve{W}\)) and set σ± = 0.1 (to avoid a large impact of few failed devices on the weight error). The target matrix and inputs are fixed for each case for better comparison. Averaged over three construction seeds of the device variations.