Figure 1

Analog computation of AI workloads is a promising solution to this power-intensive task. RRAM based AI harware accelerators leverage in-memory computation compared to state-of-the-art von Neumann architectures. (a) The von Neumann computing architecture consists of physically separated memory and logic units, which are bottlenecked by data transfer (shown by the blue arrow). (b) In-memory computation reduces power and latency by computing directly within the memory unit and also utilizes the inherent parallelism of the memory architecture. (c) A simple neural network is shown with input layer (blue), hidden layer (green), and output layer (red). (d) An example of implementing a RRAM-based memory array for hardware realization of a neural network where each RRAM stores synaptic weight values. Simultaneous input at each row as voltage can result in column current for multiply and accumulate (MAC) operations. It should be noted that MAC operations are a significant contributor towards overall neural network training workload. (e) A 1-transistor 1-RRAM (1T1R) unit cell is shown, highlighting a cross-section of the RRAM structure with oxygen vacancies in the switching layer depicted as grey dots.