Fig. 3: In-sensor-in-memory operations.

a An illustration of a computer vision model. An image is processed by this model which comprises preprocessing and subsequent feature extraction steps. The number of operations scales with the depth of the network as is shown for a LeNet-5 model, i.e., the first layer that computes directly on the input image is computationally most demanding. b An illustration showing the direct computation of the convolutions during image sensing by leveraging the crossbar topology of the active pixel array. c An illustration of emulated Gaussian blurring of an input image as a preprocessing step. d The Hough transformation pipeline is illustrated, where in-sensor computations preprocess images to generate inputs for computational memory tiles. Computational memory performs MVM and accumulation operations to detect lines in the images. e In the first operation, the input image is converted into a vector that is multiplied by a matrix encoding the parametric space transformation. The resulting output becomes the input for the accumulator space. In this space, select PCM devices experience an increase in their conductance values based on the number of times they are programmed by the input. f The experimental plot depicts a computational memory tile, showcasing the encoded regions for MVM and accumulation operations. The MVM region is programmed only once, while the accumulation operation involves all devices being reset. Over time, the mapping in the accumulation operation evolves based on the number of input pulses they receive. ADCu stands for analog-to-digital conversion units. g An experimental MVM plot displays the measured output of the computational memory. The black trace represents the ideal result from floating-point MVM. h A 3D plot illustrating the accumulator space after the computational memory has preprocessed an input image. Two unit-cells, representing a unique (r, θ) tuple, underwent the largest increase in conductance.