Fig. 5: Capacity of smooth manifolds from warped ImageNet images.
From: Separability and geometry of object manifolds in deep neural networks

a Illustration of smooth, densely sampled affine transformed images; 36 samples from a 2-d translation manifold (left) and 2-d shear manifold (right). Each manifold sample is associated with a coordinate specifying the horizontal and vertical translation or shear of a base image, and corresponds to an image where the object is warped using the appropriate affine transformation. b, c Classification capacity for 2-d smooth manifolds (full line: translation; dashed line: shear) along the layers of AlexNet (b) and VGG-16 (c). Line and markers indicate mean value over four different choices of 64 objects; surrounding shaded areas indicate 95% confidence interval. The x-axis labels provides abbreviation of the layer types. Marker shape represents layer type (circle—pixel layer, square—convolution layer, right-triangle—max-pooling layer, hexagon—fully connected layer, down-triangle—local normalization). Features in linear layers are extracted after a ReLU nonlinearity. Color (blue—AlexNet, green—VGG-16) changes from dark to light along the network. d Capacity increase from the input (pixel layer) to the output (features layer) of AlexNet (blue markers) and VGG-16 (green markers) for 2-d translation smooth manifolds. The capacity increase is specified as ratio of capacity at the last layer relative to the pixel layer (y-axis), at different levels of stimuli variation measured using Supplementary Eq. (3) at the pixel layer (x-axis).