Fig. 8: Recurrent neural network trained to perform motion and depth computations.
From: Flexible computation of object motion and depth based on viewing geometry inferred from optic flow

A Inputs and outputs of the network. The network receives three inputs—retinal motion of the target object and the image motion of two background dots (one near and one far relative to the fixation point)—and it produces two outputs, object motion in the world and depth from MP. B Outputs of the trained RNN resemble human behavior. Left, The relationship between input retinal direction and the network’s estimated motion direction in R+T (blue) and R (orange) geometries. Right, Relationships between estimated depth sign and retinal direction. Dashed and solid curves: leftward and rightward eye movement, respectively. C Joint velocity tuning for retinal and eye velocities in the R (top) and R+T (bottom) geometries for 10 example recurrent units. Corresponding units are shown for both geometries. D Motion and depth computations require distinct joint representations of retinal and eye velocities. Left: velocity in world coordinates (orange lines) increases along the diagonal direction indicated by the black arrow. Right: depths from MP are represented by lines with varying slopes. E Histograms showing distributions of tuning shifts observed in RNN units (left) and MT neurons (right; adapted from ref. 57) for the R (orange) and R+T (blue) viewing geometries. A shift of 0% indicates a retinal-centered representation and a shift of 100% indicates a world-centered representation.