Motion Estimation by Brains and Machines

Research > Scenes in Motion > Motion Estimation by Brains and Machines

Geniculate neuron pairs exhibit a diverse range of tuning properties. Shown in the lower left are the contours of the spatial receptive fields of 7 geniculate neurons recorded simultaneously, each represented by a different color and number. ON cells are represented by solid lines, OFF cells by dashed. The upper triangle shows the array of tuning properties for all pair-wise combinations of cells. Half-width-half-height (HWHH) values in degrees are given for the polar plots in the top row.

Overview
Publications

Motion Estimation by Brains and Machines

Garrett Stanley, Jose-Manuel Alonso¹, Michael J. Black

As we move through our visual environment, the spatial and temporal pattern of light that enters our eyes is strongly influenced by the properties of objects within the environment, their motion relative to each other, and our own motion relative to the external world. From the perspective of the computer vision community, an important challenge exists in inferring the motion of the external environment (or “optical flow”) from sequences of 2D images. From the perspective of the neuroscience community, quantifying the distributed neural representation of motion in the early visual pathway is a critical step in understanding how information concerning motion is extracted and prepared for processing in higher visual centers. This interdisciplinary collaborative project synthesizes these two distinct problems with tightly integrated aims designed to serve both efforts.

This project explores the estimation of visual motion by biological and computational systems in response to complex natural image sequences. Through this work we are developing a rich set of methods for the representation and recovery of motion boundaries and optical flow by brains and machines. Our computational and biological models are unified through a common probabilistic framework and a unique dataset of spatiotemporal naturalistic scenes in which we simultaneously know the optical flow at each image pixel and neural activity of a large population of single neurons densely sampled from the visual thalamus.

Natural visual stimuli have highly structured spatial and temporal properties, which strongly shape the activity of neurons in the early visual pathway. In response to natural scenes, neurons in the LGN are temporally precise on a time scale of 10-20 ms both within single cells and across cells within a population. Given that thalamic neurons with overlapping receptive fields are likely to converge at common cortical targets, that the thalamocortical synapse is highly sensitive to the timing of thalamic inputs on a time scale of approximately 10 ms, and that cortical neurons can be reliably driven with a small number of thalamic inputs, we posit a potential role for the synchronous activity of thalamic input in the establishment of cortical response properties.

Input from thalamic neurons with large spatial separation (i.e. > 2 receptive field centers) could naturally provide a highly selective signal for orientation, but given their receptive field separation, they are unlikely to project to a common recipient cortical neuron or even a common orientation column, creating a paradox in the emergence of important kinds of selectivity in visual cortex.

We have shown that, that the synchronous firing of geniculate cell pairs with highly overlapped receptive fields is strongly selective for orientation even though individual LGN neurons are not. This is made possibly due to the asymmetry in the fine temporal precision of geniculate responses. Neurons with highly overlapped receptive fields and similar response latencies can generate direction selectivity in a cortical target that reads out the synchronous inputs. We show that this stimulus selectivity remains unchanged under different contrasts, stimulus velocities and temporal integration windows.

Our findings suggest a novel population code in the synchronous firing of neurons in the early visual pathway that could serve as the substrate for establishing cortical representations of the visual scene. Thalamic synchrony could play a role in the nonlinear relationship between cortical membrane potential and cortical firing rate. This suggests an extremely simple conceptual model of the constituent elements of the cortical representation of the visual scene that relies only on the intrinsic spatial and temporal diversity of a highly localized thalamic population, and provides support for a feedforward model of thalamocortical processing that incorporates the physiological role of thalamic timing/synchrony in shaping cortical response properties

Our ongoing work uses more complex naturalistic scenes as visual stimuli. Using computer graphics sequences (see the MPI-Sintel flow project), with realistic scene statistics, lets us ask new questions about what drives neurons in LGN and V1. We are also using this understanding to propose new computational models of motion processing.

¹ Department of Biological Sciences, State University of New York, College of Optometry, New York, NY, USA