From feedforward vision to natural vision: The impact of free viewing and clutter on monkey inferior temporal object representations

Similar documents
Emergence of transformation tolerant representations of visual objects in rat lateral extrastriate cortex. Davide Zoccolan

Trade-Off between Object Selectivity and Tolerance in Monkey Inferotemporal Cortex

What Response Properties Do Individual Neurons Need to Underlie Position and Clutter Invariant Object Recognition?

Information and neural computations

Just One View: Invariances in Inferotemporal Cell Tuning

Chapter 7: First steps into inferior temporal cortex

NeuroImage 70 (2013) Contents lists available at SciVerse ScienceDirect. NeuroImage. journal homepage:

Bottom-up and top-down processing in visual perception

Position invariant recognition in the visual system with cluttered environments

Experience-Dependent Sharpening of Visual Shape Selectivity in Inferior Temporal Cortex

The Receptive Fields of Inferior Temporal Cortex Neurons in Natural Scenes

Using population decoding to understand neural content and coding

Selectivity and Tolerance ( Invariance ) Both Increase as Visual Information Propagates from Cortical Area V4 to IT

Retinotopy of Facial Expression Adaptation

Modeling the Deployment of Spatial Attention

Face Neurons. Chapter 4. Edmund T. Rolls. The discovery of face neurons

Visual Categorization: How the Monkey Brain Does It

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Commentary on Moran and Desimone's 'spotlight in V4

U.S. copyright law (title 17 of U.S. code) governs the reproduction and redistribution of copyrighted material.

Lecture overview. What hypothesis to test in the fly? Quantitative data collection Visual physiology conventions ( Methods )

Object Selectivity of Local Field Potentials and Spikes in the Macaque Inferior Temporal Cortex

Visual Object Recognition Computational Models and Neurophysiological Mechanisms Neurobiology 130/230. Harvard College/GSAS 78454

Key questions about attention

Computer Science and Artificial Intelligence Laboratory Technical Report. July 29, 2010

A quantitative theory of immediate visual recognition

The detection of visual signals by macaque frontal eye field during masking

Neuroscience Tutorial

Report. Decoding the Representation of Multiple Simultaneous Objects in Human Occipitotemporal Cortex

Supplemental Material

Spatiotemporal Dynamics Underlying Object Completion in Human Ventral Visual Cortex

Prof. Greg Francis 7/31/15

Selectivity of Inferior Temporal Neurons for Realistic Pictures Predicted by Algorithms for Image Database Navigation

Selectivity of Local Field Potentials in Macaque Inferior Temporal Cortex

Attention March 4, 2009

A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition

Object recognition in cortex: Neural mechanisms, and possible roles for attention

THE ENCODING OF PARTS AND WHOLES

Properties of Shape Tuning of Macaque Inferior Temporal Neurons Examined Using Rapid Serial Visual Presentation

1 Introduction Synchronous ring of action potentials amongst multiple neurons is a phenomenon that has been observed in a wide range of neural systems

Neural representation of action sequences: how far can a simple snippet-matching model take us?

View-invariant Representations of Familiar Objects by Neurons in the Inferior Temporal Visual Cortex

A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex

A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex

Object recognition and hierarchical computation

Adventures into terra incognita

Neuronal responses to plaids

A Source for Feature-Based Attention in the Prefrontal Cortex

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Manuscript under review for Psychological Science. Direct Electrophysiological Measurement of Attentional Templates in Visual Working Memory

Model-free detection of synchrony in neuronal spike trains, with an application to primate somatosensory cortex

Repetition Suppression Accompanying Behavioral Priming. in Macaque Inferotemporal Cortex. David B.T. McMahon* and Carl R. Olson

Continuous transformation learning of translation invariant representations

Memory, Attention, and Decision-Making

Nonlinear processing in LGN neurons

Neurons and Perception March 17, 2009

Object-based vision and attention in primates Carl R Olson

Experimental Design. Thomas Wolbers Space and Aging Laboratory Centre for Cognitive and Neural Systems

Incremental grouping of image elements in vision

Visual attention mediated by biased competition in extrastriate visual cortex

Signals in inferotemporal and perirhinal cortex suggest an untangling of visual target information

Comparison of Associative Learning- Related Signals in the Macaque Perirhinal Cortex and Hippocampus

Object Category Structure in Response Patterns of Neuronal Population in Monkey Inferior Temporal Cortex

Attention Response Functions: Characterizing Brain Areas Using fmri Activation during Parametric Variations of Attentional Load

Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., Fried, I. (2005). Invariant visual representation by single neurons in the human brain, Nature,

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Journal of Neuroscience. For Peer Review Only

HST 583 fmri DATA ANALYSIS AND ACQUISITION

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

The perirhinal cortex and long-term familiarity memory

Neural Mechanisms Underlying Visual Object Recognition

Selective Attention. Modes of Control. Domains of Selection

Visual processing is usually subdivided into an early preattentive

Frank Tong. Department of Psychology Green Hall Princeton University Princeton, NJ 08544

Effects of Illumination Intensity and Direction on Object Coding in Macaque Inferior Temporal Cortex

A quantitative theory of immediate visual recognition

Consciousness The final frontier!

A quantitative theory of immediate visual recognition

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

Article. A Growth-Cone Model for the Spread of Object-Based Attention during Contour Grouping

Size-invariant but viewpoint-dependent representation of faces

Processing of the incomplete representation of the visual world

Information in the Neuronal Representation of Individual Stimuli in the Primate Temporal Visual Cortex

Spatial Sensitivity of Macaque Inferior Temporal Neurons

Sensitivity to timing and order in human visual cortex

ATTENTIONAL MODULATION OF VISUAL PROCESSING

Linearly Additive Shape and Color Signals in Monkey Inferotemporal Cortex

Ultra-fast Object Recognition from Few Spikes Chou Hung, Gabriel Kreiman, Tomaso Poggio, James J. DiCarlo

Spatial coding and invariance in object-selective cortex

arxiv: v1 [q-bio.nc] 12 Jun 2014

How does the ventral pathway contribute to spatial attention and the planning of eye movements?

Cognitive Modelling Themes in Neural Computation. Tom Hartley

Competition in visual working memory for control of search

Search for a category target in clutter

The representation of perceived shape similarity and its role for category learning in monkeys: A modeling study

Object detection in natural scenes by feedback

Neuronal selectivity, population sparseness, and ergodicity in the inferior temporal visual cortex

Visual masking and RSVP reveal neural competition

Attention, short-term memory, and action selection: A unifying theory

Transcription:

From feedforward vision to natural vision: The impact of free viewing and clutter on monkey inferior temporal object representations James DiCarlo The McGovern Institute for Brain Research Department of Brain and Cognitive Sciences Massachusetts Institute of Technology, Cambridge MA

The core problem of object recognition Position Size Pose Illumination Clutter Background scene Other objects How does the brain recognize each object across this wide range of conditions? One needs an image representation that is selective for object identity, yet tolerant to such transformations.

Rhesus monkey model We have some idea of where we can find such an image representation (IT). We can study it at the most appropriate level of abstraction (neuronal spikes).

Monkey visual system Decision and action Memory Explicit Implicit

AIT contains a rapidly evoked, explicit object representation 100 ms 100 ms Isolated, single objects. Passive viewing. time Object identity or category is directly* available in the population response, regardless of (e.g.) object position and scale. Hung, Kreiman, Poggio and DiCarlo Science (2005) Gross et al. (1972), Perret et al. (1982), Desimone et al.. (1984), Tovee et al. (1984), Schwartz et al (1985), Ito et al. (1995), Logethetis et al. (1996), Op de Beeck and Vogles (2000), DiCarlo and Maunsell (2000), etc.

Feedforward* representation (The Core) * First evoked pattern of IT activity when an image is presented to the eye The Core is fast. The Core is powerful. The Core is not yet understood. Mechanisms? Role in natural vision? (Is it generalizable?)

The Core and natural vision What is natural vision? You know it when you see it.

The Core and natural vision

The Core and natural vision How does natural vision challenge the basic model of core vision? 1) Eye movements ( free viewing 2) Clutter / Scene / Context: objects appear among other objects and on backgrounds 3) Goal directed (e.g. feature and spatial attention, motor preparation to act, arousal) )

Object identification task Object Response Object Response A A 20 deg B C D B C D

Discriminable Detectable

fixation point

Example IT neuron

IT Population summary Population average response (normalized) 1.8 1.0 0 best target controlled free n = 63 worst target DiCarlo and Maunsell, Nature Neuroscience, 3: 814-821 (2000)

IT responses are nearly identical in controlled and free viewing conditions DiCarlo and Maunsell, Nature Neuroscience, 3: 814-821 (2000) DiCarlo and Maunsell, J Neurophysiology (2005)

The Core and natural vision How does natural vision challenge the basic model of core vision? 1) Eye movements ( free viewing ) 2) Clutter / Scene / Context: objects appear among other objects and on backgrounds Not much to worry about here. DiCarlo and Maunsell, 2000 Sheinberg and Logothetis, 2001 3) Goal directed (e.g. feature and spatial attention, motor preparation to act, arousal)

Natural vision: Clutter, scene, and context In the real world In the lab

Natural vision: Clutter, scene, and context IT Receptive Field

Long term goal: Understand IT in clutter IT responses to object are typically reduced when additional objects are presented (Sato, 1989; Miller et al., 1993; Rolls and Tovee, 1995; Chelazzi et al., 1998; Missal et al., 1999) Single objects Pair of objects Preferred object RF IT response Poor object

First open questions Object pairs: IT response:???? Any systematic relationship between: response to an object pair responses to the constituent objects?

Experimental design overview Davide Zoccolan and David Cox Recorded IT neuronal responses to the presentation of: Single objects Pairs of objects Triplets of objects In three monkeys Using two complementary experiments

Experiment 1

Experiment 2

Stimulus conditions EXPERIMENT 1 EXPERIMENT 2 Single objects 2 deg Object pairs

Stimulus conditions EXPERIMENT 1 EXPERIMENT 2 Single objects 2 deg Object triplets

Core response: Rapid visual presentation 100 ms 100 ms Fixate 100 ms 100 ms 100 ms Stimuli presented at 5 per sec Passive viewing 300 ms 104 neurons recorded in three monkeys

spikes/s Example IT neuron

Example IT neuron Response to object pairs (spikes/s) Sum of responses to single objects (spikes/s)

Example IT neuron Response to object pairs (spikes/s) Sum Average Zoccolan, Cox and DiCarlo, 2005 Sum of responses to single objects (spikes/s)

Population analysis Pairs (n Triplets (n = 79) r = 0.92 = 48) r = 0.91 Response to multiple objects (spikes/s) 120 80 40 0 0 40 80 Sum Average 120 150 100 50 0 0 50 Sum Average 100 150 Zoccolan, Cox and DiCarlo, 2005 Sum of responses to single objects (spikes/s)

Summary: The Core and multiple objects Under the conditions described here: An average rule is a very good predictor of the response of individual IT neurons (explains ~63% of response variance r 0.8) => The response pattern of The Core can be predicted by the response pattern to each constituent object => useful for supporting the simultaneous representation of multiple objects

The Core and natural vision How does natural vision challenge the basic model of core vision? 1) Eye movements ( free viewing ) 2) Clutter / Scene / Context: objects appear among other objects and on backgrounds Not much to worry about here. DiCarlo and Maunsell, 2000 Sheinberg and Logothetis, 2001 3) Goal directed (e.g. feature and spatial attention, motor preparation to act, arousal) Very important challenge. Beginnings of a systematic understanding. Zoccolan, Cox and DiCarlo, 2005