A new path to understanding vision

Similar documents
Pre-Attentive Visual Selection

A bottom up visual saliency map in the primary visual cortex --- theory and its experimental tests.

Lateral Geniculate Nucleus (LGN)

2/3/17. Visual System I. I. Eye, color space, adaptation II. Receptive fields and lateral inhibition III. Thalamus and primary visual cortex

M Cells. Why parallel pathways? P Cells. Where from the retina? Cortical visual processing. Announcements. Main visual pathway from retina to V1

Properties of V1 neurons tuned to conjunctions of visual features: application of the V1 saliency hypothesis to visual search behavior

Reading Assignments: Lecture 5: Introduction to Vision. None. Brain Theory and Artificial Intelligence

Early Stages of Vision Might Explain Data to Information Transformation

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

Competing Frameworks in Perception

Competing Frameworks in Perception

THE ENCODING OF PARTS AND WHOLES

EDGE DETECTION. Edge Detectors. ICS 280: Visual Perception

(Visual) Attention. October 3, PSY Visual Attention 1

Computational Cognitive Science

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

V1 (Chap 3, part II) Lecture 8. Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Princeton University, Fall 2017

Psychophysical tests of the hypothesis of a bottom-up saliency map in primary visual cortex

Prof. Greg Francis 7/31/15

Models of Attention. Models of Attention

Visual Selection and Attention

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Cognitive Modelling Themes in Neural Computation. Tom Hartley

Vision II. Steven McLoon Department of Neuroscience University of Minnesota

Nonlinear processing in LGN neurons

196 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 2, NO. 3, SEPTEMBER 2010

Neuroscience Tutorial

Primary Visual Pathways (I)

Vision Seeing is in the mind

Key questions about attention

Attention: Neural Mechanisms and Attentional Control Networks Attention 2

OPTO 5320 VISION SCIENCE I

Measuring the attentional effect of the bottom-up saliency map of natural images

Contextual Influences in Visual Processing

Photoreceptors Rods. Cones

Plasticity of Cerebral Cortex in Development

Object recognition and hierarchical computation

LISC-322 Neuroscience Cortical Organization

Parallel streams of visual processing

Object Perception Perceiving and Recognizing Objects

Can Saliency Map Models Predict Human Egocentric Visual Attention?

A theory of a saliency map in primary visual cortex (V1) tested by psychophysics of color-orientation interference in texture segmentation

Senses are transducers. Change one form of energy into another Light, sound, pressure, etc. into What?

Associative Decorrelation Dynamics: A Theory of Self-Organization and Optimization in Feedback Networks

Just One View: Invariances in Inferotemporal Cell Tuning

Increasing Spatial Competition Enhances Visual Prediction Learning

Introduction to Computational Neuroscience

using deep learning models to understand visual cortex

Selective Attention. Inattentional blindness [demo] Cocktail party phenomenon William James definition

Adventures into terra incognita

Goals. Visual behavior jobs of vision. The visual system: overview of a large-scale neural architecture. kersten.org

How does the ventral pathway contribute to spatial attention and the planning of eye movements?

Natural Scene Statistics and Perception. W.S. Geisler

The Visual System. Cortical Architecture Casagrande February 23, 2004

COGS 101A: Sensation and Perception

A saliency map in primary visual cortex

On the implementation of Visual Attention Architectures

lateral organization: maps

Significance of Natural Scene Statistics in Understanding the Anisotropies of Perceptual Filling-in at the Blind Spot

Attention Response Functions: Characterizing Brain Areas Using fmri Activation during Parametric Variations of Attentional Load

- 1 - Published in Journal of Experimental Psychology: human perception and performance, Vol 37(4), Aug 2011, doi: 10.

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

Perception & Attention. Perception is the best understood cognitive function, esp in terms of underlying biological substrates.

Introduction to sensory pathways. Gatsby / SWC induction week 25 September 2017

CS294-6 (Fall 2004) Recognizing People, Objects and Actions Lecture: January 27, 2004 Human Visual System

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice

PHY3111 Mid-Semester Test Study. Lecture 2: The hierarchical organisation of vision

1. The responses of on-center and off-center retinal ganglion cells

9.14 Classes #21-23: Visual systems

Fundamentals of psychophysics

Object detection in natural scenes by feedback

Neural circuits PSY 310 Greg Francis. Lecture 05. Rods and cones

Visual system invades the endbrain: pathways to striatum and cortex (continued) Why this happened in evolution

Department of Computer Science, University College London, London WC1E 6BT, UK;

Compound Effects of Top-down and Bottom-up Influences on Visual Attention During Action Recognition

Consciousness The final frontier!

What is mid level vision? Mid Level Vision. What is mid level vision? Lightness perception as revealed by lightness illusions

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

Sensation & Perception The Visual System. Subjectivity of Perception. Sensation vs. Perception 1/9/11

Perception & Attention

Modeling the Deployment of Spatial Attention

Object vision (Chapter 4)

Psychophysical Tests of the Hypothesis of a Bottom-Up Saliency Map in Primary Visual Cortex

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

Extrastriate Visual Areas February 27, 2003 A. Roe

Perceptual Grouping in a Self-Organizing Map of Spiking Neurons

Cognitive Neuroscience Attention

Dynamic Visual Attention: Searching for coding length increments

PERCEPTION AND ACTION

Sensation and Perception. A. Sensation: awareness of simple characteristics B. Perception: making complex interpretations

Spatial Attention: Unilateral Neglect

Psych 333, Winter 2008, Instructor Boynton, Exam 2

ID# Exam 1 PS 325, Fall 2003

Lighta part of the spectrum of Electromagnetic Energy. (the part that s visible to us!)

Overview of the visual cortex. Ventral pathway. Overview of the visual cortex

Contextual influences in V1 as a basis for pop out and asymmetry in visual search

Spectrograms (revisited)

CN530, Spring 2004 Final Report Independent Component Analysis and Receptive Fields. Madhusudana Shashanka

The Integration of Features in Visual Awareness : The Binding Problem. By Andrew Laguna, S.J.

Local Disparity Not Perceived Depth Is Signaled by Binocular Neurons in Cortical Area V1 of the Macaque

Transcription:

A new path to understanding vision from the perspective of the primary visual cortex Frontal brain areas Visual cortices Primary visual cortex (V1) Li Zhaoping Retina

A new path to understanding vision Traditional paths to understanding vision (1) Low level vision, mid-level vision, high-level vision (2) David Marr: primal sketch, 2.5 d sketch, 3-d model. Li Zhaoping July 16, 2018, presented at APCV 2018, HangZhou, China

Talk outline (1) The functional role of the primary visual cortex (V1) (2) In light of V1 s role a new plan to understanding vision (3) A first example study in this new plan

1953, Stephen Kuffler Center-surround feature detectors in retina

1960s-70s, Hubel and Wiesel Bar/edge feature detectors in V1 1953, Stephen Kuffler Center-surround feature detectors in retina

Some standard models of neural encoding in retina/v1: From Fig. 1 of Carandini et al 2005 Explaining: Retinal receptive fields V1 properties: Orientation tuning, Color tuning Motion direction tuning Spatial frequency tuning Ocular dominance (eye-of-origin tuning) Simple/Complex cells Explaining 80% of the response variance Explaining only 35% of the response variance!! Plus: models of gain control, response normalization by population activities

Do we know early vision? What is V1 doing? Do we really know what the early visual system does? Carandini, Demb, Mante, Tolhurst, Dan, Olshausen, Gallant, Rust, 2005 Standard models of neurons in the retina account for 80% of the response variance. However, standard models of V1 capture only 35% of the V1 variances. How close are we to understand V1? 2005. Olshausen and Field Standard models combining linear filtering, rectification and squaring, and response Is V1 normalization doing sparse still miss coding 86% of the (more variances efficient in V1 responses. encoding than in retina)? Olshausen and Field 1996: Emergence of simple-cell receptive field properties by learning a sparse coding for natural images Bell and Sejnowski 1997: The independent components of natural scenes are edge filters. Simoncelli and Olshausen 2001: Natural image statistics and neural representation.

Do we know early vision? What is V1 doing? Do we really know what the early visual system does? Carandini, Demb, Mante, Tolhurst, Dan, Olshausen, Gallant, Rust, 2005 Standard models of neurons in the retina account for 80% of the response variance. However, standard models of V1 capture only 35% of the V1 variances. How close are we to understand V1? 2005. Olshausen and Field Standard models combining linear filtering, rectification and squaring, and response Is V1 normalization doing sparse still miss coding 86% of the (more variances efficient in V1 responses. encoding than in retina)? Other than stereo vision, why not make encoding more efficient in retina already? V1 has ~100 as many cells as retina (Barlow 1981), quite inefficient/redundant! What about contextual influences to neural responses??!!!!

The primary visual cortex (V1) 1953, Stephen Kuffler, retina, 1959-- Hubel and Wiesel, V1 Then Experimentally: V1 and beyond Theoretical/modelling, Reichardt, Marr, etc. 2005: How close are we to understand V1? Olshausen and Field 2005 Do we really know what the early visual system does? Carandini, Demb, Mante, Tolhurst, Dan, Olshausen, Gallant, Rust, 2005 Standard models of V1 neural receptive field combining linear filtering, rectification and squaring, and response normalization captures only 15-35% of the variances in V1 responses. 2012, David Hubel, in answer to What Do You Feel Are the Next Big Questions in the Field? We have some idea for the retina, the lateral geniculate body, and the primary visual cortex, but that s about it. (Hubel & Wiesel 2012, Neuron)

The primary visual cortex (V1) 1953, Stephen Kuffler, retina, 1959-- Hubel and Wiesel, V1 Then Experimentally: V1 and beyond Theoretical/modelling, Reichardt, Marr, etc. 2005: How close are we to understand V1? Olshausen and Field 2005 Do we really know what the early visual system does? Carandini, Demb, Mante, Tolhurst, Dan, Olshausen, Gallant, Rust, 2005 Standard models of V1 neural receptive field combining linear filtering, rectification and squaring, and response normalization captures only 15-35% of the variances in V1 responses. 2012, David Hubel, in answer to What Do You Feel Are the Next Big Questions in the Field? We have some idea for the retina, the lateral geniculate body, and the primary visual cortex, but that s about it. (Hubel & Wiesel 2012, Neuron) Questions: Is a lack of understanding of V1 hindering our progress beyond V1? Physiologically Functionally (in behaviour)?

Information bottlenecks in the visual pathway: 10 7 bits/second ~ 10 6 neurons, ~10 spikes/neuron ~1 bit/spike To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956) 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel

Information bottlenecks in the visual pathway: We are nearly blind! Vision ~ Looking (selecting ) + Seeing top-down vs. bottom-up selection 10 7 bits/second ~ 10 6 neurons, ~10 spikes/neuron ~1 bit/spike To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956) Task: find a uniquely oriented bar 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel

Information bottlenecks in the visual pathway: Questions: which brain areas are doing the selection? Frontal? Parietal? top-down vs. bottom-up selection Task: find a uniquely oriented bar 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel 10 7 bits/second ~ 10 6 neurons, ~10 spikes/neuron ~1 bit/spike To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956)

Information bottlenecks in the visual pathway: Questions: which brain areas are doing the selection? Saliency regardless of visual features Frontal? Parietal? Koch & Ullman 1985, Itti & Koch 2001, etc top-down vs. bottom-up selection 10 7 bits/second ~ 10 6 neurons, ~10 spikes/neuron ~1 bit/spike To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956) Task: find a uniquely oriented bar 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel

Eye Extrastriate cortices V 1 LGN Parietal cortex Thalamus Superior colliculus Brain stem, etc to control eye muscle Frontal eye field In lower mammals, a smaller percentage of the total cortex is for prefrontal areas (Fuster 2002) Non-mammals do not have a neocortex Forebrain of fish is more for olfaction than vision (Butler & Hodos 2005) 11.5% 17% 3.5 % 12.5% 29%

The V1 Saliency Hypothesis: A bottom-up saliency map in the primary visual cortex (Li 1999, 2002) Retina inputs Saliency map V1 firing rates (highest at each location) = Winner-take-all Superior colliculus

The V1 Saliency Hypothesis: A bottom-up saliency map in the primary visual cortex (Li 1999, 2002) Retina inputs V1 firing rates (highest at each location) Winner-take-all Superior colliculus

The V1 Saliency Hypothesis: Neural activities as A bottom-up saliency map in the primary visual cortex (Li universal 1999, 2002) currency to bid for visual selection. Retina inputs V1 firing rates (highest at each location) Winner-take-all Superior colliculus

The V1 Saliency Hypothesis: A bottom-up saliency map in the primary visual cortex (Li 1999, 2002) Retina inputs V1 V1 firing rates (highest at each location) Winner-take-all Superior colliculus Bosking et al 1997 iso-feature suppression (Blakemore & Tobin 1972, Gilbert & Wiesel 1983, Rockland & Lund 1983, Allman et al 1985, Hirsch & Gilbert 1991, Li & Li 1994, etc)

The V1 Saliency Hypothesis: A bottom-up saliency map in the primary visual cortex (Li 1999, 2002) Retina inputs V1 V1 firing rates (highest at each location) Winner-take-all Iso-orientation suppression Iso-color suppression Bosking et al 1997 Bosking et al 1997 iso-feature suppression (Blakemore & Tobin 1972, Gilbert & Wiesel 1983, Rockland & Lund 1983, Allman et al 1985, Hirsch & Gilbert 1991, Li & Li 1994, etc Wachtler et al 2003 Jones et al 2001 ) Superior colliculus Iso-motion (direction) suppression

The V1 Saliency Hypothesis: A bottom-up saliency map in the primary visual cortex (Li 1999, 2002) Retina inputs V1 V1 firing rates (highest at each location) Winner-take-all Superior colliculus Bosking et al 1997 Left eye image Right eye image A surprising prediction: an invisible feature attract attention! (Zhaoping 2008, 2012) Escapes iso-orientation suppression Fused perception Escapes iso-eye-of-origin suppression Looking without seeing! Replicated by multiple research groups since! A fingerprint of V1

Look for saliency signals in the brain without the awareness/top-down confound Collaboration with Fang Fang (Beijing University), and his student Xilin Zhang (Neuron, 2012) Salient orientation pop-out Quickly masked to prevent awareness (verified) Vernier task probe, may be at cued location or the other hemi-field

Look for saliency signals in the brain without the awareness/top-down confound Collaboration with Fang Fang (Beijing University), and his student Xilin Zhang (VSS 2011) Cueing effect Now measure which brain areas are activated by this invisible cue? By fmri, ERP V1, but not IPS, FEF!! i.e.,saliency effect without awareness Orientation contrast at cue

Testing the V1 theory on behaving monkeys --- Yan, Zhaoping, & Li, Submitted. Saccade to an uniquely oriented bar ASAP V1 neural responses to input stimulus (spikes/sec) for trials with faster saccades Receptive field for trials with slower or failed saccades Time since visual input onset (second) Quantitative, zero-parameter, predictions from theory Imply that higher visual areas are not involved Solid curve ---Planck s law Squares: --- data points Koene & Zhaoping 2007 Zhaoping & Zhe 2015,

Talk outline (1) The functional role of the primary visual cortex (V1) Attentional selection (2) In light of V1 s role a new plan to understanding vision (3) A first example study in this new plan

Information bottlenecks in the visual pathway: 10 7 bits/second ~ 10 6 neurons, ~10 spikes/neuron ~1 bit/spike Bottom-up selection To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956) 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel

A new path to understanding vision 20 frames, 20 megabytes/second 40 bits/second Traditional paths to understanding vision (1) Low level vision, mid-level vision, high-level vision To be or not to be, This is the question.. 40 bits/second (Sziklai, 1956) (2) David Marr: primal sketch, 2.5 d sketch, 3-d model. 10 9 bits/second (Kelly 1962) ~ 25 frames/second, 2000x2000 pixels, 1 byte/pixel Bottom-up selection

Talk outline (1) The functional role of the primary visual cortex (V1) Attentional selection (2) In light of V1 s role a new plan to understanding vision (3) A first example study in this new plan

A new path to understanding vision 20 frames, 20 megabytes/second Looking periphery 40 bits/second Seeing center

A new path to understanding vision 20 frames, 20 megabytes/second Looking periphery 40 bits/second Seeing center

Looking and seeing --- peripheral and central Two separate processes Demo: (Zhaoping & Guyader 2007) V1: tuned to primitive bars IT & higher areas: X shape recognition, rotationally invariant Add a horizontal bar or a vertical bar to each oblique bar

Looking and seeing --- peripheral and central Two separate processes Demo: (Zhaoping & Guyader 2007) Add a horizontal bar or a vertical bar to each oblique bar Add a horizontal bar or a vertical bar to each oblique bar

Display span 46x32 degrees in visual angle --- condition A

Gaze arrives at target after a few saccades Looking --- mainly by the bottom up saliency of the unique orientation. Then...

Gaze dawdled around the target, then abandoned and returned. Seeing

A new path to understanding vision 20 frames, 20 megabytes/second Looking 40 bits/second Seeing Peripheral vision Must qualitatively differ in Seeing Central vision Computational algorithms? Focus on feedforwardfeedback / V1 A demo of crowding in the periphery

Summation percept dominant Perception =? In V1, signals are efficiently encoded by these two de-correlated channels (Li & Atick 1994) Ocular sum Ocular diff Subject task: report the perceived tilt. C = C Left eye Right eye Zhaoping 2017

Why does perception prefer ocular summation? (Zhaoping 2017) Higher areas for analysis-by-synthesis (c.f. predictive coding) Feedforward, feedback, verify, and re-weight (FFVW) If I perceive it, it is likely (prior) shown to both my left and my right eyes, so it should resemble the input in the sum channel!!! The Bayesian(?) monster V1 Feedback to verify Ocular sum Ocular diff Retina Left eye Right eye

Why does perception prefer ocular summation? (Zhaoping 2017) Higher areas for analysis-by-synthesis (c.f. predictive coding) Feedforward, feedback, verify, and re-weight (FFVW) If I perceive it, it is likely (prior) shown to both my left and my right eyes, so it should resemble the input in the sum channel!!! The Bayesian(?) monster V1 Feedback to verify Ocular sum Ocular diff Not because periphery cannot see tilt! Retina Left eye Right eye Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis!

Why does perception prefer ocular summation? (Zhaoping 2017) Higher areas for analysis-by-synthesis (c.f. predictive coding) Feedforward, feedback, verify, and re-weight (FFVW) If I perceive it, it is likely (prior) shown to both my left and my right eyes, so it should resemble the input in the sum channel!!! The Bayesian(?) monster V1 Feedback to verify Ocular sum Ocular diff Retina Left eye Right eye Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis!

A central disk (non-zero disparity) and a surrounding ring (zero disparity) Testing it in depth perception Zhaoping & Ackermann, 2018 Dots for the central disk correlated A V1 neuron s response to disparity in random dot stereograms Anticorrelated Correlated Dots for the central disk Anti-correlated Disparity (degrees) (Cumming and Parker 1997 ) Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis!

Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis! Dots for the central disk correlated A central disk (non-zero disparity) and a surrounding ring (zero disparity) Testing it in depth perception Zhaoping & Ackermann, 2018 No reversed depth percept V1 s report vetoed prior expects correlated inputs between the eyes Dots for the central disk Anti-correlated Top-down feedback to verify V1 s report V1 feeds forward reverse depth to higher brain areas!

Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis! Dots for the central disk correlated A central disk (non-zero disparity) and a surrounding ring (zero disparity) Testing it in depth perception Zhaoping & Ackermann, 2018 No reversed depth percept If peripheral vision has no feedback Dots for the central disk Anti-correlated V1 feeds forward reverse depth to higher brain areas!

Proposal: Top-down feedback to V1 is weaker or absent in peripheral vision for analysis-by-synthesis! A central disk (non-zero disparity) and a surrounding ring (zero disparity) Testing it in depth perception Zhaoping & Ackermann, 2018 Dots for the central disk correlated Accuracy reporting disparity-defined depth Dots for the central disk Anti-correlated

Thanks to Funding source: Gatsby Foundation, EPSRC, BBSRC. Collaborators: Wu Li, Yin Yan, Ansgar Koene, Li Zhe, Joealle Ackermann, etc A new path to understanding vision Falsifiable 20 frames, 20 megabytes/second 40 bits/second Opening the window to our brain Theory motivate inspire Experiments