Eye movements, recognition, and memory

Similar documents
What can computational models tell us about face processing? What can computational models tell us about face processing?

The roles of visual expertise and visual input in the face inversion effect: Behavioral and neurocomputational evidence

Eye Movements in Face Recognition 255

Experience Matters: Modeling the Relationship Between Face and Object Recognition

Bayesian Face Recognition Using Gabor Features

Towards The Deep Model: Understanding Visual Recognition Through Computational Models. Panqu Wang Dissertation Defense 03/23/2017

Ben Cipollini & Garrison Cottrell

Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration

Chapter 5: Perceiving Objects and Scenes

Virtual Brain Reading: A Connectionist Approach to Understanding fmri

The Time Scale of Perceptual Expertise

Introduction to Computational Neuroscience

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

A Deep Siamese Neural Network Learns the Human-Perceived Similarity Structure of Facial Expressions Without Explicit Categories

What is mid level vision? Mid Level Vision. What is mid level vision? Lightness perception as revealed by lightness illusions

Perceptual Disorders. Agnosias

Facial Memory is Kernel Density Estimation (Almost)

The obligatory nature of holistic processing of faces in social judgments

Automatic Classification of Perceived Gender from Facial Images

On the role of context in probabilistic models of visual saliency

Visual Processing (contd.) Pattern recognition. Proximity the tendency to group pieces that are close together into one object.

Object recognition and hierarchical computation

CS4495 Computer Vision Introduction to Recognition. Aaron Bobick School of Interactive Computing

A: implicit, unconscious, tacit. The name for cognitive processes of which we lack awareness

Infants Differential Processing of Female and Male Faces Jennifer L. Ramsey-Rennels 1 and Judith H. Langlois 2

Grace Iarocci Ph.D., R. Psych., Associate Professor of Psychology Simon Fraser University

Methodological Issues in Measuring the Development of Character

Morton-Style Factorial Coding of Color in Primary Visual Cortex

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction

Lecture 2.1 What is Perception?

A Computational Model of the Development of Hemispheric Asymmetry of Face Processing

Face Perception - An Overview Bozana Meinhardt-Injac Malte Persike

Neural codes PSY 310 Greg Francis. Lecture 12. COC illusion

Decoding the Representation of Code in the Brain: An fmri Study of Code Review and Expertise

16/05/2008 #1

Neuromorphic convolutional recurrent neural network for road safety or safety near the road

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany

(Visual) Attention. October 3, PSY Visual Attention 1

FAILURES OF OBJECT RECOGNITION. Dr. Walter S. Marcantoni

The "Aha! moment: How prior knowledge helps disambiguate ambiguous information. Alaina Baker. Submitted to the Department of Psychology

Facial Event Classification with Task Oriented Dynamic Bayesian Network

Understanding eye movements in face recognition with hidden Markov model

Neuroinformatics. Ilmari Kurki, Urs Köster, Jukka Perkiö, (Shohei Shimizu) Interdisciplinary and interdepartmental

Midterm Exam 1 NAME: UW ID:

using deep learning models to understand visual cortex

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION

Sensory Cue Integration

Experimental design for Cognitive fmri

Convergence of the visual field split: Hemispheric modeling of face and object recognition

Computational Cognitive Science

The Interdisciplinary Ph.D Program in Cognitive Science. What is Cognitive Science? The Axioms of Cognitive Science

Is All Face Processing Holistic? The View from UCSD

A framework for the Recognition of Human Emotion using Soft Computing models

Competing Frameworks in Perception

Competing Frameworks in Perception

Object vision (Chapter 4)

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

lateral organization: maps

Memory for object position in natural scenes

Master s Thesis. Presented to. The Faculty of the Graduate School of Arts and Sciences. Brandeis University. Department of Psychology

Two Fixations Suffice in Face Recognition Janet Hui-wen Hsiao and Garrison Cottrell

Introduction to Cognitive Neuroscience fmri results in context. Doug Schultz

Facial Memory is Kernel Density Estimation (Almost)

The overlap of neural selectivity between faces and words: evidences

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

Orientation Specific Effects of Automatic Access to Categorical Information in Biological Motion Perception

Recognition of Faces of Different Species: A Developmental Study Between 5 and 8 Years of Age

M.Sc. in Cognitive Systems. Model Curriculum

Who Needs Cheeks? Eyes and Mouths are Enough for Emotion Identification. and. Evidence for a Face Superiority Effect. Nila K Leigh

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

2015 NADTA Conference Pre-Education Committee Book Club Everyday Bias, Howard J. Ross, Suggested Group Discussion Questions

Information Design. Information Design

Modeling the Deployment of Spatial Attention

Introduction and Historical Background. August 22, 2007

Formulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification

A Cooperative Multiagent Architecture for Turkish Sign Tutors

Using Computational Models to Understand ASD Facial Expression Recognition Patterns

Dynamic Visual Attention: Searching for coding length increments

User Interface. Colors, Icons, Text, and Presentation SWEN-444

R Jagdeesh Kanan* et al. International Journal of Pharmacy & Technology

SUN: A Model of Visual Salience Using Natural Statistics. Gary Cottrell Lingyun Zhang Matthew Tong Tim Marks Honghao Shan Nick Butko Javier Movellan

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

The Effects of Color on Memory. Ann Del Rosario, Anthea Deveyra, Morgan Hartsell, Dakota Lehman, Lydia Stellwag, Mariel Valino

The Role of Color and Attention in Fast Natural Scene Recognition

Keywords- Saliency Visual Computational Model, Saliency Detection, Computer Vision, Saliency Map. Attention Bottle Neck

Overview of the visual cortex. Ventral pathway. Overview of the visual cortex

Ch 5. Perception and Encoding

Consciousness The final frontier!

Chapter 5: Perceiving Objects and Scenes

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Can Saliency Map Models Predict Human Egocentric Visual Attention?

9.65 Sept. 12, 2001 Object recognition HANDOUT with additions in Section IV.b for parts of lecture that were omitted.

The Perception of Gender in Human Faces Samantha Haseltine Gustavus Adolphus College Faculty Advisor: Patricia Costello. Perception of Gender 1

Introduction to Computational Neuroscience

Knowledge-driven Gaze Control in the NIM Model

Lecture Outline Schemas Part 1. Bottom-Up Processing. Schemas. Top-Down Processing. Bottom up vs. Top Down Processing. Schemas

Lecture Outline Schemas Part 1. Bottom-Up Processing

Ch 5. Perception and Encoding

Between-word regressions as part of rational reading

Transcription:

Eye movements, recognition, and memory Garrison W. Cottrell TDLC Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Joint work with Luke Barrington, Janet Hsiao, & Tim Marks

background I have a model of face and object processing that accounts for a lot of data

The Face and Object Processing System FFA Book Can Cup Face Bob Expert Level Classifier Sue Gabor Filtering PCA Jane LOC Book Can Cup Face Basic Level Classifier Pixel (Retina) Level Perceptual (V1) Level Gestalt Level Hidden Layer Category Level

Effects accounted for by this model Categorical effects in facial expression recognition (Dailey et al., 2002) Similarity effects in facial expression recognition (ibid) Why fear is hard (ibid) Rank of difficulty in configural, featural, and inverted face discrimination (Zhang & Cottrell, 2006) Holistic processing of identity and expression, and the lack of interaction between the two (Cottrell et al. 2000) How a specialized face processor may develop under realistic developmental constraints (Dailey & Cottrell, 1999) How the FFA could be recruited for other areas of expertise (Sugimoto & Cottrell, 2001; Joyce & Cottrell, 2004, Tran et al., 2004; Tong et al, 2005) The other race advantage in visual search (Haque & Cottrell, 2005) Generalization gradients in expertise (Nguyen & Cottrell, 2005) Priming effects (face or character) on discrimination of ambiguous chinese characters (McCleery et al, under review) Age of Acquisition Effects in face recognition (Lake & Cottrell, 2005; L&C, under review) Memory for faces (Dailey et al., 1998, 1999) Cultural effects in facial expression recognition (Dailey et al., in preparation)

background I have a model of face and object processing that accounts for a lot of data But it is fundamentally wrong in an important way!

background I have a model of face and object processing that accounts for a lot of data But it is fundamentally wrong in an important way! It assumes that the whole face is processed to the same level of detail everywhere.

background I have a model of face and object processing that accounts for a lot of data But it is fundamentally wrong in an important way! It assumes that the whole face is processed to the same level of detail everywhere.and that the face is always aligned with the brain

background Instead, people and animals actively sample the world around them with eye movements. The images our eyes take in slosh around on the retina like an MTV video. We only get patches of information from any fixation But somehow, we perceive a stable visual world.

background So, a model of visual processing needs to be dynamically integrating information over time. We need to decide where to look at any moment: There are both bottom up (inherently interesting things ) and top down influences on this (task demands)

Questions for today What form should a model take in order to model eye movements without getting into muscle control? Do people actually gain information from saccades in face processing? How is the information integrated? How do features change over time?

Answers so far What form should a model take in order to model eye movements without getting into muscle control? Probabilistic Do people actually gain information from saccades? Yes How is the information integrated? In a Bayesian way How do features change over time? Don t know yet, but we hypothesize that the effective receptive fields grow over time

Today s talk What form should a model take in order to model eye movements without getting into muscle control? Probabilistic Do people actually gain information from saccades? Yes How is the information integrated? In a Bayesian way: NIMBLE How do features change over time? Don t know yet, but we hypothesize that the effective receptive fields grow over time

Do people really need more than one fixation to recognize a face? To answer this question, we ran a simple face recognition experiment: Subjects saw 32 faces, then were given an old/new recognition task with those 32 faces and 32 more. However, subjects were restricted to 1, 2, 3, or unlimited fixations on the face at test time.

Experimental Design Deleted until publication sorry!

Results Deleted until publication - sorry!

Deleted until publication - sorry!

Where should our model look? We have lots of ideas about this, but for the purposes of today s talk, we will just use bottom up salience: Figure out where the image is interesting, and look there interesting = where there are areas of high contour

Interest points created using Gabor filter variance

Given those fragments, how do we recognize people? The basic idea is that we simply store the fragments with labels: Bob, Carol, Ted or Alice In the study set, not in the study set (Note that we have cast face recognition as a categorization problem.) When we look at a new face, we look at the labels of the most similar fragments, and vote. This can be done easily in a (Naïve) Bayesian framework

A basic test: Face Recognition Following (Duchaine & Nakayama, 2005): Images are displayed for 3 seconds -- which means about 10 fixations - which means about 10 fragments are stored, chosen at random using bottom-up salience 30 lures Normal human ROC area (A ) is from 0.9 to 1.0 NIMBLE results:

NIM s face memory performance as a function of fixations Unfortunately these are not fixated on the first three saccades so I cropped the image!

Obviously, these results are fuzzy But suggestive of the human results!

Multi-Class Memory Tasks An identification task tests NIMBLE on novel images from multiple classes Face identification: 29 identities from the FERET dataset Changes in expression and illumination Object identification: 20 object classes from the COIL-100 dataset Changes in orientation, scale and illumination

Identification Results [Belongie, Malik & Puzicha, 2002] report 97.6% accuracy on a 20- class, 4 training image set from COIL-100

What does this have to do with the temporal dynamics of learning? We now have evidence that humans require multiple fixations to perform well on this task. We have a model that makes fixations and performs the task over time. Now we need to add top-down control to the model so that it can decide at any moment where to look next - the model will need to learn a visual routine depending on the task! We need to come up with a way to adapt the representations over learning time.

Long term questions Do people have visual routines for sampling from stimuli? Do people have different routines for different tasks on identical stimuli? Do the routines change depending on the information gained at any particular location? Do different people have different routines? How do those routines change with expertise? How do they change over the acquisition of expertise?

END