Demonstratives in Vision

Similar documents
A neglected problem in the Representational Theory of Mind. Object Tracking and the Mind-World Connection

Visual indexes, preconceptual objects, and situated vision

Is Cognitive Science Special? In what way is it special? Cognitive science is a delicate mixture of the obvious and the incredible

Perceptual cognition and the design of Air Traffic Control interfaces

Page # Perception and Action. Lecture 3: Perception & Action. ACT-R Cognitive Architecture. Another (older) view

Deictic Codes for the Embodiment of Cognition. Ballard, Hayhoe, Pook, & Rao

Some Primitive Mechanisms of Spatial Attention 1

Change Blindness. The greater the lie, the greater the chance that it will be believed.

Zoo400 Exam 1: Mar 25, 1999

Cognition 113 (2009) Contents lists available at ScienceDirect. Cognition. journal homepage:

Birds' Judgments of Number and Quantity

What is mid level vision? Mid Level Vision. What is mid level vision? Lightness perception as revealed by lightness illusions

Theoretical Neuroscience: The Binding Problem Jan Scholz, , University of Osnabrück

Competing Frameworks in Perception

Competing Frameworks in Perception

(Visual) Attention. October 3, PSY Visual Attention 1

Neural circuits PSY 310 Greg Francis. Lecture 05. Rods and cones

Cognitive Modelling Themes in Neural Computation. Tom Hartley

Natural Scene Statistics and Perception. W.S. Geisler

How do Categories Work?

Identify these objects

Modeling Visual Search Time for Soft Keyboards. Lecture #14

Visual Selection and Attention

Animal Cognition. Introduction to Cognitive Science

VIEW AS Fit Page! PRESS PgDn to advance slides!

Attention! 5. Lecture 16 Attention. Science B Attention. 1. Attention: What is it? --> Selection

Lecture 2.1 What is Perception?

Selective Attention. Inattentional blindness [demo] Cocktail party phenomenon William James definition

Validating the Visual Saliency Model

Expert System Profile

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements

Learning and Adaptive Behavior, Part II

OVERVIEW TUTORIAL BEHAVIORAL METHODS CLAIM: EMLAR VII EYE TRACKING: READING. Lecture (50 min) Short break (10 min) Computer Assignments (30 min)

Searching through subsets:

Re: ENSC 370 Project Gerbil Functional Specifications

Spatial working memory load affects counting but not subitizing in enumeration

Sign Language Recognition using Webcams

Dynamics and Modeling in Cognitive Science - I

M.Sc. in Cognitive Systems. Model Curriculum

Introduction to. Metaphysics. Mary ET Boyle, Ph.D. Department of Cognitive Science UCSD

The Vote! Winners. $100 Question from Ch 10 11/16/11

the human 1 of 3 Lecture 6 chapter 1 Remember to start on your paper prototyping

How Lertap and Iteman Flag Items

Introduction and Historical Background. August 22, 2007

(In)Attention and Visual Awareness IAT814

Introduction to Computational Neuroscience

Detection of change in shape: an advantage for concavities

Games for Young Mathematicians Dot Cards

Computational Cognitive Science

Selective Attention (dichotic listening)

Concerns with Identity Theory. Eliminative Materialism. Eliminative Materialism. Eliminative Materialism Argument

Attention and Scene Perception

Turns Out, Counting on Your Fingers Makes You Smarter

Do New Objects Capture Attention? Steven L. Franconeri, 1 Andrew Hollingworth, 2 and Daniel J. Simons 3

Psychology 354 Week 10: Functional Analysis

Choosing Life: Empowerment, Action, Results! CLEAR Menu Sessions. Substance Use Risk 2: What Are My External Drug and Alcohol Triggers?

PSYC 441 Cognitive Psychology II

Lecture 9 Cognitive Processes Part I. Kashif Sajjad Bhatti Assistant Professor IIU, Islamabad

Feature binding in object-file representations of multiple moving items

Fodor on Functionalism EMILY HULL

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

Phil 490: Consciousness and the Self Handout [16] Jesse Prinz: Mental Pointing Phenomenal Knowledge Without Concepts

What Matters in the Cued Task-Switching Paradigm: Tasks or Cues? Ulrich Mayr. University of Oregon

Artificial Intelligence: Its Scope and Limits, by James Fetzer, Kluver Academic Publishers, Dordrecht, Boston, London. Artificial Intelligence (AI)

Satiation in name and face recognition

Hypothesis-Driven Research

IAT 814 Knowledge Visualization. Visual Attention. Lyn Bartram

Lecture 9: Lab in Human Cognition. Todd M. Gureckis Department of Psychology New York University

Goal-Directed Deployment of Attention in a Computational Model: A Study in Multiple-Object Tracking

VISA to (W)hole Personal Leadership

Getting the Design Right Daniel Luna, Mackenzie Miller, Saloni Parikh, Ben Tebbs

DATA COLLECTION TOOLS

Information Design. Information Design

Standard KOA3 5=3+2 5=4+1. Sarah Krauss CCLM^2 Project Summer 2012

Part 1 Making the initial neuron connection

The interplay of domain-specific and domain general processes, skills and abilities in the development of science knowledge

Offsets and prioritizing the selection of new elements in search displays: More evidence for attentional capture in the preview effect

Behavior Architectures

Understanding Users. - cognitive processes. Unit 3

PLEASE SCROLL DOWN FOR ARTICLE

80/20 THE 80/20 RULE. consider life: consider business:

The synergy of top-down and bottom-up attention in complex task: going beyond saliency models.

Prof. Greg Francis 7/31/15

Asymmetries in ecological and sensorimotor laws: towards a theory of subjective experience. James J. Clark

What matters in the cued task-switching paradigm: Tasks or cues?

Nature of the mind Intersection of Philosophy and Neuroscience

Canadian Cancer Society Relay For Life Survivor Chair

New objects do not capture attention without a sensory transient

User Interface. Colors, Icons, Text, and Presentation SWEN-444

An Attentional Framework for 3D Object Discovery

AI and Philosophy. Gilbert Harman. Thursday, October 9, What is the difference between people and other animals?

Psychology Research Methods Lab Session Week 10. Survey Design. Due at the Start of Lab: Lab Assignment 3. Rationale for Today s Lab Session

L6: Overview. with side orders of lecture revision, pokemon, and silly experiments. Dani Navarro

Psychology Perception

Chapter 5 Test Review. Try the practice questions in the Study Guide and on line

Demonstrative Thought

Lecture 9: Sound Localization

VISUAL PERCEPTION & COGNITIVE PROCESSES

= add definition here. Definition Slide

Spatial Attention and the Apprehension of Spatial Relations

Transcription:

Situating Vision in the World Zenon W. Pylyshyn (2000) Demonstratives in Vision Abstract Classical theories lack a connection between visual representations and the real world. They need a direct pre-conceptual connection between (proto-)objects in the visible world and visual representations. FINSTs (fingers of instantiation, aka Visual Indexes) do the needed work. 1

REPRESENTATIONS They encode properties of the world in the same way that words do. They can be incorrect. (We can misrepresent a wolf as a dog). But conceptual (descriptive) representations lack indexical reference. Indexical reference: representations whose reference depends on what is being pointed at by the speaker. REPRESENTATIONS Situated Vision tries to address the problem of connecting to the world by eliminating representations altogether. (Agre takes it to this extreme.) Behaviorism agents are just a bunch of complex reflexes. Visual Index theories and Situated Vision theories agree that there is a need to take into account the nature of the actual environment, not only the represented one, in explaining intelligent behavior. 2

We use the world as an extension of our minds. We can search through a visual scene the way we search through our own memories. We store object pointers so we can look up information about them in the world. Herb Simon (a major proponent of representations) has noted that, e.g., ants may seem to exhibit complex behaviors, but they likely in fact follow simple rules, like move in the direction of the sun and avoid large obstacles. In learning to add, we can learn a rule in terms of the next column on the left. To use the environment in this way, people need to keep track of individual objects, and use tracked objects as markers for cognitive activities. The visual index (FINST) is a mechanism for making this possible. 3

DEMONSTRATIVE REFERENCE Using demonstrative reference (deictic pointers) avoids the need to encode the scene in terms of global properties, and allows the encoding of relations between the objects and the perceiver/actor. Relevant objects can be selected directly. Contrast: There is something that is the North Star vs. This very thing is the North Star. The latter allows action. CONSTRUCTION OF THE IMAGE Object representations need to be posited anyway, to explain how the visual system constructs the image. The representation that is the output of the visual system is constructed from the retinal input in a series of stages. Some of the construction involves movement of the eyes (saccades). Look at a page, it is all clear, but fix your eyes and only a small area is clear. In scanning the system has to match points across inputs. This is the correspondence problem for incremental visual encoding. 4

CONSTRUCTION OF THE IMAGE Solving correspondence would be easy if the agent had an accurate 3D representation of all objects in the coordinates. But experiments show that little information is carried from one fixation to the next, and there aren t representations of absolute locations of objects. Changes in a scene are rarely noticed, unless attention is on the changing object. Another mechanism is needed to solve correspondence. Demonstrative pointers would do the needed work. Must point to objects, not locations, to work in dynamic scenes. The mechanism can t use descriptions of objects because the properties change CONSTRUCTION OF THE IMAGE Also, for pattern recognition, the system has to execute serial visual routines. The routines often involve the marking/tagging of objects. For example, a routine is needed for the counting of embedded squares. Instead of tags, Pylyshyn suggests pointers. FINSTs are pre-conceptual reference pointers in the sense that the objects are identified and represented without appeal to their properties (i.e., the concepts they fall under). The FINST might latch onto an object because of the object s properties, but at the introspectible output image, the object representations are not descriptions. 5

MULTIPLE OBJECT TRACKING First Exp: 8 dots on a screen, 4 of them flicker, and then they all move around the screen for about 10 seconds. When they stop, the subjects task is to identify the ones that had flickered. Subjects consistently track the 4 objects (87%) How? Not by encoding locations and updating information about locations too much attention and scanning is required. Other tracking experiments: odots can t be tracked when connected to nontarget dots with a bar (the visual system seems to treat the barbells as objects). odots can be tracked thru occluders and thru changes in color and shape. It takes less time to find a property among targets than non-targets. OTHER EVIDENCE FOR INDEXES Ullman s subitizing, or rapid enumeration (for < 4 items). RT increases by 60ms/item from 2 to 4 items; by about 100ms/item when more than 4 items. FINST explanation: there are 4 or 5 available pointers. Enumeration of active pointers does not require visual scanning of the display. Predictions: o If objects aren t individuated with focal attention, they can t be subitized (evidence: concentric squares). o Objects that suddenly appear in the environment get indexed and once indexed, it can be accessed without searching for it by its properties. (evidence: pop-out) Pointers vs. Priority Tags: Steve Yantis tags don t explain correspondence or directed eye movements. Tags in the real world might help, if we also had matching tags in the representation. That s, in effect, what pointers do. 6

IMPLEMENTATION: But how??? Koch and Ullman s neural net: It finds the most active unit and shuts off the rest. Then a detector for property P is sent out. If the detector fires, then we know that the focus region has property P. [But what stays active over time???] RELATED RESEARCH Object file theory: Kahneman s lexical priming experiments show that actual objects rather than their locations provide the locus for storing/accessing properties of the objects. Priming: Prior occurrence of a letter decreases RTs to the letter. The priming effect travels with the box in which the letter occurred. oexplanation: when an object first appears, an object file is created and properties are stored. 7

RELATED RESEARCH Diexis in eye-body coordination (Ballard):Viewercentered representation of the coordinates makes the task more computationally tractable. Gaze allows objects to be referenced without appeal to their properties (with fewer objects in the domain, the indexicals aren t ambiguous). Ballard monitored eye gaze in task of copying an arrangement of blocks. Subjects seem to use gaze as a pointing device to serialize the task. Index theory assumes that only indexed objects can be the targets of motor commands, including direction of gaze. Infants are able to distinguish between one and two objects earlier than they are able to use the objects properties for recognizing an object seen before. Conclusions The visual system needs some kind of direct reference mechanism to represent objects. Object representations are non-conceptual -- they do not encode properties. It was wrong to think that humans are first equipped to detect properties in the world; we seem to detect (proto-)objects first. 8

Final Thought Is a real-world connection really required? FINSTs aren t literally fingers. How might FINSTs be implemented? Perhaps it helps to consider how they might be implemented in a brain in a vat. 9