An Attentional Framework for 3D Object Discovery

Similar documents
Computational Cognitive Science

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

Real-time computational attention model for dynamic scenes analysis

Keywords- Saliency Visual Computational Model, Saliency Detection, Computer Vision, Saliency Map. Attention Bottle Neck

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

Goal-directed search with a top-down modulated computational attention system

A Computational Model of Saliency Depletion/Recovery Phenomena for the Salient Region Extraction of Videos

Natural Scene Statistics and Perception. W.S. Geisler

Computational Cognitive Science

Validating the Visual Saliency Model

On the implementation of Visual Attention Architectures

Can Saliency Map Models Predict Human Egocentric Visual Attention?

Introduction to Computational Neuroscience

Compound Effects of Top-down and Bottom-up Influences on Visual Attention During Action Recognition

Visual Task Inference Using Hidden Markov Models

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes

Experiences on Attention Direction through Manipulation of Salient Features

Motion Saliency Outweighs Other Low-level Features While Watching Videos

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

HUMAN VISUAL PERCEPTION CONCEPTS AS MECHANISMS FOR SALIENCY DETECTION

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task

Methods for comparing scanpaths and saliency maps: strengths and weaknesses

NEOCORTICAL CIRCUITS. specifications

A context-dependent attention system for a social robot

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

What we see is most likely to be what matters: Visual attention and applications

Visual Attention Framework: Application to Event Analysis

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION

Object-based Saliency as a Predictor of Attention in Visual Tasks

The Attraction of Visual Attention to Texts in Real-World Scenes

Attention and Scene Understanding

A Model for Automatic Diagnostic of Road Signs Saliency

Evaluation of the Impetuses of Scan Path in Real Scene Searching

An Evaluation of Motion in Artificial Selective Attention

Computational modeling of visual attention and saliency in the Smart Playroom

GESTALT SALIENCY: SALIENT REGION DETECTION BASED ON GESTALT PRINCIPLES

Reading Assignments: Lecture 18: Visual Pre-Processing. Chapters TMB Brain Theory and Artificial Intelligence

Change Blindness. The greater the lie, the greater the chance that it will be believed.

Supplementary Material for submission 2147: Traditional Saliency Reloaded: A Good Old Model in New Shape

Experimental evaluation of the accuracy of the second generation of Microsoft Kinect system, for using in stroke rehabilitation applications

Goal-Directed Deployment of Attention in a Computational Model: A Study in Multiple-Object Tracking

A Dynamical Systems Approach to Visual Attention Based on Saliency Maps

The synergy of top-down and bottom-up attention in complex task: going beyond saliency models.

Object-Level Saliency Detection Combining the Contrast and Spatial Compactness Hypothesis

Recurrent Refinement for Visual Saliency Estimation in Surveillance Scenarios

Advertisement Evaluation Based On Visual Attention Mechanism

ELL 788 Computational Perception & Cognition July November 2015

Novelty Is Not Always the Best Policy Inhibition of Return and Facilitation of Return as a Function of Visual Task

The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search

VENUS: A System for Novelty Detection in Video Streams with Learning

Selective Attention. Modes of Control. Domains of Selection

Inattentional Blindness in a Coupled Perceptual Cognitive System

IAT 814 Knowledge Visualization. Visual Attention. Lyn Bartram

I. INTRODUCTION VISUAL saliency, which is a term for the pop-out

Modeling the Deployment of Spatial Attention

A LEARNING-BASED VISUAL SALIENCY FUSION MODEL FOR HIGH DYNAMIC RANGE VIDEO (LBVS-HDR)

A Knowledge Driven Computational Visual Attention Model

An Experimental Analysis of Saliency Detection with respect to Three Saliency Levels

Increasing Spatial Competition Enhances Visual Prediction Learning

(In)Attention and Visual Awareness IAT814

The discriminant center-surround hypothesis for bottom-up saliency

Novelty is not always the best policy: Inhibition of return and facilitation of return. as a function of visual task. Michael D.

On the role of context in probabilistic models of visual saliency

Visual Attention. International Lecture Serie. Nicolas P. Rougier. INRIA National Institute for Research in Computer Science and Control

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

Salient Object Detection in Videos Based on SPATIO-Temporal Saliency Maps and Colour Features

Object detection in natural scenes by feedback

A Neural Network Architecture for.

Finding Saliency in Noisy Images

Visual Perception. Agenda. Visual perception. CS Information Visualization January 20, 2011 John Stasko. Pre-attentive processing Color Etc.

Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition

Adding Shape to Saliency: A Computational Model of Shape Contrast

Attending to Learn and Learning to Attend for a Social Robot

Control of Selective Visual Attention: Modeling the "Where" Pathway

Does scene context always facilitate retrieval of visual object representations?

Attention Estimation by Simultaneous Observation of Viewer and View

Understanding eye movements in face recognition with hidden Markov model

Evaluating Visual Saliency Algorithms: Past, Present and Future

A Computer Vision Model for Visual-Object-Based Attention and Eye Movements

Tumor cut segmentation for Blemish Cells Detection in Human Brain Based on Cellular Automata

VISUAL search is necessary for rapid scene analysis

Cancer Cells Detection using OTSU Threshold Algorithm

Designing Caption Production Rules Based on Face, Text and Motion Detections

intensities saliency map

Learning to Predict Saliency on Face Images

Local Image Structures and Optic Flow Estimation

Computational Architectures in Biological Vision, USC, Spring 2001

Top-down Attention Signals in Saliency 邓凝旖

State-of-the-Art in Visual Attention Modeling

Relative Influence of Bottom-up & Top-down Attention

Empirical Validation of the Saliency-based Model of Visual Attention

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

5/20/2014. Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition. ANC sysmposium 19 May 2014

Attention. What is attention? Attention metaphors. Definitions of attention. Chapter 6. Attention as a mental process

Visual Thinking for Design Colin Ware

Image-Based Estimation of Real Food Size for Accurate Food Calorie Estimation

Attentional Selection for Action in Mobile Robots

Transcription:

An Attentional Framework for 3D Object Discovery Germán Martín García and Simone Frintrop Cognitive Vision Group Institute of Computer Science III University of Bonn, Germany

Saliency Computation Saliency Computation Difference Center Input Image Surround Saliency map Simone Frintrop 2

Attention vs Saliency Why visual attention is more than saliency computation: Attention is a process that operates over time Look at one region after the other explore a scene by eye movements (saccades) examine the painting freely assess the ages of the characters The gaze is triggered by bottom-up (saliency) AND top-down cues Simone Frintrop 3 [Yarbus 1967; Images: Sasha Archibald, after Ilya Repin, An Unexpected Visitor, 1884)]

Visual attention Tasks to be solved by visual attention systems: Where to direct the attention at (saliency + top-down cues) When to switch How to prevent returning to already visited locations Get new frame Saliency computation Detect Proto-objects no Inhibit Proto-object yes Fixated long enough? Simone Frintrop 4

Visual attention Tasks to be solved by visual attention systems: Where to direct the attention suppression at (saliency of processing + top-down of cues) locations When to switch and objects that had recently been the How to prevent returning focus to already of attention visited locations encourages orienting toward novelty Get new frame Inhibition of return (IOR) [Posner 1984] Saliency computation Detect Proto-objects happens in spatial coordinates and not in retinotopic coordinates no Inhibit Proto-object yes Fixated long enough? Simone Frintrop 5

Inhibition of Return Many systems simply compute the saliency map and ignore IOR Inhibition of return in computational attention systems: Inhibition of return [Itti,Koch,Niebur PAMI 1998] Feature 1 Max Finder Saliency map... Feature n Trajectory of FOAs Input image If realized, usually by setting values to zero in saliency map. In temporal sequences, if camera positions and/or objects change location, this is not sufficient! It is necessary to remember which regions/objects have already been attended not in image but in spatial coordinates Simone Frintrop 6

3D Object Discovery We propose an architecture for object discovery that operates in 3D (on RGB-D data) Object information is accumulated over time and 3D object models are built based on information from current and previous time steps 3D voxels are tagged by inhibition of return flags that indicate if a region was recently attended Get new frame Saliency computation Detect Proto-objects no Inhibit Proto-object yes Fixated long enough? 3D Map update (object information + IOR flags) Simone Frintrop 7

3D Object Discovery Color image Germán Martín García Depth image 3D Object Map [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 8

3D Object Discovery Color image Germán Martín García Depth image What are these proto-objects? [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 9

What are proto-objects? The triadic architecture of Rensink (2000): Proto-objects: Proto-objects are object hypotheses built from simple features Volatile structures with limited coherence in space and time (Collections of) protoobjects are the target of attention ( the focused attention system acts as a hand to grab protoobjects ) Simone Frintrop 10 R. A. Rensink, The Dynamic Representation of Scenes, Visual Cognition (7), 2000

Saliency Detection For saliency computation, use information-theoretic saliency: Dominik Klein Represent center and surround regions by probability distributions Otherwise structure similar to Itti/Koch [Itti et al, PAMI 1998] (image pyramids, feature channels, conspicuity maps, saliency map) BITS [Klein/Frintrop ICCV 2011]: CoDi [Klein/Frintrop DAGM 2012]: Discrete distributions compare with Kullback-Leibler Divergence Continuous distributions (Gaussians) compare with Wasserstein distance W 2 Surround Center C++ code available at: http://www.iai.uni-bonn.de/~kleind/ Simone Frintrop 11

MSRA Salient Object Database BITS: [Klein/Frintrop, ICCV 2011] AC10: [Achanta/Süsstrunk, 2010] Simone Frintrop invt: [Itti et al., PAMI 1998] HZ08: [Hou/Zhang, 2008] 12 ST: [Walther/Koch, 2006] AIM: [Bruce/Tsotsos, 2009]

Saliency performance on MSRA Salient Object Database [Klein/Frintrop, DAGM, 2012] Simone Frintrop 13

From saliency to proto-objects We have now a saliency map: To obtain proto-objects, we follow the two-stage processing strategy of the HVS: Pre-attentive: parallel processing to find region of interest Attentive: serial processing, one region of interest at the time processing becomes selective. Simone Frintrop [Neisser, Cognitive Psychology, 1967] 14 [Treisman, Cognitive Psychology, 1985]

From saliency to proto-objects Generate proto objects: Pre-attentive (parallel) stage: 1. compute the saliency map 2. threshold the saliency map 3. find connected components (discard too big and too small objects) 4. rank by average saliency 5. Inhibit already attended regions Attentive stage: Frame t 1. Pick one salient blob & fit a rectangle around IOR information from frames 1 to t-1 2. Refine shape of blob with GrabCut segmentation [Rother 2004] (OpenCV implementation) saliency values serve to initialize pixel likelihoods [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 15

Proto-Objects Now, we have a segmented proto-object for the currently attended region Examples: [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 16

Salient Object Discovery in 3D Original image Saliency map Object hypotheses [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)]

Salient Object Discovery in 3D Color image Depth image That s was all in 2D. Now we come to 3D [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 18

Salient Object Discovery in 3D Color image From the depth data, incrementally built a 3D environment map with the Kinectfusion algorithm [Newcomb 2011]. It tracks the pose of the camera integrates the new measurement into the global model Then, project the proto-objects into the 3D scene Depth image 3D Object Map [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 19

Inhibition of Return in 3D Each voxel stores An IOR flag (active/not active) An IOR weight Increase IOR weight of attended voxels. If weight reaches threshold, set IOR flag active Decrease IOR weight of not attended voxels If weight reaches 0, set IOR flag inactive 20

Salient Object Discovery in 3D Color image Use knowledge from 3D map to improve object detection & segmentation raycast the object labels & inhibition flags to the current viewpoint Depth image constrain the segmentation step to not label pixels as foreground that belong to other objects Use inhibition flags to prevent selecting already attended blobs 3D Object Map [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 21

Results: Table top sequence Ground truth Detected Objects Simone Frintrop 22

Results: Table top sequence Ground truth Detected Objects Simone Frintrop 23

Salient Object Discovery in 3D [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)]

Results: Table top sequence Object Precision Recall Juice 97 31 Bowl 92 56 Cereals 98 61 Apple 98 45 Coffee 1 90 47 Coffee 2 96 49 Simone Frintrop 25

Results: Coffee machine sequence Ground truth Detected Objects Simone Frintrop

Results: Coffee machine sequence Ground truth Detected Objects Simone Frintrop 27

Results: Coffee machine sequence Object Precision Recall Note 69 43 Y. cup 90 40 W. cup 93 40 Box 99 30 Paper 99 40 Sugar 62 61 Cup 90 36 Water 99 37 Milk 98 39 28

3D Object Discovery Color image Take home message: Real-world attention requires inhibition of return in spatial coordinates Depth image Thank you for your attention! 3D Object Map [G. Martín-García & S. Frintrop, Proc. of the annual meeting of Cognitive Sciences (CogSci), 2013 (to appear)] [G. Martín-García & S. Frintrop, German Journal of Artificial Intelligence, 2013 (to appear)] 29