A Computational Model of Saliency Depletion/Recovery Phenomena for the Salient Region Extraction of Videos

Similar documents
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

An Attentional Framework for 3D Object Discovery

Computational Cognitive Science

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

Can Saliency Map Models Predict Human Egocentric Visual Attention?

An Evaluation of Motion in Artificial Selective Attention

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

VENUS: A System for Novelty Detection in Video Streams with Learning

Validating the Visual Saliency Model

Motion Saliency Outweighs Other Low-level Features While Watching Videos

Computational Cognitive Science

Detection of Inconsistent Regions in Video Streams

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

Modeling the Deployment of Spatial Attention

Real-time computational attention model for dynamic scenes analysis

Control of Selective Visual Attention: Modeling the "Where" Pathway

On the role of context in probabilistic models of visual saliency

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Recurrent Refinement for Visual Saliency Estimation in Surveillance Scenarios

On the implementation of Visual Attention Architectures

Compound Effects of Top-down and Bottom-up Influences on Visual Attention During Action Recognition

Top-down Attention Signals in Saliency 邓凝旖

Advertisement Evaluation Based On Visual Attention Mechanism

Incorporating Audio Signals into Constructing a Visual Saliency Map

Development of goal-directed gaze shift based on predictive learning

Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task

Experiences on Attention Direction through Manipulation of Salient Features

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION

Learning Spatiotemporal Gaps between Where We Look and What We Focus on

A context-dependent attention system for a social robot

A Dynamical Systems Approach to Visual Attention Based on Saliency Maps

A Visual Saliency Map Based on Random Sub-Window Means

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes

Dynamic Visual Attention: Searching for coding length increments

Designing Caption Production Rules Based on Face, Text and Motion Detections

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

A LEARNING-BASED VISUAL SALIENCY FUSION MODEL FOR HIGH DYNAMIC RANGE VIDEO (LBVS-HDR)

Goal-directed search with a top-down modulated computational attention system

Salient Object Detection in Videos Based on SPATIO-Temporal Saliency Maps and Colour Features

Quantitative modelling of perceptual salience at human eye position

Hierarchical Convolutional Features for Visual Tracking

Biologically Motivated Local Contextual Modulation Improves Low-Level Visual Feature Representations

NIH Public Access Author Manuscript J Vis. Author manuscript; available in PMC 2010 August 4.

ELL 788 Computational Perception & Cognition July November 2015

Visual Attention Driven by Auditory Cues

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

Attention Estimation by Simultaneous Observation of Viewer and View

EDGE DETECTION. Edge Detectors. ICS 280: Visual Perception

Chapter 14 Mining Videos for Features that Drive Attention

Object detection in natural scenes by feedback

Suspicious Object Recognition Method in Video Stream Based on Visual Attention

INCORPORATING VISUAL ATTENTION MODELS INTO IMAGE QUALITY METRICS

HUMAN VISUAL PERCEPTION CONCEPTS AS MECHANISMS FOR SALIENCY DETECTION

TARGET DETECTION USING SALIENCY-BASED ATTENTION

A context-dependent attention system for a social robot

A contrast paradox in stereopsis, motion detection and vernier acuity

Relevance of Computational model for Detection of Optic Disc in Retinal images

Selective Attention. Modes of Control. Domains of Selection

Feature combination strategies for saliency-based visual attention systems

Linear filtering. Center-surround differences and normalization. Linear combinations. Inhibition of return. Winner-take-all.

Computational modeling of visual attention and saliency in the Smart Playroom

Learning to Generate Long-term Future via Hierarchical Prediction. A. Motion-Based Pixel-Level Evaluation, Analysis, and Control Experiments

EARLY STAGE DIAGNOSIS OF LUNG CANCER USING CT-SCAN IMAGES BASED ON CELLULAR LEARNING AUTOMATE

Computational Saliency Models Cheston Tan, Sharat Chikkerur

Measuring the attentional effect of the bottom-up saliency map of natural images

Changing expectations about speed alters perceived motion direction

Lung Cancer Detection using Morphological Segmentation and Gabor Filtration Approaches

Contribution of Color Information in Visual Saliency Model for Videos

Reading Assignments: Lecture 18: Visual Pre-Processing. Chapters TMB Brain Theory and Artificial Intelligence

The discriminant center-surround hypothesis for bottom-up saliency

Increasing Spatial Competition Enhances Visual Prediction Learning

Lateral Geniculate Nucleus (LGN)

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

Keywords- Saliency Visual Computational Model, Saliency Detection, Computer Vision, Saliency Map. Attention Bottle Neck

VISUAL SALIENCY BASED BRIGHT LESION DETECTION AND DISCRIMINATION IN RETINAL IMAGES

The Photoplethysmography Imaging Device for Non-contact Monitoring of Sympathetic Blocks

Motion Control for Social Behaviours

Visual Attention. International Lecture Serie. Nicolas P. Rougier. INRIA National Institute for Research in Computer Science and Control

EXTRACTION OF RETINAL BLOOD VESSELS USING IMAGE PROCESSING TECHNIQUES

SUN: A Model of Visual Salience Using Natural Statistics. Gary Cottrell Lingyun Zhang Matthew Tong Tim Marks Honghao Shan Nick Butko Javier Movellan

Visual Attention Framework: Application to Event Analysis

The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search

Relative Influence of Bottom-up & Top-down Attention

Supplementary Material: The Interaction of Visual and Linguistic Saliency during Syntactic Ambiguity Resolution

The Attraction of Visual Attention to Texts in Real-World Scenes

A new path to understanding vision

Color Information in a Model of Saliency

Transactions on Applied Perception. Visual Attention in Edited Dynamical Images

A Bayesian Hierarchical Framework for Multimodal Active Perception

Introduction. Keywords Biological edge detection, Artificial bee colony, Unmanned combat air vehicle, Visual attention. Paper type Research paper

Sign Language Recognition using Webcams

IIE 269: Cognitive Psychology

Action Recognition. Computer Vision Jia-Bin Huang, Virginia Tech. Many slides from D. Hoiem

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

A Neural Network Architecture for.

Edge Detection Techniques Based On Soft Computing

Transcription:

A Computational Model of Saliency Depletion/Recovery Phenomena for the Salient Region Extraction of Videos July 03, 2007 Media Information Laboratory NTT Communication Science Laboratories Nippon Telegraph and Telephone Corporation Clement Leung Akisato Kimura Tatsuto Takeuchi Kunio Kashino 1

Overview Objective and related works Basic Algorithm Structure New computational model: Instantaneous Saliency Depletion with gradual Recovery Long term Saliency Depletion with instantaneous Recovery Algorithm Evaluation against previous algorithms using eye tracking tests Summary 2

Objective We use the strategy of focusing on more relevant regions and suppressing irrelevant regions in videos Reduces the amount of data to be processes Only unique information is retained Probable approach: Computational model is established based on the human visual system Human Vision has a powerful ability to extract important information from a given scenery 3

Related Works Itti, Koch & Niebur (1998) Still Image Algorithm: Proposed a model for computing saliency from still images Features: Intensity, Color & Orientation Restricted to still Images Itti, Dhavale & Pighin (2003) Moving Algorithm: Added onto the previous model flicker and motion features to produce video saliency extraction Did not take into account the temporal dynamics of the human visual system 4

New Computational Model We have extended the previous algorithms to include two important temporal characteristics: 1. Instantaneous Saliency Depletion with Gradual Recovery Based on Inhibition of Return theorem (Posner 1984): human attention tends to have a delay in realizing salient events around regions previously focused on 2. Gradual Saliency Depletion with Instantaneous Recovery Based on Neural Adaptation theorem (Hartline 1940): saliency gradually decreases over time when no surprising events occur in a video 5

Basic Algorithm Structure Input Frame n Intensity Color Orientation Flicker Motion Input data via webcam or sample video Across Scale Addition of Feature maps and Normalization Intensity Color Orientation Flicker Motion Norm Operator N(.) Saliency map Conspicuity Maps ISD mask generation GSD mask generation ISD=Instantaneous Saliency Depletion Mask convolution New Saliency Map GSD=Gradual Saliency Depletion 6

Instantaneous Saliency Depletion: Graphical Interpretation saliency (1) Focusing on most salient region (MSR) (2) Instantaneous drop in saliency of MSR due to Inhibition of Return (IOR) (3) Gradual recovery in saliency of MSR 30 70 ms 500 900 ms 500 900 ms time (Saliency drop and recovery times are based on the IOR theorem) 7

Instantaneous Saliency Depletion: Implementation Strategy Instantaneous saliency depletion (ISD) mask is created for each frame Multiplied with corresponding saliency Map MSR depletion masks Recovery mask + * ISD masks Saliency maps 8

Instantaneous Saliency Depletion: MSR Depletion Masks Saliency depletion regions move accordingly with object(s) in motion. New coordinate values of depletion regions are found using X and Y optical flow information. 1st spot: Weight = 0.2 1st spot: Weight = 0 1st spot: Weight = 0.1 2nd spot: Weight = 0.1 Frame 1: MSR of frame 0 is blacked out 2nd spot: Weight = 0 Frame 2: The MSR of frame 1 is blacked out, while 1 st MSR spot starts to recover 3rd spot: Weight = 0 Frame 3: MSR of frame 2 is blacked out while previous two MSR spots recover 9

Instantaneous Saliency Depletion: Example Original Video 10

Instantaneous Saliency Depletion: Example Saliency with instantaneous saliency depletion Instantaneous saliency depletion/recovery mask 11

Gradual Saliency Depletion: Graphical Interpretation Gradual saliency depletion with instantaneous recovery for cases such as still unchanging videos, constant velocity, constant flickering pattern. Gradual depletion in saliency due to Neural Adaptation saliency 20 30 sec time 12

Gradual Saliency Depletion: Implementation Strategy Whole-region Depletion Masks Surprise masks + * GSD masks Saliency maps 13

Gradual Saliency Depletion: Surprise Masks Step 1 Conspicuity map frames n-8 n for a specific feature (e.g. intensity) Temporal Feature maps Across scale addition of Temporal Feature maps Itemp Otemp Ctemp Ftemp Mtemp Step 2 n-8 n-7 n-6 n-5 n-4 n-3 n-2 n-1 n Sum of temporal feature map Extract temporal feature maps through difference-of-gaussian (DoG) filtering on the temporal domain Step 3 Create a surprise mask from temporal feature maps Surprise mask 14

Gradual Saliency Depletion: Example Original Video 15

Gradual Saliency Depletion: Example Saliency with gradual saliency depletion Gradual saliency depletion/recovery mask 16

Algorithm Evaluation Procedures: 5 subjects, each view 6 different sample videos while we track their eye movement Compare eye tracking test results to saliency videos produced by all three algorithms: a.) Itti s Still Image Algorithm b.) Itti s Moving Algorithm c.) Our Algorithm: Case 1: instantaneous depletion/gradual recovery only Case 2: gradual depletion/instantaneous recovery only Case 3: with both depletion properties included 17

Algorithm Evaluation Measures for evaluation: Located region where the eye is focusing on in each saliency video frame Calculated the normalized average pixel value at the region NETR value = Avg Pixel value in Eye focusing region Total pixel sum of frame Eye focusing region: center=the eye tracking point, radius=height of frame/9. Best performance: High Average Normalized Eye tracking region value. NETR 18

Results 5.00E-05 4.50E-05 4.00E-05 3.50E-05 Still image algorithm Moving algorithm NETR Values 3.00E-05 2.50E-05 2.00E-05 1.50E-05 1.00E-05 5.00E-06 0.00E+00 Video 1 Video 2 Video 3 Video 4 Video 5 Video 6 Our algorithm case 1: Inst. Saliency depletion Our algorithm case 2: Gradual Saliency depletion Our algorithm case 3: both depletion masks included Overall, our algorithm has approximately the same or substantially better performance than the other two algorithms 19

Summary Algorithm is based on extraction of early visual features to create saliency maps Instantaneous Saliency Depletion: Based on the Inhibition of return theorem, attention instantaneously diverts away from an area after being attended to, following by gradual saliency recovery Gradual Saliency Depletion: Based on Neural Adaptation theorem, a gradual decrease in saliency over time, with instantaneous saliency recovery of surprising areas Test results show that our algorithm performs approximately the same as previous algorithms or substantially better depending on selection of video Some demonstration movies will be available at http://www.brl.ntt.co.jp/people/akisato 20

21

Future Plans Do more tests with more subjects using artificial videos Conduct further studies to find more precise duration of time for Instantaneous/Gradual saliency depletion and recovery Incorporate Color-Intensity relationship Introduce High level knowledge Strategies: Learning algorithms to consider people s preferences. 22

Analysis of Results Video 3 Video 5 Original Video Moving algorithm Our Algorithm Case 1 Our Algorithm Case 2 23

Analysis of Results High Standard Error in our Algorithm s Results for each video due to: 1. High level knowledge 2. Limited number of test subjects 3. In our algorithm: If eye is not at salient region very low values, if eye is on salient region very high values. 4. Previous algorithms: Not as many regions are suppressed, if eye is not on most salient region, there are still other regions that are less salient but will give higher values than in our algorithm. 24