A Model for Automatic Diagnostic of Road Signs Saliency

Similar documents
Computational Cognitive Science

Validating the Visual Saliency Model

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION

Computational Cognitive Science

The Attraction of Visual Attention to Texts in Real-World Scenes

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

Characterizing Visual Attention during Driving and Non-driving Hazard Perception Tasks in a Simulated Environment

Computational modeling of visual attention and saliency in the Smart Playroom

Compound Effects of Top-down and Bottom-up Influences on Visual Attention During Action Recognition

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

Experiences on Attention Direction through Manipulation of Salient Features

AC : USABILITY EVALUATION OF A PROBLEM SOLVING ENVIRONMENT FOR AUTOMATED SYSTEM INTEGRATION EDUCA- TION USING EYE-TRACKING

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

Can Saliency Map Models Predict Human Egocentric Visual Attention?

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES

(In)Attention and Visual Awareness IAT814

Relevance of Computational model for Detection of Optic Disc in Retinal images

Detection of terrorist threats in air passenger luggage: expertise development

An Attentional Framework for 3D Object Discovery

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification

Identification of human factors criteria to assess the effects of level crossing safety measures

On the implementation of Visual Attention Architectures

Real-time computational attention model for dynamic scenes analysis

NIH Public Access Author Manuscript J Vis. Author manuscript; available in PMC 2010 August 4.

Advertisement Evaluation Based On Visual Attention Mechanism

Development of goal-directed gaze shift based on predictive learning

{djamasbi, ahphillips,

Methods for comparing scanpaths and saliency maps: strengths and weaknesses

IAT 814 Knowledge Visualization. Visual Attention. Lyn Bartram

A micropower support vector machine based seizure detection architecture for embedded medical devices

Measuring Focused Attention Using Fixation Inner-Density

Utilizing Posterior Probability for Race-composite Age Estimation

Pupillary Response Based Cognitive Workload Measurement under Luminance Changes

Human Learning of Contextual Priors for Object Search: Where does the time go?

Skin color detection for face localization in humanmachine

Traffic Sign Detection and Identification

Evaluation of the Impetuses of Scan Path in Real Scene Searching

Human Learning of Contextual Priors for Object Search: Where does the time go?

Keywords- Saliency Visual Computational Model, Saliency Detection, Computer Vision, Saliency Map. Attention Bottle Neck

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction

An Evaluation of Motion in Artificial Selective Attention

Estimation of Driver Inattention to Forward Objects Using Facial Direction with Application to Forward Collision Avoidance Systems

ROAD SIGN CONSPICUITY AND MEMORABILITY: WHAT WE SEE AND REMEMBER

Knowledge-driven Gaze Control in the NIM Model

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Local Image Structures and Optic Flow Estimation

The Danger of Incorrect Expectations In Driving: The Failure to Respond

Neuromorphic convolutional recurrent neural network for road safety or safety near the road

Koji Sakai. Kyoto Koka Women s University, Ukyo-ku Kyoto, Japan

NMF-Density: NMF-Based Breast Density Classifier

ADAPTING COPYCAT TO CONTEXT-DEPENDENT VISUAL OBJECT RECOGNITION

Protecting Workers with Smart E-Vests

Multimodal Driver Displays: Potential and Limitations. Ioannis Politis

THE EFFECT OF EXPECTATIONS ON VISUAL INSPECTION PERFORMANCE

Show Me the Features: Regular Viewing Patterns. During Encoding and Recognition of Faces, Objects, and Places. Makiko Fujimoto

Differential Viewing Strategies towards Attractive and Unattractive Human Faces

Early Detection of Lung Cancer

Visual Attention and Change Detection

Meaning-based guidance of attention in scenes as revealed by meaning maps

Study on Aging Effect on Facial Expression Recognition

INCORPORATING VISUAL ATTENTION MODELS INTO IMAGE QUALITY METRICS

Finding Saliency in Noisy Images

Spiking Inputs to a Winner-take-all Network

Automated Assessment of Diabetic Retinal Image Quality Based on Blood Vessel Detection

Where we look when we drive: A multidisciplinary approach

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

Cancer Cells Detection using OTSU Threshold Algorithm

Classifying Cognitive Load and Driving Situation with Machine Learning

196 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 2, NO. 3, SEPTEMBER 2010

HUMAN VISUAL PERCEPTION CONCEPTS AS MECHANISMS FOR SALIENCY DETECTION

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

The synergy of top-down and bottom-up attention in complex task: going beyond saliency models.

Eye Movement Patterns and Driving Performance

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

CS 771 Artificial Intelligence. Intelligent Agents

Analysis of Glance Movements in Critical Intersection Scenarios

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

A FIELD STUDY ASSESSING DRIVING PERFORMANCE, VISUAL ATTENTION, HEART RATE AND SUBJECTIVE RATINGS IN RESPONSE TO TWO TYPES OF COGNITIVE WORKLOAD

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

Online Vigilance Analysis Combining Video and Electrooculography Features

Quantifying the relationship between visual salience and visual importance

A Smart Texting System For Android Mobile Users

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Attention Estimation by Simultaneous Observation of Viewer and View

The Effect of Training Context on Fixations Made During Visual Discriminations

Statement of research interest

ANALYSIS OF FACIAL FEATURES OF DRIVERS UNDER COGNITIVE AND VISUAL DISTRACTIONS

(Visual) Attention. October 3, PSY Visual Attention 1

OPTIC FLOW IN DRIVING SIMULATORS

IDENTIFICATION OF MYOCARDIAL INFARCTION TISSUE BASED ON TEXTURE ANALYSIS FROM ECHOCARDIOGRAPHY IMAGES

REACTION TIME MEASUREMENT APPLIED TO MULTIMODAL HUMAN CONTROL MODELING

Understanding eye movements in face recognition with hidden Markov model

Gender Based Emotion Recognition using Speech Signals: A Review

Deriving an appropriate baseline for describing fixation behaviour. Alasdair D. F. Clarke. 1. Institute of Language, Cognition and Computation

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

The influence of clutter on real-world scene search: Evidence from search efficiency and eye movements

Transcription:

A Model for Automatic Diagnostic of Road Signs Saliency Ludovic Simon (1), Jean-Philippe Tarel (2), Roland Brémond (2) (1) Researcher-Engineer DREIF-CETE Ile-de-France, Dept. Mobility 12 rue Teisserenc de Bort, 78190, Trappes, France lsimon@developpement-durable.gouv.fr (2) Researcher University Paris Est-INRETS-LCPC (LEPSiS) 58 Bd Lefebvre, 75015, Paris, France tarel@lcpc.fr bremond@lcpc.fr Abstract Road signs, the main communication media towards the drivers, play a significant role in road safety and traffic control through drivers guidance, warning, and information. However, not all traffic signs are seen by all drivers, which sometimes lead to dangerous situations. In order to manage safer roads, the estimation of the legibility of the road environment is thus of importance for road engineers and authorities who aim at making and keeping traffic signs salient enough to attract attention regardless of the driver's workload. Our long term obective is to build a system for the automatic estimation of road sign saliency along a road network, from images taken with a digital camera on-board a vehicle. This system will be interesting for accident analysis and prevention since it will enable a fine diagnostic of the road signs saliency, helping the road manager decide on which signs he must act and how (replacement or background modification). This should lead to improved asset management, road infrastructure maintenance and road safety. What attracts driver's attention is related both to psychological factors (motivations, driving task, etc.) and to the photometrical and geometrical characteristics of the road scene (colours, background, etc.). The saliency (or conspicuity) of an obect is the degree to which this obect attracts visual attention for a given background. Road signs perception depends on the two main components of visual attention: obects pop-out and visual search. The first one is less relevant when the task is to search for a particular obect, whereas one important part of the driving task is to look for road signs. As most of current computational models of visual search saliency are limited to laboratorysituations, we propose a new model to compute visual search saliency in natural scenes. Relying on statistical learning algorithms, the proposed algorithm emulates the priors a driver learns on obect appearance for any given class of road signs. The algorithm performs both the detection of the obect of interest in the image and the estimation of its saliency. The proposed computational model of saliency was evaluated through psycho-visual experiments. This opens the possibility to design automatic diagnostic systems for road signs saliency. 1. Introduction The points on which the driver focuses his gaze depend on the traffic situation, on the ongoing driver task and on the saliency of the obects relative to their background. Visual attention (Knudsen, 2007) can be thought of as a two components process: obect pop-out and visual search. These two components are mixed during vehicle driving. The obect pop-out is only due to the high saliency of the obect which attracts attention, independently of the ongoing task. It is a bottom-up process which applies when, for instance, an observer is looking at a meaningless

picture. The visual search which is a top-down process. It consists in a goal-driven process linked to voluntary attention, which depends on the driver s experience, the current task and motivation. Searching for a specific detail in a picture is an example of pure visual search. Figure 1: On the left, the focus of attention predicted by a bottom-up model (Itti, 1998). On the middle, focus points predicted as salient by our model for the search of no entry signs. On the right, scan-path of a subect searching for no entry signs. The focus points are labelled in decreasing order with a number inside each circle. Although several computational models of the saliency for obect pop-out have been proposed in the last decade, due to the complexity of human behaviour, it is only very recently that a few computational models of the saliency for visual search have been proposed. The most popular computational saliency model was proposed in (Itti, 1998). This algorithm computes a bottom up saliency map based on a modelling of the low levels of the Human Visual System (HVS). This model was tested against oculometric data and it succeeds when the observer task is to memorize images, but as expected, it fails when the task is to search for an obect, see (Underwood and others, 2006). Fig. 1 illustrates this limit on a road scene: on the left the focus points predicted by (Itti, 1998) are incorrectly spread on the whole image, whereas the observed scan-path is mostly along the line of horizon as displayed on right image. To design an automatic system for the estimation of road sign saliency along a road network from images, a computational model of search saliency is thus requested. Until recently, there was no complete computational model, only theoretical models of the search saliency or computational models working only on simple laboratory situations and thus not on road environment. As a consequence, in (Simon and others, 2007), we proposed a new computational model of search saliency, based on learning the appearance of the sign(s) of interest in the images. As explained first in (Simon and others, 2007), our paradigm consists in linking the confidence in the detection with the search saliency, see section 2. Then, the approach was tested by psycho-visual experiments on no-entry sign, as described in section 3. 2. Saliency estimation based on sign appearance learning The visual search task for a road sign is a detection problem in terms of pattern recognition. The search saliency of a given sign can be thus interpreted as a measure of the difficulty to detect it in an image. 2.1 Saliency estimation paradigm To perform road sign detection in images, we used recently developed statistical learning algorithms. The learning is performed from a set of positive and negative examples of the signs to be detected, each example being represented as a feature vector. Positive feature vectors are

samples of the appearance of the sign, while negative feature vectors are samples of the appearance of a possible background, see Fig. 2. This set of positive and negative vectors is usually called the learning database. From this database, the Support Vector Machine (SVM) algorithm (Schölkopf and Smola, 2002) is able to infer the frontier which splits the feature space into positive and negative parts. After this learning stage, the resulting classifier is able to set a positive or a negative label to any new feature vector, i.e to any window in a new image. This classifier can thus be used to perform sign detection in an image at several scales using sliding windows. Figure 2: Positive (top) and negative (bottom) samples of the appearance of a no-entry sign. The main advantage of the SVM is that the resulting classification function gives continuous values and not only binary values as with most classification algorithms. This classification value is related to the confidence in the pattern recognition: when it is higher than one, the recognition is positive, when it is lower than zero, the recognition is negative. The closer to zero the classification value is, the more hazardous the recognition decision. Our paradigm is thus to rely on a learning algorithm for modelling the appearance of the obect of interest, and to define the search saliency of a given window image as a function of the classification value. The input of the SVM algorithm, the learning database, is a set of feature vectors with positive or negative labels. In (Simon and others, 2008) are described the experiments which allowed us to select which features are adequate to represent the appearance of an image window. We found that a simple colour histogram is enough with the obective of no-entry sign detection for saliency estimation. The choice of the kernel is also of importance to achieve correct detection performance. As detailed in (Simon and others, 2008), the best choice appears to be the power kernel α k( x, x' ) = x x' which allows for implicit adaptation to the data density, unlike most classical kernels (α=1 in the following). 2.2 Saliency map Figure 3: Scheme for the computation of the search saliency.

The SVM classifier is applied at different scales on sliding windows in a road image. Each scale results in a confidence map at a given scale of the positive classification values. The global confidence map for an image is built as the maximum of the confidence maps over the various scales. Finally, to take into account the saliency of the background around each detected road signs, the global confidence map is corrected by subtracting its local mean over the so-called background-window. The size of the background-window is of constant angular value set to 2 degrees, from our experiments in (Simon and others, 2009). The resulting map is defined as the Background related Computed Saliency (BCS) map. Notice that the BCS is independent of the size of the detected obect, whereas it is known that the size plays an important role in the saliency. Thus, the positive connex components are extracted from the BCS map and for each component i the Search Computed Saliency (SCS) is computed as: SCS ( i) = i 4 BCS( i) A( ) where BCS(i) is the average BCS over the connex component i and A(i) is its area. The search saliency of each road sign can be summarized by Fig. 3. The field of applications of this computational model is not limited to the estimation of road signs saliency by inspection vehicle; it should also serve Advanced Driver Assistance Systems (ADAS), as explained in (Simon and others, 2009). 3. Experiments 3.1 Apparatus Figure 4: Psycho-vision laboratory, including an eye-tracker system used for psycho-visual experiments. In order to test the proposed model, we conducted experiments in a display room which is photometrically controlled, as shown in Fig. 4. The room consists in subect's and advisor's screens. The display device is equipped with a remote eye-tracker (SMI) which tracks the subect s gaze to record fixations positions and duration with an accuracy of 0.5 degree at a frequency of 50Hz. The screen is 19'' wide with a viewing distance of 70 cm. Thus, the subects saw the road scene with a visual angle of 20 degrees. For each subect, 40 road images were displayed, containing a total of 76 no-entry signs. These images were selected with various sign appearances and background, leading to a large variety of saliency levels for the observed signs.

3.2 Detection score and subective saliency Thirty subects were asked to pretend they were drivers of the car from which the images were taken. The experiment consisted in two phases. In the first phase, the subects were asked to count the no-entry signs, knowing that images would disappear after 5s. In the second phase, the subects were asked to rate the saliency of each no-entry sign by giving a score between 0 and 10. The analysis of the gaze fixations associated with the subects answers (reported number of noentry signs) allowed us to know whether the subects detected each displayed no-entry sign. For a given sign, the mean percentage of detection over all subects gives the Human Detection Rate (HDR). In the second phase, due to the subects variability in the use of the score scale, all scores were standardized enforcing for each subect the same Gaussian law with mean value 5. Subect Standardized Score saliency (SSS) (subect index i), related to subective saliency (of sign ), is defined by: SSS ( i, ) = score i, E ( score i Ei ( scorei, ) + 5 E ( score )) i, i i, 3.3 Results We first checked that the average detection rate HDR is related to the Subective Standardized Score (SSS). As illustrated in the left of Fig. 5, the greater the SSS, the greater the HDR. This correlation between the subective saliency score and the correct detection rate supports the proposed paradigm: the search saliency is an evaluation of the difficulty to detect an obect. Then, we studied the correlation between the Subective Standardized Score (SSS) and the Search Computed Saliency (SCS). As shown in the right of Fig. 5, a linear relation can be observed between the SSS and the SCS. As a consequence, the proposed computational model of search saliency is correlated with the subective saliency. The link between the HDR and SSS implies that the SCS is also correlated to the Human Detection Rate (HDR) which is a saliency indicator. Figure 5: On the left, the correlation between the Subect Standardized Score (SSS) and the Human Detection Rate (HDR). On the right, the linear correlation between the SSS and the Search Computed Saliency (SCS).

The statistical analysis detailed in (Simon and others, 2009) showed that the proposed computational SCS model explains 56% of the variance between signs, and 39% overall. The same test using the 4 th squared root of the road sign size instead of the proposed model SCS explains 46% of the variance between signs and 32% overall. Thus, the proposed computational saliency model improves the size-based mode by an increase of the explanation of the variance in an amount of 18%. 4. Conclusion Most available computational models of the visual saliency are limited to obects pop-out, whereas the need for an automatic diagnostic system for road signs saliency implies to have a computational model of saliency during visual search task. We thus proposed a paradigm to define search saliency and a computational model to estimate the search saliency in an image within a search task using SVM. From our psycho-visual experiments, this computational model is correlated to subect's score of road sign saliency. In future work, we will test our computational model on other road signs, our goal being to develop a reliable automatic diagnostic system of road signs saliency. 5. References 1. Knudsen E.I. (2007), Fundamental Components of Attention, Annual Review Neuroscience, vol. 30, pp. 57-78. 2. Itti L., Koch C. & Niebur E. (1998), A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1254-1259. 3. Underwood G., Foulsham T., van Loon E., Humphreys L. And Bloyce. (2006), Eye movements during scene inspection: A test of the saliency map hypothesis, European Journal of cognitive Psychology, vol. 18, pp. 321-342. 4. Simon L., Tarel J.-P. & Bremond R. (2007), A new paradigm for the computation of conspicuity of traffic signss in real images, Proceedings of the 26 th session of Commission Internationale de L'éclairage (CIE'07), vol. 2, China, pp. 161-164. 5. Schölkopf B. & Smola A. (2002), learning with kernels, MIT Press, USA. 6. Simon L., Tarel J.-P. & Bremond R. (2008), Towards the Estimation of Conspicuity with Visual Priors, Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP'08), Portugal, pp. 323-328. 7. Simon L., Tarel J.-P. & Bremond R. (2009), Alerting the Drivers about Road Signs with Poor Visual Saliency, Proceedings of the IEEE Intelligent Vehicle Symposium (IV'2009), China, pp. 48-53.