Scanpath modeling and classification with Hidden Markov Models

Size: px
Start display at page:

Download "Scanpath modeling and classification with Hidden Markov Models"

Transcription

1 Title Scanpath modeling and classification with Hidden Markov Models Author(s) Coutrot, A; Hsiao, JHW; Chan, AB Citation Behavior Research Methods, 27 Issued Date 27 URL Rights The final publication is available at Springer via DOI]; This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4. International License.

2 DOI.3758/s Scanpath modeling and classification with hidden Markov models Antoine Coutrot Janet H. Hsiao 2 Antoni B. Chan 3 The Author(s) 27. This article is published with open access at Springerlink.com Abstract How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, i.e., the sequence of eye movements made by an observer exploring a visual stimulus, can be used to infer observerrelated (e.g., task at hand) and stimuli-related (e.g., image semantic category) information. However, eye movements are complex signals and many of these studies rely on limited gaze descriptors and bespoke datasets. Here, we provide a turnkey method for scanpath modeling and classification. This method relies on variational hidden Markov models (HMMs) and discriminant analysis (DA). HMMs encapsulate the dynamic and individualistic dimensions of gaze behavior, allowing DA to capture systematic patterns diagnostic of a given class of observers and/or stimuli. We test our approach on two very different datasets. Firstly, we use fixations recorded while viewing 8 static natural scene images, and infer an observer-related characteristic: the task at hand. We achieve an average of 55.9% correct classification rate (chance = 33%). We show that correct classification rates positively correlate with the number of salient regions present in the stimuli. Secondly, we use eye positions recorded while viewing 5 conversational videos, Antoine Coutrot acoutrot@gmail.com CoMPLEX, University College London, London, UK 2 Department of Psychology, The University of Hong Kong, Pok Fu Lam, Hong Kong 3 Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong and infer a stimulus-related characteristic: the presence or absence of original soundtrack. We achieve an average 8.2% correct classification rate (chance = 5%). HMMs allow to integrate bottom-up, top-down, and oculomotor influences into a single model of gaze behavior. This synergistic approach between behavior and machine learning will open new avenues for simple quantification of gazing behavior. We release SMAC with HMM, a Matlab toolbox freely available to the community under an open-source license agreement. Keywords Scanpath Eye movements Hidden Markov models Classification Machine-learning Toolbox Introduction We use vision to guide our interactions with the world, but we cannot process all the visual information that our surroundings provide. Instead, we sequentially allocate our attention to the most relevant parts of the environment by moving our eyes to bring objects onto our high-resolution fovea to allow fine-grained analysis. In natural vision, this endless endeavor is accomplished through a sequence of eye movements such as saccades and smooth pursuit, followed by fixations. These patterns of eye movements, also called scanpaths, areguidedbytheinteractionofthreemain factors (Kollmorgen et al., 2). First, top-down mechanisms are linked to the observers, and adapt their eye movements to their personal characteristics. They can be conscious like performing the task at hand, or unconscious like observers culture, age, gender, personality, or state of health. Second, bottom-up mechanisms are linked to the visual stimulus. They can be low-level such as local image features (motion, color, luminance, spatial frequency), or

3 high-level such as the social context or the presence of faces and other semantic content. The third factor is related to the characteristics inherent to the oculomotor system, such as the spatial bias to the center region and the geometric properties of saccades. Gaze patterns contain a wealth of information As a byproduct of these three multivariate mechanisms, eye movements are an exceptionally rich source of information about the observers and what they look at; they provide a high-resolution spatiotemporal measure of cognitive and visual processes that are used to guide behavior. Since the seminal work of Buswell and Yarbus (Buswell, 935; Yarbus, 965), several recent studies have proposed computational and statistical methods to infer observers characteristics from their eye movements. Since 22, numerous studies have tried to classify observers gaze patterns according to the task at hand during reading (Henderson et al., 23), counting (Haji-Abolhassani & Clark, 23), searching (Zelinsky et al., 23), driving (Lemonnier et al., 24), mind wandering (Mills et al., 25), memorizing, and exploring static artificial or natural scenes (Kanan et al., 24; Borji & Itti, 24; Haji-Abolhassani & Clark, 24). For a thorough review of task-prediction algorithms, see (Boisvert & Bruce, 26). Eye movements can also be used to quantify mental workload, especially during demanding tasks such as air traffic control (Ahlstrom & Friedman- Berg, 26; DiNoceraetal., 26; Kang & Landry, 25; McClung & Kang, 26; Mannaru et al., 26). Another very promising line of studies is gaze-based disease screening (Itti, 25). Eye movement statistical analysis is opening new avenues for quantitative and inexpensive evaluation of disease. Visual attention and eye movement networks are so pervasive in the brain that many disorders affect their functioning, resulting in quantifiable alterations of eye movement behavior. Both mental and eye disease diagnostics can be informed with gaze data. Mental disorders include Parkinson s disease, attention deficit hyperactivity disorder, fetal alcohol spectrum disorder (Tseng et al., 23), autism spectrum disorder (Wang et al., 25), early dementia (Seligman & Giovannetti, 25) and Alzeihmer s disease (Lagun et al., 2; Alberdi et al., 26). See (Anderson & MacAskill, 23) for a review of the impact of neurodegenerative disorders on eye movements. Eye tracking can also help diagnose eye diseases such as glaucoma (Crabb et al., 24), age-related macular degeneration (Rubin & Feely, 29; Van der Stigchel et al., 23; Kumar & Chung, 24), strabismus (Chen et al., 25), and amblyopia (Chung et al., 25). In many cases (particularly where patients/young infants cannot talk), this has the added advantage of bypassing verbal report. Given the prevalence of (sometimes subtle) health disorders, developing assessment methods that allow researchers to reliably and objectively test all ages could prove crucial for effective early intervention. Other studies have used eye movements to successfully infer observers characteristics such as their gender (Coutrot et al., 26), age (French et al., 26), personality (Mercer Moss et al., 22) and level of expertise (e.g., novices vs. experts in air traffic control (Kang & Landry, 25), medicine (Cooper et al., 29), and sports (Vaeyens et al., 27). See (Gegenfurtner et al., 2) fora meta-analysis). A complementary approach uses eye movements to extract information about what is being seen. For instance, machine learning approaches have been used to infer the valence (positive, negative, neutral) of static natural scenes (Tavakoli et al., 25). The category of a visual scene (e.g., conversation vs. landscape) can also be determined from eye movements in both static (O Connell & Watlher, 25) and dynamic (Coutrot & Guyader, 25) natural scenes. Capturing gaze information All the gaze-based inference and classification studies mentioned so far rely on a very broad range of gaze features. Gaze is a complex signal and has been described in a number of ways. In Figure, we review the main approaches proposed in the literature. Figure a takes an inventory of all eye movements direct parameters: fixation duration,location,dispersion,and clusters (Mital et al., 2; Lagun et al., 2; Mills et al., 2;Kardanetal.,25;Tavakolietal.,25; Mills et al., 25), saccade amplitude, duration, latency, direction and velocity (Le Meur & Liu, 25; Le Meur & Coutrot, 26), microsaccade amplitude, duration, latency, direction and velocity (Martinez-Conde et al., 29; Ohl et al., 26), pupil dilation (Rieger & Savin-Williams, 22; Bednarik et al., 22; Wass & Smith, 24; Binetti et al., 26), blink frequency and duration (Ahlstrom & Friedman-Berg, 26; Bulling et al., 2). The advantages of these features are their direct interpretability, the fact that they can be recorded on any stimulus set without having to tune arbitrary parameters (except saccade detection thresholds). Their drawbacks are that high-quality eye data is required to precisely parse fixations and saccades and measure their parameters, with a sampling frequency above 6 Hz (Nyström & Holmqvist, 2). Moreover, they are synchronic indicators: the events they measure occur at a specific point in time and do not capture the spatio-temporal aspect of visual exploration. In Fig. b, authors introduce spatial information with eye position maps, or heatmaps, which are three-dimensional objects (x, y, fixation density) representing the spatial distribution of eye positions at a given time. They can be either binary or continuous, if smoothed with a Gaussian filter. Different metrics have been proposed to compare two eye position maps and are either distribution-based: the Kullback Leibler divergence (KLD) (Rajashekar et al., 24), the Pearson (Le Meur et al., 26) or Spearman

4 Fig. State-of-the-art in eye movement modeling and comparison. The different approaches are clustered into four groups: (a) Oculomotor parameters, (b) Spatial distribution of eye positions (KLD = Kullback-Leibler divergence, CC = correlation coefficient, SIM = similarity, EMD = earth moving distance, AUC = area under curve, NSS = normalized scanpath saliency, PF = percentage of fixation), (c) string-based and geometric scanpath comparisons, and (d) probabilistic approaches. Each technique is referenced, and relevant reviews are suggested. On the lower part, a table capsulizes the pros and cons of each type of approach: does it require high-quality eye data?; does it provide an easily interpretable model?; does it capture temporal information?; is it data-driven?; can it be applied to all types of stimuli? [] Mills et al. (25); [2] Lagun et al. (2); [3] Tavakoli et al. (25); [4] Mital et al. (2); [5] Mills et al. (2); [6] Kardan et al. (25); [7] Le Meur & Liu (25); [8] Le Meur & Coutrot (26); [9] Martinez-Conde et al. (29);[] Ohl et al.(26); [] Ahlstrom & Friedman-Berg (26); [2] Bulling et al. (2); [3] Rieger & Savin-Williams, (22); [4] Badnarik et al. (22); [5] Wass & Smith (24); Binetti et al. (26); [7] Rajashekar et al. (24); [8] Le Meur et al. (26); [9] Toet (2); [2] Judd et al. (22); [2] Peters et al. (25); [22] Torralba et al. (26); [23] Peters & Itti (28); [24] Kümmerer et al. (25); [25] Riche et al. (23); [26] Bylinskii et al. (26); [27] Caldara & Miellet (2); [28] Lao et al. (26); [29] Mannan et al. (996); [3] Mathôt et al. (22); [3] Dewhurst et al. (22); [32] Anderson et al. (23); [33] Haas et al. (26); [34] Foerster & Schneider, (23); [35] Levenshtein (966); [36] Cristino et al. (2); [37] Duchowski et al. (2); [38] Räihä(2); [39] Hembrooke et al. (26); [4] Sutcliffe & Namoun (22); [4] Goldberg & Helfman (2); [42] Eraslan et al. (); [43] Eraslan et al. (26); [44] Le Meur & Baccino (23); [45] Anderson et al. (24); [46] Kübler et al. (26); [47] West et al. (26); [48] Kanan et al. (25); [49] Barthelmé et al.(23); [5] Engbert et al. (25); [5] Ylitalo et al. (26); [52] Rigas et al. (22); [53] Cantoni et al. (25); [54] Dolezalova & Popelka (26); [55] Vincent et al. (29); [56] Couronnéetal.(2); [57] Haji-Abolhassani & Clark (24); [58] Coutrot et al. (26); [59] Chuk et al. (27); [6] Chuk et al. (24); [6] Brockmann & Geisel (2); [62] Boccignone & Ferraro (24) [63] Boccignone (25); [64] Galdi et al. (26) (Toet, 2) correlation coefficient (CC), the similarity and the earth moving distance (EMD) (Judd et al., 22); or location-based: the normalized scanpath saliency (NSS) (Peters et al., 25), the percentage of fixation into the salient region (PF) (Torralba et al., 26), the percentile (Peters & Itti, 28) and the information gain (Kümmerer

5 et al., 25). Most of these have been created to compare ground-truth eye position maps with visual saliency maps computed from a saliency model. Eye position maps are easy to compute with any stimuli; they only require simple (x, y) gaze coordinates. For instance, imap is a popular open-source toolbox for statistical analysis of eye position maps (Caldara & Miellet, 2; Lao et al., 26). As with eye movement parameters, this approach is mostly data-driven: only the size of the smoothing Gaussian kernel needs to be defined by the user. Eye position maps can be visually meaningful. However, each metric measures the distance between slightly different aspects of spatial distributions, which can be hard to interpret. We refer the interested reader to the following reviews: (Riche et al., 23; Bylinskii et al., 26). Their main drawback is that they fail to take into account a critical aspect of gaze behavior: its highly dynamic nature. To acknowledge that visual exploration is a chronological sequence of fixations and saccades, authors listed in Fig. c represent them as scanpaths. Different metrics have been proposed to compare two scanpaths. The simplest are string-edit distances (Levenshtein, 966; Cristino et al., 2; Duchowski et al., 2). They first convert a sequence of fixations within predefined regions of interest (or on a simple grid) into a sequence of symbols. In this representation, comparing two scanpaths boils down to comparing two strings of symbols, i.e., computing the minimum number of edits needed to transform one string into the other. More complex vectorbased methods avoid having to manually predefine regions of interest by geometrically aligning scanpath (Mannan et al., 996; Mathôt et al., 22; Dewhurst et al., 22; Anderson et al., 23; Haass et al., 26; Foerster & Schneider, 23) or finding common sequences shared by two scanpaths (Räihä, 2; Hembrooke et al., 26; Sutcliffe & Namoun, 22; Goldberg & Helfman, 2; Eraslan et al., 26). For instance, MultiMatch aligns two scanpaths according to different dimensions (shape, length, duration, angle) before computing various measures of similarity between vectors (Dewhurst et al., 22). For further details, the reader is referred to the following reviews: (Le Meur & Baccino, 23; Anderson et al., 24; Eraslan et al., 26). The major drawback of both string-edit and geometricbased approaches is that they do not provide the user with an interpretable model of visual exploration, and often heavily rely on free parameters (e.g., the grid resolution). Figure d lists probabilistic approaches for eye movement modeling. These approaches hypothesize that eye movement parameters are random variables generated by underlying stochastic processes. The simplest gaze probabilistic model probably is Gaussian mixture model (GMM), where a set of eye positions is modeled by a sum of two-dimensional Gaussians. If the stimulus is static, eye positions can be recorded from the same observer and added up through time (Vincent et al., 29; Couronné etal.,2). They can also be recorded from different observers viewing the same stimulus at a given time (Mital et al., 2). Modeling gaze with GMM allows to take into account fixations slightly outside regions of interest, considering phenomena such as the dissociation between the center of gaze and the covert focus of attention, the imprecision of the human oculomotor system and of the eye-tracker. However, the main advantage of statistical modeling is its data-driven aspect. For instance, the parameters of the Gaussians (centre and variance) can be directly learnt from eye data via the expectation-maximization algorithm (Dempster et al., 977), and the optimal number of Gaussian can be determined via a criterion such as the Bayesian Information Criterion, which penalizes the likelihood of models with too many parameters. To introduce the temporal component of gaze behavior in the approach, a few authors used hidden Markov models (HMMs), which capture the percentage of transitions from one region of interest (state of the model) to another (Chuk et al., 27, 24; Haji- Abolhassani & Clark, 24; Coutrot et al., 26). HMM parameters can be directly learnt from eye data via maximum likelihood estimation. For more details on HMM computation, cf. Hidden Markov models section. HMMs are data-driven, contain temporal information, and do not require high-quality eye data. Nevertheless, they are easily interpretable only with stimuli featuring clear regions of interest (cf. Inferring observer characteristics from eye data section). This gaze representation can be made even more compact with Fisher vectors, which are a concatenation of normalized GMM or HMM parameters into a single vector (Kanan et al., 25). Although rich in information, these vectors are not intuitively interpretable. For a review of eye movement modeling with Markov processes, we refer the reader to Boccignone s thorough introduction (Boccignone, 25). Some studies in the field of Biometry and gaze-based human identification propose a graph representation (Rigas et al., 22; Cantoni et al., 25; Galdi et al., 26). For instance, in (Cantoni et al., 25), the authors subdivided the clouds of fixation points with a grid to build a graph representing the gaze density, and fixation durations within each cell, and the transition probabilities between cells. Finally, spatial point processes constitute a probabilistic way of modeling gaze spatial distribution. They allow to jointly model the influence of different spatial covariates such as viewing biases or bottom-up saliency on gaze spatial patterns (Barthelmé et al.,23; Engbert et al., 25; Ylitalo et al., 26). Their main drawback is that the temporal dimension is not taken into account.

6 Contributions The aim of this paper is to provide a ready-made solution for gaze modeling and classification, as well as an associated Matlab toolbox: SMAC with HMM (Scanpath ModelingAnd Classification with Hidden Markov Models). Our approach is based on hidden Markov models (HMMs). It integrates influences stemming from top-down mechanisms, bottom-up mechanisms, and viewing biases into a single model. It answers to three criteria. First, it encapsulates the dynamic dimension of gaze behavior. Visual exploration is inherently dynamic: we do not average eye movements over time. Second, it encapsulates the individualistic dimension of gaze behavior. As mentioned in the introduction, visual exploration is a highly idiosyncratic process. As such, we want to model gaze in a data-driven fashion, learning parameters directly from eye data. Third, our approach is visually meaningful and intuitive. The rise of low-cost eye-tracking will enable a growing number of researchers to record and include eye data in their studies (Krafka et al., 26). We want our model to be usable by scientists from all backgrounds. Our method works with any eye-data sampling frequency and do not require other input than gaze coordinates. This paper is structured as follows. First, we formally describe HMMs in the context of gaze behavior modeling, and present our open-source toolbox. Then, we illustrate the strength and versatility of this approach by using HMM parameters to infer observers-related and stimuli-related characteristics from two very different datasets. Finally, we discuss some limitations of our framework. Methods Hidden Markov models for eye movement modeling Definitions HMMs model data varying over time, and can be seen as generated by a process switching between different phases or states at different time points. They are widely used to model Markov processes in fields as varied as speech recognition, genetics, or thermodynamics. Markov processes are memory-less stochastic processes: the probability distribution of the next state only depends on the current state and not on the sequence of events that preceded it. The adjective hidden means that a state is not directly observable. In the context of eye movement modeling, it can be inferred from the association between the assumed hidden state (region of interest - or ROI - of the image) and the observed data (eye positions). Here we follow the approach used in Chuk et al. (24). More specifically, the emission densities, i.e., the distribution of fixations in each ROI, are modeled as two-dimensional Gaussian distributions. The transition from the current hidden state to the next one represents a saccade, whose probability is modeled by the transition matrix of the HMM. The initial state of the model, i.e., the probability distribution of the first fixation, is modeled by the prior values. To summarize, an HMM with K hidden states is defined by. N i (m i, i ) i [..K], the Gaussian emission densities, with m i the center and i the covariance of the i th state emission. 2. A = (a ij ) (i,j) [..K] 2 the transition matrix, with a ij the probability of transitioning from state i to state j. 3. (p i ) i [..K] the priors of the model. Figure 2 represents 9 scanpaths modeled by a single HMM. Scanpaths consisted of sequences of fixation points (on average, three fixations per second). Figure 3 is similar, but scanpaths consisted of eye positions time-sampled at 25 Hz. The sampling frequency impacts on the transition matrix coefficients: the higher the frequency, the closer to one the diagonal coefficients. Note that using time-sampled eye positions allows taking into account fixation durations. Variational approach A critical parameter is K, the number of state. For the approach to be as data-driven as possible, this value must not be determined a priori but optimized according to the recorded eye data. This is a problem since traditional maximum likelihood methods tend to give a greater probability for more complex model structures, leading to overfitting. In our case, a HMM with a great number of states might have a high likelihood but will be hard to interpret in term of ROI, and hard to compare to other HMMs trained with other sets of eye positions. The variational approach to Bayesian inference enables simultaneous estimation of model parameters and model complexity (McGrory & Titterington, 29). It leads to an automatic choice of model complexity, including the number of state K (see also Chuk et al., 24, 27). Learning HMM from one or several observers Two different approaches can be followed. An HMM can be learned from a group of scanpaths, as depicted in Figs. 2 and 3.This is useful to visualize and compare the gaze behavior of two different groups of observers, in two different experimental conditions for instance. It is also possible to learn one HMM per scanpath to investigate individual differences or train a gaze-based classifier, as depicted Fig. 4. In the following, we focus on the last approach. To link the HMM states learned from eye data to the actual ROI of the stimuli, we sort them according to their emissions center, from left to right. This allows comparing HMM learned from different scanpaths.

7 Fig. 2 SMAC with HMM toolbox plot Three-state HMM modeling 9 scanpaths on an image from Koehler s dataset. Scanpaths: fixation points of the same color belong to the same observer. Emissions: three states have been identified. Emission counts: number of fixations associated with each state. Posterior probabilities: temporal evolution of the probability of being in each state. Shaded error bars represent standard error from the mean. Transition matrix: probability of going from state (or region of interest) i to j, with (i, j) [..3] 2.Priors: initial state of the model Toolbox For more information, please refer to the SMAC with HMM toolbox manual in Supporting Information. The toolbox is available online at net/public/index.html. Classification from HMM parameters A variety of classification methods have been used in the gaze-based inference literature, including discriminant analysis (linear or quadratic) (Greene et al., 22; Tseng et al., 23;Kardanetal.,25; French et al., 26;Coutrotetal., 26), support vector machine (Lagun et al., 2; Greene et al., 22; Zelinsky et al., 23; Tseng et al., 23; Kanan et al., 24; Lemonnier et al., 24; Tavakolietal.,25; Borjietal.,25; Wangetal.,25; Kanan et al., 25; Mills et al., 25; French et al., 26), naïve Bayes (Mercer Moss et al., 22; Borji et al., 25, Kardanetal.,25, Mills et al., 25), boosting classifiers (ADABoost, RUS- Boost) (Borji & Itti, 24; Boisvert & Bruce, 26), Clustering (mean-shift, k-means, DBSCAN) (Rajashekar et al., 24; Kang & Landry, 25; Engbert et al., 25; Haass et al., 26), random forests (Mills et al., 25; Boisvert& Bruce, 26), and maximum likelihood estimation (Kanan et al., 25; Coutrot et al., 26). See (Boisvert & Bruce, 26) for a review. As stated in the Contributions section, this paper aims to provide an intuitive and visually meaningful method for gaze-based classification. We will focus on discriminant analysis as it includes both a predictive and a descriptive component: it is an efficient classification method, and it provides information on the relative importance of the variables (here, gaze features) used in the analysis. Let g R k be a k-dimensional gaze feature vector and GC = {g i,c j } i [..N];j [..M] be a set of N observations labeled by M classes. n j is the number of

8 Fig. 3 SMAC with HMM toolbox plot Three-state HMM modeling 9 scanpaths recorded on a video from Coutrot s dataset. Scanpaths: eye positions of the same color belong to the same observer. Emissions: Three states have been identified. Emission counts: number of eye positions associated with each state. Posterior probabilities: temporal evolution of the probability of being in each state. Shaded error bars represent standard error from the mean. Transition matrix: probability of going from state (or region of interest) i to j, with (i, j) [..3] 2. Priors: initial state of the model observations in class j. Observations are the gaze features used to describe recorded eye data (here, HMM parameters). Classes can represent any information about the stimuli or the observers (e.g., task at hand, experimental condition, etc.). Discriminant analysis Discriminant analysis combines the k gaze features to create a new feature-space optimizing the separation between the M classes. Let μ j be the mean of class j and W j its variance-covariance matrix. The goal is to find a space where the observations belonging to the same class are as close as possible to each other, and as far away as possible from observations belonging to other classes. First, g is normalized to unit standard deviation and zero mean. The intra-group dispersion matrix W and the inter-group variance-covariance matrix B are defined by with μ the global mean. The symbol represents the transposition. The Eigen vectors u of the new space maximize the expression u Bu arg max u ( u (W + B)u ) (2) The absolute values of the coefficients of u provide information on the relative importance of the different gaze features to separate the classes: the higher the value, the more important the corresponding feature. Classification The method is general, but for the sake of clarity, let s focus on a LDA-based two-classes classification only. Let y and y 2 be the respective projections of class and class 2 average on u. W = N M n j W j j= and B = N M n j (μ j μ) (μ j μ) j= () y = u μ and y 2 = u μ 2 (3) Let g be the new observation we want to classify and y = u μ the projection of its mean on u. The classification

9 Behav Res Fig. 4 SMAC with HMM toolbox plot One HMM for each of nine scanpaths recorded on a video from Coutrot s dataset. Maximum state number K max = 3. Small white circles represent observer s eye positions, red, green, and blue distributions represent HMM states. Covariance matrices have been tied to produce similar circular distributions consists in assigning g to the class whose average it is closest to along u, i.e., Let s assume y > y2.g is assigned to class ify > y +y2 2 (4) We follow a leave-one-out approach: at each iteration, one observation is taken to test, and the classifier trained with all the others. The correct classification rate is then the number of iteration where the class is correctly guessed divided by N, the total number of iteration. Application to gaze-based inference Here, the gaze feature vector g is made of HMM parameters. g = [(pi )i [..K], (aij )(i,j ) [..K]2, (mi )i [..K], ( i )i [..K] ] (5) with (pi ) the priors, (aij ) the transition matrix coefficients, (mi ) and ( i ) the center and covariance matrix coefficients of the Gaussian emissions. K represents the number of state used in the HMM. As presented in the previous section, this number is determined by a variational approach and can change from one observation to the other. In order for g to have the same dimensionality for all observations, we define K max as the highest number of states across all observations. For observations where K < K max, we pad their gaze feature vector with zeros, introducing ghost states. See for instance first row of Fig. 5, where K max = 3. In the free viewing and saliency viewing tasks, only one state is used, the coefficients corresponding to the other ones are set to zero. For the object search task, two states are used, the coefficients of the last one are set to zero. Regularization A problem can appear if gaze feature vectors are padded with too many zeros, or if the dimension of g exceeds the number of observations N. In that case, the intra-group dispersion matrix W is singular and therefore cannot be inverted: Eigen vectors u cannot be computed. To solve the problem, two solutions can be adopted. The first one is to simply reduce the dimensionality of g with a principal component analysis, keeping only the P < N first principal components. The second one is to use a regularized discriminant analysis approach (rda) which uses

10 Tasks original image free viewing saliency viewing object search o 2 o 22 2 o f s o 2 transition matrix f priors s o o 2 o 2 o 22 o o transition matrix f f 2 f 2 f 22 priors f f 2 s s 2 s 2 s 22 s s 2 o o 2 o 3 o 2 o 22 o 23 o 3 o 32 o 33 o o 2 o transition matrix f priors s s o o 2 o 3 o 2 o 22 o 23 o 3 o 32 o 33 o o 2 o transition f priors f s s 2 s 2 s 22 s s 2 o o 2 o 2 o 22 o o 2 Fig. 5 Hidden Markov models for four images and three tasks. For each image and each task, we train one HMM with the eye data of one observer. Small white circles represent the fixations of all observers following the same task. HMMs are made of states represented by Gaussian pdf (red, green, andblue), a transition matrix and priors. The optimal number of state has been determined by Bayesian variational approach ( λ)w + λi instead of W, with small λ called the shrinkage estimator. Results We illustrate the versatility of our approach with two very different public datasets. In the first one, we model gaze behavior on 8 still natural scene images, and infer an observer-related characteristic: the task at hand. In the second one, we model gaze behavior on 5 conversational videos, and infer a stimuli-related characteristic: the presence or absence of original soundtrack. Inferring observer characteristics from eye data Koehler s dataset This dataset was originally presented in (Koehler et al., 24) and is freely available online. It consists of 58 participants split into three tasks: free viewing, saliency search task, and cued object search task. Participants in the saliency search condition were instructed to determine whether the most salient object or location in an image was on the left or right half of the image. Participants in the cued object search task were instructed to determine pages/ saliencydata.html

11 whether a target object was present in a displayed image. Stimuli consisted of 8 natural scenes pictures, comprising both indoor and outdoor locations with a variety of sceneries and objects. Images were centrally displayed on a gray background for 2 ms, and had a resolution of 5 5 degrees of visual angle. Every trial began with an initial fixation cross randomly placed either centered, 3 degrees left of center or 3 degrees right of center. Eye data was recorded with an Eyelink monitoring gaze position at 25 Hz. HMM computation We trained one HMM per scanpath, i.e., one HMM per participant and per image. We set K max = 3. Higher values of K max have been tried, but in most instances the variational approach selected models with K 3. As a minimum of four points are needed to compute three-state HMM, scanpath with fewer than four fixations have been discarded from the analysis. We did not set a maximum number of fixations. Four representative examples are given in Fig. 5. Task classification Each scanpath is described by a 24- dimensional vectors g: sincek max = 3, there are three priors, 3 3 transition matrix coefficients, 3 2 Gaussian center coordinates and 3 2 Gaussian variance coefficients along the x and y axis. These parameters have different magnitudes, so g is normalized to unit standard deviation and zero mean. Regularized linear discriminant analysis is then used for classification. Since ghost states might be involved (for models where K<K max ), we had to regularize the training matrix. We took ( λ)w + λi instead of W, with λ = e 5. We followed a leave-one-out approach: at each iteration, we trained the classifier with all but one scanpath recorded on a given image, and tested with the removed scanpath. This led to an average correct classification rate of 55.9% (min = 2.9%, max = 87.2%, 95% confidence interval (CI) = [55.% 56.7%]). See Fig. 6 for the distribution of classification rates across stimuli. This classifier performs significantly above chance (which is 33%). To test the significance of this performance, we ran a permutation test. We randomly shuffled the label of the class (the task) for each observation and computed a random classification rate. We repeated this procedure e5 times. A p value represents the fraction of trials where the classifier did as well as or better than the original data. In our case, we found p <.. In Fig. 7, we show the absolute average values of the coefficients of the first LDA eigenvector, with unity-sum constraint. The higher the coefficients, the more important they are to separate the three classes. First, we notice that the priors and the transition matrix coefficients play a bigger role than Gaussian parameters. Then, we see that all the parameters linked to the third state are higher than the other ones. We computed the average number of real states for all scanpaths in the three tasks. We found that during the search task, scanpaths have significantly more real states (M = 2.26, 95% CI = [ ]) than during free viewing or saliency viewing (both M = 2.2, 95% CI = [2. 2.3]). chance average Frequenc y Classification success rate Fig. 6 Task classification success rate histogram. The average success rate is.559, significantly above chance (.33, permutation test, p<.). Each sample of this distribution corresponds to the mean classification rate for a given image. We show eight images drawn from the left and right tail of the distribution. Images with good task classification rate contain more salient objects. On the contrary, tasks while viewing images without particularly salient objects are harder to classify

12 Fig. 7 LDA first eigenvector coefficients (absolute values, unity-sum constraint). (pi ) i [..K] represent the priors, (a ij ) i,j [..K] 2 represent the transition matrix coefficients, (x i,y i ) i [..K] represent the center of the Gaussian states and (σi x,σy i ) i [..K] represent their variance along the x and y axis. The higher the coefficient, the more important the corresponding parameter to separate the classes. These coefficients optimize the separation between the three tasks in Koehler s data. The maximum number of state is K = 3 Is the method equally efficient with all visual content? Correct classification rates have a Gaussian-shaped distribution across stimuli, ranging from to 8%, see Fig. 6.Why is task classification more efficient with some images than with others? We hypothesize that in order to have a high correct classification rates, images must contain various regions of interest. If there is no region of interest (e.g., a picture of a uniform sky), observers exploration strategies might be too random for the classifier to capture systematic patterns diagnostic of a given class. If the image only contains one salient object (e.g., a red ball on a beach), observers exploration strategies might be too similar: everyone would stay focused on the only region of interest, and the classifier would fail for the same reason. To test this hypothesis, we looked at the correlation between the number of regions of interest and the image correct classification score. To compute the number of regions of interest, for each image, we computed its bottom-up saliency map with the attention based on information maximization (AIM) and the adaptive whitening saliency (AWS) models (Bruce & Tsotsos, 26; Garcia-Diaz et al., 22). We chose AIM and AWS because they provide good saliency estimation and the least spatially biased results, rendering them suitable for tasks in which there is no information about the underlying spatial bias of the stimuli (Wloka & Tsotsos, 26). Each saliency map is thresholded to a binary image. The number of regions of interest (or salient blobs ) in the binary map is thenumberofconnectedcomponents (bwlabel Matlab function). We found a positive significant Pearson s correlation between the number of salient objects and the classification score both for AIM (r =.4,p <.) and AWS (r =.,p =.). This means that images with higher correct classification rates contain more salient objects. On the other hand, images without particularly salient objects are harder to classify. Inferring stimulus characteristics from eye data Coutrot s dataset This dataset was originally presented in (Coutrot & Guyader, 24) and is freely available online 2. It consists of 5 conversational videos split into auditory conditions: with or without original soundtrack. Videos featured conversation partners embedded in a natural environment, lasted from 2 to 3 s and had a resolution of degrees of visual angle. Original soundtracks were made of conversation partners voice and environmental noises, non-original soundtracks were made of natural meaningless slowly varying sounds such as wind or rain sounds. Each video has been seen in each auditory condition by 8 different participants. Every trial began with an initial centered fixation cross. Eye data were recorded with an Eyelink monitoring gaze position at Hz. HMM computation We trained one HMM per scanpath, i.e., one HMM per participant and per video. HMM were trained with the average gaze positions of the 2 first 2

13 Video with original soundtrack Video without original soundtrack transition matrix s s 2 s 2 s 22 priors s s 2 m m 2 m 3 m 2 m 22 m 32 m 3 m 23 m 33 m m 2 m 3 transition matrix s s 2 s 3 s 2 s 22 s 32 s 3 s 23 s 33 priors s s 2 s 3 m m 2 m 3 m 2 m 22 m 32 m 3 m 23 m 33 m m 2 m 3 Fig. 8 Hidden Markov models for two videos and two auditory conditions. For each video, we train one HMM with the eye data of one observer (small white circles) in each auditory condition (with or without the original soundtrack). HMMs are made of states represented by Gaussian pdf (red, green, and blue), a transition matrix and priors. The optimal number of states has been determined by Bayesian variational approach. The covariance of the HMM states on the first row is data-driven, while the one of the second rows has been tied to a circular distribution frames of the video (8 s), i.e., with 2 gaze points. We set K max = 3. As with Koehler s dataset, higher values of K max have been tried, but the variational approach selected models with K 3. On the first row of Fig. 8, we give an example where ROIs covariances are determined by the data. ROI s covariance seems larger without than with the original soundtrack. To test this, we computed the average real state covariance for each HMM: σ = σx 2 + σ y 2. We indeed found a greater average covariance without (M=5568 pixels, 95% CI = [ ]) than with (M=44 pixels, 95% CI = [ ]) the original soundtrack (two-sample t test: p =.2). On the second row of Fig. 8, we used a method called parameter tying to force a unique covariance matrix across all states (Rabiner, 989). A parameter is said to be tied in the HMMs of two scanpaths if it is identical for both of them. Tying covariances makes all emissions cover the same area. This can be useful when the size of the ROIs is similar and consistent across stimuli, which is the case in this dataset where faces are always the most salient objects. We chose ( ) 5 = so state 5 distributions are circles of the same size as conversation partners faces. Stimuli classification We followed the same approach as described for Koehler s dataset, except that we have here two classes of auditory conditions. Using parameter tying and K max = 3, we achieve an average correct classification rate over all stimuli of 8.2% (min = 54.3%, max = 9.9%, 95% CI = [76.% 86.3%]). This classifier performs significantly above chance (5%, permutation tests: p<.). Comparison with other gaze features and classifiers In this section, we compare the performance of our HMMbased gaze features with other gaze features used in the literature. As described in the introduction, gaze has been

14 modeled in two ways: static (averaging eye movement parameters over time) and dynamic (representing gaze as a time series). We chose to compare our method with two widely popular representatives of each approach. Static: we use average fixation duration, standard deviation of the fixation duration distribution, saccade amplitude, standard deviation of the saccade amplitude distribution, eye position dispersion (within-subject variance), and the first five eye position coordinates, as in (Greene et al., 22; Borji & Itti, 24; Kardanetal.,25; Mills et al., 25; Tavakoli et al., 25). We can apply to these features the same classifiers as to our HMM-based features. We used linear discriminant analysis (LDA), support vector machine with linear kernel (SVM), relevance vector machine (RVM) and AdaBoost. RVM is similar to SVM but uses Bayesian inference to obtain parsimonious solutions for probabilistic classification (Tipping, 2). AdaBoost investigates non-linear relationships between features by combining a number of weak classifiers (here set at ) to learn a strong classifier. It had been previously successfully used in visual attention modeling (Zhao & Koch, 22; Borji, 22; Boisvert & Bruce, 26). Dynamic: we use Scan- Match, designed to compare pairs of scanpath (Cristino et al., 2). This method is based on the Needleman Wunsch algorithm used in bioinformatics to compare DNA sequences. It compares scanpaths of each class with each other, within and between classes. Within-class comparisons should have higher similarity scores than between class comparisons. A k-mean clustering algorithm is used to classify each comparison to either the within or betweenclass group. We used Coutrot s dataset as fixation durations and saccade amplitudes are not available in Koehler s data. Moreover, it is a two-class classification problem (with or without original soundtrack), directly compatible with ScanMatch. We compared the performance of classifiers trained with static and dynamic features previously used in the literature, HMM spatial features (ROI center coordinates and covariance), HMM temporal features (priors and transition matrix coefficients), and HMM spatio-temporal features (both). Table shows that the best results are achieved with LDA trained with HMM spatio-temporal features. Discussion Integrating bottom-up, top-down, and oculomotor influences on gaze behavior Visual attention, and hence gaze behavior, is thought to be driven by the interplay between three different mechanisms: bottom-up (stimuli-related), top-down (observer-related), and spatial viewing biases (Kollmorgen et al., 2). In this paper, we describe a classification algorithm relying on discriminant analysis (DA) fed with hidden Markov models (HMMs) parameters directly learnt from eye data. By applying it on very different datasets, we showed that this approach is able to capture gaze patterns linked to each mechanism. Bottom-up influences We modeled scanpaths recorded while viewing conversational videos from Coutrot s dataset. Videos were seen in two auditory conditions: with and without their original soundtracks. Our method is able to infer under which auditory condition a video was seen with an 8.2% correct classification rate (chance = 5%). HMMs trained with eye data recorded without the original soundtrack had ROIs with a greater average covariance than with the original soundtrack. This is coherent with previous studies showing that the presence of sound reduces the variability in observers eye movements (Coutrot et al., 22), especially while viewing conversational videos (Foulsham & Sanderson, 23; Coutrot & Guyader, 24). This shows that HMMs are able to capture a bottom-up influence: the presence or absence of original soundtrack. Top-down influences We also modeled scanpath recorded while viewing static natural scenes from Koehler s dataset. Observers were asked to look at pictures under three different tasks: free viewing, saliency viewing, and object search. Our method is able to infer under which task an image was seen with a 55.9% correct classification rate (chance = 33.3%). HMMs are hence able to capture a top-down influence: the task at hand. This complements two previous Table Correct classification scores on Coutrot s dataset, with different gaze features and classifiers Gaze features LDA SVM RVM AdaBoost k-means Static (saccades & fixations parameters averaged over time) 52.4% 56.7% 63.8% 57.6% n/a Dynamic (ScanMatch scores) n/a n/a n/a n/a 59.5% HMM spatial features (ROI mean + covariance) 59.% 57.4% 62.5% 55.2% n/a HMM temporal features (priors + transition matrix) 5.3% 54.8% 6.4% 54.6% n/a HMM spatio-temporal features (priors + transition matrix + mean + covariance) 8.2% 58.% 58.7% 54.8% n/a Scores significantly above chance are in bold (binomial test, p<.5 between 56% and 59%, p<. above 59%). Chance level is 5%

15 studies that also successfully used HMMs to infer observerrelated properties: observer s gender during face exploration (Coutrot et al., 26), and observer s processing state during reading (Simola et al., 28). Viewing biases Looking at Fig. 5, we notice that in the search task there is often a greater number of fixations at the center of the stimuli than in the other tasks. This cluster of central fixations is clearly modeled by a third HMM state in the second and third images. On average across all stimuli, we found a higher number of real HMM states in the search task than in the free viewing or saliency viewing task (2.26 versus 2.2). Figure 7 indicates that LDA first eigenvector coefficients related to the third state are higher than the other ones. Having a real third component is hence one of the criterion used by the classifier as a good marker of the search task. Moreover, the posterior probabilities of the states displayed Fig. 2 indicate that this center bias is stronger at the beginning of the exploration. This corroborates the idea that the center of the image is an optimal location for early and efficient information processing, often reported in the literature as the center bias (Tatler, 27). Hence, HMMs are able to integrate influences stemming from top-down mechanisms (task at hand), bottom-up mechanisms (presence of original soundtrack), and viewing biases (center bias) in a single model of gaze behavior. Interpretability The choice of both gaze features and classification algorithm is fundamental for efficient classification. A good illustration of this is Greene et al. s reported failure to computationally replicate Yarbus seminal claim that the observers task can be predicted from their eye movement patterns (Greene et al., 22). Using linear discriminant analysis and simple eye movement parameters (fixation durations, saccade amplitudes, etc.), they did not obtain correct classification rates higher than chance. In 24, Borji et al. and Kanan et al. obtained positive results with the same dataset by respectively adding spatial and temporal information (Borji & Itti, 24; Kanan et al., 24). They used non-linear classification methods such as k-nearest-neighbors (knn), random undersampling boosting (RUSBoost) and Fisher kernel learning, and obtained correct classification rates significantly above chance. Going further, one can hypothesize that even higher correct classification rates could be reached using deep learning networks (DLN), which have proven unbeatable for visual saliency prediction (Bylinskii et al., 25). However, boosting algorithms and DLN suffer from an important drawback: both rely on thousands of parameters, whose roles and weights are hard to interpret (although see (Lipton, 26)). Conversely, in addition to providing good correct classification rates, our approach is easy for users to understand and interpret. Our classification approach takes as input a limited number of identified and meaningful HMM parameters (priors, transition probability between learnt regions of interest, Gaussians center and covariance), and outputs weights, indicating the importance of the corresponding parameters in the classification process. Simplicity In order to make gaze-based classification easily usable in as many contexts as possible, relying on simple features is essential. In a recent study, Boisvert et al. used Koehler s dataset to classify observers task from eye data (Boisvert & Bruce, 26). They achieved a correct classification score of 56.37%, similar to ours (55.9%). They trained a random forest classifier with a combination of gaze-based (fixation density maps) and image-based features. They convolved each image with 48 filters from the Leung-Malik filter bank corresponding to different spatial scales and orientations (Leung & Malik, 2), and extracted the response of each filter at each eye position. They also computed histogram of oriented gradients from every fixated location, as well as a holistic representation of the scene based on the Gist descriptor (Oliva & Torralba, 26). This approach is very interesting, as it allows assessing the role of specific features or image structure at fixated locations. However, computing such features can be computationally costly, and even impossible if the visual stimuli are not available. On the other hand, our approach only relies on gaze coordinates, either fixations or eye positions sampled at a given frequency. Limitations Our approach suffers from a number of limitations. First, HMMs are dependent on the structure of the visual stimuli. In order to have a meaningful and stable model, stimuli must contain regions of interest (ROIs). For instance, modeling the visual exploration of a uniform landscape is difficult as nothing drives observers exploration: the corresponding HMM would most likely have a single uninformative central state. This is illustrated by the distribution of correct classification rates across stimuli, in Fig. 6. We showed a positive correlation between the number of ROIs and the image correct classification rates. This means that in order for different gaze patterns to develop and to get captured by the model visual stimuli must feature a few salient regions. Another consequence of the dependence on visual content is the difficulty-to-aggregate eye data recorded while viewing different stimuli. It is possible when the stimuli share the same layout, or have similar ROIs. For instance, a recent study used eye data of observers looking at different faces to train a single HMM (Coutrot et al., 26). This was possible since faces share the same features and can be aligned to each other; but this would not be possible with Koehler s dataset, as it is made of diverse natural scenes featuring ROIs from various sizes at

Understanding eye movements in face recognition with hidden Markov model

Understanding eye movements in face recognition with hidden Markov model Understanding eye movements in face recognition with hidden Markov model 1 Department of Psychology, The University of Hong Kong, Pokfulam Road, Hong Kong 2 Department of Computer Science, City University

More information

Validating the Visual Saliency Model

Validating the Visual Saliency Model Validating the Visual Saliency Model Ali Alsam and Puneet Sharma Department of Informatics & e-learning (AITeL), Sør-Trøndelag University College (HiST), Trondheim, Norway er.puneetsharma@gmail.com Abstract.

More information

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION

MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION MEMORABILITY OF NATURAL SCENES: THE ROLE OF ATTENTION Matei Mancas University of Mons - UMONS, Belgium NumediArt Institute, 31, Bd. Dolez, Mons matei.mancas@umons.ac.be Olivier Le Meur University of Rennes

More information

Methods for comparing scanpaths and saliency maps: strengths and weaknesses

Methods for comparing scanpaths and saliency maps: strengths and weaknesses Methods for comparing scanpaths and saliency maps: strengths and weaknesses O. Le Meur olemeur@irisa.fr T. Baccino thierry.baccino@univ-paris8.fr Univ. of Rennes 1 http://www.irisa.fr/temics/staff/lemeur/

More information

Measuring Focused Attention Using Fixation Inner-Density

Measuring Focused Attention Using Fixation Inner-Density Measuring Focused Attention Using Fixation Inner-Density Wen Liu, Mina Shojaeizadeh, Soussan Djamasbi, Andrew C. Trapp User Experience & Decision Making Research Laboratory, Worcester Polytechnic Institute

More information

Computational modeling of visual attention and saliency in the Smart Playroom

Computational modeling of visual attention and saliency in the Smart Playroom Computational modeling of visual attention and saliency in the Smart Playroom Andrew Jones Department of Computer Science, Brown University Abstract The two canonical modes of human visual attention bottomup

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 19: Contextual Guidance of Attention Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 20 November

More information

Natural Scene Statistics and Perception. W.S. Geisler

Natural Scene Statistics and Perception. W.S. Geisler Natural Scene Statistics and Perception W.S. Geisler Some Important Visual Tasks Identification of objects and materials Navigation through the environment Estimation of motion trajectories and speeds

More information

Outlier Analysis. Lijun Zhang

Outlier Analysis. Lijun Zhang Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based

More information

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING Yuming Fang, Zhou Wang 2, Weisi Lin School of Computer Engineering, Nanyang Technological University, Singapore 2 Department of

More information

The Attraction of Visual Attention to Texts in Real-World Scenes

The Attraction of Visual Attention to Texts in Real-World Scenes The Attraction of Visual Attention to Texts in Real-World Scenes Hsueh-Cheng Wang (hchengwang@gmail.com) Marc Pomplun (marc@cs.umb.edu) Department of Computer Science, University of Massachusetts at Boston,

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition Stefan Mathe, Cristian Sminchisescu Presented by Mit Shah Motivation Current Computer Vision Annotations subjectively

More information

Visual Task Inference Using Hidden Markov Models

Visual Task Inference Using Hidden Markov Models Visual Task Inference Using Hidden Markov Models Abstract It has been known for a long time that visual task, such as reading, counting and searching, greatly influences eye movement patterns. Perhaps

More information

On the role of context in probabilistic models of visual saliency

On the role of context in probabilistic models of visual saliency 1 On the role of context in probabilistic models of visual saliency Date Neil Bruce, Pierre Kornprobst NeuroMathComp Project Team, INRIA Sophia Antipolis, ENS Paris, UNSA, LJAD 2 Overview What is saliency?

More information

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training. Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze

More information

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society Title Eyes Closed and Eyes Open Expectations Guide Fixations in Real-World Search Permalink https://escholarship.org/uc/item/81z9n61t

More information

Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths

Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths Stefan Mathe 1,3 and Cristian Sminchisescu 2,1 1 Institute of Mathematics of the Romanian Academy of

More information

An Attentional Framework for 3D Object Discovery

An Attentional Framework for 3D Object Discovery An Attentional Framework for 3D Object Discovery Germán Martín García and Simone Frintrop Cognitive Vision Group Institute of Computer Science III University of Bonn, Germany Saliency Computation Saliency

More information

Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction

Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction Marc Pomplun and Sindhura Sunkara Department of Computer Science, University of Massachusetts at Boston 100 Morrissey

More information

Fusing Generic Objectness and Visual Saliency for Salient Object Detection

Fusing Generic Objectness and Visual Saliency for Salient Object Detection Fusing Generic Objectness and Visual Saliency for Salient Object Detection Yasin KAVAK 06/12/2012 Citation 1: Salient Object Detection: A Benchmark Fusing for Salient Object Detection INDEX (Related Work)

More information

Adding Shape to Saliency: A Computational Model of Shape Contrast

Adding Shape to Saliency: A Computational Model of Shape Contrast Adding Shape to Saliency: A Computational Model of Shape Contrast Yupei Chen 1, Chen-Ping Yu 2, Gregory Zelinsky 1,2 Department of Psychology 1, Department of Computer Science 2 Stony Brook University

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

A Visual Saliency Map Based on Random Sub-Window Means

A Visual Saliency Map Based on Random Sub-Window Means A Visual Saliency Map Based on Random Sub-Window Means Tadmeri Narayan Vikram 1,2, Marko Tscherepanow 1 and Britta Wrede 1,2 1 Applied Informatics Group 2 Research Institute for Cognition and Robotics

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single

More information

Object-based Saliency as a Predictor of Attention in Visual Tasks

Object-based Saliency as a Predictor of Attention in Visual Tasks Object-based Saliency as a Predictor of Attention in Visual Tasks Michal Dziemianko (m.dziemianko@sms.ed.ac.uk) Alasdair Clarke (a.clarke@ed.ac.uk) Frank Keller (keller@inf.ed.ac.uk) Institute for Language,

More information

Changing expectations about speed alters perceived motion direction

Changing expectations about speed alters perceived motion direction Current Biology, in press Supplemental Information: Changing expectations about speed alters perceived motion direction Grigorios Sotiropoulos, Aaron R. Seitz, and Peggy Seriès Supplemental Data Detailed

More information

Saliency aggregation: Does unity make strength?

Saliency aggregation: Does unity make strength? Saliency aggregation: Does unity make strength? Olivier Le Meur a and Zhi Liu a,b a IRISA, University of Rennes 1, FRANCE b School of Communication and Information Engineering, Shanghai University, CHINA

More information

Evaluation of the Impetuses of Scan Path in Real Scene Searching

Evaluation of the Impetuses of Scan Path in Real Scene Searching Evaluation of the Impetuses of Scan Path in Real Scene Searching Chen Chi, Laiyun Qing,Jun Miao, Xilin Chen Graduate University of Chinese Academy of Science,Beijing 0009, China. Key Laboratory of Intelligent

More information

Deriving an appropriate baseline for describing fixation behaviour. Alasdair D. F. Clarke. 1. Institute of Language, Cognition and Computation

Deriving an appropriate baseline for describing fixation behaviour. Alasdair D. F. Clarke. 1. Institute of Language, Cognition and Computation Central Baselines 1 Running head: CENTRAL BASELINES Deriving an appropriate baseline for describing fixation behaviour Alasdair D. F. Clarke 1. Institute of Language, Cognition and Computation School of

More information

Performance and Saliency Analysis of Data from the Anomaly Detection Task Study

Performance and Saliency Analysis of Data from the Anomaly Detection Task Study Performance and Saliency Analysis of Data from the Anomaly Detection Task Study Adrienne Raglin 1 and Andre Harrison 2 1 U.S. Army Research Laboratory, Adelphi, MD. 20783, USA {adrienne.j.raglin.civ, andre.v.harrison2.civ}@mail.mil

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 11: Attention & Decision making Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - - Electrophysiological Measurements Psychophysical Measurements Three Approaches to Researching Audition physiology

More information

The Importance of Time in Visual Attention Models

The Importance of Time in Visual Attention Models The Importance of Time in Visual Attention Models Degree s Thesis Audiovisual Systems Engineering Author: Advisors: Marta Coll Pol Xavier Giró-i-Nieto and Kevin Mc Guinness Dublin City University (DCU)

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - Electrophysiological Measurements - Psychophysical Measurements 1 Three Approaches to Researching Audition physiology

More information

Does scene context always facilitate retrieval of visual object representations?

Does scene context always facilitate retrieval of visual object representations? Psychon Bull Rev (2011) 18:309 315 DOI 10.3758/s13423-010-0045-x Does scene context always facilitate retrieval of visual object representations? Ryoichi Nakashima & Kazuhiko Yokosawa Published online:

More information

What we see is most likely to be what matters: Visual attention and applications

What we see is most likely to be what matters: Visual attention and applications What we see is most likely to be what matters: Visual attention and applications O. Le Meur P. Le Callet olemeur@irisa.fr patrick.lecallet@univ-nantes.fr http://www.irisa.fr/temics/staff/lemeur/ November

More information

A Model for Automatic Diagnostic of Road Signs Saliency

A Model for Automatic Diagnostic of Road Signs Saliency A Model for Automatic Diagnostic of Road Signs Saliency Ludovic Simon (1), Jean-Philippe Tarel (2), Roland Brémond (2) (1) Researcher-Engineer DREIF-CETE Ile-de-France, Dept. Mobility 12 rue Teisserenc

More information

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks Marc Assens 1, Kevin McGuinness 1, Xavier Giro-i-Nieto 2, and Noel E. O Connor 1 1 Insight Centre for Data Analytic, Dublin City

More information

Learning Spatiotemporal Gaps between Where We Look and What We Focus on

Learning Spatiotemporal Gaps between Where We Look and What We Focus on Express Paper Learning Spatiotemporal Gaps between Where We Look and What We Focus on Ryo Yonetani 1,a) Hiroaki Kawashima 1,b) Takashi Matsuyama 1,c) Received: March 11, 2013, Accepted: April 24, 2013,

More information

Methods for comparing scanpaths and saliency maps: strengths and weaknesses

Methods for comparing scanpaths and saliency maps: strengths and weaknesses Methods for comparing scanpaths and saliency maps: strengths and weaknesses Olivier Le Meur, Thierry Baccino To cite this version: Olivier Le Meur, Thierry Baccino. Methods for comparing scanpaths and

More information

Supplementary materials for: Executive control processes underlying multi- item working memory

Supplementary materials for: Executive control processes underlying multi- item working memory Supplementary materials for: Executive control processes underlying multi- item working memory Antonio H. Lara & Jonathan D. Wallis Supplementary Figure 1 Supplementary Figure 1. Behavioral measures of

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Evaluating Visual Saliency Algorithms: Past, Present and Future

Evaluating Visual Saliency Algorithms: Past, Present and Future Journal of Imaging Science and Technology R 59(5): 050501-1 050501-17, 2015. c Society for Imaging Science and Technology 2015 Evaluating Visual Saliency Algorithms: Past, Present and Future Puneet Sharma

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Meaning-based guidance of attention in scenes as revealed by meaning maps

Meaning-based guidance of attention in scenes as revealed by meaning maps SUPPLEMENTARY INFORMATION Letters DOI: 1.138/s41562-17-28- In the format provided by the authors and unedited. -based guidance of attention in scenes as revealed by meaning maps John M. Henderson 1,2 *

More information

A HMM-based Pre-training Approach for Sequential Data

A HMM-based Pre-training Approach for Sequential Data A HMM-based Pre-training Approach for Sequential Data Luca Pasa 1, Alberto Testolin 2, Alessandro Sperduti 1 1- Department of Mathematics 2- Department of Developmental Psychology and Socialisation University

More information

Eye Movement Pattern in Face Recognition is Associated with Cognitive Decline in the Elderly

Eye Movement Pattern in Face Recognition is Associated with Cognitive Decline in the Elderly Eye Movement Pattern in Face Recognition is Associated with Cognitive Decline in the Elderly Cynthia Y.H. Chan (cynchan@hku.hk) Department of Psychology, University of Hong Kong, Pokfulam Road, Hong Kong

More information

The Role of Color and Attention in Fast Natural Scene Recognition

The Role of Color and Attention in Fast Natural Scene Recognition Color and Fast Scene Recognition 1 The Role of Color and Attention in Fast Natural Scene Recognition Angela Chapman Department of Cognitive and Neural Systems Boston University 677 Beacon St. Boston, MA

More information

Influence of Low-Level Stimulus Features, Task Dependent Factors, and Spatial Biases on Overt Visual Attention

Influence of Low-Level Stimulus Features, Task Dependent Factors, and Spatial Biases on Overt Visual Attention Influence of Low-Level Stimulus Features, Task Dependent Factors, and Spatial Biases on Overt Visual Attention Sepp Kollmorgen 1,2., Nora Nortmann 1., Sylvia Schröder 1,2 *., Peter König 1 1 Institute

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Recognizing Scenes by Simulating Implied Social Interaction Networks

Recognizing Scenes by Simulating Implied Social Interaction Networks Recognizing Scenes by Simulating Implied Social Interaction Networks MaryAnne Fields and Craig Lennon Army Research Laboratory, Aberdeen, MD, USA Christian Lebiere and Michael Martin Carnegie Mellon University,

More information

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University

More information

NIH Public Access Author Manuscript J Vis. Author manuscript; available in PMC 2010 August 4.

NIH Public Access Author Manuscript J Vis. Author manuscript; available in PMC 2010 August 4. NIH Public Access Author Manuscript Published in final edited form as: J Vis. ; 9(11): 25.1 2522. doi:10.1167/9.11.25. Everyone knows what is interesting: Salient locations which should be fixated Christopher

More information

Contributions to Brain MRI Processing and Analysis

Contributions to Brain MRI Processing and Analysis Contributions to Brain MRI Processing and Analysis Dissertation presented to the Department of Computer Science and Artificial Intelligence By María Teresa García Sebastián PhD Advisor: Prof. Manuel Graña

More information

How does image noise affect actual and predicted human gaze allocation in assessing image quality?

How does image noise affect actual and predicted human gaze allocation in assessing image quality? How does image noise affect actual and predicted human gaze allocation in assessing image quality? Florian Röhrbein 1, Peter Goddard 2, Michael Schneider 1,Georgina James 2, Kun Guo 2 1 Institut für Informatik

More information

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling AAAI -13 July 16, 2013 Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling Sheng-hua ZHONG 1, Yan LIU 1, Feifei REN 1,2, Jinghuan ZHANG 2, Tongwei REN 3 1 Department of

More information

Webpage Saliency. National University of Singapore

Webpage Saliency. National University of Singapore Webpage Saliency Chengyao Shen 1,2 and Qi Zhao 2 1 Graduate School for Integrated Science and Engineering, 2 Department of Electrical and Computer Engineering, National University of Singapore Abstract.

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 15: Visual Attention Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 14 November 2017 1 / 28

More information

Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces

Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces Christopher Kanan Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA Dina N.F. Bseiso Department of

More information

Finding Saliency in Noisy Images

Finding Saliency in Noisy Images Finding Saliency in Noisy Images Chelhwon Kim and Peyman Milanfar Electrical Engineering Department, University of California, Santa Cruz, CA, USA ABSTRACT Recently, many computational saliency models

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

Contribution of Color Information in Visual Saliency Model for Videos

Contribution of Color Information in Visual Saliency Model for Videos Contribution of Color Information in Visual Saliency Model for Videos Shahrbanoo Hamel, Nathalie Guyader, Denis Pellerin, and Dominique Houzet GIPSA-lab, UMR 5216, Grenoble, France Abstract. Much research

More information

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention.

Computational Cognitive Science. The Visual Processing Pipeline. The Visual Processing Pipeline. Lecture 15: Visual Attention. Lecture 15: Visual Attention School of Informatics University of Edinburgh keller@inf.ed.ac.uk November 11, 2016 1 2 3 Reading: Itti et al. (1998). 1 2 When we view an image, we actually see this: The

More information

Functional Fixedness: The Functional Significance of Delayed Disengagement Based on Attention Set

Functional Fixedness: The Functional Significance of Delayed Disengagement Based on Attention Set In press, Journal of Experimental Psychology: Human Perception and Performance Functional Fixedness: The Functional Significance of Delayed Disengagement Based on Attention Set Timothy J. Wright 1, Walter

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

RECENT progress in computer visual recognition, in particular

RECENT progress in computer visual recognition, in particular IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, REVISED, AUGUST 2014. 1 Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition Stefan Mathe, Member,

More information

Convergence Principles: Information in the Answer

Convergence Principles: Information in the Answer Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice

More information

Quantifying location privacy

Quantifying location privacy Sébastien Gambs Quantifying location privacy 1 Quantifying location privacy Sébastien Gambs Université de Rennes 1 - INRIA sgambs@irisa.fr 10 September 2013 Sébastien Gambs Quantifying location privacy

More information

RECENT progress in computer visual recognition, in particular

RECENT progress in computer visual recognition, in particular 1 Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition Stefan Mathe, Member, IEEE, Cristian Sminchisescu, Member, IEEE arxiv:1312.7570v1 [cs.cv] 29 Dec 2013 Abstract

More information

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) AUTHORS: Tejas Prahlad INTRODUCTION Acute Respiratory Distress Syndrome (ARDS) is a condition

More information

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES Riku Matsumoto, Hiroki Yoshimura, Masashi Nishiyama, and Yoshio Iwai Department of Information and Electronics,

More information

Recurrent Refinement for Visual Saliency Estimation in Surveillance Scenarios

Recurrent Refinement for Visual Saliency Estimation in Surveillance Scenarios 2012 Ninth Conference on Computer and Robot Vision Recurrent Refinement for Visual Saliency Estimation in Surveillance Scenarios Neil D. B. Bruce*, Xun Shi*, and John K. Tsotsos Department of Computer

More information

A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions

A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions FINAL VERSION PUBLISHED IN IEEE TRANSACTIONS ON IMAGE PROCESSING 06 A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions Milind S. Gide and Lina J.

More information

Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition

Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition Stefan Mathe 1,3 and Cristian Sminchisescu 2,1 1 Institute of Mathematics of the Romanian Academy (IMAR) 2 Faculty

More information

Classification and Statistical Analysis of Auditory FMRI Data Using Linear Discriminative Analysis and Quadratic Discriminative Analysis

Classification and Statistical Analysis of Auditory FMRI Data Using Linear Discriminative Analysis and Quadratic Discriminative Analysis International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: 2347-5552, Volume-2, Issue-6, November-2014 Classification and Statistical Analysis of Auditory FMRI Data Using

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 8 Audio and Multimodal Attention Audio Scene Analysis Two-stage process Segmentation: decomposition to time-frequency segments Grouping

More information

The influence of clutter on real-world scene search: Evidence from search efficiency and eye movements

The influence of clutter on real-world scene search: Evidence from search efficiency and eye movements The influence of clutter on real-world scene search: Evidence from search efficiency and eye movements John Henderson, Myriam Chanceaux, Tim Smith To cite this version: John Henderson, Myriam Chanceaux,

More information

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society Title A Non-Verbal Pre-Training Based on Eye Movements to Foster Comprehension of Static and Dynamic Learning Environments Permalink

More information

SUPPLEMENTARY INFORMATION In format provided by Javier DeFelipe et al. (MARCH 2013)

SUPPLEMENTARY INFORMATION In format provided by Javier DeFelipe et al. (MARCH 2013) Supplementary Online Information S2 Analysis of raw data Forty-two out of the 48 experts finished the experiment, and only data from these 42 experts are considered in the remainder of the analysis. We

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

An Evaluation of Motion in Artificial Selective Attention

An Evaluation of Motion in Artificial Selective Attention An Evaluation of Motion in Artificial Selective Attention Trent J. Williams Bruce A. Draper Colorado State University Computer Science Department Fort Collins, CO, U.S.A, 80523 E-mail: {trent, draper}@cs.colostate.edu

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the

More information

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis Thesis Proposal Indrayana Rustandi April 3, 2007 Outline Motivation and Thesis Preliminary results: Hierarchical

More information

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant

More information

Putting Context into. Vision. September 15, Derek Hoiem

Putting Context into. Vision. September 15, Derek Hoiem Putting Context into Vision Derek Hoiem September 15, 2004 Questions to Answer What is context? How is context used in human vision? How is context currently used in computer vision? Conclusions Context

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

Learning to classify integral-dimension stimuli

Learning to classify integral-dimension stimuli Psychonomic Bulletin & Review 1996, 3 (2), 222 226 Learning to classify integral-dimension stimuli ROBERT M. NOSOFSKY Indiana University, Bloomington, Indiana and THOMAS J. PALMERI Vanderbilt University,

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Can Saliency Map Models Predict Human Egocentric Visual Attention?

Can Saliency Map Models Predict Human Egocentric Visual Attention? Can Saliency Map Models Predict Human Egocentric Visual Attention? Kentaro Yamada 1, Yusuke Sugano 1, Takahiro Okabe 1 Yoichi Sato 1, Akihiro Sugimoto 2, and Kazuo Hiraki 3 1 The University of Tokyo, Tokyo,

More information

Visual Strategies in Analogical Reasoning Development: A New Method for Classifying Scanpaths

Visual Strategies in Analogical Reasoning Development: A New Method for Classifying Scanpaths Visual Strategies in Analogical Reasoning Development: A New Method for Classifying Scanpaths Yannick Glady, Jean-Pierre Thibaut, Robert French {yannick.glady, jean-pierre.thibaut, robert.french}@u-bourgogne.fr

More information

Supplemental Material

Supplemental Material 1 Supplemental Material Golomb, J.D, and Kanwisher, N. (2012). Higher-level visual cortex represents retinotopic, not spatiotopic, object location. Cerebral Cortex. Contents: - Supplemental Figures S1-S3

More information

Sound Texture Classification Using Statistics from an Auditory Model

Sound Texture Classification Using Statistics from an Auditory Model Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering

More information

Emotion Recognition using a Cauchy Naive Bayes Classifier

Emotion Recognition using a Cauchy Naive Bayes Classifier Emotion Recognition using a Cauchy Naive Bayes Classifier Abstract Recognizing human facial expression and emotion by computer is an interesting and challenging problem. In this paper we propose a method

More information

Visual Saliency with Statistical Priors

Visual Saliency with Statistical Priors Int J Comput Vis (2014) 107:239 253 DOI 10.1007/s11263-013-0678-0 Visual Saliency with Statistical Priors Jia Li Yonghong Tian Tiejun Huang Received: 21 December 2012 / Accepted: 21 November 2013 / Published

More information

Comparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation

Comparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation Comparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation U. Baid 1, S. Talbar 2 and S. Talbar 1 1 Department of E&TC Engineering, Shri Guru Gobind Singhji

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science

2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science 2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science Stanislas Dehaene Chair in Experimental Cognitive Psychology Lecture No. 4 Constraints combination and selection of a

More information

Real-time computational attention model for dynamic scenes analysis

Real-time computational attention model for dynamic scenes analysis Computer Science Image and Interaction Laboratory Real-time computational attention model for dynamic scenes analysis Matthieu Perreira Da Silva Vincent Courboulay 19/04/2012 Photonics Europe 2012 Symposium,

More information