Valence-arousal evaluation using physiological signals in an emotion recall paradigm. CHANEL, Guillaume, ANSARI ASL, Karim, PUN, Thierry.

Similar documents
A Bayesian Framework for Video Affective Representation. SOLEYMANI, Mohammad, et al. Abstract

A Bayesian Framework for Video Affective Representation

Proceedings Chapter. Reference. Boredom, Engagement and Anxiety as Indicators for Adaptation to Difficulty in Games. CHANEL, Guillaume, et al.

On Shape And the Computability of Emotions X. Lu, et al.

Towards Human Affect Modeling: A Comparative Analysis of Discrete Affect and Valence-Arousal Labeling

INTER-RATER RELIABILITY OF ACTUAL TAGGED EMOTION CATEGORIES VALIDATION USING COHEN S KAPPA COEFFICIENT

Article. Reference. Short-term emotion assessment in a recall paradigm. CHANEL, Guillaume, et al.

PHYSIOLOGICAL RESEARCH

DISCRETE WAVELET PACKET TRANSFORM FOR ELECTROENCEPHALOGRAM- BASED EMOTION RECOGNITION IN THE VALENCE-AROUSAL SPACE

Distributed Multisensory Signals Acquisition and Analysis in Dyadic Interactions

Emotion Recognition using a Cauchy Naive Bayes Classifier

Dimensional Emotion Prediction from Spontaneous Head Gestures for Interaction with Sensitive Artificial Listeners

Relationship of Terror Feelings and Physiological Response During Watching Horror Movie

Affective Impact of Movies: Task Overview and Results

Emotion classification using linear predictive features on wavelet-decomposed EEG data*

Analysis of EEG signals and facial expressions for continuous emotion detection

Facial Expression Recognition Using Principal Component Analysis

THE PHYSIOLOGICAL UNDERPINNINGS OF AFFECTIVE REACTIONS TO PICTURES AND MUSIC. Matthew Schafer The College of William and Mary SREBCS, USC

A USER INDEPENDENT, BIOSIGNAL BASED, EMOTION RECOGNITION METHOD

AMIGOS: A Dataset for Affect, Personality and Mood Research on Individuals and Groups

Affective pictures and emotion analysis of facial expressions with local binary pattern operator: Preliminary results

Emotion Detection Using Physiological Signals. M.A.Sc. Thesis Proposal Haiyan Xu Supervisor: Prof. K.N. Plataniotis

Emote to Win: Affective Interactions with a Computer Game Agent

A Protocol for Cross-Validating Large Crowdsourced Data: The Case of the LIRIS-ACCEDE Affective Video Dataset

Emotion Affective Color Transfer Using Feature Based Facial Expression Recognition

EEG-based Valence Level Recognition for Real-Time Applications

Detecting emotion from EEG signals using the emotive epoc device

Estimating Multiple Evoked Emotions from Videos

Emotion based E-learning System using Physiological Signals. Dr. Jerritta S, Dr. Arun S School of Engineering, Vels University, Chennai

Facial expression recognition with spatiotemporal local descriptors

MSAS: An M-mental health care System for Automatic Stress detection

A Large Video Database for Computational Models of Induced Emotion

Recognising Emotions from Keyboard Stroke Pattern

Human Factors & User Experience LAB

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Bio-sensing for Emotional Characterization without Word Labels

PERFORMANCE EVALUATION OF VARIOUS EMOTION CLASSIFICATION APPROACHES FROM PHYSIOLOGICAL SIGNALS

Statistical Methods for Wearable Technology in CNS Trials

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete

Crowdsourcing for Affective Annotation of Video : Development of a Viewer-reported Boredom Corpus. SOLEYMANI, Mohammad, LARSON, Martha.

EMOTION RECOGNITION FROM USERS EEG SIGNALS WITH THE HELP OF STIMULUS VIDEOS

PREFACE Emotions and Personality in Personalized Systems

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 5, JULY

Edge Based Grid Super-Imposition for Crowd Emotion Recognition

The Ordinal Nature of Emotions. Georgios N. Yannakakis, Roddy Cowie and Carlos Busso

Using multi-method measures in consumer research investigating eye-tracking, electro-dermal activity and self report

FROM CROWDSOURCED RANKINGS TO AFFECTIVE RATINGS

Asymmetric spatial pattern for EEG-based emotion detection

AUDIO-VISUAL EMOTION RECOGNITION USING AN EMOTION SPACE CONCEPT

arxiv: v1 [cs.hc] 1 Jun 2016

Facial Expression Biometrics Using Tracker Displacement Features

Sleep Stage Estimation By Evolutionary Computation Using Heartbeat Data and Body-Movement

Face Analysis : Identity vs. Expressions

Formulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification

ANALYSIS OF FACIAL FEATURES OF DRIVERS UNDER COGNITIVE AND VISUAL DISTRACTIONS

Towards an EEG-based Emotion Recognizer for Humanoid Robots

Recognition of Sleep Dependent Memory Consolidation with Multi-modal Sensor Data

Temporal Context and the Recognition of Emotion from Facial Expression

Comparison of Two Approaches for Direct Food Calorie Estimation

Validation of a Low-Cost EEG Device for Mood Induction Studies

International Journal of Research in Engineering, Science and Management Volume-1, Issue-9, September ISSN (Online):

Regression algorithm for emotion detection

Emotion Classification along Valence Axis Using ERP Signals

Blue Eyes Technology

Implicit Emotional Tagging of Multimedia Using EEG Signals and Brain Computer Interface

A micropower support vector machine based seizure detection architecture for embedded medical devices

Smart Sensor Based Human Emotion Recognition

Estimating Intent for Human-Robot Interaction

An Algorithm to Detect Emotion States and Stress Levels Using EEG Signals

Social Context Based Emotion Expression

Real-time Recognition of the Affective User State with Physiological Signals

Continuous Facial Expression Representation for Multimodal Emotion Detection

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain

EEG-Based Emotion Recognition via Fast and Robust Feature Smoothing

Pattern Classification

Design of Intelligent Emotion Feedback to Assist Users Regulate Emotions: Framework and Principles

Classification of valence using facial expressions of TV-viewers

Mammogram Analysis: Tumor Classification

Feature Extraction for Emotion Recognition and Modelling using Neurophysiological Data

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and

A Pilot Study on Emotion Recognition System Using Electroencephalography (EEG) Signals

DETECTING VIEWER INTEREST IN VIDEO USING FACIAL AND HEART RATE RESPONSES

Emotional State Recognition via Physiological Measurement and Processing

Signal Processing ] (]]]]) ]]] ]]] Contents lists available at SciVerse ScienceDirect. Signal Processing

Gender Based Emotion Recognition using Speech Signals: A Review

Skin color detection for face localization in humanmachine

Emotion Classification in Arousal Valence Model using MAHNOB-HCI Database

Emotionally Augmented Storytelling Agent

Affective Priming: Valence and Arousal

This is a repository copy of Facial Expression Classification Using EEG and Gyroscope Signals.

Primary Level Classification of Brain Tumor using PCA and PNN

Audio-visual Classification and Fusion of Spontaneous Affective Data in Likelihood Space

Adaptation in Affective Video Games: A Literature Review

Identity Verification Using Iris Images: Performance of Human Examiners

Local Image Structures and Optic Flow Estimation

Panopto: Captioning for Videos. Automated Speech Recognition for Individual Videos

Statistical and Neural Methods for Vision-based Analysis of Facial Expressions and Gender

Predicting About-to-Eat Moments for Just-in-Time Eating Intervention

Emotion Recognition from Physiological Signals for User Modeling of Affect

Transcription:

Proceedings Chapter Valence-arousal evaluation using physiological signals in an emotion recall paradigm CHANEL, Guillaume, ANSARI ASL, Karim, PUN, Thierry Abstract The work presented in this paper aims at assessing human emotions using peripheral as well as electroencephalographic (EEG) physiological signals. Three specific areas of the valence-arousal emotional space are defined, corresponding to negatively excited, positively excited, and calm-neutral states. An acquisition protocol based on the recall of past emotional events has been designed to acquire data from both peripheral and EEG signals. Pattern classification is used to distinguish between the three areas of the valence-arousal space. The performance of two classifiers has been evaluated on different features sets: peripheral data, EEG data, and EEG data with prior feature selection. Comparison of results obtained using either peripheral or EEG signals confirms the interest of using EEG's to assess valence and arousal in emotion recall conditions. Reference CHANEL, Guillaume, ANSARI ASL, Karim, PUN, Thierry. Valence-arousal evaluation using physiological signals in an emotion recall paradigm. In: 2007 IEEE International Conference on Systems, Man and Cybernetics, SMC 2007 : proceedings. Institute of Electrical and Electronics Engineers ( IEEE ), 2007. DOI : 10.1109/ICSMC.2007.4413638638 Available at: http://archive-ouverte.unige.ch/unige:47741 Disclaimer: layout of this document may differ from the published version.

Valence-Arousal Representation of Movie Scenes Based on Multimedia Content Analysis and User's Physiological Emotional Responses Mohammad Soleymani, Guillaume Chanel, Joep Kierkels, Thierry Pun Computer Vision and Multimedia Laboratory, Computer Science Department University of Geneva, Battelle Campus, Building A, Rte. de Drize 7, CH - 1227 Carouge, Geneva, Switzerland Phone: +41 (22) 379 0185 Mohammad.soleymani@cui.unige.ch Keywords: Multimodal indexing, affective personalization and characterization, emotion recognition and assessment, affective computing, physiological signals analysis. 1 Introduction In this paper, we propose an approach for affective representation of movie scenes based on the emotions that are actually felt by spectators. Affect, to be used as a multimodal indexing feature, is estimated from both physiological signals and multimedia content analysis. Emotions are not discrete phenomena but rather continuous ones. Psychologists therefore represent emotions or feelings in an n- dimensional space (generally 2- or 3-dimensional). The most famous such space, which is used in the present study, is the 2D valence/arousal space [1]. Valence in the range [0-1] represents the way one judges a situation, from unpleasant (negative emotion) to pleasant (positive emotion). Arousal in the range [0-1] expresses the degree of excitement, from calm to exciting. The main goal of this work is therefore to automatically estimate the arousal and valence created by a movie scene as affect indicators for emotional characterization of these scenes. In order to represent affect of movie scenes, arousal and valence can be represented by grades; therefore, we compared self-assessment arousal and valence grades of the emotional content of scenes, with affective grades automatically estimated from physiological responses and from multimedia content analysis. There are only a limited number of studies on multimedia content-based affective representation/understanding of movies, and these mostly rely on self-assessments or population averages to obtain the emotional content of movies [2; 3]. While affect characterizations is usually based on self assessment, we propose here to use physiological signals as a modality for emotion assessment. In emotion assessment, physiological responses are valued for not interrupting users for self reporting phases; furthermore self reports are unable to represent dynamic changes. Physiological measurements give the ability of measuring the user responses dynamically. With the advancement of wearable systems for recording peripheral physiological signals, it is becoming more practically feasible to employ these signals in an easy-to-

use human computer interface and use them for emotion assessment [4; 5]. As opposed to signals from the central nervous system (electro-encephalograms) we therefore concentrated on the use of peripheral physiological signals for assessing emotion, namely: galvanic skin resistance (GSR), blood pressure which provided heart rate, respiration pattern, skin temperature, and electromyograms (EMG). In order to record facial muscles activity we used EMG from the Zygomaticus major and Frontalis muscles. 2 Material and Methods A video dataset of 64 movie scenes of different genres, each one to two minutes long, was created from which content-based low-level multimodal features were determined. The scenes were extracted from the following movies: Saving Private Ryan (action, drama), Kill Bill, Vol. 1 (action), Hotel Rwanda (drama), The Pianist (drama), Mr. Bean s Holiday (comedy), Love Actually (comedy), The Ring, Japanese version (horror) and 28 Days Later (horror). Experiments were conducted during which physiological signals were recorded from spectators watching the clips. In order to obtain their self-assessed emotions, participants were asked to characterize each movie scene by arousal and valence grades using self-assessment Manikins (SAM) [6]. To avoid fatigue of the participants, the protocol divided the show into two approximately 2 hours sessions of 32 movie scenes in random order. Before each scene, a 30 seconds long emotionally neutral clip was shown as baseline. Eight healthy participants (three female and five male, from 22 to 40 years old) participated in the experiment, during which peripheral physiological signals and facial expression EMG signals were recorded for emotion assessment. Examples of recorded physiological signals in a surprising scene are given in Figure 1. After finishing the experiment three types of affective information about each movie clip were available from different modalities: multimedia content-based information extracted from audio and video signals; physiological responses from spectators bodily reactions (due to the autonomous nervous system) and from muscular activity originating from facial expressions; self-assessed arousal and valence, used as ground truth for the true feelings of the spectator. Next, we aim at demonstrating how those true feelings about the movie scenes can be estimated in terms of valence and arousal from the information that is either extracted from audio and video signals or contained within the recorded physiological signals. Corresponding to each of the 64 video clips, a set of multimedia feature vectors each composed of 64 elements was extracted. Each feature vector highlights a single characteristic (e.g., average sound energy) of the 64 movie scenes. These features, like audio energy, motion component in video, color variance, were chosen based on existing literature, for example [1] and [2]. Regarding the use of physiological signals, the following affect representative features were extracted: energy of EMG signals, average and standard deviation of GSR, temperature, the

main frequency of respiration pattern acquired from respiration belt, heart rate, heart rate variability, and average blood pressure [3]. In order to select features that are the most relevant for estimating affect, the correlation between the single-feature vectors and the self-assessed arousal/valence vector was determined. Only the features with high absolute correlation coefficient between features and self assessments were chosen for affect estimation. A linear regression of the features from physiological signals was used to estimate degrees of arousal and valence. The same regression was applied on multimedia features to estimate those degrees. In order to determine the optimum estimated points in the arousal/valence space (each corresponding to one particular emotion), the linear regression was computed by means of a linear relevance vector machine (RVM) from the Tipping RVM toolbox [6] This procedure was performed four times on the user self assessed arousal/valence, and on the feature-estimated arousal/valence, for optimizing the weights corresponding to: (1) physiological features when estimating valence; (2) physiological features when estimating arousal; (3) multimedia features when estimating valence; (4) multimedia features when estimating arousal. 2 Results and Conclusion Physiological responses of participants were recorded while watching baseline clips as well as the movie scenes. The key features were extracted from these responses after removing the average baseline of physiological signals. After choosing the features having high correlation coefficients with self assessments, arousal/valence grades were estimated. Facial EMG signals and GSR were found to be highly correlated with valence and arousal grades for all subjects. Concerning multimedia content features, audio energy and visual features like color variance were found to be the most relevant for affect characterization. Table 1. Average of Euclidean distances between estimated points and self-assessed points in the valence-arousal space, for participants 1 to 8. Average distances with multimedia features Average distances with physiological features 1 0.25 0.22 2 0.20 0.18 3 0.19 0.21 4 0.20 0.21 5 0.23 0.25 6 0.24 0.28 7 0.19 0.18 8 0.17 0.19 Figure 1. Physiological response (participant 2) to a surprising action scene. The following raw physiological signals are shown: respiration pattern (a), GSR (b), blood pressure (c), and Frontalis EMG (d). The surprise moment is indicated by an arrow. A leave-one-out cross validation was done to evaluate the regression accuracy, using one sample (one scene out of 64) as test and the rest of the dataset as training set. The accuracy of the estimated arousal/valence grades was evaluated by computing the Euclidean distance in valence-arousal space between the estimated

points and the self assessments (ground truth). Valence, arousal and the estimation error are expressed in normalized ranges [0-1]; Table 1 shows the average distance, which is the estimation error between self assessed points and estimated points. The error distance mostly lies in the range [0-0.25], which is sufficiently small to allow the use of our estimations for emotional annotation or tagging. The weakness of affect estimation for some participants was often caused by inconsistent self assessments (such as declaring high valence values for negative emotions). It was observed that the error on valence estimation was lower than the error on arousal estimation. It is now planned to use prior information about movies such as genre, ratings, etc. from movie databases to enhance estimation accuracy. Previous work in the field of emotion assessment was mostly focused on discrete classification of basic emotions; in contrast, the method presented here allows for a continuous emotion determination and representation. Quite promising results were obtained regarding the accuracy of multimodal affect estimation. This shows the possibility of using this method to automatically annotate and index multimedia data, for instance for designing personal emotion-based content delivery systems. Having a precise affect estimation method, a similar strategy could be applied to neuromarketing where consumers' reactions to marketing stimuli could be predicted. References 1. J. A. Russell and A. Mehrabian, "Evidence for A 3-Factor Theory of Emotions", Journal of Research in Personality, vol. 11, no. 3, pp. 273--294 (1977). 2. A. Hanjalic and L. Q. Xu, "Affective video content representation and modeling", IEEE Transactions on Multimedia, vol. 7, no. 1, pp. 143--154 (2005). 3. H. L. Wang and L. F. Cheong, "Affective understanding in film", IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 6, pp. 689--704 (June 2006). 4. G. Chanel, K. Ansari-Asl, and T. Pun, "Valence-arousal evaluation using physiological signals in an emotion recall paradigm", Trans. IEEE Conf. SMC, Systems, Man and Cybernetics, Montreal (October 2007) 5. J.A. Healey, "Wearable and Automotive Systems for Affect Recognition from Physiology", Ph.D. Massachusetts Institute of Technology, (May 2000). 6. J. D. Morris, "Observations: SAM: The self-assessment manikin - An efficient cross-cultural measurement of emotional response", Journal. of Advertising Research, vol. 35, no. 6, pp. 63--68, (1995). 7. M. E. Tipping, "Sparse Bayesian learning and the relevance vector machine", Journal of Machine Learning Research, vol. 1, no. 3, pp. 211--244, (2001).