A Linear Regression Model to Detect User Emotion for Touch Input Interactive Systems

2015 Internatonal Conference on Affectve Computng and Intellgent Interacton (ACII) A Lnear Regresson Model to Detect User Emoton for Touch Input Interactve Systems Samt Bhattacharya Dept of Computer Scence & Engneerng IIT Guwahat, Guwahat, Inda samt3k@gmal.com We propose a very basc three-level categorzaton of emotonal states: postve, negatve and neutral. We further propose a set of seven features to characterze touch nteracton: devaton n number of fnger strokes, devaton n number of fnger taps, average length of the fnger strokes, average speed of fnger strokes, average delay, total delay and turnaround tme. We assume these features to be ndrect cues to user s emotonal state. Our proposed model s a lnear combnaton of these features. Wth the lnear model, we can predct the user s emotonal state as one of the three basc categores. Abstract Human emoton plays sgnfcant role s affectng our reasonng, learnng, cognton and decson makng, whch n turn may affect usablty of nteractve systems. Detecton of emoton of nteractve system users s therefore mportant, as t can help desgn for mproved user experence. In ths work, we propose a model to detect the emotonal state of the users of touch screen devces. Although a number of methods were developed to detect human emoton, those are computatonally ntensve and requre setup cost. The model we propose ams to avod these lmtatons and make the detecton process vable for moble platforms. We assume three emotonal states of a user: postve, negatve and neutral. The touch nteracton s characterzed by a set of seven features, derved from the fnger strokes and taps. Our proposed model s a lnear combnaton of these features. The model s developed and valdated wth emprcal data nvolvng 5 partcpants performng touch nput tasks. The valdaton study demonstrates a hgh predcton accuracy of 90.4%. We conducted emprcal studes to develop and valdate the model. The studes were conducted n two parts. In the frst part, we collected data from 36 partcpants for touch nteracton tasks on 10 tablets. The data were used to perform regresson analyss, to determne the lnear combnaton of the features. In the second part, we collected data from 21 partcpants. These data were used to ascertan the effcacy of the proposed model. The valdaton study shows that the proposed model s able to acheve a predcton accuracy of 90.4%. Keywords Emoton; touch nput; fnger strokes and taps; lnear regresson; emprcal study; I. INTRODUCTION In recent years, there has been a sgnfcant growth n the use of touch nput devces, prmarly due to the avalablty of smart phones and tabs at affordable prce. The users of such devces come from vared soco-economc and cultural backgrounds, wth wde varatons n ther profle. Consequently, the role of HCI n the desgn of nterfaces for the touch nput devces s very mportant. Emotonal state of an ndvdual may have an mpact on human factors [16] affectng usablty, the prmary concern n HCI. As a result, we need to detect user emoton so as to desgn nterfaces wth mproved usablty. The proposed model along wth the detals of the emprcal studes s descrbed n ths paper. The paper s organzed as follows. In Secton II, we present the related lterature. Ths s followed by the descrpton of the proposed approach, ncludng the feature set, n Secton III. The emprcal studes are presented n Secton IV, followed by dscusson (Secton V) and concluson (Secton VI). II. RELATED WORK Emoton and ts role n computng system desgn and applcaton has become a much studed area n recent years [18]. It s often found to be the drvng force behnd motvaton. Therefore, HCI researchers attempted to ntegrate the theory of emotons wth usable system desgn [1]. There s a substantal body of lterature avalable n the feld of emoton recognton. These mostly nvolve computer vson and mage processng technques, whch are computatonally very expensve methods. In addton, such methods also requre addtonal set ups (hardware/software). In ths work, we propose an approach to detect emoton for ndvduals usng the touch nput system. The method we propose reles on users touch nteracton behavour (strokes and taps). As a result, the proposed method does not requre any extra set up and the computatons are also much less. Hence, the proposed approach s expected to be more sutable for moble touch nput devces whch has lmted computatonal resources. 98-1-499-9953-8/15/$31.00 2015 IEEE There are broadly two ways of representng emotons: the dscrete model [6] of emoton and the contnuous model [19]. The former vew posts that emotons are dscrete, measurable, and are physologcally dstnct. Accordng to ths model, there are basc emotons such as happy, sad, angry, scared, tender and excted. Any emotonal state can be consdered to be a sub state of ether of the basc emotonal states. The contnuous model, on the other hand, represents emoton as a pont n a two dmensonal space of valence and arousal. The 90

x-axs represents the valence and the y-axs represents the arousal values. Based on these two models, several works are reported on emoton detecton. Ekman et al [] proposed the theory of emoton by facal expresson, whch had a sgnfcant effect on the works on emoton detecton (e.g., [2], [11]. [13]). Body movements and gestures also provde cues to emoton. Glownsk et al [9] reported on GEMEP (Geneva multmodal emoton portrayals) related to human upper-body movements and ther relaton to affect states. Banch-Berthouze and Klensmth [3] formalzed a general descrpton of posture based on angles and dstances between body jonts. They used t to create an affectve posture recognton system usng an assocatve neural network. Emoton detecton from full body movements was reported by Kapur et al [12] and Camurr et al [4]. Researchers also made efforts to use physologcal sgnals for emoton detecton. Such sgnals nclude electrooculogram (EOG), galvanc skn response (GSR), heart rate (HR), Electrocardogram (ECG) and eye blnkng rate (EBR). Takahash [21] recorded EEG and perpheral physologcal sgnal lke pulse and skn conductance for recognton of fve basc emotons. Koelstra et al [15] recorded multple physologcal sgnals such as EOG, HR, GSR, EBR and EEG to detect emotons. In Alzoub et al [1]), three physologcal sgnals, namely EMG (Electromyography), ECG and GSR were recorded and used for detecton of user s affectve states. Hazlett [10] reported the use of EMG sgnal to measure postve and negatve emotonal valence durng nteractve experence. Soleyman et al [20] worked on usng eye gaze data for emoton detecton. In ths work, we am to detect emoton for touch nput users. The prmary devces for such users are smart phones and tablets (.e., small handheld devces). These devces have lmted computng powers. In contrast, the major approaches towards emoton detecton requre sgnfcant computatons. Moreover, many of those requre addtonal hardware setup such as vdeo sensors, sensors to measure physologcal sgnals, eye trackers and so on. These extra setups may be costly, may not be convenent to use consderng moblty of the targeted devces and may not be supported by the devces at all. As a result, we need to come up wth technques that do not requre extra set-up or sgnfcant computaton. We propose to use touch nteracton characterstcs to predct emoton. Those are the strokes and taps. We assume that these provde an ndrect ndcaton (cue) to the user s affectve state. Snce we are not collectng any other nput, no extra set-up s requred. We found very few works n ths drecton. Khanna and Saskumar [14] have shown that emoton can be recognzed by keystroke patterns, especally frequency of some specal keys (e.g. spacebar and backspace). Epp et al [5] have also tred to dentfy emotonal states usng keystrokes dynamcs. However, these works were amed to detect emoton for desktop computer users. Gao et al [8] studed emoton detecton from touch nformaton, n the lmted context of game playng on Pod. In ths work, we propose a predctve model for emoton detecton, whch works based on touch nformaton and requres much less computaton compared to the other methods. The proposed approach s descrbed next. III. PROPOSED APPROACH Our proposed approach s based on the dscrete model of emoton. We assume three broad emotonal states, namely postve, negatve and neutral. Each of these broad states represents a set of basc affectve states. The postve state encompasses happy, excted and elated emotons. Sad, anger, fear and dsgust are represented by the negatve state. The neutral state represents calm, relaxed and contented emotons. Gven a user s touch nteracton behavor (n terms of fnger strokes and taps), we propose to map hs/her affectve state of mnd nto one of these three broad states. In order to do that, we desgned a set of seven features based on the strokes and taps. The feature set and the mappng approach are dscussed n the followng sectons. A. Proposed Feature Set There are three actons durng a touch nteracton: down, up and move. A down acton sgnfes the tme nstance at whch the fnger touches the screen. Lkewse, up acton s the tme nstance when the fnger s released. After a down acton, f the fnger moves on the screen wthout up acton, we call t a move acton. These three actons can be used to defne two touch nteracton characterstcs: stroke and tap. A tap s a combnaton of down and up acton whereas a stroke s a combnaton of down, up and move actons. In practce, however, we may not be able to have a perfect tap and there wll always be some small movements by our fnger, although we ntend to avod t. Hence, we dfferentate between the two based on the stroke length. If the length s less than or equal to a specfed lmt, we desgnate t a tap; otherwse a stroke. On the bass of the concepts of stroke and tap, we propose the followng seven features. 1. Average stroke length 2. Average stroke speed 3. Devaton n number of strokes ths s bascally the dfference between the actual number of strokes made to perform a task and the mnmum requred. 4. Devaton n number of taps smlar to the above but n terms of taps. 5. Total delay - Delay s the tme lag between the completon of the current touch acton (stroke or tap) and the startng of the next one. We can determne the delay between two consecutve touch actons by takng the dfference between the up acton tme (.e., fnshng) of the current touch acton and the down acton tme (.e., startng) of the next touch acton. We add up all these values for all the touch actons to get the value of the feature. 6. Average delay. Turnaround tme t s the total tme taken to complete a task. It s calculated by subtractng the down tme of the frst touch acton of the task from the up tme of the last touch acton of the task. We are usng these features based on the assumpton that the emotonal state has a role to play n nducng error 98-1-499-9953-8/15/$31.00 2015 IEEE 91

behavor, the speed of strokes, the delay between two consecutve touch actons or the total task completon tme. For example, a person n an exted state s more lkely to make some errors than someone n a calm state. Therefore, we can expect to have dfferent values for these features for dfferent emotonal states. In other words, the features act as ndrect cue (ndcator) to the emotonal states. B. Proposed Model Our proposed model s a lner combnaton of the feature values. In order to develop the model, we assumed that the feature values are mapped along the x-axs and the emotonal state of an ndvdual along the y-axs n a two-dmensonal space. For the three states, we assgned three dstnct ranges of values along the y-axs. These values were obtaned from emprcal data. We then establshed a lnear relatonshp between the features and the emotonal states, usng the lnear regresson technque. The lnear relaton assumes the followng form (Eq 1), for each feature f : y A B * f (1) We propose three dfferent relatons for each of the postve, negatve and neutral states. These relatons consttute the components of the model. The relatonshps were establshed wth the followng approach. We start by takng ndvdual features one at a tme for a partcular emotonal state. For example, let us consder the feature average delay for the affectve state postve. For ths feature, we plotted the data ponts and performed regresson analyss. Ths procedure s repeated for each of the remanng features. In ths way, we obtaned a set of seven lnear equatons, one for each feature for each of the emotonal state. Next, we combned these seven equatons to come up wth a sngle equaton, of the form shown n Eq 2, for a partcular emotonal state. y 1 A 1 1 B * f We then obtan the fnal form of the model by addng to the rght hand sde of the Eq 2 a constant value, whch s unque to a partcular emotonal state. Thus, we obtan three dfferent lnear relatons, one for each of the three emotonal states, as shown n Eqs 3-5. y y y pos neg neut A 1 (2) [ POS] B * f CPOS (3) 1 1 A 1 [ NEG] B * f CNEG (4) 1 1 A 1 [ NEU] B * f CNEU (5) 1 1 In the above set of equatons, there are several parameters. These nclude the numerc ranges for each emotonal states ([POS] for postve emoton, [NEG] for negatve emoton and [NEU] for neutral emoton) and the constants A s, B s and C s. These parameters were determned emprcally, whch s descrbed next. IV. EMPIRICAL STUDY A. Expermental Setup In the emprcal study, we collected touch usage data from 36 partcpants. The data were collected wth the 10 Akaash TM tablet runnng the Androd (Gngerbread) OS, verson 4.0.3. We had developed an Androd app for data collecton. The app was developed n Eclpse TM usng the Androd Development Kt (ADK). The app contaned seven general tasks, whch requred fnger strokes and taps to execute. We also estmated the mnmum number of strokes and taps requred to perform each task. The tasks were chosen snce they represented typcal functonaltes of touch devces. The tasks are lsted below. 1. Set a remnder for a specfc date and tme. Also type a remnder message. 2. There s a lst of 30 songs n alphabetcal order and user has to play a partcular song, whch s 5 th from the last. 3. Type 'hello' on message screen and select the send button. 4. There s a lst of 25 contacts n phone drectory. User has to call one of them, whch s 5 th from the bottom. 5. Draw a pattern eg: IIT LIFE on a blank canvas. 6. There s a lst of 25 contacts n phone drectory. User has to edt one contact whch s the second last n the lst.. Type 'google' on URL box and select the search button. Usually durng touch nteracton, we have to choose the desred app con from a set of app cons on the screen. Sometmes, the requred con may not be on the current screen. In that case, we have to change screen. In order to mmc ths behavor, we desgned our app wth four screens. In each screen, 15 cons were shown n grd vew wth a total of 60 cons. Among them, only seven cons were actvated for the tasks and the rest were dummy cons. Few task cons were placed on the frst (man) screen tself. However, cons for some of the tasks could be accessed only through screen change. The app captured and stored fnger down tme, up tme and stroke length n a log fle. B. Partcpants We selected a group of 36 male partcpants n the age group of 20 to 26. They were under graduate and post graduate students. The partcpants were chosen on the bass of ther famlarty wth the touch devces. All of them were regular users of touch devces (smart phones and tabs). The partcpants were further dvded equally nto three sub groups (each havng 12 partcpants) correspondng to the three emotonal states. Partcpants belongng to a sub group provded data for the correspondng emoton only. 98-1-499-9953-8/15/$31.00 2015 IEEE 92

C. Procedure We dvded the data collecton study nto fve stages: (1) tranng, (2) ntentonal emoton changng, (3) self-assessment questonnare, (4) actual data collecton, and (5) selfassessment questonnare. Durng the tranng sesson, partcpants were famlarzed wth the app. The app tranng ncluded ntroducng the partcpants to the actve task cons, the steps requred to locate those cons n the four screens and the steps for executng the seven tasks. They were gven some dummy tasks to perform for the purpose. Tranng sessons lasted for about 10-15 mnutes. Each partcpant was provded wth a volunteer ready to help at any stage. In order to collect data for postve and negatve states (from those partcpants whom we put nto those sub groups), we used a method to brng a partcpant to one of these states. Usually, t s dffcult to change one s mental state from postve to negatve or vce versa. Changng the mental state from neutral to postve or neutral to negatve s much easer. Therefore, we performed an ntal screenng of the partcpants and selected those whose mental states were lkely to be neutral. We ntervewed them about the actvtes they performed pror to comng for the test. Based on ther responses, we made the judgment. For example, f a partcpant sad that he was playng football and scored a dffcult goal, he was lkely to be n an excted state. Therefore, we decded not to collect data from hm at that pont of tme. On the other hand, f someone reported to be lstenng to devotonal songs before jonng the experment or sleepng, we assumed hm to be n a neutral state and ncluded hm n the study. For takng the user to a partcular emotonal state, we defned some ntentonal emoton changng dummy tasks. A partcpant took around thrty to forty mnutes to carry out the tasks. We nformed the partcpants beforehand that the tasks were lkely to trgger a change n emotonal state and obtaned ther nformed consent. We brought a partcpant nto a postve emotonal state by showng them funny vdeos, comedy vdeos and comedy scenes from YouTube. The vdeos were selected on the bass of vewers ratng. These tasks were expected to nduce happy emoton n the partcpants, whch s a part of the postve emotonal state. We also had set some SGT puzzles 1, n whch we had chosen UNTANGLE, to make partcpants excted, another postve emoton. In addton, we made use of some nspratonal vdeos and motvatonal tasks to trgger the postve emoton n the partcpants. We used a slow response devce to change partcpants emotonal state to negatve. On ths devce, we had defned a task that took nearly 30 mnutes to complete. Due to ts slow response, the partcpants were expected to become angry and frustrated, trggerng a negatve emotonal state. We also had shown some explct vdeos related to poverty, malnutrton and post war trauma, to nduce the negatve emoton among 1 Smon Tathams portable puzzle collecton. http://www.chark.greenend.org.uk/~sgtatham/puzzles/ the partcpants. Partcpants were also asked to get 25 ponts wthn 60 seconds playng the SGT puzzle, whch were beyond ther abltes. Ths was expected to trgger the negatve emotonal state. The prevous step was followed by a self-assessment questonnare. It helped us decde whether that partcular partcpant had come nto the desred emotonal state. There were eght Yes/No type questons n the self-assessment questonnare, whch are lsted below. [1] Are you happy or excted? [2] Are you enjoyng? [3] Do you want to leave ths room? [4] Are you nterested to do the same thng you dd n the last 20 mnutes? [5] Would you lke to lsten to joke? [6] Would you lke to solve puzzle? [] Would you lke to sng a sad song? [8] Are you sad, angry or frustrated? We determned the effectveness of the emoton changng tasks by evaluatng the answers of the above questons. The questons 1, 2, 4, 5 and 6 focused on postve aspects of mnd lke happness, exctement, enjoyment and affnty towards a rewardng thng. If a partcpant answered YES to all these questons and NO to the other questons, we concluded that he was n the postve emotonal state. Smlarly, YES to the questons 3, and 8 and NO to the other questons by a partcpant led us to conclude hs emotonal state to be negatve. We repeated the emoton changng tasks f the selfassessment questonnare dd not ndcate a change n the partcpant s emoton to the desred state. In the fourth stage, we asked the partcpants to perform the seven tasks on the tablet n a sngle sesson, whch took between 3-5 mn for a partcpant to complete. The order of the tasks was counterbalanced (changed for each partcpant) to take nto account the learnng effects (f any). After the data collecton stage, we admnstered the same self-assessment questons to the partcpants, to ensure that the emotonal states durng data collecton were the desred ones. If we found that a partcpants emoton after the tasks was dfferent than before, we repeated the steps agan. D. Result and Analyss Our frst objectve was to dentfy the numerc ranges that characterze each emotonal state, wth the constrants that (a) the ranges should not be overlappng and (b) the ranges should be such that any feature vector (the set of seven feature values) can map to only one of the three ranges. We performed several trals and errors wth varous ranges to ft the emprcal data subject to the satsfacton of the constrants. The emprcal data conssted of three sets of feature vectors for the three emotonal state, each set havng 12 vectors. We had chosen the three unque non-overlappng ranges that satsfed the constrants and gave the closest match wth the emprcal data. 98-1-499-9953-8/15/$31.00 2015 IEEE 93

The three ranges we obtaned through ths approach are as follows. 1) Postve emoton range: [50, 105] 2) Negatve emoton range: [1, 12] 3) Neutral emoton range: [25, 36] In order to obtan the other parameters (A s, B s and C s of Eq. 3-5), we used an elaborate assgnment based approach. Let us consder the postve emotonal state for llustraton. We had 12 feature vectors correspondng to the 12 partcpants for ths state. The emoton of each of the partcpants was assgned a number between [50, 105] (.e., the postve range). The assgnment started wth 50, separated by 5 and ended at 105. For example, f one partcpant s assgned the value 50, the next s assgned 55, the next 60 and so on tll 105. Wth one set of such assgnment, we estmated the constants A s, B s and C s, through lnear regresson. We then reassgned the numbers to the partcpants (e.g., the partcpant who was assgned 50 was re-assgned 55 and so on) and re-estmated the constants. The process we repeated for 12 tmes, correspondng to the 12 partcpants. The fnal values of A s, B s and C s were obtaned by takng average of all the estmated 12 values. We appled smlar procedure to determne the constants for the other two emotonal states also. Only mnor dfference was that the assgnment of numbers was separated by 1 n those cases, rather than 5. The fnal estmated values are shown n Table I. The constants C s of Eq. 3-5 were estmated as.5, 6.5 and 30.5 for postve, negatve and neutral emotons respectvely (through tral and error method, satsfyng the two constrants mentoned before). Thus, we obtaned Eq. 6-8 as our fnal proposed model, where B values are taken from Table I for the correspondng f. y 13.91 0.14 * B * f.5 (6) pos 1 y 13.91 0.14 * B * f 6.5 () neg 1 y 13.91 0.14 * B * f 30.5 (8) neut 1 TABLE I. THE MODEL CONSTANTS ESTIMATED FROM EMPIRICAL DATA. Feature Notaton A B Devaton n strokes f 1 34.58 0.25 Devaton n taps f 2 36.26 0.54 Average length of strokes f 3 21.23-0.02 Average speed of strokes f 4-26.54 91.9 Average delay f 5 20.13 0.0 Total delay f 6 15.59 0.0 Turnaround tme f 15.59 0.0 After we obtaned Eq 6-8, we recomputed the y values for the gven feature vectors and refned the emoton ranges. The refned ranges we obtaned are as follows. 1) Postve emoton range: [9.38, 102.21] 2) Negatve emoton range: [25.38, 30.] 3) Neutral emoton range: [48.99 to 53.89] Therefore, our proposed model conssts of Eqs 6-8 along wth the three refned ranges mentoned above. V. DISCUSSION In order to ascertan the valdty of the proposed model, we performed further emprcal studes. The expermental setup was the same ncludng the devces, app and the tasks. The nteracton data of fnger down tme, fnger up tme and the stroke length were logged, as before. We selected a new set of 21 partcpants for ths study, n the age group of 22-26. They were postgraduate male students wth good exposure to touch devces. The partcpants were dvded nto 3 groups of n each, correspondng to the three emotonal states. Data were collected usng the same procedure as before. The feature vectors we obtaned at the end of data collecton were used to predct the state of the correspondng partcpant. We matched ths data wth the (known) emotonal state of the partcpant. The proposed model was able to correctly predct 19 out of 21 cases. Hence the accuracy of the model was found to be 90.4%. The emprcal study results show that the model can predct wth a reasonably hgh accuracy, whch makes t sutable for practcal use. Moreover, the features are very smple to compute. The model therefore does not requre any addtonal resources or set up for predcton. Hence, we beleve that the proposed model offers a sutable alternatve for predctng the emotonal state of touch screen users. The model can be used as follows. For a touch nput user, we get a feature vector contanng the seven feature values. We use ths vector to compute the y values wth Eqs. 6-8. The range n whch the y value les ndcate the user s emotonal state (postve, negatve or neutral). Once we are able to detect user s emotonal state from hs nteracton wth the devce, we can change the look and feel of the nterface, to complement the emotonal state. We can also make changes n the way tasks are performed dependng on the current state of user emoton. Ths may lead to polte nterfaces, whch are empathc. Such qualtes, n turn, are expected to mprove usablty and enhance user experence. In order to develop the model, we assumed that the seven features provde ndrect cues to the emotonal state. The hgh accuracy observed durng the emprcal valdaton ndcates the valdty of ths assumpton. We also assumed that all the seven features are necessary for computng user s emotonal state. Snce the features are very smple to compute, any reducton n feature set may not lead to sgnfcant reducton n computaton. Hence, we ddn t perform any separate study to decde f the same predcton results can be obtaned wth lesser number of features. A thrd mportant assumpton we made was that all emotonal states can be clustered nto three 98-1-499-9953-8/15/$31.00 2015 IEEE 94

broader states (postve, negatve and neutral). It s true that there are varous models for representng emotons, as mentoned n the related works secton. However, we thnk t s reasonable to consder emotons as belongng to any of the three broader classes. Ths assumpton helps n the development of the smpler model of predcton, as compared to the more complex models that are found n the lterature. The valdaton results gve an ndrect justfcaton for ths assumpton. We also assumed that we can nduce emotons followng the method we used n the emprcal study. It may be noted that the emoton nducement process we developed s a unque approach, not found n the lterature. The model parameters were estmated from the emprcal data collected followng the steps that nclude the emoton evokng tasks and the selfassessment questonnares. Snce the valdaton results show hgh accuracy of predcton, we beleve the partcular approach s justfed. In summary, we beleve that the assumptons are reasonable based on the emprcal valdaton of the model. It may also be noted that the model was bult by frst obtanng the numerc ranges for each emotonal state through tral and error. Ths s followed by regresson analyss assumng those ranges to be correct. The ntal ranges were fnalzed by applyng the model on the tranng vectors. Emprcal valdaton results demonstrate that the fnal ranges and the model obtaned through ths process are relable predctors wth hgh degree of accuracy. However, t may be necessary to base the justfcatons on more theoretcal foundaton before we can reach to any concluson about the valdty of those assumptons and the method we adopted for data analyss. We plan to work n ths drecton n future. The emprcal data were obtaned from partcpants who represent homogeneous group (male, post graduate students wthn the age group of 22-26). Moreover, the number of partcpants was also modest (5). Therefore, the results we obtaned may be termed as ndcatve. We plan to conduct more emprcal studes wth large number of heterogeneous partcpants to come to any concluson about the model predcton. VI. CONCLUSIONS We proposed an emprcally derved lnear model to predct the affectve state of touch nput users. The model s able to predct the user s affectve state as one of the three classes: postve, negatve and neutral. The model parameters were estmated from emprcal data. Emprcal valdaton ndcates hgh predcton accuracy of the proposed model. In order to refne and mprove the model further, we plan to work on the theoretcal justfcaton for the varous assumptons made n the model, model refnement and valdaton wth more emprcal data from larger heterogeneous group of users and desgn nterface and nteracton that complements user s emotonal state. ACKNOWLEDGMENT We thank Sachn Shah, Panthadeep Bhattacharjee and the volunteers who helped us n expermental setup and data collecton. REFERENCES [1] O. AlZoub, S. D'Mello, and R. Calvo, Detectng naturalstc expressons of nonbasc affect usng physologcal sgnals, IEEE Trans. Affectve Computng, 3(3), 298-310, 2012. [2] C. Bartneck, and J. Rechenbach, Subtle emotonal expressons of synthetc characters, Int. J Human-Computer Studes, 62, 19-192, 2005. [3] N. Banch-Berthouze, and A. Klensmth, A categorcal approach to affectve gesture recognton, Connecton Scence, 15(4), 259-269, 2003. [4] A. Camurr, I. Lagerlof, and G. Volpe, Recognzng emoton from dance movement: comparson of spectator recognton and automated technques, Int J Human-Computer Studes, 59(1), 213-225, 2003. [5] C. Epp, M. Lppold, and R.I. Mandryk, Identfyng emotonal states usng keystroke dynamcs Proc. CHI2011, ACM Press (2011), 15-24. [6] P. Ekman, An argument for basc emotons, Cognton and Emoton, 6 (3/4), 1992. [] P. Ekman, E. R. Sorenson, and W. V. Fresen, Pan-cultural elements n facal dsplays of emoton, Scence, 164(385), 86-88, 1969. [8] Y. Gao, N. Banch-Berthouze, and H. Meng, What does touch tell us about emotons n touchscreen-based gameplay?, ACM Trans Computer-Human Interacton (TOCHI), 19(4), 2012. [9] D. Glownsk, N. Dael, A. Camurr, G. Volpe, M. Mortllaro, and K. Scherer, Toward a mnmal representaton of affectve gestures, IEEE Trans Affectve Computng, 2(2), 106-118, 2011. [10] R. L. Hazlett, Measurng emotonal valence durng nteractve experences: boys at vdeo game play, In Proc. CHI2006. ACM Press (2006), 1023-1026. [11] K. Isbster, K. Hook, M. Sharp, and J. Laaksolaht, The sensual evaluaton nstrument: developng an affectve evaluaton tool, Proc CHI 2006, 1163-112. ACM, 2006. [12] A. Kapur, A. Kapur, N. Vrj-Babul, G. Tzanetaks, and P. F. Dressen, Gesture-based affectve computng on moton capture data, Affectve Computng and Intellgent Interacton, LNCS 384, 1-, Sprnger, 2005. [13] J. Katsyr, and M. Sams, The effect of dynamcs on dentfyng basc emotons from synthetc and natural faces, Int J of Human-Computer Studes, 66(4), 233-242, 2008. [14] P. Khanna, and M. Saskumar, Recognsng emotons from keyboard stroke pattern, Int J Computer Applcatons, 11(9), 2010. [15] S. Koelstra, A. Yazdan, M. Soleyman, C. Muhl, J. Lee, A. Njholt, T. Pun, T. Ebrahm, and I. Patras, Sngle tral classfcaton of EEG and perpheral physologcal sgnals for recognton of emotons nduced by musc vdeos, Bran Informatcs, 89-100. Sprnger, 2010. [16] D. Lottrdge, M. Chgnell, and A. Jovcc, Affectve nteracton: understandng, evaluatng, and desgnng for human emoton, Human Factors and Ergonomcs,(1), 19-21, 2011. [1] C. Peter, and R. Beale, (eds), Affect and emoton n human computer nteracton: from theory to applcaton, LNCS 4868, Sprnger, 2008. [18] R. W. Pccard, Affectve computng, MIT Press, Cambrdge, 199. [19] J. A. Russell, A crcumplex model of affect, J Personalty and Socal Psychology, 39(6), 1161-118, 1980. [20] M. Soleyman, M. Pantc, and T. Pun, Multmodal emoton recognton n response to vdeos, IEEE Trans Affectve Computng, 3(2), 211-223, 2012. [21] K. Takahash, Remarks on emoton recognton from bo-potental sgnals, Proc. 2 nd Int Conf Autonomous Robots and Agents, (2004), 186-191 98-1-499-9953-8/15/$31.00 2015 IEEE 95