Unveiling the Multimedia Unconscious: Implicit Cognitive Processes and Multimedia Content Analysis

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Unveiling the Multimedia Unconscious: Implicit Cognitive Processes and Multimedia Content Analysis"

Transcription

1 Unviling th Multimia Unconscious: Implicit Cognitiv Pocsss an Multimia Contnt Analysis ABSTRACT Maco Cistani 1 Alssano Vincialli 2,3 1 Univsity of Vona (Italy) 2 Univsity of Glasgow (UK) On of th main finings of cognitiv scincs is that automatic pocsss of which w a unawa shap, to a significant xtnt, ou pcption of th nvionmnt. Th phnomnon applis not only to th al wol, but also to multimia ata w consum vy ay. Whnv w look at pictus, watch a vio o listn to auio coings, ou conscious attntion ffots focus on th obsvabl contnt, but ou cognition spontanously pcivs intntions, blifs, valus, attitus an oth constucts that, whil bing outsi of ou conscious awanss, still shap ou actions an bhavio. So fa, multimia tchnologis hav nglct such a phnomnon to a lag xtnt. This pap agus that taking into account cognitiv ffcts is possibl an it can also impov multimia appoachs. As a suppoting poof-of-concpt, th pap shows not only that th a visual pattns colat with th psonality taits of 300 Flick uss to a statistically significant xtnt, but also that th psonality taits (both slf-assss an attibut by oths) of thos uss can b inf fom th imags ths latt post as favouit. 1. INTRODUCTION Until a fw yas ago, pouction an iffusion of multimia ata qui skills an infastuctu that w th pivilg of a fw iniviuals an oganizations (achivs, igital libais, onlin positois, tc.) [38]. Nowaays, tchnologis as ubiquitous an us-finly as smatphons an tablts allow on to asily cat multimia matial (pictus, vios, sounbits, txt an thi combinations) an sha it with oths - typically though social mia o oth onlin tchnologis - by simply pushing a button. In such a tchnological lanscap, multimia ata is not just a way to tansmit knowlg an infomation - as it us to b taitionally fo any typ of ata [5, 41] - but on of th channls though which w intact with oths. Th co ia of this aticl is that, in such an unpc- Pmission to mak igital o ha copis of all o pat of this wok fo psonal o classoom us is gant without f povi that copis a not ma o istibut fo pofit o commcial avantag an that copis ba this notic an th full citation on th fist pag. To copy othwis, to publish, to post on svs o to istibut to lists, quis pio spcific pmission an/o a f. Copyight 20XX ACM X-XXXXX-XX-X/XX/XX...$ Cistina Sgalin 1 Alssano Pina 4 3 Iiap Rsach Institut (Switzlan) 4 Micosoft Rsach (USA) nt scnaio, th xchang of multimia ata has bcom a fom of human-human communication an, thfo, it shoul involv th cognitiv phnomna typically obsv in human-human intactions. This applis in paticula to implicit cognitiv pocsss that tak plac outsi ou conscious awanss, but still shap to a lag xtnt ou pcption of th wol an ou bhavio [23], namly th tnncy to xpss an attibut to oths goals, valus, intntions, taits, blifs an any oth typ of socially lvant chaactistics [49]. To hav a masu of how much multimia ata hav bcom a mans of communication btwn popl, it is sufficint to consi a fw statistics availabl on Youtub at th momnt this aticl is bing wittn 1 : whil uploaing vy ay 12 yas of vio matial, Youtub uss accss th popula on-lin platfom on tillion tims p ya, an avag of 140 visits p pson on Eath (th figu fs to 2011). In oth wos, th sms to b no multimia sampl pouc by on pson that is not consum by somon ls. Unlik a m fw yas ago, cation, iffusion an shaing of multimia ata is no long th xclusiv pogativ of skill pofssionals, but th vyay pactic of th lay pson. Multimia ata a no long, o no long xclusivly, th cafully caft pouct of cativity an communication skills, but th spontanous xpssion of common iniviuals involv in vyay social intactions. Th poblm is that ou cognitiv pocsss a th sult of a long volutionay histoy an cannot chang at th pac of tchnology. Thfo, ou cognition kps following pattns that w shap uing tims whn tchnology was fa fom xisting [32]. In paticula, a lag boy of vinc shows that ou cognition constantly woks to mak sns of th wol aoun us an that this happns, to a lag xtnt, ffotlssly, an vn unintntionally [48]. This mans that th infomation w gath an pocss though ou conscious attntion - th typical alm of cunt multimia tchnologis - is only on of th factos that iv ou actions towas th nvionmnt, th oths bing implicit, vn automatic pocsss: implicit attitus, infncs, goals an thois, an th affct an bhavios thy pouc [49], wh th wo implicit mans outsi ou conscious awanss. To th bst of ou knowlg, multimia appoachs nglct so fa to a lag, if not full, xtnt th phnomna abov. Most of th cunt tchnologis tak into account 1

2 only infomation that can b automatically tct in th ata (.g., objcts in pictus) o inf fom it (.g., gn fom music). Th fw attmpts to tak into account implicit cognitiv pocsss focus on obsvabl ffcts, incluing motional, bhavioal an physiological actions, us,.g., fo tival [1] an tagging puposs [36]. Howv, such actions might b ifficult to tct, spcially in sttings wh social noms impos bhavioal limitations (.g., public spacs). Futhmo, obsvabl actions a nothing ls than ffcts that follow th actual changs in th us, namly thos that concn implicit attitus, infncs, goals an thois (s abov). Th possibl solution to such a stat of affais is that cognitiv changs bhin obsvabl actions a not anom, but tn to follow, accoing to th Bunswik Lns [4], stabl an pictabl pattns. Th Bunswik Lns, on of th most ffctiv mols vlop in cognitiv psychology, povis a famwok suitabl fo invstigating how multimia ata can b aopt as an obsvabl vinc of attitus, infncs, goals an thois (s abov) of ata poucs. Symmtically, th mol hlps to xplain how ata consums attibut attitus, infncs, goals an thois to ata poucs. This pap shows that cognitiv ffcts a tctabl at last in th cas of th intplay btwn Flick pictus an psonality taits of Flick uss. Th sults, obtain ov PsychoFlick (a novl imag atast of 60,000 imags post by 300 iniviuals), show not only that th is a statistically significant colation btwn psonality taits of th uss an fatus xtact fom th imags thy post, but also that th sam fatus a colat with th taits that pictu obsvs assign to th uss, vn if obsvs an uss hav nv bn in contact. Futhmo, both associations a sufficintly stabl to b lan by supvis statistical classifis. This opns up to a st of applications lik,.g., automatic attibut piction: givn a pool of imags, th goal is to inf psonal chaactistics of its own. This goal is pfom h by pojcting imags on low-imnsional manifols an xploiting spas gssion. Th st of this pap is oganiz as follows: Sction2 suvys sach tns lvant to this wok, Sction 3 scibs th Bunswik Lns mol, Sction 4 psnts th atast us fo th xpimnts, Sction 5 pots on xpimntal vinc suppoting th co-ia of this pap, Sction 6 psnts application omains that can bnfit fom this wok an th final Sction 7 aws som conclusions. 2. NEIGHBORING AREAS Th ky-ia of this aticl is that th xchang of multimia ata has bcom a fom of human-human communication an, thfo, it shoul giv is to th sam cognitiv phnomna (.g., s [48, 49]) typically obsv in any human-human intaction. To th bst of ou knowlg, this aticl is th fist attmpt to aopt such a pspctiv in multimia tchnologis. Howv, sval omains consi nighboing issus that, whil bing iffnt fom th ons popos in this aticl, still inclu aspcts lvant to this wok. Th application of th sociotchnical pspctiv in stuying th us of igital libais - until a fw yas ago th most common infastuctu fo th xchang of multimia ata - is on of th alist attmpts to tak into account social issus in tchnological applications: To unstan, us, plan fo an valuat igital libais, w n to attn to social pactic, which w fin as popl s outin activitis that a lan, shap, an pfom iniviually an togth [38]. Th main iffnc btwn th sociotchnical pspctiv an th sach iction popos in this wok is that th fom focuss on us an usability issus (spcially in pofssional an institutional sttings) whil th latt tagts th communication btwn iniviuals, a stp ma possibl only by cnt tchnologis (social mia, mobil vics, tc.). In paalll, sval ffots w on to impov multimia tchnologis by automatically tcting an unstaning motional, bhavioal an physiological actions of ata consums (.g., if a pson watching a vio laughs, thn th vio can b tagg as funny ) [1, 25, 36]. Th co-ia of ths tns is that th contnt of th ata poucs obsvabl changs in ata consums, thn th obsvation of ths latt povis infomation about th ata. Th main iffnc with spct to this aticl is that th accnt is on th ata contnt, lik in most of th multimia tchnologis, an not on th communication pocss unlying th ata xchang btwn iniviuals. Mo cntly, som woks invstigat th intplay btwn obsvabl chaactistics of multimia ata an cognition [26, 54]. Th fist wok [26] consis imags tagg as favouit by a ctain pson as an xpssion of h asthtic pfncs an shows that, givn a ctain amount of pictus tagg as favouit by a ctain iniviual, it is possibl to pict whth th sam will happn fo anoth pictu o st of pictus. Th scon wok [54] invstigats th chaactistics of abstact paintings that stimulat ctain motional actions ath than oths. Both woks shift th attntion fom th ba contnt of imags to thi potntial ol in a communication pocss, namly a pson xpssing asthtic pfncs in [26] an a paint liciting motions in [54]. Howv, unlik th pspctiv avocat in this aticl, both woks tak into account only on of th patis involv in th communication pocss. To th bst of ou knowlg, th only two woks that sm to consi multimia ata as a fom of communication a in [11, 14]. Th wok in [14] stuis th pcption of pofil pictus on social mia an, in paticula, th agmnt btwn th actual psonality taits of pofil hols an taits attibut by oths bas on th pofil pictu. Th wok in [11], os a simila analysis, but it consis all lmnts that can appa in a pofil. Not supisingly, ths woks focus on social mia, an intaction-oint tchnology that allow uss to us multimia matial to communicat with oths. Howv aly, th appoachs in [11, 14] sm to confim th action of implicit cognitiv pocsss whn using multimia ata in a communication scnaio, th ky-ia avocat in this aticl. Still, both woks focus on a spcific cas an o not ty to intify th unlying pspctiv that can b appli to many iffnt cass. 3. THE MULTIMEDIA LENS MODEL This sction povis a concptual famwok that illustats th pspctiv popos in this aticl, namly a simplifi vsion of th Bunswik s Lns (s Figu 1), th mol oiginally popos in [4] an succssivly moifi to invstigat, among oth intaction phnomna, th in-

3 Data Pouc Extnalization Stat μ S masu ρ EV Ecological Valiity fatu 1 fatu 2 fatu 3 ρ Functional FV Valiity fatu N-2 fatu N-1 fatu N Attibution μ P Pcptual Jugmnt masu ρ RV Rpsntation Valiity Data Consum Figu 1: Th pictu shows a simplifi vsion of th Bunswik Lns Mol aapt to th xchang of multimia ata btwn a Data Pouc an a Data Consum. flunc of nonvbal bhavio in fac-to-fac intactions [43] o th jugmnt of appot [2]. In th mol of Figu 1, th multimia ata is consi a fom of communication btwn Data Poucs (DP) an Data Consums (DC). Th ky-ia of this aticl is that th pocss inclus not only th xchang of contnt, a poblm that th multimia inxing an tival community has xtnsivly invstigat fo at last two cas, but also implicit cognitiv pocsss typical of any human-human intaction lik,.g., th spontanous attibution of socially lvant chaactistics (attactivnss, tustwothinss, tc.) o th vlopmnt of impssions. Th DP is always assum to b in a ctain stat that can b ith tansint (.g., motions, attitus, goals, physiological conitions tc.) o stabl (.g., psonality taits, valus, social status, tc.). In opational tms, th stats a fin as quantitativ masus (intifi as µ S in Figu 1) to b obtain via objctiv pocsss pning on th paticula cas un obsvation. Fo xampl, in th cas of th social status, th masu can b th yaly incom of th DP, whil fo th physiological conition it can b th hat at o th galvanic skin conuctanc. In many cass, th stats cospon to psychological constucts (.g., psonality taits o intpsonal attactivnss) an th masus a th outcom of psychomtic qustionnais. Ths latt a typically aminist to th DPs an inclu qustions associat to Likt scals (s Sction 5 fo an xampl). Accoing to th mol, th multimia ata a an xtnalization of th DP stat, i.. an obsvabl ffct of it. Futhmo, th ata is all th DCs know about a DP. Fom an opational point of viw, th ata cospon not only to th actual multimia matial (.g., pictus, vio, sounbits, tc.), but also to any fatu that can b xtact, manually o automatically, fom th matial itslf. Th mpiical covaiation of stat masus an fatus quantifis th cological valiity of ths latt, i.. thi ffctivnss in accounting fo th DP stat. In Figu 1, th cological valiity is inicat with ρ EV an, typically, it cospons to th colation o th Spaman cofficint btwn fatus an µ S. Whn th DCs consum th ata, thy attibut th DP a stat of masu µ P. Th pocss is call attibution an µ P is f to as pcptual jugmnt. Fo xampl, th DCs can attibut a ctain yaly incom to th DP bas on th pictus an vios this latt shows. In th cas of th psychological constucts (s Sction 5), th attibution pocss is typically unconscious an it taks plac spontanously, whth th DC ns it (wants it) o not [48, 49]. In pincipl, µ S an µ P shoul hav th sam valu (o at last simila valus), but communication pocsss a always noisy, spcially whn th communication taks plac though ambiguous channls lik multimia ata a. Th mpiical covaiation btwn fatus an pcptual jugmnts accounts fo th psntation valiity of th fatus (intifi as ρ RV ), i.. fo th influnc ths latt hav on th attibution pocss. Lik in th cas of ρ EV, th most common masumnts of ρ RV a colation an Spaman cofficint. Th cognitiv pocsss this pap focuss on a activ in paticula at th pcptual jugmnt stag, whn DCs unconsciously vlop an impssion about th DP vn if all thy know about this latt is th multimia ata thy a consuming. Howv, th pocsss a impotant fo th DP as wll bcaus, in a communication scnaio, th is no ata pouction without an attmpt to convy an impssion, i.. to nsu that µ S an µ P a clos to ach oth. Th mpiical covaiation of µ S an µ P (intifi as ρ F V in Figu 1) accounts fo th latt aspct an it is call functional valiity. 4. THE PSYCHOFLICKR DATASET W xpimnt th co-ia of this pap on Flick 2, on of th most popula onlin photo-shaing platfoms. To this pupos w collct a copus, ubb PsychoFlick, that flcts th Lns Mol an inclus both pictus an psonality assssmnts. Th copus is publicly availabl at [th atast will b ma availabl in cas of accptanc] an was collct as follows: w contact 300 anom po uss, i.. iniviuals that pay a yaly f in o to accss pivilg Flick functionalitis. Ths uss a xpct to b, on avag, mo apt than oths to photogaphy languag an tchniqus. Fo ach of ths 300 uss, w collct th 200 latst pictus ma by oths thy labl as favouit, fo a total of 60,000 imags. Futhmo, ach us fill th BFI-10 (Big Fiv Invntoy 10) [39], a 2

4 Extnalization Data Pouc Opnnss Conscintiousnss Siz Rgions Hu Ang. Disp. Txtu L1 ch.h Txtu L3 ch.v # Facs Dominanc 0.31 Stat μ S Colofulnss μ P Extavsion # Popl Siz Popl # Facs # Egs Opnnss Conscintiousnss Pcptual Jugmnt Extavsion Attibution Data Consum Agablnss 0.15 # Cas 0.17 Agablnss Nuoticism Pupl Nuoticism Figu 2: Th pictu shows th Bunswik Lns mol fo th PsychoFlick atast, wh th stat cospons to th Big Fiv taits (as p assss with th BFI-10). Ecological an Rpsntation valiitis a masu with th Spaman Cofficint an th pictu shows (fo ach tait) fatus fo which both valus a statistically significant (p < 5%). psonality qustionnai aim at masuing th psonality of an iniviual in tms of th Big Fiv, fiv boa imnsions shown to captu most of th iniviual iffncs [42]. Th outcom of th BFI-10 is a fiv imnsional vcto wh ach componnt masus how high an iniviual is with spct to ach of th Big Fiv taits, namly Opnnss (tnncy to b intllctually opn, cuious an hav wi intsts), Conscintiousnss (tnncy to b sponsibl, liabl an tustwothy), Extavsion (tnncy to intact an spn tim with oths), Agablnss (tnncy to b kin, gnous, tc.) an Nuoticism (tnncy to xpinc th ngativ aspcts of lif, to b anxious, snsitiv, tc.). In paticula, fo ach tait, w hav an intg which gos fom -4 (low tnncy) to 4 (high tnncy). Finally, w hi ight assssos that look at th st of 200 favoit imags povi by ach of th uss an, fo ach of thm, fill th BFI-10 qustionnai. Howv, whil th Flick uss aopt th slf-assssmnt vsion of th BFI-10, th assssos us th oth-assssmnt vsion. In oth wos, th uss at statmnts lik I am a sv pson, whil th assssos at statmnts lik This pson is sv, wh by pson is mant th us that has labl th imags un xam as favouit. Each of th 8 assssos fill th psonality qustionnais fo ach of th 300 uss. Th uss an th assssos w nv in contact an, futhmo, th imags w th only infomation th assssos ha at isposition about th uss un xam. Th 8 psonality atings pouc by th iffnt assssos about th sam us w avag to obtain th pcptual jugmnt (accoing to xpimntal psychology pactics [39]). Th agmnt among th assssos was masu with th Kippnoff s α [22], a liability cofficint suitabl O C E A N α Tabl 1: Kippnoff s α fo th Big Fiv taits. fo a wi vaity of assssmnts (binay, nominal, oinal, intval tc.), an obust to small sampl sizs. Tabl 1 pots th α valus fo th iffnt taits. Th valus a statistically significant an compaabl to thos obsv in th litatu fo zo acquaintanc scnaios [3], i.. situations wh assssos an subjcts bing at o not hav any psonal contact (lik it in th PsychoFlick copus). In tms of th Lns Mol, th po uss a th Data Poucs, th assssos a th Data Consums, th psonality is th stat an th outcom of th BFI qustionnai is th stat masu (s Sction 3 fo mo tails). 5. MULTIMEDIA LENS AND FLICKR This sction masus, in quantitativ tms, whth implicit cognitiv pocsss a actually at wok in th scnaio unlying th PsychoFlick copus o not. In paticula, th sction asss th following qustions: Is th a consistnt lation btwn fatus xtact fom sts of favouit imags an psonality taits of Flick uss (both slf-assss an attibut)? If ys, is th lation sufficintly stabl to automatically pict th psonality taits of Flick uss (both slf-assss an attibut) bas on sts of favouit imags? If th ky-ia of this wok hols, an implicit cognitiv pocsss influnc th xchang of multimia ata (accoing

5 Catgoy Nam L Shot Dsciption Us of light 1 Avag pixl intnsity of V channl [9] HSV statistics 4 Man of S channl an stana viation of S, V channls [27]; Hu angula ispsion in IHLS colo spac [30] Emotion-bas 3 Amount of Plasu, Aousal, Dominanc [27, 50] Colofulnss 1 Colofulnss masu bas on Eath Mov s Distanc (EMD) [9, 27] Colo Nam 11 Amount of Black,, Bown, Gn, Gay, Oang, Pink, Pupl,, Whit, Yllow [27] Entopy 1 Imag ntopy [26] Asthtics Wavlt txtus 12 Lvl of spatial gaininss masu with a th-lvl (L1,L2,L3) Daubchis wavlt tansfom on th HSV channls [9] Tamua 3 Amount of Coasnss, Contast, Dictionality [46] GLCM-fatus 12 Amount of Contast, Colation, Engy, Homognity fo ach HSV channl [27] Egs 1 Total numb (#) of g points, xtact with Canny [26] Lvl of tail 1 Numb of gions (aft man shift sgmntation) [6, 16] Rgions 1 Avag siz of th gions (aft man shift sgmntation) [6, 16] Low pth of fil (DOF) 3 Amount of focus shapnss in th inn pat of th imag w..t. th ovall focus [9, 27] Rul of this 2 Man of S,V channls in th inn ctangl of th imag [9, 27] Imag paamts 2 Siz of th imag [9, 26] Objcts 28 Objcts tctos [12]: w kpt th numb of instancs (#) an thi avag bouning box siz Contnt Facs 2 Numb (#) an siz of facs aft Viola-Jons fac tction algoithm [51] GIST sciptos 24 Lvl of opnnss, uggnss, oughnss an xpansion fo scn cognition [35]. Tabl 2: Summay of all fatus. Th column L inicats th fatu vcto lngth fo ach typ of fatu. to th Lns Mol), thn th answ shoul b positiv in both cass. 5.1 Fatus an Psonality W aopt a wi, though not xhaustiv, spctum of fatus, h goup into two familis (s Tabl 2). On on si, w hav th cus that focus on asthtic aspcts [9, 27]: th ason is that th PsychoFlick copus inclus pictus post as favouit, i.. likly to psnt th asthtic pfncs of th uss un xam. On th oth si, w focus on th contnt of th imags; to this n, w mploy obust pobabilistic objct tctos [12] (fo a complt list of all tctabl objcts s [12]); w also tain th avag aa (th algoithm givs also th bouning box of th tct objcts). In aition, w focus on th facs, aopting th stana Viola-Jons fac tction algoithm [51] implmnt in th OpnCV libais. Finally, w aopt th GIST scn sciptos, which amounts to apply a st of oint ban-pass filts. Figu 2 shows th fatus with high cological (covaiation of slf-assssmnt an fatus) an psntation (covaiation of fatus an pcptual jugmnt) valiity with spct to th Big Fiv taits. Th covaiations, masu with th Spaman Cofficint, a statistically significant (p < 5%). Thfo, implicit cognitiv pocsss sm to b actually at wok whn Flick uss sha thi st of favouit imags. Th answ to th fist qustion at th bginning of this sction is positiv. Futh confimation coms fom Figu 3, showing a anom slction of imags labl as favouit by xtavt (collag a ) an intovt (collag b ) subjcts. Th fom appa to pictu popl way mo fquntly than intovt ons (80% an 17% of th imags in th collag, spctivly). 5.2 Psonality Piction Ecological an psntation valiity valus of Figu 2 sm to suggst that picting psonality taits (both slfassss an attibut) using favouit imags is possibl. Th poblm was cast as a gssion instanc on th taits of th uss, consiing uss as th sts of thi pf imags (s appnix A fo a sciption of th gssion appoach). Th pfomanc was masu with th Spaman colation cofficint btwn actual an pict psonality taits, th high th cofficint, th clos th piction to th tu valu. Th sults a pot in Tabl 3 wh th fist column shows th tait, th scon xplains whth th pict tait is slf-assss o attibut by th assssos, Max ρ is th maximum colation foun acoss th tsts (i.. th vaious configuations of th gssion appoach), Man (St) a man valu an stana viation comput on colations with p-valus < 5%, an % s.s is th pcntag of iffnt configuations of th gssion appoach that sult in a statistically significant sult. In lin with th litatu on psonality computing [28], th pfomancs achiv ov slf-assss taits a low than thos obtain ov attibut ons. Th ason is that th fom pn on an iniviual assssmnt an tn to b mo noisy whil th latt, sulting fom th consnsus among iffnt assssos, tn to colat btt with masuabl chaactistics of th ata. In paticula, fo th attibut taits, all configuations of th gssion appoach tst in th xpimnts l to statistically significant sults, whil fo th slf-assss taits this happns only fo Opnnss. Th bst pfomanc is achiv, fo th attibut assssmnts, fo Extavsion an Conscintiousnss. Th sam applis to most of th woks on psonality computing psnt in th litatu an th ason is that Extavsion an Conscintiousnss a th two taits popl pciv mo quickly an ffctivly [19]. In th cas of this wok, th pfomanc is satisfactoy fo Nuoticism an Agablnss as wll. Th tait fo which th pfomanc is low is Opnnss. Th pobabl xplanation is that th istibution of th uss along such tait is pak aoun high valus (Opnnss is th tait of cativity an po Flick uss a, not supisingly, high than th avag along th tait). Th sults psnt in Tabl 3 a statistically significant (p < 5%) an, thfo, th answ to th scon qustion of th bginning of this sction is positiv. In oth wos, implicit cognitiv pocsss sm to influnc th attibution of psonality taits in Data Consums watching favouit pictus on Flick.

6 (a) Figu 3: Collag (a) an (b) a a anom slction of favouit pictus of subjcts high an low in Extavsion (as p attibut by th assssos), spctivly. Th most impotant iffnc is that xtovt iniviuals show a pfnc fo pictus potaying popl (80% of th sampls in th collag) whil intovt show th opposit pfnc (17% of th pictus in th collag). (b) Tait Labl Max ρ Man (St) ρ % s.s. O Slf (0.04) 100% Attibut (0.04) 100% C Slf (0.03) 44% Attibut (0.05) 100% E Slf (0.05) 88% Attibut (0.03) 100% A Slf (0.03) 55% Attibut (0.05) 100% N Slf (0.07) 7% Attibut (0.04) 100% Tabl 3: Piction Rsults. ρ is th Spaman Colation Cofficint 6. POTENTIAL APPLICATIONS Sction 2 suvys aas that sha som aspcts with th pspctiv popos in this aticl whil still showing substantial iffncs. This sction focuss on application omains that involv th xchang of multimia ata an, thfo, might bnfit fom taking into account th cognitiv pocsss that, accoing to th sults psnt abov, sm to influnc such typ of pocss. Unstaning an moling of cognitiv pocsss involv in multimia ata consumption a likly to b bnficial fo Human Infomation Intaction (HII), th omain stuying th lationship btwn popl an infomation [13]. HII sachs a paticulaly intst in molling popl poflctions, i.. iniviual s conscious an unconscious pojctions on infomation objcts (.g., pictus) an th flctions that oth popl an machins cat to thos pojctions (.g., links an annotations) [29]. This applis in paticula to multimia tival tchnologis that might b nhanc by taking into account not only th ata contnt (lik most of cunt tchnologis o [25]), but also th intplay btwn contnt an pcptual jugmnts (psonality, valus, goals, intntions, tc.). Th ol of cognitiv biass can b of intst fo Digital Humanitis as wll, spcially fo what concns th ffot towas nw mos of knowlg fomation nabl by ntwok, igital nvionmnts an th focus on istinctiv mos of poucing knowlg an istinctiv mols of knowlg itslf [5]. In paticula, Digital Humanitis invstigat th impact of mia authoing tchnologis on th tansmission of knowlg an infomation, a phnomnon likly to involv implicit cognitiv pocsss lik thos scib in this wok. In a simila vin, th pspctiv aopt in this aticl can b usful in Big Data Analytics - th omain aim at making sns of lag amounts of unstuctu ata [31] - on of th most impotant challngs tchnology facs toay. In paticula, th is consnsus among Big Data xpts that no usful infomation can b xtact fom lag atabass without associating automatic mining appoachs an human intptation [34]. This latt is likly to b influnc by cognitiv pocsss simila to thos illustat in th xpimnts of this wok. Vial makting, th iffusion of infomation about th pouct an its aoption ov th ntwok [24], is an avtismnt tchniqu aim at spaing infomation as wily as possibl though (mostly onlin) wo of mouth mchanisms. Th pvious pat of this pap has shown that th xchang of multimia ata, bing a fom of humanhuman communication, can b thought of as a fom of wo of mouth. Thfo, implicit cognitiv pocsss might contibut to xplain an nhanc viality. In th sam vin, communication statgis bas on social mia can bnfit fom th piction of pcptual jugmnts likly to b attibut to a givn multimia mssag iffus though onlin social platfoms [20]. Accoing to th Euopan Consum Commission, Psonal ata is th nw oil of th intnt an th nw cuncy of th igital wol [15]. Th stats of ata poucs (s Sction 3) oftn cospon to psonal chaactistics of potntial intst fo iffnt bois (.g., companis tying to mol thi customs o govnmnts intst in gathing infomation about th population). Appoachs lik thos psnt in this wok can hlp to obtain such infomation by analyzing publicly availabl ata that popl usually post on psonal hom pags, Youtub, Facbook, tc. [21]. In paalll, th vlopmnt of tchnologis ca-

7 pabl of going byon th m contnt an inf psonal chaactistics of ata poucs qui a finition of th concpt of pivacy an a caful analysis of thical issus [8]. A pculia fom of communication though multimia matial is th paticipation in onlin gams wh sval paticipants intact via avatas o animat chaacts. Th choic of a paticula chaact o paticula gaming statgis an options is likly to convy infomation about th play stats (s,.g., th appoachs in [17, 53, 55] fo th cas of psonality). In a simila way, comput miat communication can b influnc by implicit cognitiv pocsss via intfac chaactistics lik th pofil pictu of Skyp uss. Cating an viwing photogaphs as a pocss of slfinsight an psonal chang is th main pincipl of photothapy an thaputic photogaphy [52, 45], two cnt psychology pspctivs; fo th thapists, Imags povi an uncunt of motion an ias that nich intpsonal ynamics, oftn on a lvl that is not fully conscious o capabl of bing vbaliz. Of paticula intst fo ths fils is how th languag of composition an visual sign intscts with th languag of unconscious pimay cognitiv pocsss, incluing motional/iational association. Ou stuy suggsts that answs to ths qustions may b foun with th hlp of computs. Last, but not last, it is appant th cucial ol that imag pocssing an machin laning woul hav; at th sam tim, ou stuy linats nw challngs fo ths aas; fo xampl, iscoving visual pattns that colat with psonal taits in a stong way than oinay fatus coul b a sach mission fo th fil of p laning an fatu laning [44]. Gnativ moling can also b involv, looking fo nw mols that mimic th way ivs visual fatus shoul b combin togth to communicat a ctain psonal tait. An immiat xampl applis to th Counting Gi us in Sction A.1: in this pap, w mploy CG as a m imnsionality uction statgy, without accounting th taits labl. Incluing this infomation may la to a low-imnsional mbing wh naby imags xhibit simila fatus an psonal taits. Th list psnt in this sction is fa fom bing xhaustiv, but it is psntativ of th scnaios wh th invstigations popos in this aticl can b lvant, namly thos wh iniviuals pouc, xchang an consum (possibily multimia) ata. 7. CONCLUSIONS This aticl avocats th ia that th xchang of multimia ata has bcom a human-human communication scnaio an, thfo, it involvs th sam cognitiv phnomna of any oth fom of intaction btwn popl, spcially whn it coms to xpssion an mutual attibution of socially lvant chaactistics (attactivnss, social status, psonality, goals, valus, intntions, tc.). As a suppoting vinc, th pap poposs xpimnts on th intplay btwn psonality taits an Flick pictus. Th sults show that th psonality of an iniviual can b pict, to a statistically significant xtnt, though th pictus sh labls as favouit. Futhmo, th xpimnts show that th imags can b us to pict th taits that oths attibut to such an iniviual. Thfo, at last fo what concns psonality, th xchang of imags via Flick sms to wok accoing to th Bunswik Lns (s Sction 3), th cognitiv mol unlying social intactions. In oth wos, th ky-ia popos in this wok appas to hol. To th bst of ou knowlg, such a pspctiv has nv bn aopt in a multimia tchnology contxt bfo. Th pobabl ason is that multimia ata bcam an intaction channl only cntly, whn th iffusion of appopiat tchnologis fo ata pouction (camas, smatphons, tablts, tc.) an consumption (social mia, igital libais, tc.) ma it possibl to xchang multimia ata as asily as w pviously xchang wittn matial (ltts, mssags, tc.) [5]. This nw scnaio opns sval sach qustions (th list is not xhaustiv): Is it possibl to impov multimia tchnologis by taking into account implicit cognitiv pocsss? Do implicit cognitiv pocsss influnc ou bhavio as multimia tchnology uss? Dos multimia tchnology n to chang to accomoat implicit cognitiv pocsss? If ys, how? What o w val about ouslvs whn w sha multimia ata? What is th ffct of th multimia ata w sha on th impssion oths vlop about us? It can b xpct that th pcptual jugmnts w mak about thos who pouc th ata w consum n up influncing ou pcption of th ata. Fo xampl, w might tn to lik mo o to fin mo lvant ata pouc by popl w pciv as mo simila to us. If actually obsv, such an ffct (known in psychology as similaityattaction [7]), might not only impov tival tchnologis, but also contibut to xplain ou bhavio as uss an, in ultimat analysis, la to high tchnology usability an ffctivnss. Simila consiations apply to any tchnology that involvs th consumption of ata. Symmtically, th incasing amount of multimia infomation w pouc an sha (Instagam pictus, Twts incluing points to vio an auio ata, tc.) is pobably contibuting to a lag an lag xtnt to ou appaanc, on of th chaactistics that influnc most th impssion oths vlop about us (an ffct known as th halo-ffct psychology [33]). Howv, whil w know how to manag ou appaanc in fac-to-fac intactions, in most cass w a still not awa of th way oths s us though th lns of th multimia ata w pouc. Th two xampls abov show how cognitiv an tchnological issus a tightly inttwin in vyay scnaios involving pouction an consumption of multimia ata. Th two cass focus on spcific aspcts, but th pspctiv popos in this wok might show that th full ang of phnomna taking plac in fac-to-fac intactions (s [10] fo a monogaph) tak plac though multimia ata as wll. If tu, th oo woul b opn towas nw multimia applications as wll as novl finings in cognitiv scincs. APPENDIX A. THE REGRESSION APPROACH To apply a stana gssion appoach is poblmatic bcaus th a multipl imags associat to th sam

8 tagt. Staightfowa algoithms lik,.g., summing all th imag sciptos of ach us, an thn pfom gssion, os not wok bcaus such pocss as nois to a wak signal. Multipl instanc gssion [40] is also unavisabl bcaus of its high computational complxity, spcially whn th numb of imags fo ach us is lag. Thfo, w popos an altnativ appoach compos by th following th stps: H.g H. g H. g H. g 1. Fatu Extaction an Nomalization. W fist xtact fom all th imags th st of fatus list in Sction 5.1; sinc ach z-th cu xpsss th lvl of psnc of a givn quantity, i.. a count c z, w can think ach imag as an histogam of counts {c z}, o bag-of-fatus (BoF). Aft that, w nomaliz ach c z to nsu that ach fatu taks valus in th sam ang. This avois som fatus (.g., numb of gs) to ovcom oths (.g., GIST, amount of coasnss) 2. Clusting. Aft iviing th uss in taining an tsting uss, w consi all th imags of th taining uss. By mans of a clusting algoithm, w lan a low-imnsional psntation that maps ach t-th imag (i.. its BoF) in a 2-imnsional location l t, lying on a smooth manifol. As clusting mtho w mploy th Counting Gi [37], a cnt gnativ mol which mbs BoF psntations in N-imnsional manifols 3. This way, ach us u bcoms a st of locations L u = {l t } on th manifol. 3. Rgssion an Tait Piction. Consiing th taining uss, w tain a gsso to th psonality taits. In spcific, fo ach us u w hav a fivimnsional tagt that chaactizs th Big Fiv psonality taits p {O, C, E, A, N}, wh ach tait is scib by a valu y u p [ 4, 4]. As gssion mtho, w us Lasso [47]. Tait piction amounts to tst th gsso on th tst uss. In th following, w will tail th latt two stps of th pocss. A.1 Clusting: th Counting Gi Mol Th counting gi (CG) is a gnativ mol cntly intouc in [37] fo analyzing imags collctions. It assums that imags a psnt as histogams {c z} o bags of fatus, wh c z counts th occuncs of fatu z. Consiing its two-imnsional vsion, a CG is a 2D finit isct gi wh ach location i = (x, y) contains a nomaliz count of fatus π i,z. Un this mol, an imag (i.. its BoF {c z}) coul b thought as pouc by th following gnativ pocss: a small winow is locat in th gi, avaging th fatu counts within it to obtain a local pobability mass function ov th fatus, an thn gnating fom it an appopiat numb of fatus in th bag (s Fig. 4). In oth wos, unlik a staightfowa mbing (.g. PCA) that links an imag with a point location, th counting gi focs th imag to link with a small winow of locations. Givn that th siz E 1 E 2 of 3 H w ci N = 2 fo th sak of claity; oth imnsions can b xplo. In aition, w ti iffnt imnsionality uction appoachs (Mixtus of Diichlt istibutions), laing to infio pfomancs. Figu 4: Gnating an imag fom a simpl 3 3 counting gi: givn a 2 2 winow on th gi, w avag th fatu counts, obtaining a bag of fatus which cospons to th final imag. V. g an a toy fatus maning vtical an hoizontal gs, spctivly. a counting gi is usually small compa to th numb of imags, this also focs winows link to iffnt imags to ovlap, an to co-xist by fining a sha compomis in th fatu counts locat in thi intsction. Th ovall ffct of ths constaints is to pouc locally smooth tansitions btwn stongly iffnt fatu counts by gaually phasing fatus in/out in th intmiat locations. In pactic, local nighbohoos in th gi psnt simila concpts an imags mapp in clos locations a somhow simila. Fomally, th counting gi π i,z is a 2D finit isct gi, spatially inx by i = (x, y) [1... E 1] [1... E 2], an containing nomaliz counts of fatus inx by z. Thus, w hav z π i,z = 1 vywh on th gi. A givn BoF {c z} is gnat by slcting a ctain location k, calculating th istibution h k,z = 1 W n i W k π i,z by avaging all th wos counts within th winow W k (with aa W n) that stats at k, an thn awing fatus counts fom this istibution. In oth wos, th position of th winow k in th gi is a latnt vaiabl; givn k, th liklihoo of {c z} is p({c z} k) = z (h k,z ) cz = α z ( i W k π i,z ) cz, (1) wh α is a fix nomalization facto. To lan a counting gi, w n to maximiz th liklihoo ov all taining imags T, that can b wittn as p({{c t z}, k t } T t=1) ( ) c t z π i,z, (2) t z i W k t which is intactabl, much lik in mixtus; thfo, it is ncssay to mploy an itativ EM algoithm. Stating fom a anom initialization of th counting gi π, th E- stp aligns all bags of fatus to gi winows, to match th bags histogams, infing wh ach bag maps on th gi, i.. q t (i) xp z c t z log h i,z (3) In th M-stp th mol paamt, i.. th counting gi π, is -stimat. Fo tails on th laning algoithm an on its fficincy, th a can f to th oiginal paps [37, 18]. Fo ou puposs, th most intsting outputs a th

9 postio pobabilitis q t s, th position in th gi of ach imag. Summing ov th nti gi th contibuts q t (i), which a u to th imags of a us, povis a signatu L u. Essntially, it is a 2D matix, of th sam imnsion of th gi, wh som locations {i} a wight by th q t s, inicating that in such locations th a som imags of th us u. A.2 Rgssion an Tait Piction To assss th valiity of ou piction mtho, w us th Lav-On-Us-Out paaigm. W consi CGs of vaious complxitis with siz E = [20 20, 25 25, ] an winow W = [5 5] an w lant a mol with all th imags blonging to th taining uss. Thn, w comput L u fo ach us, an us this psntation to gss on th pofil yp u. W lan th gssion wight vcto w by minimizing th o function E(w) = L i=1 ( y p w T L u (i)) 2 (4) wh L inicats th numb positions in th gi. W solv th poblm using Lasso [47], a shinkag an slction mtho fo lina gssion which nfocs th spasity on cofficints w by bouning th sum of th absolut valus of th cofficints. Th boun is a constaint that has to b takn into account whn minimizing th o function. At this point th taining phas is complt an to pict th psonality tain of th hl out (tst) us, w 1) inf th mappings of its imags on th counting gi (Eq.3), 2) comput is latnt sciption L tst (i) an 3) by calculating w pict th tait. B. REFERENCES ŷ tst p = w T L tst, (5) [1] I. Aapakis, J.M. Jos, an P.D. Gay. Affctiv fback: an invstigation into th ol of motions in th infomation sking pocss. In Pocings of th ACM SIGIR Confnc on Rsach an Dvlopmnt in Infomation Rtival, pags ACM, [2] F.J. Bnii an J.S. Gillis. Juging appot: Employing bunswik s lns mol to stuy intpsonal snsitivity. In J.A. Hall an F.J. Bnii, itos, Intpsonal Snsitivity. Thoy an Masumnt. Lawnc Elbaum, [3] J.C. Bisanz an S.G. Wst. Psonality cohnc: Moating slf oth pofil agmnt an pofil consnsus. Jounal of Psonality an Social Psychology, 79(3): , [4] E. Bunswik. Pcption an th psntativ sign of psychological xpimnts. Univsity of Califonia Pss, [5] A. Buick, J. Duck, P. Lunnfls, T. Psn, an J. Schnapp. Digitial Humanitis. MIT Pss, [6] D. Comaniciu an P. M. Man shift: a obust appoach towa fatu spac analysis. IEEE Tansactions on Pattn Analysis an Machin Intllignc, 24(5): , [7] J.W. Conon an W.D. Cano. Inf valuation an th lation btwn attitu similaity an intpsonal attaction. Jounal of Psonality an Social Psychology, 54(5):789, [8] R. Cowi. Th goo ou fil can hop to o, th ham it shoul avoi. IEEE Tansactions on Affctiv Computing (to appa), [9] R. Datta, D. Joshi, J. Li, an J. Wang. Stuying asthtics in photogaphic imags using a computational appoach. In Pocings of th Euopan Confnc on Comput Vision, volum 3953 of Lctu Nots in Comput Scinc, pags Sping Vlag, [10] J. Elst. Explaining Social Bhavio. Cambig Univsity Pss, [11] D.C. Evans, S. D. Gosling, an A. Caoll. What lmnts of an onlin social ntwoking pofil pict tagt-at agmnt in psonality impssions. In Pocings of th Intnational Confnc on Wblogs an Social Mia, pags 45 50, [12] P. F. Flznszwalb, R. B. Gishick, an D. McAllst. Disciminativly tain fomabl pat mols, las 4. pff/latnt-las4/, [13] R. Fil. Human Infomation Intaction. MIT Pss, [14] S. Fitzgal, D.C. Evans, an R.K. Gn. Is you pofil pictu woth 1000 wos? Photo chaactistics associat with psonality impssion agmnt. In Pocings of AAAI Intnational Confnc on Wblogs an Social Mia, [15] Wol Economic Foum. Psonal ata: th mgnc of a nw asst class. Tchnical pot, Wol Economic Foum, [16] C.M. Gogscu. Syngism in low lvl vision. In Pocings of th Intnational Confnc on Pattn Rcognition, pags , [17] D. Johnson an J. Gan. Psonality, motivation an vio gams. In Pocings of th Confnc of th Comput-Human Intaction Spcial Intst Goup of Austalia on Comput-Human Intaction, pags , [18] N. Jojic an A. Pina. Multiimnsional counting gis: Infing wo o fom iso bags of wos. In Pocings of Unctainty in Atificial Intllignc, pags , [19] C.M. Ju, L. Jams-Hawkins, V. Yzbyt, an Y. Kashima. Funamntal imnsions of social jugmnt: Unstaning th lations btwn jugmnts of comptnc an wamth. Jounal of Psonality an Social Psychology, 89(6): , [20] A.M. Kaplan an M. Hanlin. Uss of th wol, unit! Th challngs an oppotunitis of social mia. Businss Hoizons, 53(1):59 68, [21] M. Kosinski, D. Stillwll, an T. Gapl. Pivat taits an attibuts a pictabl fom igital cos of human bhavio. Pocings of th National Acamy of Scincs, 110(15): , [22] K. Kippnoff. Rliability in contnt analysis.

10 Human Communication Rsach, 30(3): , [23] Z. Kuna. Social cognition: Making sns of popl. Th MIT Pss, [24] J. Lskovc, L. Aamic, an B. Hubman. Th ynamics of vial makting. ACM Tansactions on th Wb, 1(1):5, [25] M.S. Lw, N. Sb, D. Chaban, an R. Jain. Contnt-bas multimia infomation tival: Stat of th at an challngs. ACM Tansactions on Multimia Computing, Communications, an Applications, 2(1):1 19, [26] P. Lovato, A. Pina, N. Sb, O. Zanoná, A. Montagnini, M. Bicgo, an M. Cistani. Tll m what you lik an I ll tll you what you a: isciminating visual pfncs on Flick ata. In K.M. L, Y. Matsushita, J.M. Rhg, an Z. Hu, itos, Pocings of th Asian Confnc on Comput Vision, volum Lctu Nots in Comput Scinc Sping Vlag, [27] J. Machajik an A. Hanbuy. Affctiv imag classification using fatus inspi by psychology an at thoy. In Pocings of th ACM Intnational Confnc on Multimia, pags 83 92, [28] F. Maiss, M. A. Walk, M. R. Mhl, an R. K. Moo. Using linguistic cus fo th automatic cognition of psonality in convsation an txt. Jounal of Atificial Intllignc Rsach, 30: , [29] G. Machionini. Human infomation intaction sach an vlopmnt. Libay & Infomation Scinc Rsach, 30(3): , [30] K.V. Maia an P.E. Jupp. Dictional Statistics. Wily Sis in Pobability an Statistics. Wily, [31] M. Minlli, M. Chambs, an A. Dhiaj. Big Data, Big Analytics. Wily, [32] C. Nass an S. Bav. Wi fo spch: How voic activats an avancs th Human-Comput lationship. Th MIT Pss, [33] R.E. Nisbtt an T.D. Wilson. Th halo ffct: Evinc fo unconscious altation of jugmnts. Jounal of Psonality an Social Psychology, 35(4): , [34] F.J. Olhost. Big Data Analytics. Wily, [35] A. Oliva an A. Toalba. Moling th shap of th scn: A holistic psntation of th spatial nvlop. Intnational Jounal of Comput Vision, 42(3): , [36] M. Pantic an A. Vincialli. Implicit Human-Cnt Tagging. IEEE Signal Pocssing Magazin, 26(6): , [37] A. Pina an N. Jojic. Imag analysis by counting on a gi. In Pocings of th Intnational Confnc on Comput Vision an Pattn Rcognition, pags , [38] A. Ptson Bishop, N.A. van Hous, an B.P. Buttnfils, itos. Digital Libay Us. MIT Pss, [39] B. Rammstt an O. P. John. Masuing psonality in on minut o lss: A 10-itm shot vsion of th Big Fiv Invntoy in English an Gman. in Jounal of Rsach in Psonality, 41: , [40] S. Ray. Multipl instanc gssion. In Pocings of th Intnational Confnc on Machin Laning, pags , [41] D. Rosnbg. Data bfo th Fact. In L. Gitlman, ito, Raw ata is an oxymoon, pags MIT Pss, [42] G. Sauci an L.R. Golbg. Th languag of psonality: Lxical pspctivs on th fiv-facto mol. In J.S. Wiggins, ito, Th Fiv-Facto Mol of Psonality [43] K.R. Sch. Psonality maks in spch. In Social maks in spch, pags Cambig Univsity Pss, Cambig, [44] K. Sohn, D.Y. Jung, H. L, an A.O. Ho. Efficint laning of spas, istibut, convolutional fatu psntations fo objct cognition. In IEEE Intnational Confnc on Comput Vision, pags , [45] J. Sul. Th psychothaputics of onlin photoshaing. Intnational Jounal of Appli Psychoanalytic Stuis, 6(4): , [46] H. Tamua, S. Moi, an T. Yamawaki. Txtu fatus cosponing to visual pcption. IEEE Tansactions on Systms, Man an Cybntics, 8(6), [47] R. Tibshiani. Rgssion shinkag an slction via th lasso. Jounal of th Royal Statistical Socity, Sis B, 58: , [48] J.S. Ulman, L.S. Nwman, an G.B. Moskowitz. Popl as flxibl intpts: Evinc an issus fom spontanous tait infnc. In M.P. Zanna, ito, Avancs in Expimntal Social Psychology, volum 28, pags Elsvi, [49] J.S. Ulman, S.A. Saibay, an C.M. Gonzalz. Spontanous infncs, implicit impssions, an implicit thois. Annual Rviws of Psychology, 59: , [50] P. Valz an A. Mhabian. Effcts of colo on motions. Jounal of Expimntal Psychology Gnal, 123(4): , [51] P.A. Viola an M.J. Jons. Robust al-tim fac tction. Intnational Jounal of Comput Vision, 57(2): , [52] J. Wis. Photothapy tchniqus: Exploing th scts of psonal snapshots an family albums. Jossy-Bass San Fancisco, [53] C.Y. Yaakub, N. Sulaiman, an C.W. Kim. A stuy on psonality intification using gam bas thoy. In Pocings of th Intnational Confnc on Comput Tchnology an Dvlopmnt, pags , [54] V. Yanulvskaya, J. Uijlings, E. Buni, A. Satoi, E. Zamboni, F. Bacci, D. Mlch, an N. Sb. In th y of th bhol: mploying statistical analysis an y tacking fo analyzing abstact paintings. In Pocings of th ACM intnational confnc on Multimia, pags , [55] N. Y, N. Duchnaut, L. Nlson, an P. Likaish. Intovt lvs & conscintious gnoms: Th

11 xpssion of psonality in Wol of Wacaft. In Pocings of th Annual Confnc on Human Factos in Computing Systms, pags , 2011.