SCHIZOPHRENIA, AS SEEN BY A DEPTH CAMERA AND OTHER SENSORS Daphna Weinshall 1 School of Computer Science & Engin. Hebrew University of Jerusalem, Israel
RESEARCH GOAL Motivation: current psychiatric diagnosis and monitoring is based on subjective evaluations by clinicians, without being able to base judgment on measurable physical properties Goal: investigate physical qualities, including measurements of facial expressions, movement and speech, as a physical biomarker for mental disorders (Schizophrenia and Depression) 2
Audio Recording Bodily Movement Non verbal behavior Machine learning Diagnostic Tools Descriptive Tools Facial Expressions 3
SCHIZOPHRENIA One of the most severe mental disorders Lifetime prevalence of about 1% worldwide. Negative Symptoms - loss of functions and abilities (e.g. lack of speech and motivation, blunted affect) Positive symptoms - pathological functions not present in healthy individuals (e.g. auditory hallucinations, delusions and paranoid thoughts) 4 John Forbes Nash Vincent Willem van Gogh
SCHIZOPHRENIA Positive and Negative Symptom Scale (PANSS) BACKGROUND 5
OUTLINE OF TALK 1. Automated facial expression analysis in Schizophrenia: a continuous dynamic approach 2. Prosodic analysis of speech and the underlying mental state (schizophrenia and depression) 3. The dawn of big data: using accelerometer data, as measured by wearable sports watches 6
AUTOMATED FACIAL EXPRESSION ANALYSIS IN SCHIZOPHRENIA: A CONTINUOUS DYNAMIC APPROACH Talia Tron Shaar Menashe hospital: Avi Peled 7
STUDY OBJECTIVE Characterize the way schizophrenia is manifested in facial activity Develop automatic tools for quantitatively describing and analyzing relevant measures of this activity 8
FACIAL EXPRESSIONS? Reflect mental and emotional state Impaired in patients (flat affect, incongruity) Integral part of mental status examination Possible relation to neural mechanism Direct, non invasive measurement Challenges Technology Computational 9
TECHNOLOGY (FACIAL AND BODILY EXPRESSIONS) Structured light 3D camera (Carmine 1.09) Action Units - a system to taxonomize human facial movements Facial Action Units extraction out of 3D video with Faceshift TM 10
STRUCTURED LIGHT
WHAT ARE ACTION UNITS 12
EXTRACTED FEATURES Faceshift TM returns 4 output types: Intensity level of 51 faceshift Action Units (fs-aus): BROWS (up, down) CHEEK (squint, puff) NOSE (sneer) CHIN (raise) EYES (blink, squint, up, down, in, out) LIPS (stretch, close, open, up, down, funnel, pucker) MOUTH (left, right, frown, smile, dimple) Eye gaze and position 3D head coordinates 3D position of facial markers JAW (forward, left, right, open) 13 In our research we used only Intensity level of 51 AUs
FACESHIFT IN ACTION 14
METHOD Characterize Record Interview 3D Data Extract Facial Activity Compute Facial Features Richness Typicality Affect Predict time Patients / Control Symptom Severity 15
DATA ACQUISITION 34 schizophrenia patients and 33 control subjects Recorded with structured light 3D camera 15 minutes long structured Interview by trained psychiatrist Positive and Negative Symptom Scale (PANSS) RGB + Depth data Carmine 1.09 16
Personal Details PANSS STRUCTURED INTERVIEW RGB + Depth data IAPS X 60 Sound איך הרגשת בזמן שהסתכלת בתמונה? 6 sec RESTING STATE 1 min EMBD X 3 איך הרגשת בזמן שצפית בסרט? 40 sec
OUTLINE Record Interview 3D Data Extract Facial Activity Compute Facial Features Characteriz Richness e Typicality Affect Predict time Patients / Control Symptom Severity 18
FACIAL ACTIVITY EXTRACTION Facial Action Coding System (FACS) Ekman and Rosenberg (1997) 19
FACIAL ACTIVITY EXTRACTION Raw interview data Control Patient 20 time time
FACIAL ACTIVITY EXTRACTION 23 Faceshift Action Units (AUs) were selected based on tracking sensitivity and noise level. 21
OUTLINE Characterize Record Interview 3D Data Extract Facial Activity Compute Facial Features Richness Typicality Affect Predict time Patients / Control Symptom Severity 22
FACIAL ACTIVITY CHARACTERIZING FEATURES We measured the diversity in facial activity throughout the video: typicality the range of subtle changes in facial activity richness the range of prototypical expressions used 23
RESULTS CONTROL PATIENTS 1 1. Neutral, flat 4. Sadness, fear, anger 7. Happiness, content 24
PREDICTION, LEARNING DETAILS Two step learning algorithm SVM for patients vs. controls classification Regularized regression (ridge) for symptom severity prediction Leave One Out (LOO) train-test paradigm 25
RESULTS- PREDICTION Patients vs. Control Classification Symptom Severity Prediction Predicted Score (R=0.53, p<<0.01) Psychiatrist Score 26
RESULTS - DESCRIPTIVE CURRENT WORK Dynamic Vs. intensity features 27
RESULTS - DESCRIPTIVE Dynamic Vs. intensity features Emotional affect analysis Activation Level Activation Level p<0.01 Pleasant emotions,happy Sadness, Fear, Disgust 28 Wallace V Friesen and Paul Ekman. Emfacs-7: Emotional facial action coding system. Unpublished manuscript, University of California at San Francisco, 2:36,. 1983
RESULTS - DESCRIPTIVE CURRENT WORK Dynamic Vs. intensity features Emotional affect analysis Smile charactarization 29
PART 1, RESULTS HIGHLIGHTS We measured the co-existence of flat affect and inappropriate affect in patients Flat affect is expressed by reduction of intensity, slowdown of dynamic and reduced variability of expression Facial expressions of patients showed reduced consistency ( inappropriateness ), without evidence of impaired emotional experience 30
PROSODIC ANALYSIS OF SPEECH AND THE UNDERLYING MENTAL STATE Roi Kliper McLean hospital, Boston: Shirley Portuguese 31
Audio Recording Bodily Movement Non verbal behavior Machine learning Diagnostic Tools Descriptive Tools Facial Expressions 32
A HISTORICAL NOTE dementia preacox patients indifferent tone and distorted turns of speech Dr. Emil Kraepelin (1913) Long held belief/observation: the human voice (prosody) conveys information about people s feelings, emotions and mental state 33
Standard part of the repertoire of mental status examination Reported changes in acoustic characteristics of speech prosody in the course of different mental disorders, notably depression and schizophrenia Negative symptoms Alogia poverty of speech, including the manner of speech Affective Flattening the lack of vocal inflections MADRS: Montgomery Åsberg Depression Rating Scale Apparent sadness - gloom and despair reflected in speech (as well) 34
SPEECH ANALYSIS Speech = prosody + content Prosody, the acoustic properties of speech Frequency of the sound wave [pitch - fundamental frequency] Amplitude of sound wave (decibels) Timing (length of speech and gaps) we measure these properties, and how they change Focus of this research: investigate properties of speech prosody which can be used to characterize and monitor mental illnesses 35
- Data Collection McLean Hospital Boston - healthy, depressed and Schizophrenic individuals - Feature Extraction Select the most informative subset of features for the task - Algorithm for Mental state evaluation Train from data with machine learning tools - Gain insight regarding the phenomenon Meta-analysis: look for biomarkers for mental illnesses 36
DATA COLLECTION Healthy Schizophrenia Depression Male 10 13 9 32 Female 10 9 11 30 20 22 20 62 3 tasks: North American Adult Reading Test (NART) list of irregularly pronounced words Passage Reading the rainbow passage all 40 phonemes of American English are utilized in proportion to their representation in everyday conversation Interview 37
ALTERATIONS OF SIGNAL CAN OCCUR AT DIFFERENT TIME-SCALES Macro Scale > 1 sec [semantic level] Meso-Scale 25 ms 1 sec Control Time Micro Scale [least voluntary] <10 ms 38
utterance gap utterance gap 39
UTTERANCE segment of continuous speech which exceeds 0.5 s 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Time (sec) 40
FEATURES OF SPEECH PROSODY Macro scale >1s Mean utterance duration Mean gap duration Mean spoken ratio (all utterances) Meso scale 0.25ms 1s Pitch Range Pitch standard deviation power standard deviation Micro scale Mean waveform correlation Mean jitter Mean shimmer < 10ms 41
RESULTS SINGLE FEATURES, MACRO SCALE Spoken ratio Utterance Length Gap length 42
RESULTS SINGLE FEATURES, MESO SCALE Pitch Range STD Pitch STD Power 43
RESULTS SINGLE FEATURES, MICRO SCALE MWC Jitter Shimmer 44
CORRELATION WITH STANDARD SCALES SANS alogia SANS affective flatening Spoken ratio -0.58-0.64 Utterance duration -0.55-0.49 Gap duration 0.45 0.54 45
46
PART 2, SUMMARY Investigate the potential use of speech to provide biomarkers for mental illnesses Motivation: the development of a reliable, objective, low-priced, and readily applicable assessment tool would enhance the accuracy of the clinical evaluation for diagnosis and severity Suitable technological apparatus including speech recognition software could allow this tool to be applied for screening or monitoring mental health status remotely 47
PART 3: THE DAWN OF BIG DATA Data: movement (accelerometer data) is recorded 24/7 by wearable sport watches Work in progress Sensor: geneactive wearable watch Measures acceleration, temperature, light 48
CORRELATION ANALSYS Correlation between our measurements and clinical measures: 49
DETECTING CHANGE OF MEDICATION 50
SUMMARY Results: Automatic facial activity has predictive power for diagnosis and symptom severity assessment in schizophrenia Speech prosody can be used to monitor depression and schizophrenia New technology may enable the continuous monitoring of patients 24/7 We seek to develop: Biomarkers for poorly understood mental disorders Tools for scientific investigation of the underlying causes and manifestations of the disorders 51