Nonlinear signal processing and statistical machine learning techniques to remotely quantify Parkinson's disease symptom severity

Similar documents
Accurate telemonitoring of Parkinson s disease progression by non-invasive speech tests

Telephone Based Automatic Voice Pathology Assessment.

Parkinson s Disease Diagnosis by k-nearest Neighbor Soft Computing Model using Voice Features

International Journal of Pure and Applied Mathematics

A methodology for the analysis of medical data

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes

UNOBTRUSIVE MONITORING OF SPEECH IMPAIRMENTS OF PARKINSON S DISEASE PATIENTS THROUGH MOBILE DEVICES

MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE

Comparative Accuracy of a Diagnostic Index Modeled Using (Optimized) Regression vs. Novometrics

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

A Brain Computer Interface System For Auto Piloting Wheelchair

CLOSED PHASE ESTIMATION FOR INVERSE FILTERING THE ORAL AIRFLOW WAVEFORM*

Handwriting - marker for Parkinson s Disease

Heart Murmur Recognition Based on Hidden Markov Model

Detection of Parkinson S Disease by Speech Analysis

Evaluation of the neurological state of people with Parkinson s disease using i-vectors

A Review on Dysarthria speech disorder

Investigating Voice as a Biomarker for leucine-rich repeat kinase 2-Associated Parkinson s Disease

Emotion Detection Using Physiological Signals. M.A.Sc. Thesis Proposal Haiyan Xu Supervisor: Prof. K.N. Plataniotis

Parkinson s Disease : A Case Study

ParkDiag: A Tool to Predict Parkinson Disease using Data Mining Techniques from Voice Data

Empirical Mode Decomposition based Feature Extraction Method for the Classification of EEG Signal

Discrimination between ictal and seizure free EEG signals using empirical mode decomposition

Voice Pathology Recognition and Classification using Noise Related Features

Running-speech MFCC are better markers of Parkinsonian speech deficits than vowel phonation and diadochokinetic

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise

SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM)

Identifying Parkinson s Patients: A Functional Gradient Boosting Approach

Performance of Gaussian Mixture Models as a Classifier for Pathological Voice

Degree of Parkinson s Disease Severity Estimation Based on Speech Signal Processing

EPILEPTIC SEIZURE DETECTION USING WAVELET TRANSFORM

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

2012, Greenwood, L.

EECS 433 Statistical Pattern Recognition

Analyses of Instantaneous Frequencies of Sharp I, and II Electroencephalogram Waves for Epilepsy

EEL 6586, Project - Hearing Aids algorithms

Noise-Robust Speech Recognition Technologies in Mobile Environments

Effect of Hypnosis and Hypnotisability on Temporal Correlations of EEG Signals in Different Frequency Bands

Voice analysis for detecting patients with Parkinson s disease using the hybridization of the best acoustic features

TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY

2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

Proceedings of Meetings on Acoustics

Discrimination of Parkinsonian Tremor From Essential Tremor by Voting Between Different EMG Signal Processing Techniques

Selection of Feature for Epilepsy Seizer Detection Using EEG

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Quick detection of QRS complexes and R-waves using a wavelet transform and K-means clustering

CONTENT BASED CLINICAL DEPRESSION DETECTION IN ADOLESCENTS

2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

Development of 2-Channel Eeg Device And Analysis Of Brain Wave For Depressed Persons

THE data used in this project is provided. SEIZURE forecasting systems hold promise. Seizure Prediction from Intracranial EEG Recordings

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

EMOTIONS are one of the most essential components of

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

III. PROPOSED APPROACH FOR PREDICTING SPIROMETRY READINGS FROM COUGH AND WHEEZE

Minimum Feature Selection for Epileptic Seizure Classification using Wavelet-based Feature Extraction and a Fuzzy Neural Network

2013), URL

MULTI-MODAL FETAL ECG EXTRACTION USING MULTI-KERNEL GAUSSIAN PROCESSES. Bharathi Surisetti and Richard M. Dansereau

LIE DETECTION SYSTEM USING INPUT VOICE SIGNAL K.Meena 1, K.Veena 2 (Corresponding Author: K.Veena) 1 Associate Professor, 2 Research Scholar,

CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS

Monitoring Cardiac Stress Using Features Extracted From S1 Heart Sounds

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

IDENTIFICATION OF MYOCARDIAL INFARCTION TISSUE BASED ON TEXTURE ANALYSIS FROM ECHOCARDIOGRAPHY IMAGES

home environments of patients

EXPLORATORY ANALYSIS OF SPEECH FEATURES RELATED TO DEPRESSION IN ADULTS WITH APHASIA

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Comparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide

Identification of Tissue Independent Cancer Driver Genes

Particle Swarm Optimization Supported Artificial Neural Network in Detection of Parkinson s Disease

HCS 7367 Speech Perception

Mammogram Analysis: Tumor Classification

On the feasibility of speckle reduction in echocardiography using strain compounding

Hilbert Huang analysis of the breathing sounds of obstructive sleep apnea patients and normal subjects during wakefulness.

COMPUTIONAL DIAGNOSIS OF PARKINSON DISEASE USING SPEECH SIGNAL

Epileptic Seizure Detection using Spike Information of Intrinsic Mode Functions with Neural Network

Efficient Feature Extraction and Classification Methods in Neural Interfaces

A Review on Sleep Apnea Detection from ECG Signal

A New Paradigm for the Evaluation of Forensic Evidence

Hemodynamic Monitoring Using Switching Autoregressive Dynamics of Multivariate Vital Sign Time Series

AUTOMATIC CLASSIFICATION OF HEARTBEATS

Binary Classification of Cognitive Disorders: Investigation on the Effects of Protein Sequence Properties in Alzheimer s and Parkinson s Disease

Epileptic Seizure Classification using Statistical Features of EEG Signal

Identification and Validation of Repetitions/Prolongations in Stuttering Speech using Epoch Features

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency

AU B. Sc.(Hon's) (Fifth Semester) Esamination, Introduction to Artificial Neural Networks-IV. Paper : -PCSC-504

I. INTRODUCTION. OMBARD EFFECT (LE), named after the French otorhino-laryngologist

Discovering Meaningful Cut-points to Predict High HbA1c Variation

International Journal of Advance Engineering and Research Development EARLY DETECTION OF GLAUCOMA USING EMPIRICAL WAVELET TRANSFORM

Critical Review: Can sub-thalamic deep brain stimulation (STN-DBS) improve speech output in patients with Parkinson s Disease?

Sound Texture Classification Using Statistics from an Auditory Model

Speech Generation and Perception

Voice Detection using Speech Energy Maximization and Silence Feature Normalization

Statistical analysis of epileptic activities based on histogram and wavelet-spectral entropy

Advanced Methods and Tools for ECG Data Analysis

Speech Compression for Noise-Corrupted Thai Dialects

Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2

Optimizing deep brain stimulation settings using wearable sensing technology

Epileptic seizure detection using EEG signals by means of stationary wavelet transforms

Development of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images.

DISCRETE WAVELET PACKET TRANSFORM FOR ELECTROENCEPHALOGRAM- BASED EMOTION RECOGNITION IN THE VALENCE-AROUSAL SPACE

Transcription:

Nonlinear signal processing and statistical machine learning techniques to remotely quantify Parkinson's disease symptom severity A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig 9 March 2011

Project background Neurological disorders claim lives at an epidemic rate worldwide Parkinson s disease (PD) is the second most common neurodegenerative disorder to Alzheimer Incidence rate: 20/100,000 annually 1% affected over the age of 60 years (age most important risk factor) Aging population more PD patients Drugs & surgery can alleviate symptoms but there is no known cure

What we already know Speech impairment is common in PD (90%) Evidence suggests degradation with PD progression Disease symptom tracking: costly, subjective, time-consuming, requires patient s physical presence. Symptoms clinician maps to clinical metric (inter-rater variability) Unified Parkinson s Disease Rating Scale (UPDRS) Range: motor-updrs 0 108, total-updrs 0 176 (44 sections)

Unified Parkinson s Disease Rating Scale (UPDRS) comprises three components and 44 sections in total, each section spans the range 0-4 Component 1 Mentation, behavior and mood 4 sections (1-4) Component 2 Activities of daily living 13 sections (5-17) Component 3 Motor (motor-updrs) 27 sections (18-44) Includes mentation, thought disorder, depression, and motivation/initiative Ability to complete daily tasks unassisted, e.g. dressing, walking, writing Muscle problems e.g. tremor, rigidity, posture, stability, bradykinesia Section 5: Speech the clinician assesses whether the subject s vocal output is understandable during casual discussion. Section 18: Speech the clinician assesses whether the subject s vocal output is expressive during casual discussion.

What this study adds Novel mapping of speech dysphonias to UPDRS Accurate telemonitoring of PD progression Objective machine learning implementations Remote monitoring, less time-consuming Technology enabling large scale clinical trials Facilitates design and testing of novel drug treatments

Telemedicine: the dawn of a new era

Data Untreated idiopathic PD (onset < 5 years at baseline) 42 PD subjects (28 males) Patient follow-up: UPDRS at baseline, 3-month and 6-month Linear interpolation, obtain weekly UPDRS estimates 5875 phonations in total, approximately 140 phonations per subject 6 phonations weekly, 4 phonations comfortable pitch, and 2 loud Data partitioning: 4,010 signals for males and 1,865 signals for females

Methodology of this work Speech signal Extract feature vector Feature selection/transformation Statistical machine learning (mapping features UPDRS) Report results

Features Classical dysphonia measures (Tsanas et al. IEEE TBME 2010a) (capture fundamental frequency changes, amplitude, variability...) Improved classical dysphonia measures (Tsanas et al. ICASSP 2010b) (power transformation of classical schemes leads to improved results) Wavelets, Entropies & Energy (Tsanas et al. NOLTA 2010c) (wavelet decomposition and application of entropy and linear/nonlinear energy concepts) Nonlinear dynamical systems theory (Tsanas et al. JRSI 2010d)

Novel features (background) Pathological voices exhibit high frequency noise Asynchronous excitation due to incomplete vocal fold closure Loss of signal power & increase of noise power signal to noise ratio concepts Unstable control of voice production mechanism, not confined only to vocal folds Tentative belief that voice disorders affect differently frequency bands Nonlinearities in voice production

Novel features (algorithms) Empirical mode decomposition excitation ratio (first components represent noise, the rest components the signal) Vocal fold excitation ratio (detect glottal pulses and scan the frequency range using 500 Hz bands, then 1 Hz 2.5 khz denotes signal, the rest is noise - heuristic finding in this study) Glottis quotient (use DYPSA algorithm to detect glottal pulses and quantify periodicity concept invoked from the classical dysphonia measures)

Empirical mode decomposition excitation ratio (EMD-ER) Hilbert-Huang transform decomposes the signal in its components known as intrinsic mode functions The first intrinsic mode function contains the most highfrequency component in the signal noise in the signal Subsequent components have lower frequencies Experiment to find appropriate ratios of components that denote signal to noise ratio

Vocal fold excitation ratio (VFER) Detect glottal pulses using a robust F0 algorithm Scan the frequency range using 500 Hz bands Optimize the frequency range of signal and noise by experimentation Compute ratios, entropy values of the signal to noise ratio signals

Feature selection Curse of dimensionality Reducing number of features enables a) improved performance, b) more accurate inference of the underlying characteristics of the modelled system LASSO, elastic net, maximum relevance minimum redundancy Recently we have proposed the relevance, redundancy and complementarity trade-off (RRCT)

Statistical mapping y = f (X), X Features and y UPDRS Random forests, powerful nonparametric learner Input into Random Forests the feature subsets selected by the feature selection algorithms One standard error rule to select parsimonious model

Test methods Use 10-fold cross validation with 100 repetitions for confidence Report 1 MAE yi f( xi) L Q, L is number of samples denoted by Q which contains the indices of samples in each cross validation repetition Statistical significance tests (Kolmogorov-Smirnov, Wilcoxon)

Results Table 1: Most strongly associated features with motor-updrs and total-updrs for males Measure VFER- NSR TKEO DFA 7 th MFCC coef 6 th MFCC coef F0,Sun F0,expected Log energy 4 th MFCC coef 0 th MFCC coef 8 th MFCC coef 8 th delta MFCC coef Description Ratio of the sum of the log-transformed mean TKEO of the band-pass signals for frequencies >2.5 khz to the sum of the mean TKEO of the band-pass signals for frequencies <2.5 khz Characterizes the extent of turbulent noise, quantifying its stochastic self-similarity (Little et al. 2007) 7 th Mel Frequency Cepstral Coefficient (Brookes 2006) 6 th Mel Frequency Cepstral Coefficient (Brookes 2006) Mean difference of the cycle-to-cycle estimate (extracted using Sun s algorithm) and the average expected in age- and sex- matched healthy controls Estimate of the logarithmic energy (Brookes 2006) 4 th Mel Frequency Cepstral Coefficient (Brookes 2006) 0 th Mel Frequency Cepstral Coefficient (Brookes 2006) 8 th Mel Frequency Cepstral Coefficient (Brookes 2006) 8 th delta Mel Frequency Cepstral Coefficient (First derivative of 8 th MFCC) (Brookes 2006) Motor-UPDRS relevance and correlation MI Spearman R Total-UPDRS relevance and correlation MI Spearman R 0.105 0.159 0.132 0.187 0.078-0.162 0.115-0.205 0.079-0.066 0.108 0.0070 0.106-0.277 0.102-0.294 0.088 0.097 0.101 0.018 0.090 0.149 0.099 0.169 0.088-0.082 0.098-0.061 0.079 0.171 0.099 0.197 0.106 0.276 0.095 0.259 0.073 0.181 0.093 0.205

Results Table 2: Most strongly associated features with motor-updrs and total-updrs for females Measure Description Motor-UPDRS relevance and correlation MI Spearman R Total-UPDRS relevance and correlation MI Spearman R Std F0,mixture Standard deviation of the extracted 0.205 0.475 0.216 0.470 Std F0, Rapt Standard deviation of the extracted 0.174 0.437 0.195 0.434 GQ closed 0 th MFCC coef F0,Praat F0,expected 1 st MFCC coef Log energy F0,mixture F0, expected Standard deviation of the duration that the vocal folds remain closed 0.211 0.236 0.195 0.250 0 th delta Mel Frequency Cepstral Coefficient (Brookes 2006) 0.200-0.327 0.187-0.344 Mean difference of the cycle-to-cycle estimate (extracted using Praat s algorithm) and the average expected in age- and sex- matched healthy controls 0.198 0.103 0.176 0.034 1 st delta Mel Frequency Cepstral Coefficient (Brookes 2006) 0.135-0.047 0.170-0.031 Estimate of the logarithmic energy (Brookes 2006) 0.179-0.458 0.170-0.487 Mean difference of the cycle-to-cycle estimate (extracted using the mixture algorithm) and the average expected in age- and sex- matched healthy controls 0.181 0.019 0.164-0.055 F0, Rapt F0, expected TKEO (F0), prc5 Mean difference of the cycle-to-cycle estimate (extracted using Rapt s algorithm) and the average expected in age- and sex- matched healthy controls 5 th percentile of the TKEO of the fundamental frequency values, obtained with the mixture algorithm 0.173 0.022 0.158-0.054 0.177-0.411 0.153-0.369

Results Table 3: Selected feature subset using the LASSO (show only 17 due to lack of space) MALES (33 dysphonia measures) FEMALES (33 dysphonia measures) Dysphonia measure Motor UPDRS MI R Total UPDRS MI R Dysphonia measure Motor UPDRS MI R Total UPDRS 6 th MFCC coef 0.106-0.277 0.102-0.294 Log energy 0.179-0.458 0.170-0.487 8 th MFCC coef 0.106 0.276 0.095 0.259 Std F0, Rapt 0.205 0.475 0.216 0.470 VFER SNR,TKEO 0.077-0.076 0.077-0.108 10 th MFCC coef 0.112 0.239 0.107 0.250 VFER mean 0.076 0.154 0.089 0.13 PPE 0.118 0.436 0.105 0.396 8 th delta MFCC 0.073 0.181 0.093 0.205 12 th MFCC coef 0.094 0.204 0.088 0.261 12 th delta MFCC 0.048 0.172 0.054 0.167 IMF SNR,TKEO 0.075-0.127 0.067-0.067 0 th MFCC coef 0.079 0.171 0.097 0.197 8 th MFCC coef 0.114-0.341 0.092-0.255 2 nd MFCC coef 0.082-0.149 0.084-0.182 11 th MFCC coef 0.078 0.127 0.100 0.187 3 rd MFCC coef 0.071 0.091 0.077 0.067 IMF NSR,SEO 0.099-0.117 0.065-0.058 2 nd delta MFCC 0.047 0.130 0.050 0.125 GNE mean 0.090 0.035 0.086-0.062 3 rd delta MFCC 0.046 0.169 0.054 0.161 3 rd delta MFCC 0.070 0.149 0.064 0.119 Std F0, Sun 0.046 0.144 0.050 0.129 HNR std 0.072 0.224 0.066 0.195 9 th MFCC coef 0.075-0.194 0.073-0.153 5 th MFCC coef 0.113 0.173 0.115 0.188 7 th MFCC coef 0.079-0.066 0.108 0.007 2 nd delta MFCC 0.055 0.172 0.056 0.206 4 th delta MFCC 0.041 0.001 0.044 0.007 GNE SNR,TKEO 0.036 0.038 0.042 0.033 GNE SNR,TKEO 0.023 0.074 0.024 0.089 10 th delta MFCC 0.071-0.064 0.066-0.079 Shimmer A0,abs 0.042-0.079 0.058-0.135 GQ open 0.061 0.256 0.057 0.248 MI R

Results Table 4: Out of sample mean absolute error (MAE) for motor UPDRS and total UPDRS Motor UPDRS (males) Total UPDRS (males) Motor UPDRS (females) Total-UPDRS (females) 1.62 0.17 1.96 0.23 1.72 0.16 2.20 0.21 Inter-rater variability: 4-5 UPDRS points Critically important difference ~ 2.5 motor-updrs & ~ 4.5 total-updrs Results are statistically significant (p<0.001) with respect to the mean UPDRS and baseline UPDRS using the KS & Wilcoxon significance tests

Results 6-month UPDRS-tracking for a male and a female subject

Conclusions Speech signals convey clinically useful information Fast, accurate, remote, objective monitoring of PD is shown to be possible using simply speech Results are better than the inter-rater variability (which is about 5 UPDRS points). Potentially these results could be further improved in conjunction with other PD recorded signals (e.g. tremor data).

Current and future work A robust feature selection algorithm which outperforms popular filters (in December we submitted a paper in JMLR) PD metrics (about to submit a paper in Movement Disorders) Apply novel features for PD versus controls (writing for IEEE TBME) Currently extending feature selection algorithms using geometrical concepts A new machine learning approach for speech F0 estimation extending our JRSI suggestion (expecting ground truth signals from collaborators) Exploring a new approach on random forests

References A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: Accurate telemonitoring of Parkinson s disease progression by non-invasive speech tests, IEEE Transactions Biomedical Engineering, Vol. 57, pp. 884-893, 2010a A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: Enhanced classical dysphonia measures and sparse regression for telemonitoring of Parkinson s disease progression, ICASSP, pp. 594-597, Dallas, Texas, US, 14-19 March 2010b A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson s disease symptom severity", International Symposium on Nonlinear Theory and its Applications (NOLTA), pp. 457-460, Krakow, Poland, 5-8 September 2010c (invited) A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: Remote tracking of Parkinson s disease progression by extracting novel dysphonia patterns from speech signals, Journal ofthe Royal Society Interface, (in press) 2010d