HDMM 2017, April 22nd, 2017 MASC: Automatic Sleep Stage Classification Based on Brain and Myoelectric Signals Yuta Suzuki Makito Sato Hiroaki Shiokawa Masashi Yanagisawa Hiroyuki Kitagawa Graduate School of Systems and Information Engineering, University of Tsukuba International Institute for Integrative Sleep Medicine, University of Tsukuba :Center for Computational Sciences, University of Tsukuba
2 Outline 1. Background and Motivation 2. Our proposal: MASC 3. Evaluation 4. Conclusion
3 Sleep Stage for Clinical Researches A sleep can generally be spanned into several stages. REM (Rapid Eye Movement) sleep, Non-REM sleep, Wake Sleep staging becomes the base for sleep disorder diagnoses and researches. Sleep disorder patients usually shows abnormal sleep stage transitions. Clinical experts can inspect the effectiveness of mediation by comparing the stages between before and after the mediation.
4 Sleep Staging on Mice Clinical experts usually analyze sleep stages on mice so as to clarify the effectiveness mediations for the sleep disorder. Electric signal transition observed from brain EEG (electro-encephalography) EMG (electro-myography) Electric signal transition taken from spinal cord
5 Sleep Stages Mice generally take three types of sleep stages: REM, Non-REM, and Wake They have different amplitudes in EEG and EMG signals. EMG EEG
6 How do they specify the stages? They visually inspect all signals to specify their stages. Clinical experts span EEG/EMG signals into fixed size of subsequences, called epoch, and classify them. It is generally set to 20 seconds. Of course, it is so time-consuming and needs much labor. EEG EMG An expert needs to spend more than 24 hours to classify EEG/EMG signals whose length is 8 hours. Epoch 1 Epoch 2 Epoch 3 Epoch 4 20 sec. 20 sec. 20 sec. 20 sec. Wake Wake Non-REM Non-REM
7 Why do they visually inspect? Clinical researchers require more than 95% accuracy! Existing works: Neural network based method [Yokoyama et al., 1993] Decision tree based method [Hanaoka et al., 2001] LDA + Decision tree [Brankack et al., 2010] Naïve Bayes classifier [Rempe et al., 2015] FASTER [Sunagawa et al., 2013], exfaster [Suzuki et al., 2015] They did not achieve 95% accuracy and still require experts interactions.
8 FASTER [Sunagawa et al., 2013] Unsupervised & fully-automated sleep staging method. Observation: epochs in the same sleep stage show similar EEG/EMG spectra. Non-REM Wake REM features extraction classification FASTER achieves the best performance in terms of accuracy, but it does not reach 95% accuracy (it is almost 80~90%)
9 Our Goals & Contributions Our goal is to achieve 95% classification accuracy without any experts interactions. We present a supervised classification method MASC for the sleep staging. In this work, we experimentally selects effective features for the sleep staging. Contributions Accurate: MASC shows almost 95% accuracy. Robust: MASC shows better robustness against noises. Automatic: MASC does not require any user interactions.
10 Outline 1. Background and Motivation 2. Our proposal: MASC 3. Evaluation 4. Conclusion
11 Key Observations Observation 1 : Actual EEG/EMG signals have several transition patterns that appears frequently. However, FASTER fails to capture the temporal features. Observation 2: Some epochs contain multiple stages (ambiguous epoch) FASTER does not care such ambiguous epochs.
12 Overview of MASC MASC: a supervised sleep stage classification method MASC is consisted of the following three steps. Temporal features Ambiguous epochs Test data Step 1: Temporal feature construction Step 2: Classification & ambiguous epoch extraction Step 3: Reclassification Results Training data
14 1. Temporal feature construction Goal of this step is extract a new feature that captures the temporal sleep stage transitions. To do so, MASC preliminary classify epochs by using the same features as those of FASTER. Preliminary classification Test data Training data Preliminary stages One-vs-Rest SVM Wake temporal features
15 How do we build temporal features? MASC builds temporal features from preliminary stages. Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6 Epoch 7 Wake Wake Non-REM Non-REM Non-REM Non-REM Non-REM target epoch Before k epochs Behind k epochs
16 Three types of temporal features Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6 Epoch 7 Wake Wake Non-REM Non-REM Non-REM Non-REM Non-REM 1. String-based: target epoch String sequence of preliminary stages. {Wake, Wake, Non-REM, Non-REM, Non-REM, Non-REM} 2. Probability-based: Ratio of each sleep stage types in the before/behind k epochs. {Wake-/k, Non-REM-/k, REM-/k, Wake/k, Non-REM/k, REM/k} = {0.67, 0.33, 0, 0, 1, 0} 3. Weighted probability-based: Applying Gaussian-based weight function into Probability-based feature.
17 Which feature is the best? Non-REM Accuracy k = 1 k = 2 k = 3 k = 4 k = 5 String-based 91.13 2.67 92.38 2.02 92.58 2.51 88.30 5.29 89.97 2.94 Probability-based 93.35 1.66 93.94 1.49 93.09 1.53 94.19 1.58 94.20 1.55 Weighted probabilitybased 93.35 1.66 93.36 1.65 93.40 1.61 93.65 1.58 93.80 1.52 REM Accuracy k = 1 k = 2 k = 3 k = 4 k = 5 String-based 92.93 3.17 92.72 3.00 93.09 2.95 94.04 2.83 94.54 2.12 Probability-based 96.03 2.97 95.30 2.89 94.75 3.14 94.31 3.23 94.05 3.36 Weighted probabilitybased 96.03 2.97 96.01 2.95 95.89 2.77 95.50 2.80 95.25 2.91 Weighted probability-based feature with k=3 shows wellbalanced & better performance among the sleep stages. But, probability-based with k=3 is the best to classify Non-REM. Also, probability-based with k=1 is the best for REM.
18 2. Classification & ambiguous epochs extraction Training data Evaluate confidences & find ambiguous epochs Weighted probabilitybased features (k=3) One-vs-Rest SVM epoch2 REM epoch1 Wake epoch3 Wake High-confidence epochs Ambiguous epochs Non-REM positive negative The others negative Wake The others Non-REM classifier REM REM classifier positive The others positive negative Wake classifier
19 How do we find ambiguous epoch? Ambiguous epochs contain multiple types of sleep stages. The epochs should be placed in the positive side of multiple classifier. We evaluate!(#) for each epoch # to detect ambiguous epochs.! # = min ) * ), s.t. ) *, ), ) /, ) 0, ) 1 Non-REM positive The others negative d W Wake negative d N The others Non-REM classifier dr REM REM classifier positive The others positive negative Wake classifier
20 What s going on? Frequency Frequency Epochs classified as Non-REM 0 2 4 6 8 10 D(e) Epochs classified as Wake Frequency Epochs classified as REM 0 2 4 6 8 10 D(e) Ambiguous epochs Correctly classified Misclassified We should re-classify REM epochs where! # 4. 0 2 4 6 8 10 D(e)
21 3. Re-classification MASC re-classifies ambiguous epochs by using k=5 probability-based features. REM and Non-REM have very similar features. Thus, misclassified REM epochs are most likely to be Non-REM rather than Wake. Recall, REM shows the best accuracy for probability-based features with k=5 than the others. Ambiguous epochs Training data Probabilitybased features (k=5) One-vs-Rest SVM Results
22 Outline 1. Background and Motivation 2. Our proposal: MASC 3. Evaluation 4. Conclusion
23 Evaluation Settings Competitors MASC FASTER [Sunagawa et al., 2013] exfaster [Suzuki et al., 2015] Datasets: Two types mice provided by IIIS in U. Tsukuba Noiseless Mice (Wild type): Normal mice without any mediations Noisy Mice: Abnormal mice with have sleep disorders Each epoch is labeled by clinical experts # of mice:14 for each type Sampling ratio: 250Hz, Epoch size: 20 sec. Totally, we used 240K epochs
24 Example: Noisy Mice
25 Evaluation 1: Noiseless Mice Non-REM Sensitivity Non-REM Specificity REM Sensitivity REM Specificity Wake Sensitivity Wake Specificity Accuracy MASC 94.37 1.44% 96.63 1.30% 94.74 2.86% 97.37 0.94% 95.42 2.25% 98.09 0.93% 94.76 1.02% FASTER 89.49 0.41% 94.02 0.73% 78.36 1.00% 98.33 0.06% 94.61 0.72% 92.31 0.36% 91.09 0.21% exfaster 93.79 0.21% 93.87 0.54% 79.72 0.59% 99.08 0.05% 94.79 0.59% 95.26 0.20% 93.41 0.21%
26 Evaluation 2: Noisy Mice Non-REM Sensitivity Non-REM Specificity REM Sensitivity REM Specificity Wake Sensitivity Wake Specificity Accuracy MASC 94.19 5.47% 93.71 2.07% 92.95 7.11% 96.01 2.60% 91.10 4.98% 99.50 0.34% 92.40 3.31% FASTER 91.26 6.01% 79.04 25.86% 88.95 6.65% 97.24 3.03% 75.55 28.85% 94.79 4.01% 82.32 13.39% exfaster 95.02 2.49% 78.65 25.44% 88.08 6.08% 97.89 3.02% 75.40 28.64% 97.14 1.71% 83.93 14.69%
27 Outline 1. Background and Motivation 2. Our proposal: MASC 3. Evaluation 4. Conclusion
28 Conclusion We present a machine learning approach and feature engineering technique, named MASC, for the sleep staging. Contributions Accurate: MASC shows almost 95% accuracy. Robust: MASC shows better robustness against noises. Automatic: MASC does not require any user interactions.