Predicting Seizures in Intracranial EEG Recordings

Similar documents
Predict seizures in intracranial EEG recordings

THE data used in this project is provided. SEIZURE forecasting systems hold promise. Seizure Prediction from Intracranial EEG Recordings

Epileptic Dogs: Advanced Seizure Prediction

Seizure Prediction and Detection

Feature Parameter Optimization for Seizure Detection/Prediction

Forecasting Seizures in Dogs with Naturally Occurring Epilepsy

Epileptic Seizure Prediction by Exploiting Signal Transitions Phenomena

Seizure Detection/Prediction Devices and Therapies

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Epileptic seizure detection using linear prediction filter

Optimal preictal period in seizure prediction

Exploration of Instantaneous Amplitude and Frequency Features for Epileptic Seizure Prediction

Linear Features, Principal Component Analysis, and Support Vector Machine for Epileptic Seizure Prediction Progress

EEG signal classification using Bayes and Naïve Bayes Classifiers and extracted features of Continuous Wavelet Transform

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

Crowdsourcing reproducible seizure forecasting in human and canine epilepsy

A micropower support vector machine based seizure detection architecture for embedded medical devices

Towards improved design and evaluation of epileptic seizure predictors

The seizure prediction characteristic: a general framework to assess and compare seizure prediction methods

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

An ensemble learning approach to detect epileptic seizures from long intracranial EEG recordings.

Minimum Feature Selection for Epileptic Seizure Classification using Wavelet-based Feature Extraction and a Fuzzy Neural Network

A Machine Learning Framework for Space Medicine Predictive Diagnostics with Physiological Signals

Database of paroxysmal iceeg signals

EPILEPTIC SEIZURE PREDICTION

Automatic Seizure Detection using Inter Quartile Range

Detection of Epileptic Seizure

Epileptic Seizure Classification using Statistical Features of EEG Signal

MULTICLASS SUPPORT VECTOR MACHINE WITH NEW KERNEL FOR EEG CLASSIFICATION

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

Clinical Neurophysiology

Seizure Prediction Through Clustering and Temporal Analysis of Micro Electrocorticographic Data

Using of the interictal EEGs for epilepsy diagnosing

Predicting Sleep Using Consumer Wearable Sensing Devices

Statistical analysis of epileptic activities based on histogram and wavelet-spectral entropy

Current Literature In Clinical Science. Predicting Seizures: Are We There Yet?

Epileptic seizure detection using EEG signals by means of stationary wavelet transforms

Epileptic Seizure Classification of EEG Image Using ANN

Classification of EEG signals using Hyperbolic Tangent - Tangent Plot

Efficient Feature Extraction and Classification Methods in Neural Interfaces

Discrimination between ictal and seizure free EEG signals using empirical mode decomposition

Seizure Detection and Prediction Through Clustering and Temporal Analysis of Micro Electrocorticographic Data

An analysis of the use of animals in predicting human toxicology and drug safety: a review

Epilepsy is the fourth most common neurological disorder

Correlation analysis of seizure detection features

Seizure Prediction using Hilbert Huang Transform on Field Programmable Gate Array

Generic Single-Channel Detection of Absence Seizures

arxiv: v1 [cs.lg] 4 Feb 2019

J2.6 Imputation of missing data with nonlinear relationships

Real-time Seizure Detection System using Multiple Single-Neuron Recordings

Epilepsy Seizure Detection in EEG Signals Using Wavelet Transforms and Neural Networks

MLE #8. Econ 674. Purdue University. Justin L. Tobias (Purdue) MLE #8 1 / 20

Onset Detection in Epilepsy Patients: A Systems Approach

Reactive agents and perceptual ambiguity

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science

Epileptic Seizure Classification Using Neural Networks With 14 Features

Epilepsy Disorder Detection from EEG Signal

ONLINE SEIZURE DETECTION IN ADULTS WITH TEMPORAL LOBE EPILEPSY USING SINGLE-LEAD ECG

Online Seizure Prediction Using An Adaptive Learning Approach

Predicting Breast Cancer Survivability Rates

A GENETIC APPROACH TO SELECTING THE OPTIMAL FEATURE FOR EPILEPTIC SEIZURE PREDICTION

Fractal spectral analysis of pre-epileptic seizures in terms of criticality

Est-ce que l'eeg a toujours sa place en 2019?

The EEG and Epilepsy in Kelantan --- A Hospital/laboratory... Based Study

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

Automated Detection of Videotaped Neonatal Seizures of Epileptic Origin

Empirical Mode Decomposition based Feature Extraction Method for the Classification of EEG Signal

Detection of Pre-stage of Epileptic Seizure by Exploiting Temporal Correlation of EMD Decomposed EEG Signals

Cerebral Cortex. Edmund T. Rolls. Principles of Operation. Presubiculum. Subiculum F S D. Neocortex. PHG & Perirhinal. CA1 Fornix CA3 S D

REDUCED-COMPLEXITY EPILEPTIC SEIZURE PREDICTION WITH EEG

Applying Data Mining for Epileptic Seizure Detection

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

Early Detection of Seizure With a Sequential Analysis Approach

EEG-Based Emotion Recognition via Fast and Robust Feature Smoothing

Automatic Epileptic Seizure Onset Detection Using Matching Pursuit A case Study

Classification of Epileptic Seizure Predictors in EEG

Predicting Non-Seizure Regions to Expedite EEG Analysis

Efficient Feature Extraction and Classification Methods in Neural Interfaces

EEG Analysis Using Neural Networks for Seizure Detection

Feasibility Study of the Time-variant Functional Connectivity Pattern during an Epileptic Seizure

EEG Signal Classification using Fusion of DWT, SWT and PCA Features

Performance Analysis of Epileptic Seizure Detection Using DWT & ICA with Neural Networks

An IoT-based Drug Delivery System for Refractory Epilepsy

Error Detection based on neural signals

Seizure Detection Challenge The Fitzgerald team solution

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation

HURST PARAMETER ESTIMATION FOR EPILEPTIC SEIZURE DETECTION

Classifying Substance Abuse among Young Teens

Saichiu Nelson Tong ALL RIGHTS RESERVED

One-shot Learning for ieeg Seizure Detection Using End-to-end Binary Operations: Local Binary Patterns with Hyperdimensional Computing

Genetic Algorithms and their Application to Continuum Generation

RBF Based Responsive Stimulators to Control Epilepsy

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

FREQUENCY DOMAIN BASED AUTOMATIC EKG ARTIFACT

Intelligent Edge Detector Based on Multiple Edge Maps. M. Qasim, W.L. Woon, Z. Aung. Technical Report DNA # May 2012

Large-Scale Statistical Modelling via Machine Learning Classifiers

Analysis of EEG Signal for the Detection of Brain Abnormalities

EEG Signal Processing for Epileptic Seizure Prediction by Using MLPNN and SVM Classifiers

MITOCW conditional_probability

Surgical Decision Making in Temporal Lobe Epilepsy by Heterogeneous Classifier Ensembles

Transcription:

Sining Ma, Jiawei Zhu sma87@stanford.edu, jiaweiz@stanford.edu Abstract If seizure forecasting systems could reliably identify periods of increased probability of seizure occurrence, patients who suffer from epilepsy would be able to avoid dangerous activities and lead more normal lives. The goal of this project is to differentiate between the preictal and interictal states by analyzing intracranial EEG recordings. Data for each hour are organized into six ten-minute time sequences. and support vector machines () are applied to each time sequence to calculate the average test false negative rate. The idea of combining data from various time sequences is also experimented with to examine if false negative rates can be reduced. The results show that provides better prediction results for patients, and that combining training examples from different time sequences does not help improve prediction results. Since the pathogenesis of epilepsy may vary across different species, applying the same training models to both dogs and patients may be problematic. Keywords seizure, classification, logistic regression, support vector machines () I. INTRODUCTION Epilepsy, characterized by the occurrence of spontaneous seizures, afflicts nearly % of the population worldwide. Epilepsy sometimes leads to loss of consciousness and control of bowel or bladder function, creating not only the risk of serious injury, but also an intense feeling of helplessness that strongly impacts the everyday lives of epileptic patients. If computational algorithms could reliably predict seizure occurrences, devices designed to warn patients of impending seizures would help patients avoid potentially dangerous activities. Also, medications could be taken only when necessary to reduce overall side effects. An epileptic patient s brain activity can be classified into 4 states: Interictal (between seizures, or baseline), Preictal (prior to seizure), Ictal (seizure), and Post-ictal (after seizures). The primary challenge in seizure forecasting is to differentiate between the preictal and interictal states. Past research on seizure forecasting based on EEG has pointed out two difficulties. First, preictal and interictal EEG patterns across patients (and dogs) vary considerably. Second, EEG is highly complex and varies over time. II. OBJECTIVE This project represents an attempt to demonstrate the existence and accurate classification of the preictal brain state in dogs and humans with naturally occurring epilepsy. III. DATA Data used in this project, which we obtained from a Kaggle competition, are intracranial electroencephalography (EEG) recordings sampled from dogs and patients. The training data are organized into ten-minute EEG clips labeled "Preictal" for preseizure data segments, or "Interictal" for non-seizure data segments. Both the preictal and interictal training data segments are numbered sequentially. The sequence, numbered - 6, represents the index of the data segment within an hour. For example, the sequence 6 preictal data segment represents the EEG data between the 5st minute and the 6th minute in the hour before a seizure occurrence. The test data, which are preictal, are also organized into tenminutes EEG clips but are provided in random order. This allows us to pick any test data segment to verify the accuracy of every time sequence s training model. Preictal training and test data segments cover one hour prior to seizure with a five-minute seizure horizon (Figure ). Similarly, one hour sequences of interictal ten-minute data segments are provided. A. EEG recording for dogs Intracranial EEG for dogs are sampled from 6 electrodes at 4 Hz. Thus, the canine data contain 6 features, and the data for each ten-minute time sequence contain 6 24, data points. Each

CS229 Final Project, 24 Fall A. IV. MODELS This is a binary classification problem, so the first intuitive choice is logistic regression with sigmoid function h θ (x). We apply stochastic gradient ascent to the following equation: θ j := θ j + α(y (i) h θ (x (i) ))x (i) j Fig. : EEG recordings cover one hour prior to seizure with a five-minute seizure horizon data point represents the EEG recording for electrode i at some time t. The competition provided EEG recordings for 5 dogs. We choose to use the first hour (6 sequences) training data for dogs and 5. where h θ (x) = g(θ T x) = + e θt x A fixed learning rate α is used, and the implementation slowly decreases α to some value around zero to ensure the algorithm converges to the global minimum. B. Support Vector Machine () The goal of classification is to establish and test a mapping x y from EEG spectral features to either a preictal or an interictal label. To achieve this goal, we use C-support vector classification which solves the following primal optimization problems: min w,b,ξ 2 wt w + C l i= ξ i subject to y i (w T φ(x i ) + b) ξ i, ξ i, i =,..., l, Fig. 2: EEG recordings cover one hour prior to seizure with a five-minute seizure horizon B. EEG recording for patients Intracranial EEG for patients are sampled from 5 electrodes at 5, Hz. Thus, the human data contain 5 features, and the data for each tenminute time sequence contain 5 3, data points. The competition provided EEG recordings for 2 patients. We choose to use the first hour (6 sequences) training data for patient. where φ(x i ) maps x i into a higher-dimension space and C > is the regularization parameter. We use the radial basis function (RBF) kernel: K(x i, x j ) = exp( γ x i x j 2 ) There are two model parameters: C and the kernel factor γ. Using the grid search approach, we try all combinations of C and γ, and select the combination that produces the maximum classification accuracy. The search range for log C is from - to 6, and for log γ is from -3 to 4. We then run C- with the selected C and γ. We use the same C and γ for each (individual and combined) time sequence. 2

CS229 Final Project, 24 Fall C. Calculating False Negative Rates Data for dogs and patients are categorized as either preictal or interictal. For logistic regression, we label preictal data examples as positive (""), and interictal data examples as negative (""). For, we label preictal data examples as positive (""), and interictal examples as negative ("-"). All test examples should be predicted as preictal, which is positive (""). We are interested in the false negative rate, which is the fraction of the preictal test examples that the algorithm incorrectly predicts to be interictal. A small false negative rate implies a high probability of seizure occurrences. In the following equation, R F N stands for false negative rate. R F N = # of test examples predicted to be interictal # of test examples We apply logistic regression to a time sequence times using different test data segments, and compute the average of the false negative rates. We repeat this step for the other five time sequence and compare the average false negative rates. (For, the process is the same except we use 5 different test data segments). The time sequence with the lowest average false negative rate is the best in predicting seizures. We then combine some continuous or randomly selected tenminute sequences, and run the algorithms again, calculate and compare the average false negative rates to observe whether combining time sequences improves the accuracy of seizure predictions. V. RESULTS A. parameter selection Because of the extremely long training time, we select the last 6 8, examples in the canine data, and the last 5, examples in the human data to grid-search the optimal values of C and γ. C γ Accuracy Dog 8 2 77.899% Dog 5 8 2 84.923% Patient 2.% TABLE I: Cross Validation Results for C and γ B. Dogs We use all the canine data, 6 24,, to run logistic regression. Due to the aforementioned time constraint, we use only the last 6 8, canine training examples in each time sequence to run. We then combine sequences, 6 and sequences 3, 6 to examine if the accuracy can be improved. See Figure 3, 4, and Table II for Dog and Dog 5 results. 2 3 4 5 6 (a) Sequences to 6 4&5 5&6 (b) Combined Sequences Fig. 3: Dog Results Dog seq 87635.999345 seq 2.35574 358 seq 3 5336 7742 seq 4.36352 7995 seq 5 484 827 seq 6 8698.93298 seq &6.394584.977588 seq 3&6 28.95888 Dog 5 seq 9379.999345 seq 2 46686 358 seq 3.366449 7742 seq 4.526572 7995 seq 5.55742 827 seq 6.3765.93298 seq &6.37788.977588 seq 3&6.385755.95888 TABLE II: Dogs and 5 Results 3

CS229 Final Project, 24 Fall C. Patients 2 3 4 5 6 (a) Sequences to 6 4&5 5&6 (b) Combined Sequences Fig. 4: Dog 5 Results We use all the human data, 5 3,,, to run logistic regression. Due to the aforementioned time constraint, we use only the last 5, training examples in each time sequence to run. We then combine sequences 4, 5 and sequences 5, 6 to examine if the accuracy can be improved. See Figure 5 and Table III for Patient results. 2 3 4 5 6 4&5 5&6 Fig. 5: Histogram of Patients Results VI. DISCUSSION & CONCLUSION Based on the results of dogs and patients, we observe that combining time sequences does not Patient seq.527982.8346 seq 2.543429 2566 seq 3 86822.5524 seq 4 5355 6868 seq 5 86 582 seq 6.523835 seq 4&5 83.93442 seq 5&6.5485.5538 TABLE III: Patient Results improve the results. None of the combined sequences produce false negative rates smaller than the minimum of the individual sequences false negative rates. Although this contradicts the intuitive conjecture that adding more data should improve results, the finding is not entirely surprising because previous research has pointed out that EEG is highly complex and varies over time. If there is little correlation between any time sequences, continuous or random selected, then combining sequences would only add noise to each model and should not be expected to yield better predictions. Interestingly, while sequences, 3, 6 in the human data give low false negative rates, sequences, 3, 6 in the canine data give surprisingly high false negative rates (close to ). Our surmise is that logistic regression may not have classified the preictal and interictal states correctly because we omitted the process of cross validation before performing logistic regression. We observe some patterns from the results of the two dogs. The logistic regression false negative rates fluctuate (increase, decrease, increase, and finally decrease), while the false negative rates exhibit an entirely opposite, and much more volatile trend. These trends may be evidence that pre-seizure brain activities exhibit a high degree of fluctuation. If future medical devices can detect such fluctuation, then warning epileptic patients of impending seizures may be achievable. This approach may be more effective in forecasting seizures than solely focusing on the false negative rates of specific time sequences. According to the results of patient, outperforms logistic regression as expected. results are all < 25%. This means that we are able to predict seizures with reasonable accuracy. However, all logistic regression false negative rates are around 5%. Such relative "uniformity" makes it difficult to interpret which time sequences are better in seizure 4

CS229 Final Project, 24 Fall forecasting than others. The false negative rate for sequence 6 is surprising. However, since sequence 6 closely precedes seizures, the brain s behavior is expected to differ drastically from that during nonseizure periods, thus the low false negative error. After comparing the results between dogs and patient, we observe that sequence 6 has the lowest false negative rate for patient but not for dogs. This inconsistency implies that the pathogenesis of epilepsy may be different across different species. Hence, applying the same training models to both dogs and patients may be problematic. [3] Y. Park, L. Luo, K. Parhi and T. Netoff (2). Seizure prediction with spectral power of EEG using cost-sensitive support vector machines. Epilepsia 52:76-77. [4] J. Howbert, E. Patterson, S. Stead, B. Brinkmann, V. Vasoli, D. Crepeau, C. Vite and B. Sturges (24, Jan). Forecasting Seizures in Dogs with Naturally Occurring Epilepsy. PLoS One 9():e892. [5] Snyder DE, Echauz J, Grimes DB, Litt B (28) The statistics of a practical seizure warning system. J Neural Eng 5: 392?4. doi:.88/74-256/5/4/4 [6] American Epilepsy Society Seizure Prediction Challenge. [Online] https://www.kaggle.com/c/seizure-prediction VII. FUTURE This competition also provided data for another three dogs and one patient. Furthermore, for each dog and each patient, multiple hours worth of data are available. We could run the same algorithms several more times to observe patterns and anomalies. The new results could be used to verify our conclusions. Instead of 6 sequences, we could divide an hour into 2 or more sequences. That would give us a better understanding of the pathogenesis of epileptic seizures and the variation of the prediction results within shorter time intervals. Logistic regression does not work well for patient. The reason might be that human brains are much more complicated than canine brains. We could apply logistic regression with regularization to ameliorate the high variance problem. does not work well for dogs at sequences, 3 and 6. The cause is the large number of support vectors. Since the the number of support vectors determines the maximum number of misclassified data points, we could apply ν- instead of C-, and set ν =.5. This approach would force only half of our data to be support vectors and implicitly adapt C. REFERENCES [] C. Hsu, C. Chang and C. Lin. (2, April). A Practical Guide to Support Vector Classification. [Online] http://www.csie.ntu.edu.tw/ cjlin/papers/guide/guide.pdf. [2] R. Andrzejak, D. Chicharro, C. Elger and F. Mormann (29, July). Seizure prediction: Any better than chance? Clin Neurophysiol. 5