Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning

Similar documents
Adaptive Treatment of Epilepsy via Batch-mode Reinforcement Learning

Recurrent Boosting for Classification of Natural and Synthetic Time-Series Data

Efficient Feature Extraction and Classification Methods in Neural Interfaces

Neuromodulation in Epilepsy. Gregory C. Mathews, M.D., Ph.D.

Sequential Decision Making

Lecture 13: Finding optimal treatment policies

Design Considerations and Clinical Applications of Closed-Loop Neural Disorder Control SoCs

Automatic learning of adaptive treatment strategies. using kernel-based reinforcement learning

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

Towards Learning to Ignore Irrelevant State Variables

Learning to Identify Irrelevant State Variables

Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR. Donglin Zeng, Department of Biostatistics, University of North Carolina

PCA Enhanced Kalman Filter for ECG Denoising

Error Detection based on neural signals

Toward Extending Auditory Steady-State Response (ASSR) Testing to Longer-Latency Equivalent Potentials

Computational Neuroscience. Instructor: Odelia Schwartz

Clinical Commissioning Policy: Deep Brain Stimulation for Refractory Epilepsy

rtms Versus ECT The Future of Neuromodulation & Brain Stimulation Therapies

arxiv: v2 [cs.lg] 1 Jun 2018

Shadowing and Blocking as Learning Interference Models

Feature Parameter Optimization for Seizure Detection/Prediction

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

arxiv: v1 [cs.lg] 4 Feb 2019

Oxford Foundation for Theoretical Neuroscience and Artificial Intelligence

Dynamic Control Models as State Abstractions

Seizure Prediction and Detection

Modeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments

Sparse Coding in Sparse Winner Networks

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Sabrina Jedlicka 9/20/2010 NEUROENGINEERING

Applying Data Mining for Epileptic Seizure Detection

Jitter-aware time-frequency resource allocation and packing algorithm

Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis

Learning and Adaptive Behavior, Part II

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6)

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Classification of EEG signals in an Object Recognition task

Toward a noninvasive automatic seizure control system with transcranial focal stimulations via tripolar concentric ring electrodes

Support system for breast cancer treatment

Understand the New 2019 Neurostimulator Analysis-Programming CPT Coding Structure and Associated Relative Value Units

Intracranial Studies Of Human Epilepsy In A Surgical Setting

Minimum Feature Selection for Epileptic Seizure Classification using Wavelet-based Feature Extraction and a Fuzzy Neural Network

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Choosing Mindray SpO2

Discovering Meaningful Cut-points to Predict High HbA1c Variation

SUPPLEMENTARY INFORMATION. Supplementary Figure 1

Extraction of Unwanted Noise in Electrocardiogram (ECG) Signals Using Discrete Wavelet Transformation

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $

Tactile Internet and Edge Computing: Emerging Technologies for Mobile Health

Learning to Use Episodic Memory

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

Probabilistic Graphical Models: Applications in Biomedicine

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

Neuron, Volume 63 Spatial attention decorrelates intrinsic activity fluctuations in Macaque area V4.

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED

Saichiu Nelson Tong ALL RIGHTS RESERVED

An Edge-Device for Accurate Seizure Detection in the IoT

Epileptic seizure detection using linear prediction filter

Epileptic seizure detection using EEG signals by means of stationary wavelet transforms

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

Why did the network make this prediction?

Measuring Focused Attention Using Fixation Inner-Density

THE PERFORMANCE OF TEMPORAL LOBE EPILEPSY (TLE) PATIENTS ON VERBAL AND NONVERBAL SELECTIVE REMINDING PROCEDURES: PRE AND POSTOPERATIVE COMPARISONS

Learning Classifier Systems (LCS/XCSF)

Classification and Statistical Analysis of Auditory FMRI Data Using Linear Discriminative Analysis and Quadratic Discriminative Analysis

Artificial Intelligence Lecture 7

Automatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods

PERFORMANCE CALCULATION OF WAVELET TRANSFORMS FOR REMOVAL OF BASELINE WANDER FROM ECG

BLADDERSCAN PRIME PLUS TM DEEP LEARNING

Biceps Activity EMG Pattern Recognition Using Neural Networks

VAGUS NERVE STIMULATION FOR TREATMENT OF RHEUMATOID ARTHRITIS MEAGHAN O CONNELL BME 281 URI 2016

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta

THE data used in this project is provided. SEIZURE forecasting systems hold promise. Seizure Prediction from Intracranial EEG Recordings

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Data Mining Tools used in Deep Brain Stimulation Analysis Results

Information Processing During Transient Responses in the Crayfish Visual System

Neuromodulation in Dravet Syndrome. Eric BJ Segal, MD Director of Pediatric Epilepsy Northeast Regional Epilepsy Group Hackensack, New Jersey

Redundancy of Independent Component Analysis in Four Common Types of Childhood Epileptic Seizure

EEG Classification with BSA Spike Encoding Algorithm and Evolving Probabilistic Spiking Neural Network

Study and Design of a Shannon-Energy-Envelope based Phonocardiogram Peak Spacing Analysis for Estimating Arrhythmic Heart-Beat

Automated Detection of Epileptic Seizures in the EEG

PD233: Design of Biomedical Devices and Systems

HST 583 fmri DATA ANALYSIS AND ACQUISITION

Assessment of Reliability of Hamilton-Tompkins Algorithm to ECG Parameter Detection

International Journal of Research in Science and Technology. (IJRST) 2018, Vol. No. 8, Issue No. IV, Oct-Dec e-issn: , p-issn: X

Simultaneous Real-Time Detection of Motor Imagery and Error-Related Potentials for Improved BCI Accuracy

Computational & Systems Neuroscience Symposium

Classification of Epileptic Seizure Predictors in EEG

Empirical Mode Decomposition based Feature Extraction Method for the Classification of EEG Signal

Design of the HRV Analysis System Based on AD8232

Detection of Neuromuscular Diseases Using Surface Electromyograms

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task

Time Experiencing by Robotic Agents

NEURAL CONTROL OF MOVEMENT: ENGINEERING THE RHYTHMS OF THE BRAIN

patients and family NEMOS for treatment of epilepsies

Transcription:

Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Arthur Guez, Robert D. Vincent and Joelle Pineau School of Computer Science, McGill University Massimo Avoli Montreal Neurological Institute 20th Conference on Innovative Applications of Artificial Intelligence 17 July 2008

Epilepsy Epilepsy is a common disorder of the nervous system. Affects ~0.6% of all humans worldwide. Up to 75% successfully treated with drug therapies. A few others are candidates for surgery. Causes: Varied Symptoms: Recurrent unprovoked seizures. www.chse.louisville.edu/graphics/epilepsy07.jpg

Deep brain stimulation (DBS) Electrical stimulation of the brain or vagus nerve is an attractive alternative treatment. http://www.medtronic.com/

Evidence in support of DBS Fixed frequency stimulation in brain slices decreases seizure duration D Arcangelo et al., Neurobiology of Disease. 2005.

Responsive stimulation for epilepsy The current paradigm: Open loop The emerging paradigm: Responsive Seizure prediction or detection: Algorithms measure electrical activity (e.g.) to determine patient's state. Seizure control: Prevent seizures through electrical stimulation.

Our objective To create a stimulation device which is: Adaptive: strategy evolves as a function of the observation. Automatic: stimulation strategy learned from data. Optimal: maximize seizure reduction and minimize stimulation.

Four reasons why this is a hard problem 1. Large amount of information available at each decision point. 2. Large variance in the disease (within patient, between patients, and from animal to patient). 3. No available generative model of epilepsy. 4. Data acquisition (for exploration and validation) is expensive.

Human seizure variation Two seizures recorded on the same patient:

Four reasons why this is a hard problem 1. Large amount of information available at each decision point. 2. Large variance in the disease (within patient, between patients, and from animal to patient). 3. No available generative model of epilepsy. 4. Data acquisition (for exploration and validation) is expensive.

Four reasons why this is a hard problem 1. Large amount of information available at each decision point. 2. Large variance in the disease (within patient, between patients, and from animal to patient). 3. No available generative model of epilepsy. 4. Data acquisition (for exploration and validation) is expensive.

Reinforcement Learning We model an agent interacting with a stochastic environment: Discrete time: t {0... T} State variables: s S Actions: a A Transition dynamics: P(s' s,a) Scalar reward function: R(s,a) The agent's goal is to learn a policy that maximizes the expected total discounted reward: RT = t t R(st,at) The value of each state/action pairing can be expressed as: Qt(s,a) = R(s,a) + s' P(s' s,a) maxa' Qt+1(s',a')

Batch RL using animal data Side step exploration problem by using data from a small set of fixed (hand selected) policies. Data collected using an in vitro animal model of epilepsy. Rat brain slices from mesial temporal structures. Seizure like discharges induced by 4AP added to medium. Discharges monitored with field potential extracellular recordings. Stimuli delivered using a square pulse, constant current, pattern.

Recordings of electrical activity Recorded from single sensing electrode placed in vitro. Raw data consists of a set of traces of electrical field activity in the sample (> 50 million data points) Each trace contains several minutes with each fixed stimulation strategy.

Converting raw data to a state representation We need to extract: D = { <st, at, rt, st+1>, t=1 D } Divide signal into overlapping windows of different length. st = 114 real valued features mean, range, energy, frequency spectra

Adding actions and rewards We need to extract: D = { <st, at, rt, st+1>, t=1 D } Divide signal into overlapping windows of different length. rt ={normal, stimulation, seizure} hand labeled for each frame. Seizure (large negative reward) Stimulation (small negative reward) Other (zero reward) at = {0 Hz, 0.2 Hz, 0.5 Hz, 1.0 Hz} stationary stimulation frequencies.

Estimating the value function Recall the value function: Qt(s,a) = R(s,a) + s' P(s' s,a) maxa' Qt+1(s',a') Don t know P(s' s,a) or how it may be affected by treatment. Cannot represent Q(s,a) in tabular form. Need a good regression function, appropriate for: Batches of continuous valued data. High dimensional input. Limited domain knowledge.

Fitted Q iteration algorithm Estimate Q function via a sequence of supervised learning trials Given data set: D = { <st, at, rt, st+1>, t=1 D } For n = 1 : N Form a training set Dn = { (il, ol) }, l = 1,, D } il = (stl, atl) ol = rtl + maxa A Qn 1 (slt+1, a) Estimate Qn from Dn using any regression algorithm. Gordon 1999; Ormoneit and Sen 2002

Regression via extremely randomized trees Build M trees independently from the complete training set D. Generate tests randomly: Select K candidate attributes (fi). Choose cut points (ti) uniformly. f i < ti f i < ti Keep the test with highest computed score. Predicted value is the mean of M trees. This method (and its relatives) appear to be good fi < ti supervised learning algorithms for high dimensional data. Ernst et al. 2005; Geurts et al. 2004 fi < ti

Evaluating performance Now we can learn! How do we know if our learned policy is good? We can t compare to the optimal stimulation strategy. Limited data means no optimality guarantees. Data collected under different (fixed) policy. But we have some indicators that suggest we are not doing too badly. Perform leave one out cross validation. We use rejection sampling to simulate the policy.

Comparison to fixed stimulation strategies Proportion of seizure states: QuickTimee ᆰ and a TIFF (LZW) decompressor are needed to see this picture.

Comparison to fixed stimulation strategies Number of stimulation actions:

Comparison to fixed stimulation strategies Empirical return:

Example trace

Future challenges Now conducting experiments in active slice models. Exploration: We need a safe and efficient method for learning online. Evaluation: We need better methods for evaluating the policy. Knowledge transfer / Multi task learning: Slice Slice Slice Animal Animal Human Applications to other diseases (e.g. depression, schizophrenia).

Summary We present a framework for automatically learning stimulation strategies for the treatment of epilepsy. Preliminary results using cross validation imply both improved seizure suppression and reduced stimulation. Now evaluating by learning policy from batch data and then applying the policy to an active slice.

Acknowledgements Collaborating researchers: Keith Bush, Ph.D., SOCS Philip de Guzman, Ph.D., MNI Gabriella Panuccio, Ph.D. Student, MNI Funding Natural Sciences and Engineering Research Council Canadian Institutes for Health Research