Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013

Size: px
Start display at page:

Download "Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013"

Transcription

1 Partially-Observable Markov Decision Processes as Dynamical Causal Models Finale Doshi-Velez NIPS Causality Workshop 2013

2 The POMDP Mindset We poke the world (perform an action) Agent World

3 The POMDP Mindset We poke the world (perform an action) We get a poke back (see an observation) Agent We get a poke back (get a reward) -$1 World

4 What next? We poke the world (perform an action) We get a poke back (see an observation) Agent We get a poke back (get a reward) -$1 World

5 What next? We poke the world (perform an action) We get a poke back (see an observation)?the world is a mystery... Agent We get a poke back (get a reward) -$1 World

6 The agent needs a representation to use when making decisions We poke the world (perform an action) Representation of how the world works Representation of current world state We get a poke back (see an observation)?the world is a mystery... Agent We get a poke back (get a reward) -$1 World

7 Many problems can be framed this way Robot navigation (take movement actions, receive sensor measurements) Dialog management (ask questions, receive answers) Target tracking (search a particular area, receive sensor measurements) the list goes on...

8 The Causal Process, Unrolled a t-1 a t a t+1 a t+2 o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 -$1 -$1 -$5 $10 8

9 The Causal Process, Unrolled a t-1 a t a t+1 a t+2 Given a history of actions, observations, and rewards How can we act in order to maximize long-term future rewards? o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 -$1 -$1 -$5 $10 9

10 The Causal Process, Unrolled a t-1 a t a t+1 a t+2 Key Challenge: The entire history may be needed to make near-optimal decisions o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 10

11 The Causal Process, Unrolled a t-1 a t a t+1 a t+2 All past events are needed to predict future events o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 11

12 The Causal Process, Unrolled The representation is a sufficient statistic that summarizes the history a t-1 a t a t+1 a t+2 s t-1 s t s t+1 s t+2 o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 12

13 The Causal Process, Unrolled The representation is a sufficient statistic that summarizes the history a t-1 a t a t+1 a t+2 s t-1 s t s t+1 s t+2 o t-1 o t o t+1 o t+2 r t-1 r t r t+1 r t+2 We call this representation the information state. 13

14 What is state? Sometimes, there exists an obvious choice for this hidden variable (such as a robot's true position) At other times, learning a representation that makes the system Markovian may provide insights into the problem.

15 Formal POMDP definition A POMDP consists of A set of states S, actions A, and observations O A transition function T( s' s, a ) An observation function O( o s, a ) A reward function R( s, a ) A discount factor γ ], the expected long- The goal is to maximize E[ term discounted reward. t=1 γ t R t

16 Relationship to Other Models Hidden State? Markov Model Hidden Markov Model Decisions?... s t-1 s t s t+1 s t o t-1 o t o t+1 o t+2... Markov Decision Process a t-1 a t a t+1 a t+2... s t-1 s t s t+1 s t+2... r t-1 r t r t+1 r t+2 POMDP s t-1 s t s t+1 s t+2 a t-1 a t a t+1 a t+2... s t-1 s t s t+1 s t o t-1 o t o t+1 o t r t-1 r t r t+1 r t+2...

17 Formal POMDP definition A POMDP consists of A set of states S, actions A, and observations O A transition function T( s' s, a ) An observation function O( o s, a ) A reward function R( s, a ) A discount factor γ This ], the expected optimization long- is called Planning The goal is to maximize E[ term discounted reward. t=1 γ t R t

18 Formal POMDP definition A POMDP consists of A set of states S, actions A, and observations O A transition function T( s' s, a ) An observation function O( o s, a ) A reward function R( s, a ) A discount factor γ ], the expected long- The goal is to maximize E[ term discounted reward. t=1 γ t R t Learning These is called Learning

19 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b].=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))]

20 State and State, a quick aside In the POMDP literature, the term state usually refers to the hidden state (i.e. the robot's true location). The posterior distribution of states s is called the belief b(s). It is a sufficient statistic for the history, and thus the information state for the POMDP.

21 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b].=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))]

22 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b] Belief b (sufficient statistic/ information state).=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))]

23 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b].=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))] Immediate reward for taking action a in belief b

24 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b].=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))] Expected future rewards

25 Planning Bellman Recursion for the value (long-term expected reward) V (b)=max E[ t=1 γ t R t b 0 =b].=max a [R(b,a)+γ( o O P(b oa b)v (b oa ))] especially when b is high-dimensional, solving for this continuous function is not easy (PSPACE-HARD)

26 Planning: Yes, we can! Global: Approximate the entire function V(b) via a set of support points b'. - e.g. SARSOP b Local: Approximate a the value for a particular belief with forward simulation - e.g. POMCP b t... a o

27 Learning Given histories h=(a 1,r 1, o 1, a 2,r 2, o 2,...,a T, r T, o T ) we can learn T, O, R via forward-filtering/backwardsampling or <fill in your favorite timeseries algorithm> Two principles usually suffice for exploring to learn: Optimism under uncertainty: Try actions that might be good Risk control: If an action seems risky, ask for help.

28 Example: Timeseries in Diabetes Data: Electronic health records of ~17,000 diabetics with 5+ A1c lab measurements and 5+ anti-diabetic agents prescribed. Meds (Antidiabetic agents) Clinician Model Patient Model Lab Results (A1c) Collaborators: Isaac Kohane, Stan Shaw

29 Example: Timeseries in Diabetes Data: Electronic health records of ~17,000 diabetics with 5+ A1c lab measurements and 5+ anti-diabetic agents prescribed. Meds (Antidiabetic agents) Clinician Model Patient Model Lab Results (A1c)

30 Discovered Patient States The patient states each correspond to a set of A1c levels (unsurprising) A1c < 5.5 A1c A1c A1c A1c > 8.5

31 Example: Timeseries in Diabetes Data: Electronic health records of ~17,000 diabetics with 5+ A1c lab measurements and 5+ anti-diabetic agents prescribed. Meds (Antidiabetic agents) Clinician Model Patient Model Lab Results (A1c)

32 Discovered Clinician States The clinician states follow the standard treatment protocols for diabetes (unsurprising, but exciting that we discovered this is a completely unsupervised manner) Next steps: Incorporate more variables; identify patient and clinician outliers (quality of care) Metformin A1c control Metformin, Glipizide A1c up A1c up A1c up Basic Insulins A1c control Metformin, Glyburide A1c up Glargine, Lispro, Aspart

33 Example: Experimental Design In a very general sense: Action space: all possible experiments + submit State space: which hypothesis is true Observation space: results of experiments Reward: cost of experiment Allows for non-myopic sequencing of experiments. Example: Bayesian Optimization? Joint with: Ryan Adams/HIPS group

34 Summary POMDPs provide a framework for modeling causal dynamical systems making optimal sequential decisions POMDPs can be learned and solved!... a t-1 a t a t+1 a t s t-1 s t s t+1 s t o t-1 o t o t+1 o t r t-1 r t r t+1 r t+2...

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want

More information

Lecture 13: Finding optimal treatment policies

Lecture 13: Finding optimal treatment policies MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline

More information

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This

More information

Probabilistic Graphical Models: Applications in Biomedicine

Probabilistic Graphical Models: Applications in Biomedicine Probabilistic Graphical Models: Applications in Biomedicine L. Enrique Sucar, INAOE Puebla, México May 2012 What do you see? What we see depends on our previous knowledge (model) of the world and the information

More information

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California

More information

Electronic Health Record Analytics: The Case of Optimal Diabetes Screening

Electronic Health Record Analytics: The Case of Optimal Diabetes Screening Electronic Health Record Analytics: The Case of Optimal Diabetes Screening Michael Hahsler 1, Farzad Kamalzadeh 1 Vishal Ahuja 1, and Michael Bowen 2 1 Southern Methodist University 2 UT Southwestern Medical

More information

A Framework for Sequential Planning in Multi-Agent Settings

A Framework for Sequential Planning in Multi-Agent Settings A Framework for Sequential Planning in Multi-Agent Settings Piotr J. Gmytrasiewicz and Prashant Doshi Department of Computer Science University of Illinois at Chicago piotr,pdoshi@cs.uic.edu Abstract This

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Artificial Intelligence Lecture 7

Artificial Intelligence Lecture 7 Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge

More information

POMCoP: Belief Space Planning for Sidekicks in Cooperative Games

POMCoP: Belief Space Planning for Sidekicks in Cooperative Games POMCoP: Belief Space Planning for Sidekicks in Cooperative Games The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Adversarial Decision-Making

Adversarial Decision-Making Adversarial Decision-Making Brian J. Stankiewicz University of Texas, Austin Department Of Psychology & Center for Perceptual Systems & Consortium for Cognition and Computation February 7, 2006 Collaborators

More information

Bayesian Reinforcement Learning

Bayesian Reinforcement Learning Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline

More information

How to weigh the strength of prior information and clarify the expected level of evidence?

How to weigh the strength of prior information and clarify the expected level of evidence? How to weigh the strength of prior information and clarify the expected level of evidence? Martin Posch martin.posch@meduniwien.ac.at joint work with Gerald Hlavin Franz König Christoph Male Peter Bauer

More information

Markov Decision Processes for Screening and Treatment of Chronic Diseases

Markov Decision Processes for Screening and Treatment of Chronic Diseases Markov Decision Processes for Screening and Treatment of Chronic Diseases Lauren N. Steimle and Brian T. Denton Abstract In recent years, Markov decision processes (MDPs) and partially obserable Markov

More information

Learning to Identify Irrelevant State Variables

Learning to Identify Irrelevant State Variables Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences

More information

Marcus Hutter Canberra, ACT, 0200, Australia

Marcus Hutter Canberra, ACT, 0200, Australia Marcus Hutter Canberra, ACT, 0200, Australia http://www.hutter1.net/ Australian National University Abstract The approaches to Artificial Intelligence (AI) in the last century may be labelled as (a) trying

More information

POND-Hindsight: Applying Hindsight Optimization to POMDPs

POND-Hindsight: Applying Hindsight Optimization to POMDPs POND-Hindsight: Applying Hindsight Optimization to POMDPs Alan Olsen and Daniel Bryce alan@olsen.org, daniel.bryce@usu.edu Utah State University Logan, UT Abstract We present the POND-Hindsight entry in

More information

DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY

DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY by Rouba Ibrahim UCL School of Management, University College London, London, UK rouba.ibrahim@ucl.ac.uk, Tel: (44)20-76793278

More information

Timing and partial observability in the dopamine system

Timing and partial observability in the dopamine system In Advances in Neural Information Processing Systems 5. MIT Press, Cambridge, MA, 23. (In Press) Timing and partial observability in the dopamine system Nathaniel D. Daw,3, Aaron C. Courville 2,3, and

More information

10CS664: PATTERN RECOGNITION QUESTION BANK

10CS664: PATTERN RECOGNITION QUESTION BANK 10CS664: PATTERN RECOGNITION QUESTION BANK Assignments would be handed out in class as well as posted on the class blog for the course. Please solve the problems in the exercises of the prescribed text

More information

Outline. Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes. Motivation. Why Hidden Markov Model? Why Hidden Markov Model?

Outline. Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes. Motivation. Why Hidden Markov Model? Why Hidden Markov Model? Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes Li-Jung Liang Department of Medicine Statistics Core Email: liangl@ucla.edu Joint work with Rob Weiss & Scott Comulada Outline Motivation

More information

CS 4365: Artificial Intelligence Recap. Vibhav Gogate

CS 4365: Artificial Intelligence Recap. Vibhav Gogate CS 4365: Artificial Intelligence Recap Vibhav Gogate Exam Topics Search BFS, DFS, UCS, A* (tree and graph) Completeness and Optimality Heuristics: admissibility and consistency CSPs Constraint graphs,

More information

Une approche interactioniste pour des robots adaptatifs auto-motivés

Une approche interactioniste pour des robots adaptatifs auto-motivés Une approche interactioniste pour des robots adaptatifs auto-motivés Olivier GEORGEON Alain Mille Christian Wolf LIRIS 1 /27 Outline Challenge Theoretical approach Demonstrations 2 /27 Challenge Generate

More information

Bayesian integration in sensorimotor learning

Bayesian integration in sensorimotor learning Bayesian integration in sensorimotor learning Introduction Learning new motor skills Variability in sensors and task Tennis: Velocity of ball Not all are equally probable over time Increased uncertainty:

More information

RAPID: A Belief Convergence Strategy for Collaborating with Inconsistent Agents

RAPID: A Belief Convergence Strategy for Collaborating with Inconsistent Agents RAPID: A Belief Convergence Strategy for Collaborating with Inconsistent Agents Trevor Sarratt and Arnav Jhala University of California Santa Cruz {tsarratt, jhala}@soe.ucsc.edu Abstract Maintaining an

More information

The optimism bias may support rational action

The optimism bias may support rational action The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction

More information

Interaction as an emergent property of a Partially Observable Markov Decision Process

Interaction as an emergent property of a Partially Observable Markov Decision Process Interaction as an emergent property of a Partially Observable Markov Decision Process Andrew Howes, Xiuli Chen, Aditya Acharya School of Computer Science, University of Birmingham Richard L. Lewis Department

More information

Bayesian Nonparametric Methods for Precision Medicine

Bayesian Nonparametric Methods for Precision Medicine Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign

More information

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany High-level Vision Bernd Neumann Slides for the course in WS 2004/05 Faculty of Informatics Hamburg University Germany neumann@informatik.uni-hamburg.de http://kogs-www.informatik.uni-hamburg.de 1 Contents

More information

Artificial Intelligence Programming Probability

Artificial Intelligence Programming Probability Artificial Intelligence Programming Probability Chris Brooks Department of Computer Science University of San Francisco Department of Computer Science University of San Francisco p.1/25 17-0: Uncertainty

More information

Thinking and Guessing: Bayesian and Empirical Models of How Humans Search

Thinking and Guessing: Bayesian and Empirical Models of How Humans Search Thinking and Guessing: Bayesian and Empirical Models of How Humans Search Marta Kryven (mkryven@uwaterloo.ca) Department of Computer Science, University of Waterloo Tomer Ullman (tomeru@mit.edu) Department

More information

Katsunari Shibata and Tomohiko Kawano

Katsunari Shibata and Tomohiko Kawano Learning of Action Generation from Raw Camera Images in a Real-World-Like Environment by Simple Coupling of Reinforcement Learning and a Neural Network Katsunari Shibata and Tomohiko Kawano Oita University,

More information

Pond-Hindsight: Applying Hindsight Optimization to Partially-Observable Markov Decision Processes

Pond-Hindsight: Applying Hindsight Optimization to Partially-Observable Markov Decision Processes Utah State University DigitalCommons@USU All Graduate Theses and Dissertations Graduate Studies 5-2011 Pond-Hindsight: Applying Hindsight Optimization to Partially-Observable Markov Decision Processes

More information

CASE A2 Managing Between-meal Hypoglycemia

CASE A2 Managing Between-meal Hypoglycemia Managing Between-meal Hypoglycemia 1 I would like to discuss this case of a patient who, overall, was doing well on her therapy until she made an important lifestyle change to lose weight. This is a common

More information

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some

More information

Does Machine Learning. In a Learning Health System?

Does Machine Learning. In a Learning Health System? Does Machine Learning Have a Place In a Learning Health System? Grand Rounds: Rethinking Clinical Research Friday, December 15, 2017 Michael J. Pencina, PhD Professor of Biostatistics and Bioinformatics,

More information

Framing Human-Robot Task Communication as a POMDP

Framing Human-Robot Task Communication as a POMDP Framing Human-Robot Task Communication as a POMDP Mark P. Woodward Harvard University 6 Oxford St. Cambridge, Massachussettes 8 USA mwoodward@eecs.harvard.edu Robert J. Wood Harvard University Oxford St.

More information

Dynamic Control Models as State Abstractions

Dynamic Control Models as State Abstractions University of Massachusetts Amherst From the SelectedWorks of Roderic Grupen 998 Dynamic Control Models as State Abstractions Jefferson A. Coelho Roderic Grupen, University of Massachusetts - Amherst Available

More information

Reinforcement Learning and Artificial Intelligence

Reinforcement Learning and Artificial Intelligence Reinforcement Learning and Artificial Intelligence PIs: Rich Sutton Michael Bowling Dale Schuurmans Vadim Bulitko plus: Dale Schuurmans Vadim Bulitko Lori Troop Mark Lee Reinforcement learning is learning

More information

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes Using Eligibility Traces to Find the est Memoryless Policy in Partially Observable Markov Decision Processes John Loch Department of Computer Science University of Colorado oulder, CO 80309-0430 loch@cs.colorado.edu

More information

Supplementary notes for lecture 8: Computational modeling of cognitive development

Supplementary notes for lecture 8: Computational modeling of cognitive development Supplementary notes for lecture 8: Computational modeling of cognitive development Slide 1 Why computational modeling is important for studying cognitive development. Let s think about how to study the

More information

Reimagining Diabetes Care: Leveraging Digital Health Technologies. William Hsu, MD

Reimagining Diabetes Care: Leveraging Digital Health Technologies. William Hsu, MD Reimagining Diabetes Care: Leveraging Digital Health Technologies William Hsu, MD Current Diabetes Care Model What s Not to Like? 2 Achievement of Goals in US Diabetes Care, 1999 2010 N Engl J Med 2013;368:1613-24

More information

Bayesian (Belief) Network Models,

Bayesian (Belief) Network Models, Bayesian (Belief) Network Models, 2/10/03 & 2/12/03 Outline of This Lecture 1. Overview of the model 2. Bayes Probability and Rules of Inference Conditional Probabilities Priors and posteriors Joint distributions

More information

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments Challenges in Developing Learning Algorithms to Personalize mhealth Treatments JOOLHEALTH Bar-Fit Susan A Murphy 01.16.18 HeartSteps SARA Sense 2 Stop Continually Learning Mobile Health Intervention 1)

More information

Integrating Declarative Programming and Probabilistic Planning for Robots

Integrating Declarative Programming and Probabilistic Planning for Robots How Should Intelligence Be Abstracted in AI Research... AAAI Technical Report FS-13-02 Integrating Declarative Programming and Probabilistic Planning for Robots Shiqi Zhang and Mohan Sridharan Department

More information

Lecture 3: Bayesian Networks 1

Lecture 3: Bayesian Networks 1 Lecture 3: Bayesian Networks 1 Jingpeng Li 1 Content Reminder from previous lecture: Bayes theorem Bayesian networks Why are they currently interesting? Detailed example from medical diagnostics Bayesian

More information

arxiv: v1 [cs.lg] 2 Dec 2017

arxiv: v1 [cs.lg] 2 Dec 2017 Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients arxiv:1712.00654v1 [cs.lg] 2 Dec 2017 Wei-Hung Weng MIT CSAIL ckbjimmy@mit.edu Mingwu Gao Philips Connected

More information

Computational Models for Belief Revision, Group Decisions and Cultural Shifts

Computational Models for Belief Revision, Group Decisions and Cultural Shifts Computational Models for Belief Revision, Group Decisions and Cultural Shifts Whitman Richards (PI), M. I. T. Computer Science and Artificial Intelligence Laboratory MURI Objective: models that forecast

More information

Lecture 9: The Agent Form of Bayesian Games

Lecture 9: The Agent Form of Bayesian Games Microeconomics I: Game Theory Lecture 9: The Agent Form of Bayesian Games (see Osborne, 2009, Sect 9.2.2) Dr. Michael Trost Department of Applied Microeconomics December 20, 2013 Dr. Michael Trost Microeconomics

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Bayesian Networks in Medicine: a Model-based Approach to Medical Decision Making

Bayesian Networks in Medicine: a Model-based Approach to Medical Decision Making Bayesian Networks in Medicine: a Model-based Approach to Medical Decision Making Peter Lucas Department of Computing Science University of Aberdeen Scotland, UK plucas@csd.abdn.ac.uk Abstract Bayesian

More information

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab Intelligent Machines That Act Rationally Hang Li Toutiao AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly Thinking

More information

DAILY WELLNESS. GlutenFreeLearning.com 877-GLUTEN1 ( ) 2016 thedr.com. All Rights Reserved.

DAILY WELLNESS. GlutenFreeLearning.com 877-GLUTEN1 ( ) 2016 thedr.com. All Rights Reserved. DAILY WELLNESS How to Use this Journal This wellness journal is meant to be an easy way for you to reflect, observe, and change in regards to your daily routine and going gluten free. The Section was designed

More information

Rational Agents (Chapter 2)

Rational Agents (Chapter 2) Rational Agents (Chapter 2) Agents An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators Example: Vacuum-Agent Percepts:

More information

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of

More information

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 Practical Bayesian Optimization of Machine Learning Algorithms Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 ... (Gaussian Processes) are inadequate for doing speech and vision. I still think they're

More information

Natural Scene Statistics and Perception. W.S. Geisler

Natural Scene Statistics and Perception. W.S. Geisler Natural Scene Statistics and Perception W.S. Geisler Some Important Visual Tasks Identification of objects and materials Navigation through the environment Estimation of motion trajectories and speeds

More information

M.Sc. in Cognitive Systems. Model Curriculum

M.Sc. in Cognitive Systems. Model Curriculum M.Sc. in Cognitive Systems Model Curriculum April 2014 Version 1.0 School of Informatics University of Skövde Sweden Contents 1 CORE COURSES...1 2 ELECTIVE COURSES...1 3 OUTLINE COURSE SYLLABI...2 Page

More information

Remarks on Bayesian Control Charts

Remarks on Bayesian Control Charts Remarks on Bayesian Control Charts Amir Ahmadi-Javid * and Mohsen Ebadi Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran * Corresponding author; email address: ahmadi_javid@aut.ac.ir

More information

340B Savings Equals Improved Patient Care

340B Savings Equals Improved Patient Care 10 th Annual 340B Coalition Winter Conference 340B Savings Equals Improved Patient Care Carol Millage, PharmD Pharmacy Director County of Santa Barbara Public Health February 7, 2014 Statement of Conflicts

More information

Using Inverse Planning and Theory of Mind for Social Goal Inference

Using Inverse Planning and Theory of Mind for Social Goal Inference Using Inverse Planning and Theory of Mind for Social Goal Inference Sean Tauber (sean.tauber@uci.edu) Mark Steyvers (mark.steyvers@uci.edu) Department of Cognitive Sciences, University of California, Irvine

More information

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Confirmations and Contradictions Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Estimates of the Deterrent Effect of Capital Punishment: The Importance of the Researcher's Prior Beliefs Walter

More information

Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning

Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Arthur Guez, Robert D. Vincent and Joelle Pineau School of Computer Science, McGill University Massimo Avoli Montreal Neurological Institute

More information

Transition and Maintenance

Transition and Maintenance Table of Contents Transition and Maintenance 2 Welcome to Transition and Maintenance...2 Preparing Your Mind for Success...4 Creating Accountability...6 Your Kitchen Your Bedroom Your Bathroom Your Friends

More information

Counterfactual Reasoning with Dynamic Switching Models for HIV Therapy Selection

Counterfactual Reasoning with Dynamic Switching Models for HIV Therapy Selection Counterfactual Reasoning with Dynamic Switching Models for HIV Therapy Selection Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 Model-based approaches to disease progression are desirable

More information

Human and Optimal Exploration and Exploitation in Bandit Problems

Human and Optimal Exploration and Exploitation in Bandit Problems Human and Optimal Exploration and ation in Bandit Problems Shunan Zhang (szhang@uci.edu) Michael D. Lee (mdlee@uci.edu) Miles Munro (mmunro@uci.edu) Department of Cognitive Sciences, 35 Social Sciences

More information

Approximate Solutions For Partially Observable Stochastic Games with Common Payoffs

Approximate Solutions For Partially Observable Stochastic Games with Common Payoffs Approximate Solutions For Partially Observable Stochastic Games with Common Payoffs Rosemary Emery-Montemerlo, Geoff Gordon,Jeff Schneider The Robotics Institute Carnegie Mellon University Pittsburgh,

More information

State Estimation: Particle Filter

State Estimation: Particle Filter State Estimation: Particle Filter Daniel Seliger HAUPT-/ BACHELOR- SEMINAR ADAPTIVE 28.06.2012 SYSTEME PST PROF. DR. WIRSING 14. JUNI 2009 VORNAME NAME Overview 1. Repitition: Bayesian Filtering 2. Particle

More information

Discovering Meaningful Cut-points to Predict High HbA1c Variation

Discovering Meaningful Cut-points to Predict High HbA1c Variation Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi

More information

AI Programming CS F-04 Agent Oriented Programming

AI Programming CS F-04 Agent Oriented Programming AI Programming CS662-2008F-04 Agent Oriented Programming David Galles Department of Computer Science University of San Francisco 04-0: Agents & Environments What is an Agent What is an Environment Types

More information

Chapter 01: The Study of the Person

Chapter 01: The Study of the Person Chapter 01: The Study of the Person MULTIPLE CHOICE 1. Which of the following is NOT part of the psychological triad? a. behavior c. psychological health b. thoughts d. feelings C DIF: Easy REF: The Study

More information

Intelligent Machines That Act Rationally. Hang Li Bytedance AI Lab

Intelligent Machines That Act Rationally. Hang Li Bytedance AI Lab Intelligent Machines That Act Rationally Hang Li Bytedance AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly

More information

Human Activities: Handling Uncertainties Using Fuzzy Time Intervals

Human Activities: Handling Uncertainties Using Fuzzy Time Intervals The 19th International Conference on Pattern Recognition (ICPR), Tampa, FL, 2009 Human Activities: Handling Uncertainties Using Fuzzy Time Intervals M. S. Ryoo 1,2 and J. K. Aggarwal 1 1 Computer & Vision

More information

Expert System Profile

Expert System Profile Expert System Profile GENERAL Domain: Medical Main General Function: Diagnosis System Name: INTERNIST-I/ CADUCEUS (or INTERNIST-II) Dates: 1970 s 1980 s Researchers: Ph.D. Harry Pople, M.D. Jack D. Myers

More information

Statement of research interest

Statement of research interest Statement of research interest Milos Hauskrecht My primary field of research interest is Artificial Intelligence (AI). Within AI, I am interested in problems related to probabilistic modeling, machine

More information

HARVARD PILGRIM HEALTH CARE RECOMMENDED MEDICATION REQUEST GUIDELINES

HARVARD PILGRIM HEALTH CARE RECOMMENDED MEDICATION REQUEST GUIDELINES Generic Brand HICL GCN Exception/Other INSULIN REGULAR, HUMAN AFREZZA 37619, 37622, 37623, 38923, 37624, 42833, 38918, 37621 GUIDELINES FOR USE 1. Is the member currently taking the requested medication

More information

Face Your Fear System

Face Your Fear System proudly announces Dr. Rob s Face Your Fear System Programs Freedom From OCD Freedom From Social Anxiety Greatest Me. Anxiety Free Specialized Youth Group Pathway To Peace Specialized Youth Group R.A.M.E

More information

Between-word regressions as part of rational reading

Between-word regressions as part of rational reading Between-word regressions as part of rational reading Klinton Bicknell & Roger Levy UC San Diego CUNY 2010: New York Bicknell & Levy (UC San Diego) Regressions as rational reading CUNY 2010 1 / 23 Introduction

More information

Understanding eye movements in face recognition with hidden Markov model

Understanding eye movements in face recognition with hidden Markov model Understanding eye movements in face recognition with hidden Markov model 1 Department of Psychology, The University of Hong Kong, Pokfulam Road, Hong Kong 2 Department of Computer Science, City University

More information

Towards Learning to Ignore Irrelevant State Variables

Towards Learning to Ignore Irrelevant State Variables Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract

More information

Towards Applying Interactive POMDPs to Real-World Adversary Modeling

Towards Applying Interactive POMDPs to Real-World Adversary Modeling Proceedings of the Twenty-Second Innovative Applications of Artificial Intelligence Conference (IAAI-10) Towards Applying Interactive POMDPs to Real-World Adversary Modeling Brenda Ng and Carol Meyers

More information

Bayesian Perception & Decision for Intelligent Mobility

Bayesian Perception & Decision for Intelligent Mobility Bayesian Perception & Decision for Intelligent Mobility E-Motion & Chroma teams Inria Research Center Grenoble Rhône-Alpes Christian LAUGIER First Class Research Director at Inria San Francisco, 05/11/2015

More information

Increasing Motor Learning During Hand Rehabilitation Exercises Through the Use of Adaptive Games: A Pilot Study

Increasing Motor Learning During Hand Rehabilitation Exercises Through the Use of Adaptive Games: A Pilot Study Increasing Motor Learning During Hand Rehabilitation Exercises Through the Use of Adaptive Games: A Pilot Study Brittney A English Georgia Institute of Technology 85 5 th St. NW, Atlanta, GA 30308 brittney.english@gatech.edu

More information

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s Using Bayesian Networks to Analyze Expression Data Xu Siwei, s0789023 Muhammad Ali Faisal, s0677834 Tejal Joshi, s0677858 Outline Introduction Bayesian Networks Equivalence Classes Applying to Expression

More information

Bayesian Networks for Modeling Emotional State and Personality: Progress Report

Bayesian Networks for Modeling Emotional State and Personality: Progress Report From: AAAI Technical Report FS-98-03. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Bayesian Networks for Modeling Emotional State and Personality: Progress Report Jack Breese Gene

More information

Accuracy and validity of Kinetisense joint measures for cardinal movements, compared to current experimental and clinical gold standards.

Accuracy and validity of Kinetisense joint measures for cardinal movements, compared to current experimental and clinical gold standards. Accuracy and validity of Kinetisense joint measures for cardinal movements, compared to current experimental and clinical gold standards. Prepared by Engineering and Human Performance Lab Department of

More information

Detecting Cognitive States Using Machine Learning

Detecting Cognitive States Using Machine Learning Detecting Cognitive States Using Machine Learning Xuerui Wang & Tom Mitchell Center for Automated Learning and Discovery School of Computer Science Carnegie Mellon University xuerui,tom.mitchell @cs.cmu.edu

More information

Trauma Introduction to Trauma-Informed Care and The Neurosequential Model

Trauma Introduction to Trauma-Informed Care and The Neurosequential Model Overview of Great Circle s Trauma-Informed Trainings Trauma 101 - Introduction to Trauma-Informed Care and The Neurosequential Model (called Trauma 101 for Great Circle staff) (4 hours) Trauma informed

More information

Chris L. Baker, Julian Jara-Ettinger, Rebecca Saxe, & Joshua B. Tenenbaum* Department of Brain and Cognitive Sciences

Chris L. Baker, Julian Jara-Ettinger, Rebecca Saxe, & Joshua B. Tenenbaum* Department of Brain and Cognitive Sciences Rational quantitative attribution of beliefs, desires, and percepts in human mentalizing Chris L. Baker, Julian Jara-Ettinger, Rebecca Saxe, & Joshua B. Tenenbaum* Department of Brain and Cognitive Sciences

More information

Lecture 10: Learning Optimal Personalized Treatment Rules Under Risk Constraint

Lecture 10: Learning Optimal Personalized Treatment Rules Under Risk Constraint Lecture 10: Learning Optimal Personalized Treatment Rules Under Risk Constraint Introduction Consider Both Efficacy and Safety Outcomes Clinician: Complete picture of treatment decision making involves

More information

Sensory Cue Integration

Sensory Cue Integration Sensory Cue Integration Summary by Byoung-Hee Kim Computer Science and Engineering (CSE) http://bi.snu.ac.kr/ Presentation Guideline Quiz on the gist of the chapter (5 min) Presenters: prepare one main

More information

Analyses of Markov decision process structure regarding the possible strategic use of interacting memory systems

Analyses of Markov decision process structure regarding the possible strategic use of interacting memory systems COMPUTATIONAL NEUROSCIENCE ORIGINAL RESEARCH ARTICLE published: 24 December 2008 doi: 10.3389/neuro.10.006.2008 Analyses of Markov decision process structure regarding the possible strategic use of interacting

More information

Neuro-Inspired Statistical. Rensselaer Polytechnic Institute National Science Foundation

Neuro-Inspired Statistical. Rensselaer Polytechnic Institute National Science Foundation Neuro-Inspired Statistical Pi Prior Model lfor Robust Visual Inference Qiang Ji Rensselaer Polytechnic Institute National Science Foundation 1 Status of Computer Vision CV has been an active area for over

More information

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement

More information

Overcoming Barriers to Change: Insulin Pump Transition. Korey K. Hood, PhD Professor & Staff Psychologist Stanford University School of Medicine

Overcoming Barriers to Change: Insulin Pump Transition. Korey K. Hood, PhD Professor & Staff Psychologist Stanford University School of Medicine Overcoming Barriers to Change: Insulin Pump Transition Korey K. Hood, PhD Professor & Staff Psychologist Stanford University School of Medicine 1 Topics Reviewed Change is hard for everyone and can be

More information

Society for Ambulatory Anesthesia Consensus Statement on Perioperative Blood Glucose Management in Diabetic Patients Undergoing Ambulatory Surgery

Society for Ambulatory Anesthesia Consensus Statement on Perioperative Blood Glucose Management in Diabetic Patients Undergoing Ambulatory Surgery Society for Ambulatory Anesthesia Consensus Statement on Perioperative Blood Glucose Management in Diabetic Patients Undergoing Ambulatory Surgery Girish P. Joshi, MB BS, MD, FFARCSI Anesthesia & Analgesia

More information

Facial Event Classification with Task Oriented Dynamic Bayesian Network

Facial Event Classification with Task Oriented Dynamic Bayesian Network Facial Event Classification with Task Oriented Dynamic Bayesian Network Haisong Gu Dept. of Computer Science University of Nevada Reno haisonggu@ieee.org Qiang Ji Dept. of ECSE Rensselaer Polytechnic Institute

More information