Search e Fall /18/15

Size: px
Start display at page:

Download "Search e Fall /18/15"

Transcription

1 Sample Efficient Policy Click to edit Master title style Search Click to edit Emma Master Brunskill subtitle style e Fall

2 Sample Efficient RL Objectives Probably Approximately Correct Minimizing regret Bayes-optimal RL Focus today: Empirical performance 2

3 Last Time: Policy Search Using Gradients Gradient approaches only guaranteed to find a local optima Finite-difference methods scale with # of parameters needed to represent the policy, but don t require differentiable policy Likelihood ratio gradient approaches Require us to be able to compute gradient analytically Cost independent of # params Don t need to know dynamics model Benefit from using value function/baseline to reduce variance 3

4 Question from Last Class Figure from David Silver 4

5 How to Choose a Compatible Value Function Approximator? Ex. Policy parameterization: Compatibility requires that: Therefore a reasonable choice for value function is Example from Sutton, McAllester, SIngh & Mansour NIPS

6 Today: Sample Efficient Policy Search Powerful function approximators to represent policy value May not be easy to take derivative Like last time, may benefit from exploiting structure (e.g. not completely blackbox optimization Figure from David Silver 6

7 Recall: Gaussian Process to Represent MDP Dynamics/Reward Models s = + s s Figure adjusted from Wilson et al. JMLR

8 Today: Gaussian Process to Represent Value of Parameterized Policy V(ϴ) ϴ Figure adjusted from Wilson et al. JMLR

9 Now Generalizing Policy Value (Rather than Model Dynamics/Rewards) V(ϴ) ϴ Figure adjusted from Wilson et al. JMLR

10 Why Use GPs? Used frequently in Bayesian optimization Last ~5 years Bayesian optimization has become very influential & useful Brief (relevant) digression Two big motivations for Bayesian optimization 10

11 Motivation 1: ML Parameter Tuning ML methods getting more powerful and complex Sophisticated fitting methods Often involve tuning many parameters Hard for non experts to use Performance often substantially impacted by choice of parameters Material from Ryan Adams 11

12 Deep Learning Big investments by Google, Facebook, Microsoft, etc. Many choices: number of layers, weight regularization, layer size, which nonlinearity, batch size, learning rate schedule, stopping conditions Material from Ryan Adams 12

13 Classification of DNA Sequences Predict which DNA sequences will bind with which proteins, Miller et al. (2012) Choices: margin param, entropy param, converg. criterion Material from Ryan Adams 13

14 Motivation 2: Increasing Instances Which Blur Experimentation & Optimization 14

15 Website Design Ex. What Photo to Put Here? Material from Peter Frazier 15

16 6 Photos on Website: 51*50*49*48*47*46= 12,966,811,200 Options! Material from Peter Frazier 16

17 Design Choices Matter Material from Peter Frazier 17

18 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 18

19 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 19

20 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 20

21 Don t Want to Have Worse Revenue on 1.25 million Users Material from Peter Frazier 21

22 Why Use GPs? Used frequently in Bayesian optimization Last ~5 years Bayesian optimization has become very influential & useful Brief (relevant) digression Two big motivations for Bayesian optimization ML algorithm parameter tuning Large online experimental settings where care about performance (e.g. revenue) while testing 22

23 Bayesian Optimization Build a probabilistic model for the objective. Include hierarchical structure about units, etc. Compute the posterior predictive distribution. Integrate out all the possible true functions. Gaussian process regression popular Optimize a cheap proxy function instead. The model is much cheaper than that true objective Two key ideas Use model to guide how to search space. Model is an approximation, but when sampling a point in the real world is more costly than computation, very useful Use proxy function to guide how to balance exploration and exploitation 23

24 Historical Background of Bayesian Optimization Closely related to statistical ideas of optimal design of experiments, dating back to Kirstine Smith in As response surface methods, date back to Box and Wilson in 1951 As Bayesian optimization, studied first by Kushner in 1964 and then Mockus in Methodologically, it touches on several important machine learning areas: active learning, contextual bandits, Bayesian nonparametrics Started receiving serious attention in ML in 2007, Interest exploded when it was realized that Bayesian optimization provides an excellent tool for finding good ML hyperparameters. Material from Ryan Adams 24

25 Bayesian Optimization for Policy Search V(ϴ) ϴ Material from Ryan Adams 25

26 Bayesian Optimization for Policy Search Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 26

27 Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 27

28 Probability of Improvement Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 28

29 Expected Exercise:Improvement Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 29

30 Upper/Lower Confidence Bound Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 30

31 Probability of Improvement Utility function relative to f, best point so far V(ϴ) Acquisition function 31 Figure modified from Ryan Adams, Adams Equations from

32 Expected Improvement Utility function relative to f, best point so far Acquisition function V(ϴ) 32 Figure modified from Ryan Adams, Equations from

33 Upper/Lower Confidence Bound Acquisition function V(ϴ) 33 Figure modified from Ryan Adams, Equations from

34 Acquisition Function Probability of improvement Expected Improvement Upper confidence bound Other ideas? What are the limitations of these? 34

35 Policy Search as Black Box Bayesian Optimization Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 35

36 Policy Search as Bayesian Optimization Figure modified from Calandra, Seyfarth, Peters & Deissenroth

37 Gait Optimization GP Policy search Reduced samples needed to find a fast walk by about 3x Lizotte et al. IJCAI

38 More Gait Optimization Gait parameters: 4 threshold values of the FSM (two for each leg) & 4 control signals applied during extension and flexion (separately for knees and hips). Figures modified from Calandra, Seyfarth, Peters & Deissenroth

39 videos 39

40 Why is this Suboptimal? Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 40

41 Throwing Away All Information But Policy Parameter & Total Reward from Trajectory. Also Ignores Structure of Relationship Between Policies Figure from David Silver 41

42 Covariance function: Key choice for GP When should 2 policies be considered close? Figures from Ryan Adams 42

43 Behavior Based Kernel (Wilson et al. JMLR 2014) KL divergence Probability of trajectory under policy ϴi 43

44 Behavior Based Kernel (Wilson et al. JMLR 2014) KL divergence Prob. trajectory under policy ϴi 44

45 Behavior Based Kernel (BBK) (Wilson et al. JMLR 2014) KL divergence Prob. trajectory under policy ϴi Do we need to know the dynamics model? 45

46 Throwing Away All Information But Policy Parameter & Total Reward from Trajectory. Also Ignores Structure of Relationship Between Policies Figure from David Silver 46

47 Model-Based Bayesian Optimization Algorithm (MBOA) (Wilson et al. JMLR 2014) Bayesian Optimization with a Gaussian Process π = f(θi) Use data to build a model. Use to accelerate learning Create new training point [f(θi),r] Execute policy π in environment for T steps, get total reward R 47

48 Figure from WIlson, Fern & Tadepalli JMLR

49 Figure from WIlson, Fern & Tadepalli JMLR

50 Figure from WIlson, Fern & Tadepalli JMLR

51 Figure from WIlson, Fern & Tadepalli JMLR

52 New Kernel vs Model Based Information? Using model information often greatly improves performance if it s a good model If it s not good, learns to ignore New kernel to relate policies (BBK) much less of an impact 52

53 Other Ways to Go Beyond Black-Box Bayesian Optimization Current work in my group (Rika, Christoph, Dexter Lee, Joe Runde) Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 53

54 Subtleties of Bayesian Optimization Still have to choose policy class (this determines the input space) Have to choose kernel function Have to choose hyperparameters of kernel function (can optimize these) Have to choose acquisition function 54

55 Impact of Acquisition Function & Hyperparameters Manually fixed hyperparameters led to sub-optimal solutions for all the acquisition functions. Figures modified from Calandra, Seyfarth, Peters & Deissenroth

56 Summary: Bayesian Optimization for Efficient Policy Search Benefits Direct policy search Finds global optima Uses sophisticated function class (GP) to model input policy param & output policy value Use smart (but typically myopic) acquisition function to balance exploration/exploitation in searching policy space Can be very sample efficient Not a silver bullet Still have to decide on policy space Choose kernel function (though squared expl often works well) Should optimize hyperparameters 56

57 Stuff to Know: Bayesian Optimization for Efficient Policy Search Properties (global optima, no gradient information used) Define and know benefits/drawbacks of different acquisition functions Understand how to take a policy search problem and formulate as a black box Bayesian optimization problem Be able to list some things have to do in practice (optimize hyperparameters, etc) 57

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 Practical Bayesian Optimization of Machine Learning Algorithms Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 ... (Gaussian Processes) are inadequate for doing speech and vision. I still think they're

More information

Exploiting Similarity to Optimize Recommendations from User Feedback

Exploiting Similarity to Optimize Recommendations from User Feedback 1 Exploiting Similarity to Optimize Recommendations from User Feedback Hasta Vanchinathan Andreas Krause (Learning and Adaptive Systems Group, D-INF,ETHZ ) Collaborators: Isidor Nikolic (Microsoft, Zurich),

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s

More information

Bayesians methods in system identification: equivalences, differences, and misunderstandings

Bayesians methods in system identification: equivalences, differences, and misunderstandings Bayesians methods in system identification: equivalences, differences, and misunderstandings Johan Schoukens and Carl Edward Rasmussen ERNSI 217 Workshop on System Identification Lyon, September 24-27,

More information

BayesOpt: Extensions and applications

BayesOpt: Extensions and applications BayesOpt: Extensions and applications Javier González Masterclass, 7-February, 2107 @Lancaster University Agenda of the day 9:00-11:00, Introduction to Bayesian Optimization: What is BayesOpt and why it

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments Challenges in Developing Learning Algorithms to Personalize mhealth Treatments JOOLHEALTH Bar-Fit Susan A Murphy 01.16.18 HeartSteps SARA Sense 2 Stop Continually Learning Mobile Health Intervention 1)

More information

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s Lecture 1.

More information

Intelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger

Intelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger Intelligent Systems Discriminative Learning Parts marked by * are optional 30/12/2013 WS2013/2014 Carsten Rother, Dmitrij Schlesinger Discriminative models There exists a joint probability distribution

More information

Threshold Learning for Optimal Decision Making

Threshold Learning for Optimal Decision Making Threshold Learning for Optimal Decision Making Nathan F. Lepora Department of Engineering Mathematics, University of Bristol, UK n.lepora@bristol.ac.uk Abstract Decision making under uncertainty is commonly

More information

Sensory Cue Integration

Sensory Cue Integration Sensory Cue Integration Summary by Byoung-Hee Kim Computer Science and Engineering (CSE) http://bi.snu.ac.kr/ Presentation Guideline Quiz on the gist of the chapter (5 min) Presenters: prepare one main

More information

Learning from data when all models are wrong

Learning from data when all models are wrong Learning from data when all models are wrong Peter Grünwald CWI / Leiden Menu Two Pictures 1. Introduction 2. Learning when Models are Seriously Wrong Joint work with John Langford, Tim van Erven, Steven

More information

My Review of John Barban s Venus Factor (2015 Update and Bonus)

My Review of John Barban s Venus Factor (2015 Update and Bonus) My Review of John Barban s Venus Factor (2015 Update and Bonus) December 26, 2013 by Erin B. White 202 Comments (Edit) This article was originally posted at EBWEIGHTLOSS.com Venus Factor is a diet program

More information

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis Thesis Proposal Indrayana Rustandi April 3, 2007 Outline Motivation and Thesis Preliminary results: Hierarchical

More information

Towards Learning to Ignore Irrelevant State Variables

Towards Learning to Ignore Irrelevant State Variables Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract

More information

Expert System Profile

Expert System Profile Expert System Profile GENERAL Domain: Medical Main General Function: Diagnosis System Name: INTERNIST-I/ CADUCEUS (or INTERNIST-II) Dates: 1970 s 1980 s Researchers: Ph.D. Harry Pople, M.D. Jack D. Myers

More information

Learning to Identify Irrelevant State Variables

Learning to Identify Irrelevant State Variables Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences

More information

arxiv: v1 [stat.ap] 2 Feb 2016

arxiv: v1 [stat.ap] 2 Feb 2016 Better safe than sorry: Risky function exploitation through safe optimization Eric Schulz 1, Quentin J.M. Huys 2, Dominik R. Bach 3, Maarten Speekenbrink 1 & Andreas Krause 4 1 Department of Experimental

More information

Reach and grasp by people with tetraplegia using a neurally controlled robotic arm

Reach and grasp by people with tetraplegia using a neurally controlled robotic arm Leigh R. Hochberg et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm Nature, 17 May 2012 Paper overview Ilya Kuzovkin 11 April 2014, Tartu etc How it works?

More information

Analysis of acgh data: statistical models and computational challenges

Analysis of acgh data: statistical models and computational challenges : statistical models and computational challenges Ramón Díaz-Uriarte 2007-02-13 Díaz-Uriarte, R. acgh analysis: models and computation 2007-02-13 1 / 38 Outline 1 Introduction Alternative approaches What

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

RoBO: A Flexible and Robust Bayesian Optimization Framework in Python

RoBO: A Flexible and Robust Bayesian Optimization Framework in Python RoBO: A Flexible and Robust Bayesian Optimization Framework in Python Aaron Klein kleinaa@cs.uni-freiburg.de Numair Mansur mansurm@cs.uni-freiburg.de Stefan Falkner sfalkner@cs.uni-freiburg.de Frank Hutter

More information

5/20/2014. Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition. ANC sysmposium 19 May 2014

5/20/2014. Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition. ANC sysmposium 19 May 2014 ANC sysmposium 19 May 2014 Lorentz workshop on HPI Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition Johan Kwisthout, Maria Otworowska, Harold Bekkering, Iris van Rooij

More information

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This

More information

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Information-theoretic stimulus design for neurophysiology & psychophysics

Information-theoretic stimulus design for neurophysiology & psychophysics Information-theoretic stimulus design for neurophysiology & psychophysics Christopher DiMattina, PhD Assistant Professor of Psychology Florida Gulf Coast University 2 Optimal experimental design Part 1

More information

arxiv: v1 [cs.lg] 17 Dec 2018

arxiv: v1 [cs.lg] 17 Dec 2018 Bayesian Optimization in AlphaGo Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver & Nando de Freitas arxiv:1812.06855v1 [cs.lg] 17 Dec 2018 DeepMind, London, UK

More information

Model-Based fmri Analysis. Will Alexander Dept. of Experimental Psychology Ghent University

Model-Based fmri Analysis. Will Alexander Dept. of Experimental Psychology Ghent University Model-Based fmri Analysis Will Alexander Dept. of Experimental Psychology Ghent University Motivation Models (general) Why you ought to care Model-based fmri Models (specific) From model to analysis Extended

More information

Motivation: Fraud Detection

Motivation: Fraud Detection Outlier Detection Motivation: Fraud Detection http://i.imgur.com/ckkoaop.gif Jian Pei: CMPT 741/459 Data Mining -- Outlier Detection (1) 2 Techniques: Fraud Detection Features Dissimilarity Groups and

More information

NONLINEAR REGRESSION I

NONLINEAR REGRESSION I EE613 Machine Learning for Engineers NONLINEAR REGRESSION I Sylvain Calinon Robot Learning & Interaction Group Idiap Research Institute Dec. 13, 2017 1 Outline Properties of multivariate Gaussian distributions

More information

Generative Adversarial Networks.

Generative Adversarial Networks. Generative Adversarial Networks www.cs.wisc.edu/~page/cs760/ Goals for the lecture you should understand the following concepts Nash equilibrium Minimax game Generative adversarial network Prisoners Dilemma

More information

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology Ch.20 Dynamic Cue Combination in Distributional Population Code Networks Ka Yeon Kim Biopsychology Applying the coding scheme to dynamic cue combination (Experiment, Kording&Wolpert,2004) Dynamic sensorymotor

More information

Bayesian integration in sensorimotor learning

Bayesian integration in sensorimotor learning Bayesian integration in sensorimotor learning Introduction Learning new motor skills Variability in sensors and task Tennis: Velocity of ball Not all are equally probable over time Increased uncertainty:

More information

Bayesian Reinforcement Learning

Bayesian Reinforcement Learning Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline

More information

Human Behavior Modeling with Maximum Entropy Inverse Optimal Control

Human Behavior Modeling with Maximum Entropy Inverse Optimal Control Human Behavior Modeling with Maximum Entropy Inverse Optimal Control Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Recognizing Ambiguity

Recognizing Ambiguity Recognizing Ambiguity How Lack of Information Scares Us Mark Clements Columbia University I. Abstract In this paper, I will examine two different approaches to an experimental decision problem posed by

More information

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples

More information

4. Model evaluation & selection

4. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Individualized Treatment Effects Using a Non-parametric Bayesian Approach

Individualized Treatment Effects Using a Non-parametric Bayesian Approach Individualized Treatment Effects Using a Non-parametric Bayesian Approach Ravi Varadhan Nicholas C. Henderson Division of Biostatistics & Bioinformatics Department of Oncology Johns Hopkins University

More information

The optimism bias may support rational action

The optimism bias may support rational action The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability

More information

Thank you for signing up for the 2017 COLOR ME BELMAR Walk Run Dash for PossABILITIES

Thank you for signing up for the 2017 COLOR ME BELMAR Walk Run Dash for PossABILITIES TEAM GUIDE Thank you for signing up for the 2017 COLOR ME BELMAR Walk Run Dash for PossABILITIES Your participation means so much to our organization AND the children, adults and families that we serve!

More information

Lecture 13: Finding optimal treatment policies

Lecture 13: Finding optimal treatment policies MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline

More information

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want

More information

Accelerated Arm Recovery Workout and Training Plan

Accelerated Arm Recovery Workout and Training Plan Accelerated Arm Recovery Workout and Training Plan After going through the Accelerated Arm Recovery, it s by far the best thing for shoulder recovery that I ve come across. It s really easy to pick up

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab Intelligent Machines That Act Rationally Hang Li Toutiao AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly Thinking

More information

On Training of Deep Neural Network. Lornechen

On Training of Deep Neural Network. Lornechen On Training of Deep Neural Network Lornechen 2016.04.20 1 Outline Introduction Layer-wise Pre-training & Fine-tuning Activation Function Initialization Method Advanced Layers and Nets 2 Neural Network

More information

Bayesian Inference. Thomas Nichols. With thanks Lee Harrison

Bayesian Inference. Thomas Nichols. With thanks Lee Harrison Bayesian Inference Thomas Nichols With thanks Lee Harrison Attention to Motion Paradigm Results Attention No attention Büchel & Friston 1997, Cereb. Cortex Büchel et al. 1998, Brain - fixation only -

More information

Neurons and neural networks II. Hopfield network

Neurons and neural networks II. Hopfield network Neurons and neural networks II. Hopfield network 1 Perceptron recap key ingredient: adaptivity of the system unsupervised vs supervised learning architecture for discrimination: single neuron perceptron

More information

Conjoint Audiogram Estimation via Gaussian Process Classification

Conjoint Audiogram Estimation via Gaussian Process Classification Washington University in St. Louis Washington University Open Scholarship Engineering and Applied Science Theses & Dissertations Engineering and Applied Science Spring 5-2017 Conjoint Audiogram Estimation

More information

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data Haipeng Xing, Yifan Mo, Will Liao, Michael Q. Zhang Clayton Davis and Geoffrey

More information

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. icausalbayes USER MANUAL INTRODUCTION You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. We expect most of our users

More information

3. Model evaluation & selection

3. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

Locally-Biased Bayesian Optimization using Nonstationary Gaussian Processes

Locally-Biased Bayesian Optimization using Nonstationary Gaussian Processes Locally-Biased Bayesian Optimization using Nonstationary Gaussian Processes Ruben Martinez-Cantin Centro Universitario de la Defensa Zaragoza, 50090, Spain rmcantin@unizar.es Abstract Bayesian optimization

More information

Why did the network make this prediction?

Why did the network make this prediction? Why did the network make this prediction? Ankur Taly (Google Inc.) Joint work with Mukund Sundararajan and Qiqi Yan Some Deep Learning Successes Source: https://www.cbsinsights.com Deep Neural Networks

More information

arxiv: v2 [stat.ap] 7 Dec 2016

arxiv: v2 [stat.ap] 7 Dec 2016 A Bayesian Approach to Predicting Disengaged Youth arxiv:62.52v2 [stat.ap] 7 Dec 26 David Kohn New South Wales 26 david.kohn@sydney.edu.au Nick Glozier Brain Mind Centre New South Wales 26 Sally Cripps

More information

THE ANALYTICS EDGE. Intelligence, Happiness, and Health x The Analytics Edge

THE ANALYTICS EDGE. Intelligence, Happiness, and Health x The Analytics Edge THE ANALYTICS EDGE Intelligence, Happiness, and Health 15.071x The Analytics Edge Data is Powerful and Everywhere 2.7 Zettabytes of electronic data exist in the world today 2,700,000,000,000,000,000,000

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

A Cooking Assistance System for Patients with Alzheimers Disease Using Reinforcement Learning

A Cooking Assistance System for Patients with Alzheimers Disease Using Reinforcement Learning International Journal of Information Technology Vol. 23 No. 2 2017 A Cooking Assistance System for Patients with Alzheimers Disease Using Reinforcement Learning Haipeng Chen 1 and Yeng Chai Soh 2 1 Joint

More information

Numerical Integration of Bivariate Gaussian Distribution

Numerical Integration of Bivariate Gaussian Distribution Numerical Integration of Bivariate Gaussian Distribution S. H. Derakhshan and C. V. Deutsch The bivariate normal distribution arises in many geostatistical applications as most geostatistical techniques

More information

K Armed Bandit. Goal. Introduction. Methodology. Approach. and optimal action rest of the time. Fig 1 shows the pseudo code

K Armed Bandit. Goal. Introduction. Methodology. Approach. and optimal action rest of the time. Fig 1 shows the pseudo code K Armed Bandit Goal Introduction Methodology Approach Epsilon Greedy Method: Greedy Method: Optimistic Initial Value Method: Implementation Experiments & Results E1: Comparison of all techniques for Stationary

More information

A Bayesian Approach to Tackling Hard Computational Challenges

A Bayesian Approach to Tackling Hard Computational Challenges A Bayesian Approach to Tackling Hard Computational Challenges Eric Horvitz Microsoft Research Joint work with: Y. Ruan, C. Gomes, H. Kautz, B. Selman, D. Chickering MS Research University of Washington

More information

Make a Difference: UNDERSTANDING INFLUENCE, POWER, AND PERSUASION. Tobias Spanier Extension Educator Center for Community Vitality

Make a Difference: UNDERSTANDING INFLUENCE, POWER, AND PERSUASION. Tobias Spanier Extension Educator Center for Community Vitality Make a Difference: UNDERSTANDING INFLUENCE, POWER, AND PERSUASION Tobias Spanier Extension Educator Center for Community Vitality 2016 Regents of the University of Minnesota. All rights reserved. capacity

More information

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)

More information

Cost-sensitive Dynamic Feature Selection

Cost-sensitive Dynamic Feature Selection Cost-sensitive Dynamic Feature Selection He He 1, Hal Daumé III 1 and Jason Eisner 2 1 University of Maryland, College Park 2 Johns Hopkins University June 30, 2012 He He, Hal Daumé III and Jason Eisner

More information

EXPLORATION FLOW 4/18/10

EXPLORATION FLOW 4/18/10 EXPLORATION Peter Bossaerts CNS 102b FLOW Canonical exploration problem: bandits Bayesian optimal exploration: The Gittins index Undirected exploration: e-greedy and softmax (logit) The economists and

More information

For general queries, contact

For general queries, contact Much of the work in Bayesian econometrics has focused on showing the value of Bayesian methods for parametric models (see, for example, Geweke (2005), Koop (2003), Li and Tobias (2011), and Rossi, Allenby,

More information

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Practical Bayesian Design and Analysis for Drug and Device Clinical Trials p. 1/2 Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Brian P. Hobbs Plan B Advisor: Bradley P. Carlin

More information

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer

More information

CS4495 Computer Vision Introduction to Recognition. Aaron Bobick School of Interactive Computing

CS4495 Computer Vision Introduction to Recognition. Aaron Bobick School of Interactive Computing CS4495 Computer Vision Introduction to Recognition Aaron Bobick School of Interactive Computing What does recognition involve? Source: Fei Fei Li, Rob Fergus, Antonio Torralba. Verification: is that

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 10: Brain-Computer Interfaces Ilya Kuzovkin So Far Stimulus So Far So Far Stimulus What are the neuroimaging techniques you know about? Stimulus So Far

More information

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA Henrik Kure Dina, The Royal Veterinary and Agricuural University Bülowsvej 48 DK 1870 Frederiksberg C. kure@dina.kvl.dk

More information

Macroeconometric Analysis. Chapter 1. Introduction

Macroeconometric Analysis. Chapter 1. Introduction Macroeconometric Analysis Chapter 1. Introduction Chetan Dave David N. DeJong 1 Background The seminal contribution of Kydland and Prescott (1982) marked the crest of a sea change in the way macroeconomists

More information

A Diabetes minimal model for Oral Glucose Tolerance Tests

A Diabetes minimal model for Oral Glucose Tolerance Tests arxiv:1601.04753v1 [stat.ap] 18 Jan 2016 A Diabetes minimal model for Oral Glucose Tolerance Tests J. Andrés Christen a, Marcos Capistrán a, Adriana Monroy b, Silvestre Alavez c, Silvia Quintana Vargas

More information

Progress in Risk Science and Causality

Progress in Risk Science and Causality Progress in Risk Science and Causality Tony Cox, tcoxdenver@aol.com AAPCA March 27, 2017 1 Vision for causal analytics Represent understanding of how the world works by an explicit causal model. Learn,

More information

Bayesian Nonparametric Methods for Precision Medicine

Bayesian Nonparametric Methods for Precision Medicine Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning

CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning Chris HolmesParker Parflux LLC Salem, OR chris@parflux.com Matthew E. Taylor Washington State University

More information

Bayesian Inference Bayes Laplace

Bayesian Inference Bayes Laplace Bayesian Inference Bayes Laplace Course objective The aim of this course is to introduce the modern approach to Bayesian statistics, emphasizing the computational aspects and the differences between the

More information

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University

More information

NDT PERFORMANCE DEMONSTRATION USING SIMULATION

NDT PERFORMANCE DEMONSTRATION USING SIMULATION NDT PERFORMANCE DEMONSTRATION USING SIMULATION N. Dominguez, F. Jenson, CEA P. Dubois, F. Foucher, EXTENDE nicolas.dominguez@cea.fr CEA 10 AVRIL 2012 PAGE 1 NDT PERFORMANCE DEMO. USING SIMULATION OVERVIEW

More information

Psychology Research Methods Lab Session Week 10. Survey Design. Due at the Start of Lab: Lab Assignment 3. Rationale for Today s Lab Session

Psychology Research Methods Lab Session Week 10. Survey Design. Due at the Start of Lab: Lab Assignment 3. Rationale for Today s Lab Session Psychology Research Methods Lab Session Week 10 Due at the Start of Lab: Lab Assignment 3 Rationale for Today s Lab Session Survey Design This tutorial supplements your lecture notes on Measurement by

More information

Online Journal for Weightlifters & Coaches

Online Journal for Weightlifters & Coaches Online Journal for Weightlifters & Coaches WWW.WL-LOG.COM Genadi Hiskia Dublin, October 7, 2016 WL-LOG.com Professional Online Service for Weightlifters & Coaches Professional Online Service for Weightlifters

More information

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods (10) Frequentist Properties of Bayesian Methods Calibrated Bayes So far we have discussed Bayesian methods as being separate from the frequentist approach However, in many cases methods with frequentist

More information

P. RICHARD HAHN. Research areas. Employment. Education. Research papers

P. RICHARD HAHN. Research areas. Employment. Education. Research papers P. RICHARD HAHN Arizona State University email: prhahn@asu.edu Research areas Bayesian methods, causal inference, foundations of statistics, nonlinear regression, Monte Carlo methods, applications to social

More information

SPSS Correlation/Regression

SPSS Correlation/Regression SPSS Correlation/Regression Experimental Psychology Lab Session Week 6 10/02/13 (or 10/03/13) Due at the Start of Lab: Lab 3 Rationale for Today s Lab Session This tutorial is designed to ensure that you

More information

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases

More information

User Guide Seeing and Managing Patients with AASM SleepTM

User Guide Seeing and Managing Patients with AASM SleepTM User Guide Seeing and Managing Patients with AASM SleepTM Once you have activated your account with AASM SleepTM, your next step is to begin interacting with and seeing patients. This guide is designed

More information

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. icausalbayes USER MANUAL INTRODUCTION You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. We expect most of our users

More information

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER 2016 DELIVERING VALUE WITH DATA SCIENCE BAYES APPROACH - MAKING DATA WORK HARDER The Ipsos MORI Data Science team increasingly

More information

Cindy Fan, Rick Huang, Maggie Liu, Ethan Zhang October 20, f: Design Check-in Tasks

Cindy Fan, Rick Huang, Maggie Liu, Ethan Zhang October 20, f: Design Check-in Tasks Cindy Fan, Rick Huang, Maggie Liu, Ethan Zhang October 20, 2014 2f: Design Check-in Tasks Tracking Liquid Intake Over Time - Easy Task Albert is a software developer in his late 20 s, working for a tech

More information