Search e Fall /18/15
|
|
- Edward Snow
- 6 years ago
- Views:
Transcription
1 Sample Efficient Policy Click to edit Master title style Search Click to edit Emma Master Brunskill subtitle style e Fall
2 Sample Efficient RL Objectives Probably Approximately Correct Minimizing regret Bayes-optimal RL Focus today: Empirical performance 2
3 Last Time: Policy Search Using Gradients Gradient approaches only guaranteed to find a local optima Finite-difference methods scale with # of parameters needed to represent the policy, but don t require differentiable policy Likelihood ratio gradient approaches Require us to be able to compute gradient analytically Cost independent of # params Don t need to know dynamics model Benefit from using value function/baseline to reduce variance 3
4 Question from Last Class Figure from David Silver 4
5 How to Choose a Compatible Value Function Approximator? Ex. Policy parameterization: Compatibility requires that: Therefore a reasonable choice for value function is Example from Sutton, McAllester, SIngh & Mansour NIPS
6 Today: Sample Efficient Policy Search Powerful function approximators to represent policy value May not be easy to take derivative Like last time, may benefit from exploiting structure (e.g. not completely blackbox optimization Figure from David Silver 6
7 Recall: Gaussian Process to Represent MDP Dynamics/Reward Models s = + s s Figure adjusted from Wilson et al. JMLR
8 Today: Gaussian Process to Represent Value of Parameterized Policy V(ϴ) ϴ Figure adjusted from Wilson et al. JMLR
9 Now Generalizing Policy Value (Rather than Model Dynamics/Rewards) V(ϴ) ϴ Figure adjusted from Wilson et al. JMLR
10 Why Use GPs? Used frequently in Bayesian optimization Last ~5 years Bayesian optimization has become very influential & useful Brief (relevant) digression Two big motivations for Bayesian optimization 10
11 Motivation 1: ML Parameter Tuning ML methods getting more powerful and complex Sophisticated fitting methods Often involve tuning many parameters Hard for non experts to use Performance often substantially impacted by choice of parameters Material from Ryan Adams 11
12 Deep Learning Big investments by Google, Facebook, Microsoft, etc. Many choices: number of layers, weight regularization, layer size, which nonlinearity, batch size, learning rate schedule, stopping conditions Material from Ryan Adams 12
13 Classification of DNA Sequences Predict which DNA sequences will bind with which proteins, Miller et al. (2012) Choices: margin param, entropy param, converg. criterion Material from Ryan Adams 13
14 Motivation 2: Increasing Instances Which Blur Experimentation & Optimization 14
15 Website Design Ex. What Photo to Put Here? Material from Peter Frazier 15
16 6 Photos on Website: 51*50*49*48*47*46= 12,966,811,200 Options! Material from Peter Frazier 16
17 Design Choices Matter Material from Peter Frazier 17
18 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 18
19 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 19
20 Standard A/B Testing or Experimentation Doesn t Scale Material from Peter Frazier 20
21 Don t Want to Have Worse Revenue on 1.25 million Users Material from Peter Frazier 21
22 Why Use GPs? Used frequently in Bayesian optimization Last ~5 years Bayesian optimization has become very influential & useful Brief (relevant) digression Two big motivations for Bayesian optimization ML algorithm parameter tuning Large online experimental settings where care about performance (e.g. revenue) while testing 22
23 Bayesian Optimization Build a probabilistic model for the objective. Include hierarchical structure about units, etc. Compute the posterior predictive distribution. Integrate out all the possible true functions. Gaussian process regression popular Optimize a cheap proxy function instead. The model is much cheaper than that true objective Two key ideas Use model to guide how to search space. Model is an approximation, but when sampling a point in the real world is more costly than computation, very useful Use proxy function to guide how to balance exploration and exploitation 23
24 Historical Background of Bayesian Optimization Closely related to statistical ideas of optimal design of experiments, dating back to Kirstine Smith in As response surface methods, date back to Box and Wilson in 1951 As Bayesian optimization, studied first by Kushner in 1964 and then Mockus in Methodologically, it touches on several important machine learning areas: active learning, contextual bandits, Bayesian nonparametrics Started receiving serious attention in ML in 2007, Interest exploded when it was realized that Bayesian optimization provides an excellent tool for finding good ML hyperparameters. Material from Ryan Adams 24
25 Bayesian Optimization for Policy Search V(ϴ) ϴ Material from Ryan Adams 25
26 Bayesian Optimization for Policy Search Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 26
27 Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 27
28 Probability of Improvement Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 28
29 Expected Exercise:Improvement Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 29
30 Upper/Lower Confidence Bound Exercise: Where Would You Sample Next (What Policy Would You Evaluate) & Why? What Algorithm Would You Use for Sampling? Dark points are the policies we ve already tried in the world V(ϴ) Blue line is estimated mean of function (value of policy) Grey areas represent uncertainty over function value ϴ Figure modified from Ryan Adams 30
31 Probability of Improvement Utility function relative to f, best point so far V(ϴ) Acquisition function 31 Figure modified from Ryan Adams, Adams Equations from
32 Expected Improvement Utility function relative to f, best point so far Acquisition function V(ϴ) 32 Figure modified from Ryan Adams, Equations from
33 Upper/Lower Confidence Bound Acquisition function V(ϴ) 33 Figure modified from Ryan Adams, Equations from
34 Acquisition Function Probability of improvement Expected Improvement Upper confidence bound Other ideas? What are the limitations of these? 34
35 Policy Search as Black Box Bayesian Optimization Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 35
36 Policy Search as Bayesian Optimization Figure modified from Calandra, Seyfarth, Peters & Deissenroth
37 Gait Optimization GP Policy search Reduced samples needed to find a fast walk by about 3x Lizotte et al. IJCAI
38 More Gait Optimization Gait parameters: 4 threshold values of the FSM (two for each leg) & 4 control signals applied during extension and flexion (separately for knees and hips). Figures modified from Calandra, Seyfarth, Peters & Deissenroth
39 videos 39
40 Why is this Suboptimal? Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 40
41 Throwing Away All Information But Policy Parameter & Total Reward from Trajectory. Also Ignores Structure of Relationship Between Policies Figure from David Silver 41
42 Covariance function: Key choice for GP When should 2 policies be considered close? Figures from Ryan Adams 42
43 Behavior Based Kernel (Wilson et al. JMLR 2014) KL divergence Probability of trajectory under policy ϴi 43
44 Behavior Based Kernel (Wilson et al. JMLR 2014) KL divergence Prob. trajectory under policy ϴi 44
45 Behavior Based Kernel (BBK) (Wilson et al. JMLR 2014) KL divergence Prob. trajectory under policy ϴi Do we need to know the dynamics model? 45
46 Throwing Away All Information But Policy Parameter & Total Reward from Trajectory. Also Ignores Structure of Relationship Between Policies Figure from David Silver 46
47 Model-Based Bayesian Optimization Algorithm (MBOA) (Wilson et al. JMLR 2014) Bayesian Optimization with a Gaussian Process π = f(θi) Use data to build a model. Use to accelerate learning Create new training point [f(θi),r] Execute policy π in environment for T steps, get total reward R 47
48 Figure from WIlson, Fern & Tadepalli JMLR
49 Figure from WIlson, Fern & Tadepalli JMLR
50 Figure from WIlson, Fern & Tadepalli JMLR
51 Figure from WIlson, Fern & Tadepalli JMLR
52 New Kernel vs Model Based Information? Using model information often greatly improves performance if it s a good model If it s not good, learns to ignore New kernel to relate policies (BBK) much less of an impact 52
53 Other Ways to Go Beyond Black-Box Bayesian Optimization Current work in my group (Rika, Christoph, Dexter Lee, Joe Runde) Bayesian Optimization with a Gaussian Process Create new training point [f(θi),r] π = f(θi) Execute policy π in environment for T steps, get total reward R 53
54 Subtleties of Bayesian Optimization Still have to choose policy class (this determines the input space) Have to choose kernel function Have to choose hyperparameters of kernel function (can optimize these) Have to choose acquisition function 54
55 Impact of Acquisition Function & Hyperparameters Manually fixed hyperparameters led to sub-optimal solutions for all the acquisition functions. Figures modified from Calandra, Seyfarth, Peters & Deissenroth
56 Summary: Bayesian Optimization for Efficient Policy Search Benefits Direct policy search Finds global optima Uses sophisticated function class (GP) to model input policy param & output policy value Use smart (but typically myopic) acquisition function to balance exploration/exploitation in searching policy space Can be very sample efficient Not a silver bullet Still have to decide on policy space Choose kernel function (though squared expl often works well) Should optimize hyperparameters 56
57 Stuff to Know: Bayesian Optimization for Efficient Policy Search Properties (global optima, no gradient information used) Define and know benefits/drawbacks of different acquisition functions Understand how to take a policy search problem and formulate as a black box Bayesian optimization problem Be able to list some things have to do in practice (optimize hyperparameters, etc) 57
Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012
Practical Bayesian Optimization of Machine Learning Algorithms Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 ... (Gaussian Processes) are inadequate for doing speech and vision. I still think they're
More informationExploiting Similarity to Optimize Recommendations from User Feedback
1 Exploiting Similarity to Optimize Recommendations from User Feedback Hasta Vanchinathan Andreas Krause (Learning and Adaptive Systems Group, D-INF,ETHZ ) Collaborators: Isidor Nikolic (Microsoft, Zurich),
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationPolicy Gradients. CS : Deep Reinforcement Learning Sergey Levine
Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s
More informationBayesians methods in system identification: equivalences, differences, and misunderstandings
Bayesians methods in system identification: equivalences, differences, and misunderstandings Johan Schoukens and Carl Edward Rasmussen ERNSI 217 Workshop on System Identification Lyon, September 24-27,
More informationBayesOpt: Extensions and applications
BayesOpt: Extensions and applications Javier González Masterclass, 7-February, 2107 @Lancaster University Agenda of the day 9:00-11:00, Introduction to Bayesian Optimization: What is BayesOpt and why it
More informationReview: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections
Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi
More informationChallenges in Developing Learning Algorithms to Personalize mhealth Treatments
Challenges in Developing Learning Algorithms to Personalize mhealth Treatments JOOLHEALTH Bar-Fit Susan A Murphy 01.16.18 HeartSteps SARA Sense 2 Stop Continually Learning Mobile Health Intervention 1)
More informationPolicy Gradients. CS : Deep Reinforcement Learning Sergey Levine
Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s Lecture 1.
More informationIntelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger
Intelligent Systems Discriminative Learning Parts marked by * are optional 30/12/2013 WS2013/2014 Carsten Rother, Dmitrij Schlesinger Discriminative models There exists a joint probability distribution
More informationThreshold Learning for Optimal Decision Making
Threshold Learning for Optimal Decision Making Nathan F. Lepora Department of Engineering Mathematics, University of Bristol, UK n.lepora@bristol.ac.uk Abstract Decision making under uncertainty is commonly
More informationSensory Cue Integration
Sensory Cue Integration Summary by Byoung-Hee Kim Computer Science and Engineering (CSE) http://bi.snu.ac.kr/ Presentation Guideline Quiz on the gist of the chapter (5 min) Presenters: prepare one main
More informationLearning from data when all models are wrong
Learning from data when all models are wrong Peter Grünwald CWI / Leiden Menu Two Pictures 1. Introduction 2. Learning when Models are Seriously Wrong Joint work with John Langford, Tim van Erven, Steven
More informationMy Review of John Barban s Venus Factor (2015 Update and Bonus)
My Review of John Barban s Venus Factor (2015 Update and Bonus) December 26, 2013 by Erin B. White 202 Comments (Edit) This article was originally posted at EBWEIGHTLOSS.com Venus Factor is a diet program
More informationBayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis
Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis Thesis Proposal Indrayana Rustandi April 3, 2007 Outline Motivation and Thesis Preliminary results: Hierarchical
More informationTowards Learning to Ignore Irrelevant State Variables
Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract
More informationExpert System Profile
Expert System Profile GENERAL Domain: Medical Main General Function: Diagnosis System Name: INTERNIST-I/ CADUCEUS (or INTERNIST-II) Dates: 1970 s 1980 s Researchers: Ph.D. Harry Pople, M.D. Jack D. Myers
More informationLearning to Identify Irrelevant State Variables
Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences
More informationarxiv: v1 [stat.ap] 2 Feb 2016
Better safe than sorry: Risky function exploitation through safe optimization Eric Schulz 1, Quentin J.M. Huys 2, Dominik R. Bach 3, Maarten Speekenbrink 1 & Andreas Krause 4 1 Department of Experimental
More informationReach and grasp by people with tetraplegia using a neurally controlled robotic arm
Leigh R. Hochberg et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm Nature, 17 May 2012 Paper overview Ilya Kuzovkin 11 April 2014, Tartu etc How it works?
More informationAnalysis of acgh data: statistical models and computational challenges
: statistical models and computational challenges Ramón Díaz-Uriarte 2007-02-13 Díaz-Uriarte, R. acgh analysis: models and computation 2007-02-13 1 / 38 Outline 1 Introduction Alternative approaches What
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationRoBO: A Flexible and Robust Bayesian Optimization Framework in Python
RoBO: A Flexible and Robust Bayesian Optimization Framework in Python Aaron Klein kleinaa@cs.uni-freiburg.de Numair Mansur mansurm@cs.uni-freiburg.de Stefan Falkner sfalkner@cs.uni-freiburg.de Frank Hutter
More information5/20/2014. Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition. ANC sysmposium 19 May 2014
ANC sysmposium 19 May 2014 Lorentz workshop on HPI Leaving Andy Clark's safe shores: Scaling Predictive Processing to higher cognition Johan Kwisthout, Maria Otworowska, Harold Bekkering, Iris van Rooij
More informationSeminar Thesis: Efficient Planning under Uncertainty with Macro-actions
Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This
More informationA Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models
A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationReinforcement Learning
Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction
More informationStepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality
Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,
More informationInformation-theoretic stimulus design for neurophysiology & psychophysics
Information-theoretic stimulus design for neurophysiology & psychophysics Christopher DiMattina, PhD Assistant Professor of Psychology Florida Gulf Coast University 2 Optimal experimental design Part 1
More informationarxiv: v1 [cs.lg] 17 Dec 2018
Bayesian Optimization in AlphaGo Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver & Nando de Freitas arxiv:1812.06855v1 [cs.lg] 17 Dec 2018 DeepMind, London, UK
More informationModel-Based fmri Analysis. Will Alexander Dept. of Experimental Psychology Ghent University
Model-Based fmri Analysis Will Alexander Dept. of Experimental Psychology Ghent University Motivation Models (general) Why you ought to care Model-based fmri Models (specific) From model to analysis Extended
More informationMotivation: Fraud Detection
Outlier Detection Motivation: Fraud Detection http://i.imgur.com/ckkoaop.gif Jian Pei: CMPT 741/459 Data Mining -- Outlier Detection (1) 2 Techniques: Fraud Detection Features Dissimilarity Groups and
More informationNONLINEAR REGRESSION I
EE613 Machine Learning for Engineers NONLINEAR REGRESSION I Sylvain Calinon Robot Learning & Interaction Group Idiap Research Institute Dec. 13, 2017 1 Outline Properties of multivariate Gaussian distributions
More informationGenerative Adversarial Networks.
Generative Adversarial Networks www.cs.wisc.edu/~page/cs760/ Goals for the lecture you should understand the following concepts Nash equilibrium Minimax game Generative adversarial network Prisoners Dilemma
More informationCh.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology
Ch.20 Dynamic Cue Combination in Distributional Population Code Networks Ka Yeon Kim Biopsychology Applying the coding scheme to dynamic cue combination (Experiment, Kording&Wolpert,2004) Dynamic sensorymotor
More informationBayesian integration in sensorimotor learning
Bayesian integration in sensorimotor learning Introduction Learning new motor skills Variability in sensors and task Tennis: Velocity of ball Not all are equally probable over time Increased uncertainty:
More informationBayesian Reinforcement Learning
Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline
More informationHuman Behavior Modeling with Maximum Entropy Inverse Optimal Control
Human Behavior Modeling with Maximum Entropy Inverse Optimal Control Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey School of Computer Science Carnegie Mellon University Pittsburgh,
More informationRecognizing Ambiguity
Recognizing Ambiguity How Lack of Information Scares Us Mark Clements Columbia University I. Abstract In this paper, I will examine two different approaches to an experimental decision problem posed by
More informationMBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks
MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples
More information4. Model evaluation & selection
Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector
More informationIndividualized Treatment Effects Using a Non-parametric Bayesian Approach
Individualized Treatment Effects Using a Non-parametric Bayesian Approach Ravi Varadhan Nicholas C. Henderson Division of Biostatistics & Bioinformatics Department of Oncology Johns Hopkins University
More informationThe optimism bias may support rational action
The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability
More informationThank you for signing up for the 2017 COLOR ME BELMAR Walk Run Dash for PossABILITIES
TEAM GUIDE Thank you for signing up for the 2017 COLOR ME BELMAR Walk Run Dash for PossABILITIES Your participation means so much to our organization AND the children, adults and families that we serve!
More informationLecture 13: Finding optimal treatment policies
MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline
More informationComputer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California
Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface
More informationSequential Decision Making
Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want
More informationAccelerated Arm Recovery Workout and Training Plan
Accelerated Arm Recovery Workout and Training Plan After going through the Accelerated Arm Recovery, it s by far the best thing for shoulder recovery that I ve come across. It s really easy to pick up
More informationCombining Risks from Several Tumors Using Markov Chain Monte Carlo
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors
More informationIntelligent Machines That Act Rationally. Hang Li Toutiao AI Lab
Intelligent Machines That Act Rationally Hang Li Toutiao AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly Thinking
More informationOn Training of Deep Neural Network. Lornechen
On Training of Deep Neural Network Lornechen 2016.04.20 1 Outline Introduction Layer-wise Pre-training & Fine-tuning Activation Function Initialization Method Advanced Layers and Nets 2 Neural Network
More informationBayesian Inference. Thomas Nichols. With thanks Lee Harrison
Bayesian Inference Thomas Nichols With thanks Lee Harrison Attention to Motion Paradigm Results Attention No attention Büchel & Friston 1997, Cereb. Cortex Büchel et al. 1998, Brain - fixation only -
More informationNeurons and neural networks II. Hopfield network
Neurons and neural networks II. Hopfield network 1 Perceptron recap key ingredient: adaptivity of the system unsupervised vs supervised learning architecture for discrimination: single neuron perceptron
More informationConjoint Audiogram Estimation via Gaussian Process Classification
Washington University in St. Louis Washington University Open Scholarship Engineering and Applied Science Theses & Dissertations Engineering and Applied Science Spring 5-2017 Conjoint Audiogram Estimation
More informationGenome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data
Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data Haipeng Xing, Yifan Mo, Will Liao, Michael Q. Zhang Clayton Davis and Geoffrey
More informationYou can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.
icausalbayes USER MANUAL INTRODUCTION You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. We expect most of our users
More information3. Model evaluation & selection
Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr
More informationLocally-Biased Bayesian Optimization using Nonstationary Gaussian Processes
Locally-Biased Bayesian Optimization using Nonstationary Gaussian Processes Ruben Martinez-Cantin Centro Universitario de la Defensa Zaragoza, 50090, Spain rmcantin@unizar.es Abstract Bayesian optimization
More informationWhy did the network make this prediction?
Why did the network make this prediction? Ankur Taly (Google Inc.) Joint work with Mukund Sundararajan and Qiqi Yan Some Deep Learning Successes Source: https://www.cbsinsights.com Deep Neural Networks
More informationarxiv: v2 [stat.ap] 7 Dec 2016
A Bayesian Approach to Predicting Disengaged Youth arxiv:62.52v2 [stat.ap] 7 Dec 26 David Kohn New South Wales 26 david.kohn@sydney.edu.au Nick Glozier Brain Mind Centre New South Wales 26 Sally Cripps
More informationTHE ANALYTICS EDGE. Intelligence, Happiness, and Health x The Analytics Edge
THE ANALYTICS EDGE Intelligence, Happiness, and Health 15.071x The Analytics Edge Data is Powerful and Everywhere 2.7 Zettabytes of electronic data exist in the world today 2,700,000,000,000,000,000,000
More informationBayesian and Frequentist Approaches
Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law
More informationA Cooking Assistance System for Patients with Alzheimers Disease Using Reinforcement Learning
International Journal of Information Technology Vol. 23 No. 2 2017 A Cooking Assistance System for Patients with Alzheimers Disease Using Reinforcement Learning Haipeng Chen 1 and Yeng Chai Soh 2 1 Joint
More informationNumerical Integration of Bivariate Gaussian Distribution
Numerical Integration of Bivariate Gaussian Distribution S. H. Derakhshan and C. V. Deutsch The bivariate normal distribution arises in many geostatistical applications as most geostatistical techniques
More informationK Armed Bandit. Goal. Introduction. Methodology. Approach. and optimal action rest of the time. Fig 1 shows the pseudo code
K Armed Bandit Goal Introduction Methodology Approach Epsilon Greedy Method: Greedy Method: Optimistic Initial Value Method: Implementation Experiments & Results E1: Comparison of all techniques for Stationary
More informationA Bayesian Approach to Tackling Hard Computational Challenges
A Bayesian Approach to Tackling Hard Computational Challenges Eric Horvitz Microsoft Research Joint work with: Y. Ruan, C. Gomes, H. Kautz, B. Selman, D. Chickering MS Research University of Washington
More informationMake a Difference: UNDERSTANDING INFLUENCE, POWER, AND PERSUASION. Tobias Spanier Extension Educator Center for Community Vitality
Make a Difference: UNDERSTANDING INFLUENCE, POWER, AND PERSUASION Tobias Spanier Extension Educator Center for Community Vitality 2016 Regents of the University of Minnesota. All rights reserved. capacity
More informationUsing Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems
Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)
More informationCost-sensitive Dynamic Feature Selection
Cost-sensitive Dynamic Feature Selection He He 1, Hal Daumé III 1 and Jason Eisner 2 1 University of Maryland, College Park 2 Johns Hopkins University June 30, 2012 He He, Hal Daumé III and Jason Eisner
More informationEXPLORATION FLOW 4/18/10
EXPLORATION Peter Bossaerts CNS 102b FLOW Canonical exploration problem: bandits Bayesian optimal exploration: The Gittins index Undirected exploration: e-greedy and softmax (logit) The economists and
More informationFor general queries, contact
Much of the work in Bayesian econometrics has focused on showing the value of Bayesian methods for parametric models (see, for example, Geweke (2005), Koop (2003), Li and Tobias (2011), and Rossi, Allenby,
More informationPractical Bayesian Design and Analysis for Drug and Device Clinical Trials
Practical Bayesian Design and Analysis for Drug and Device Clinical Trials p. 1/2 Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Brian P. Hobbs Plan B Advisor: Bradley P. Carlin
More informationA Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning
A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer
More informationCS4495 Computer Vision Introduction to Recognition. Aaron Bobick School of Interactive Computing
CS4495 Computer Vision Introduction to Recognition Aaron Bobick School of Interactive Computing What does recognition involve? Source: Fei Fei Li, Rob Fergus, Antonio Torralba. Verification: is that
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 10: Brain-Computer Interfaces Ilya Kuzovkin So Far Stimulus So Far So Far Stimulus What are the neuroimaging techniques you know about? Stimulus So Far
More informationSLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure
SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA Henrik Kure Dina, The Royal Veterinary and Agricuural University Bülowsvej 48 DK 1870 Frederiksberg C. kure@dina.kvl.dk
More informationMacroeconometric Analysis. Chapter 1. Introduction
Macroeconometric Analysis Chapter 1. Introduction Chetan Dave David N. DeJong 1 Background The seminal contribution of Kydland and Prescott (1982) marked the crest of a sea change in the way macroeconomists
More informationA Diabetes minimal model for Oral Glucose Tolerance Tests
arxiv:1601.04753v1 [stat.ap] 18 Jan 2016 A Diabetes minimal model for Oral Glucose Tolerance Tests J. Andrés Christen a, Marcos Capistrán a, Adriana Monroy b, Silvestre Alavez c, Silvia Quintana Vargas
More informationProgress in Risk Science and Causality
Progress in Risk Science and Causality Tony Cox, tcoxdenver@aol.com AAPCA March 27, 2017 1 Vision for causal analytics Represent understanding of how the world works by an explicit causal model. Learn,
More informationBayesian Nonparametric Methods for Precision Medicine
Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign
More informationData Analysis Using Regression and Multilevel/Hierarchical Models
Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface
More informationCLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning
CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning Chris HolmesParker Parflux LLC Salem, OR chris@parflux.com Matthew E. Taylor Washington State University
More informationBayesian Inference Bayes Laplace
Bayesian Inference Bayes Laplace Course objective The aim of this course is to introduce the modern approach to Bayesian statistics, emphasizing the computational aspects and the differences between the
More informationThe 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian
The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University
More informationNDT PERFORMANCE DEMONSTRATION USING SIMULATION
NDT PERFORMANCE DEMONSTRATION USING SIMULATION N. Dominguez, F. Jenson, CEA P. Dubois, F. Foucher, EXTENDE nicolas.dominguez@cea.fr CEA 10 AVRIL 2012 PAGE 1 NDT PERFORMANCE DEMO. USING SIMULATION OVERVIEW
More informationPsychology Research Methods Lab Session Week 10. Survey Design. Due at the Start of Lab: Lab Assignment 3. Rationale for Today s Lab Session
Psychology Research Methods Lab Session Week 10 Due at the Start of Lab: Lab Assignment 3 Rationale for Today s Lab Session Survey Design This tutorial supplements your lecture notes on Measurement by
More informationOnline Journal for Weightlifters & Coaches
Online Journal for Weightlifters & Coaches WWW.WL-LOG.COM Genadi Hiskia Dublin, October 7, 2016 WL-LOG.com Professional Online Service for Weightlifters & Coaches Professional Online Service for Weightlifters
More informationST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods
(10) Frequentist Properties of Bayesian Methods Calibrated Bayes So far we have discussed Bayesian methods as being separate from the frequentist approach However, in many cases methods with frequentist
More informationP. RICHARD HAHN. Research areas. Employment. Education. Research papers
P. RICHARD HAHN Arizona State University email: prhahn@asu.edu Research areas Bayesian methods, causal inference, foundations of statistics, nonlinear regression, Monte Carlo methods, applications to social
More informationSPSS Correlation/Regression
SPSS Correlation/Regression Experimental Psychology Lab Session Week 6 10/02/13 (or 10/03/13) Due at the Start of Lab: Lab 3 Rationale for Today s Lab Session This tutorial is designed to ensure that you
More informationABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases
More informationUser Guide Seeing and Managing Patients with AASM SleepTM
User Guide Seeing and Managing Patients with AASM SleepTM Once you have activated your account with AASM SleepTM, your next step is to begin interacting with and seeing patients. This guide is designed
More informationYou can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.
icausalbayes USER MANUAL INTRODUCTION You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful. We expect most of our users
More informationUNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER
UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER 2016 DELIVERING VALUE WITH DATA SCIENCE BAYES APPROACH - MAKING DATA WORK HARDER The Ipsos MORI Data Science team increasingly
More informationCindy Fan, Rick Huang, Maggie Liu, Ethan Zhang October 20, f: Design Check-in Tasks
Cindy Fan, Rick Huang, Maggie Liu, Ethan Zhang October 20, 2014 2f: Design Check-in Tasks Tracking Liquid Intake Over Time - Easy Task Albert is a software developer in his late 20 s, working for a tech
More information