Reinforcement Learning and Artificial Intelligence

Size: px
Start display at page:

Download "Reinforcement Learning and Artificial Intelligence"

Transcription

1 Reinforcement Learning and Artificial Intelligence PIs: Rich Sutton Michael Bowling Dale Schuurmans Vadim Bulitko plus: Dale Schuurmans Vadim Bulitko Lori Troop Mark Lee

2 Reinforcement learning is learning from interaction to achieve a goal Environment state action reward Agent complete agent temporally situated continual learning & planning object is to affect environment environment stochastic & uncertain

3 Psychology Artificial Intelligence Reinforcement Learning Neuroscience Control Theory

4 Intro to RL with a robot bias policy reward value TD error world model generalized policy iteration

5 World or world model 5

6 Hajime Kimura s RL Robots Backward New Robot, Same algorithm Before After

7 Devilsticking Finnegan Southey University of Alberta Stefan Schaal & Chris Atkeson Univ. of Southern California Model-based Reinforcement Learning of Devilsticking

8

9

10 The RoboCup Soccer Competition

11 Autonomous Learning of Efficient Gait Kohl & Stone (UTexas) 2004

12

13

14

15 Policies A policy maps each state to an action to take Like a stimulus response rule We seek a policy that maximizes cumulative reward The policy is a subgoal to achieving reward

16 The Reward Hypothesis The goal of intelligence is to maximize the cumulative sum of a single received number: reward = pleasure - pain Artificial Intelligence = reward maximization

17 Brain reward systems What signal does this neuron carry? Honeybee Brain VUM Neuron Hammer, Menzel

18 Value

19 Value systems are hedonism with foresight We value situations according to how much reward we expect will follow them All efficient methods for solving sequential decision problems determine (learn or compute) value functions as an intermediate step Value systems are a means to reward, yet we care more about values than rewards

20 Pleasure good Even enjoying yourself you call evil whenever it leads to the loss of a pleasure greater than its own, or lays up pains that outweigh its pleasures.... Isn't it the same when we turn back to pain? To suffer pain you call good when it either rids us of greater pains than its own or leads to pleasures that outweigh them. Plato, Protagoras

21 Reward r : States Pr(R) e.g., r t R Policies π : States Pr(Actions) π is optimal Value Functions V π : States R V π (s) = E π γ k 1 r t +k k =1 given that s t = s discount factor 1 but <1

22 Backgammon STATES: configurations of the playing board ( ) ACTIONS: moves REWARDS: win: +1 lose: 1 else: 0 a big game

23 TD-Gammon Tesauro, Value Action selection by 2-3 ply search TD Error V t +1 V t Start with a random Network Play millions of games against itself Learn a value function from this simulated experience Six weeks later it s the best player of backgammon in the world

24 The Mountain Car Problem Goal Gravity wins SITUATIONS: car's position and velocity ACTIONS: three thrusts: forward, reverse, none REWARDS: always 1 until car reaches the goal No Discounting Minimum-Time-to-Goal Problem Moore, 1990

25 Value Functions Learned while solving the Mountain Car problem Minimize Time-to-Goal Value = estimated time to goal Goal region

26 TD error

27 Reward r : States Pr(R) e.g., r t R Policies π : States Pr(Actions) π is optimal Value Functions V π : States R V π (s) = E π γ k 1 r t +k k =1 given that s t = s discount factor 1 but <1

28 V t = E γ k 1 r t +k k =1 = E r t+1 + γ γ k 1 r t +1+k k =1 r t +1 +γv t +1 TDerror t = r t +1 +γv t +1 V t

29 Brain reward systems seem to signal TD error TD error Wolfram Schultz, et al.

30 World or world model 30

31 World models

32 Autonomous helicopter flight via Reinforcement Learning Ng (Stanford), Kim, Jordan, & Sastry (UC Berkeley) 2004

33

34

35 Reason as RL over Imagined Experience 1. Learn a predictive model of the world s dynamics transition probabilities, expected immediate rewards 2. Use model to generate imaginary experiences internal thought trials, mental simulation (Craik, 1943) 3. Apply RL as if experience had really happened vicarious trial and error (Tolman, 1932)

36 GridWorld Example

37 Intro to RL with a robot bias policy reward value TD error world model generalized policy iteration

38 Policy Iteration π 0 V π 0 π 1 V π 1 Lπ * V * π * policy evaluation policy improvement greedification Improvement is monotonic Converges is a finite number of steps Generalized Policy Iteration: Intermix the two steps more finely, state by state, action by action, sample by sample

39 Generalized Policy Iteration policy evaluation Policy π Value Function π V greedification π * V *

40 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model

41 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward

42 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward Key to all efficient methods for finding optimal policies

43 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward Key to all efficient methods for finding optimal policies This has nothing to do with either biology or computers

44 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model

45 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model together with the causal structure of the world

46 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model together with the causal structure of the world

47 additions for next year sarsa how it might actually applied to something they might do

48 World or world model 48

49 Current work Knowledge is subjective An individual s knowledge consists of predictions about its future (low-level) sensations and actions Everything else (objects, houses, people, love) must be constructed from sensation and action

50 Subjective/Predictive Knowledge John is in the coffee room My car in is the South parking lot What we know about geography, navigation What we know about how an object looks, rotates What we know about how objects can be used Recognition strategies for objects and letters The portrait of Washington on the dollar in the wallet in my other pants in the laundry, has a mustache on it

51 RoboCup soccer keepaway Stone, Sutton & Kuhlmann, 2005

Chapter 1. Give an overview of the whole RL problem. Policies Value functions. Tic-Tac-Toe example

Chapter 1. Give an overview of the whole RL problem. Policies Value functions. Tic-Tac-Toe example Chapter 1 Give an overview of the whole RL problem n Before we break it up into parts to study individually Introduce the cast of characters n Experience (reward) n n Policies Value functions n Models

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction

More information

Lecture 13: Finding optimal treatment policies

Lecture 13: Finding optimal treatment policies MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want

More information

Machine Learning! R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998!

Machine Learning! R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998! Introduction Machine Learning! Literature! Introduction 1 R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998! http://www.cs.ualberta.ca/~sutton/book/the-book.html! E. Alpaydin:

More information

Learning Utility for Behavior Acquisition and Intention Inference of Other Agent

Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Yasutake Takahashi, Teruyasu Kawamata, and Minoru Asada* Dept. of Adaptive Machine Systems, Graduate School of Engineering,

More information

CS343: Artificial Intelligence

CS343: Artificial Intelligence CS343: Artificial Intelligence Introduction: Part 2 Prof. Scott Niekum University of Texas at Austin [Based on slides created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All materials

More information

Learning to Identify Irrelevant State Variables

Learning to Identify Irrelevant State Variables Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences

More information

Dopamine neurons activity in a multi-choice task: reward prediction error or value function?

Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Jean Bellot 1,, Olivier Sigaud 1,, Matthew R Roesch 3,, Geoffrey Schoenbaum 5,, Benoît Girard 1,, Mehdi Khamassi

More information

The optimism bias may support rational action

The optimism bias may support rational action The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability

More information

Introduction to Deep Reinforcement Learning and Control

Introduction to Deep Reinforcement Learning and Control Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Introduction to Deep Reinforcement Learning and Control Lecture 1, CMU 10703 Katerina Fragkiadaki Logistics 3 assignments

More information

Exploration and Exploitation in Reinforcement Learning

Exploration and Exploitation in Reinforcement Learning Exploration and Exploitation in Reinforcement Learning Melanie Coggan Research supervised by Prof. Doina Precup CRA-W DMP Project at McGill University (2004) 1/18 Introduction A common problem in reinforcement

More information

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement

More information

Towards Learning to Ignore Irrelevant State Variables

Towards Learning to Ignore Irrelevant State Variables Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract

More information

Diagnostic Measure of Procrastination. Dr. Piers Steel

Diagnostic Measure of Procrastination. Dr. Piers Steel Diagnostic Measure of Procrastination Dr. Piers Steel Scientific Stages of Development 1 2 3 4 Identification and Measurement: What is it and how can we identify it. Prediction and Description: What is

More information

Overview. What is an agent?

Overview. What is an agent? Artificial Intelligence Programming s and s Chris Brooks Overview What makes an agent? Defining an environment Overview What makes an agent? Defining an environment Department of Computer Science University

More information

Cost-Sensitive Learning for Biological Motion

Cost-Sensitive Learning for Biological Motion Olivier Sigaud Université Pierre et Marie Curie, PARIS 6 http://people.isir.upmc.fr/sigaud October 5, 2010 1 / 42 Table of contents The problem of movement time Statement of problem A discounted reward

More information

Agents and Environments

Agents and Environments Artificial Intelligence Programming s and s Chris Brooks 3-2: Overview What makes an agent? Defining an environment Types of agent programs 3-3: Overview What makes an agent? Defining an environment Types

More information

Why Do Kids Play Soccer? Why Do You Coach Soccer?

Why Do Kids Play Soccer? Why Do You Coach Soccer? COMMUNICATION MOTIVATION Why Do Kids Play Soccer? Why Do You Coach Soccer? Why Do Kids Play Soccer? Because they want to have fun Because they want to learn, develop, and get better Because they want to

More information

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes Using Eligibility Traces to Find the est Memoryless Policy in Partially Observable Markov Decision Processes John Loch Department of Computer Science University of Colorado oulder, CO 80309-0430 loch@cs.colorado.edu

More information

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer

More information

A Reinforcement Learning Approach Involving a Shortest Path Finding Algorithm

A Reinforcement Learning Approach Involving a Shortest Path Finding Algorithm Proceedings of the 003 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems Proceedings Las Vegas, of Nevada the 003 October IEEE/RSJ 003 Intl. Conference on Intelligent Robots and Systems Las Vegas,

More information

SUPERVISED-REINFORCEMENT LEARNING FOR A MOBILE ROBOT IN A REAL-WORLD ENVIRONMENT. Karla G. Conn. Thesis. Submitted to the Faculty of the

SUPERVISED-REINFORCEMENT LEARNING FOR A MOBILE ROBOT IN A REAL-WORLD ENVIRONMENT. Karla G. Conn. Thesis. Submitted to the Faculty of the SUPERVISED-REINFORCEMENT LEARNING FOR A MOBILE ROBOT IN A REAL-WORLD ENVIRONMENT By Karla G. Conn Thesis Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment

More information

C2 Qu2 DP1 How can psychology affect performance?

C2 Qu2 DP1 How can psychology affect performance? C2 Qu2 DP1 How can psychology affect performance? Hi Guys Let s move onto the second critical question in Core 2 How can psychology affect performance? In this video, we will be explore motivation from

More information

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of

More information

A Model of Dopamine and Uncertainty Using Temporal Difference

A Model of Dopamine and Uncertainty Using Temporal Difference A Model of Dopamine and Uncertainty Using Temporal Difference Angela J. Thurnham* (a.j.thurnham@herts.ac.uk), D. John Done** (d.j.done@herts.ac.uk), Neil Davey* (n.davey@herts.ac.uk), ay J. Frank* (r.j.frank@herts.ac.uk)

More information

CS324-Artificial Intelligence

CS324-Artificial Intelligence CS324-Artificial Intelligence Lecture 3: Intelligent Agents Waheed Noor Computer Science and Information Technology, University of Balochistan, Quetta, Pakistan Waheed Noor (CS&IT, UoB, Quetta) CS324-Artificial

More information

Intangible Attributes of Baseball s Best Players

Intangible Attributes of Baseball s Best Players by Jim Murphy Intangible Attributes of Baseball s Best Players 1 Dealing with adversity Baseball is a game of failure, and the top players are comfortable with adversity. They don t base their self confidence

More information

DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY!

DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY! H DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY! appy, vibrant, successful people think and behave in certain ways, as do miserable and unfulfilled people. In other words, there

More information

Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR. Donglin Zeng, Department of Biostatistics, University of North Carolina

Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR. Donglin Zeng, Department of Biostatistics, University of North Carolina Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR Introduction Introduction Consider simple DTRs: D = (D 1,..., D K ) D k (H k ) = 1 or 1 (A k = { 1, 1}). That is, a fixed treatment

More information

Do Reinforcement Learning Models Explain Neural Learning?

Do Reinforcement Learning Models Explain Neural Learning? Do Reinforcement Learning Models Explain Neural Learning? Svenja Stark Fachbereich 20 - Informatik TU Darmstadt svenja.stark@stud.tu-darmstadt.de Abstract Because the functionality of our brains is still

More information

The Rescorla Wagner Learning Model (and one of its descendants) Computational Models of Neural Systems Lecture 5.1

The Rescorla Wagner Learning Model (and one of its descendants) Computational Models of Neural Systems Lecture 5.1 The Rescorla Wagner Learning Model (and one of its descendants) Lecture 5.1 David S. Touretzky Based on notes by Lisa M. Saksida November, 2015 Outline Classical and instrumental conditioning The Rescorla

More information

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California

More information

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences Marc Pomplun 1, Yueju Liu 2, Julio Martinez-Trujillo 2, Evgueni Simine 2, and John K. Tsotsos 2 1 Department

More information

Doing High Quality Field Research. Kim Elsbach University of California, Davis

Doing High Quality Field Research. Kim Elsbach University of California, Davis Doing High Quality Field Research Kim Elsbach University of California, Davis 1 1. What Does it Mean to do High Quality (Qualitative) Field Research? a) It plays to the strengths of the method for theory

More information

Stress Prevention in 6 Steps S T E P 3 A P P R A I S E : C O G N I T I V E R E S T R U C T U R I N G

Stress Prevention in 6 Steps S T E P 3 A P P R A I S E : C O G N I T I V E R E S T R U C T U R I N G Stress Prevention in 6 Steps S T E P 3 A P P R A I S E : C O G N I T I V E R E S T R U C T U R I N G 6 steps overview 1. Assess: Raising Awareness 2. Avoid: Unnecessary stress; problem solving 3. Appraise

More information

Chapter 2: Intelligent Agents

Chapter 2: Intelligent Agents Chapter 2: Intelligent Agents Outline Last class, introduced AI and rational agent Today s class, focus on intelligent agents Agent and environments Nature of environments influences agent design Basic

More information

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s Lecture 1.

More information

IS IT A STATE OR A GOAL?

IS IT A STATE OR A GOAL? IS IT A STATE OR A GOAL? A STATE IS: A GOAL OR OUTCOME IS: EXPRESSED AMBIGUOUSLY YOU CAN HAVE IT NOW (Immediate Gratification) INFINITE (Too much is never enough!) EXPRESSED FOR SELF AND/OR OTHERS EXPRESSED

More information

Katsunari Shibata and Tomohiko Kawano

Katsunari Shibata and Tomohiko Kawano Learning of Action Generation from Raw Camera Images in a Real-World-Like Environment by Simple Coupling of Reinforcement Learning and a Neural Network Katsunari Shibata and Tomohiko Kawano Oita University,

More information

Stop Smoking Pre Session Guide. Helping YOU Successfully Stop Smoking For Good

Stop Smoking Pre Session Guide. Helping YOU Successfully Stop Smoking For Good Stop Smoking Pre Session Guide Helping YOU Successfully Stop Smoking For Good Important Note: This document and your appointment are not a substitute for appropriate medical advice. If you have concerns

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Artificial Intelligence Lecture 7

Artificial Intelligence Lecture 7 Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge

More information

Worksheet # 1 Why We Procrastinate

Worksheet # 1 Why We Procrastinate Worksheet # 1 Why We Procrastinate Directions. Take your best guess and rank the following reasons for why we procrastinate from 1 to 5 starting with 1 being the biggest reason we procrastinate and 5 being

More information

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This

More information

Dikran J. Martin. Psychology 110. Name: Date: Principal Features. "First, the term learning does not apply to (168)

Dikran J. Martin. Psychology 110. Name: Date: Principal Features. First, the term learning does not apply to (168) Dikran J. Martin Psychology 110 Name: Date: Lecture Series: Chapter 5 Learning: How We're Changed Pages: 26 by Experience TEXT: Baron, Robert A. (2001). Psychology (Fifth Edition). Boston, MA: Allyn and

More information

Concreteness and Abstraction in Everyday Explanation. Christos Bechlivanidis, David Lagnado Jeff Zemla & Steve Sloman

Concreteness and Abstraction in Everyday Explanation. Christos Bechlivanidis, David Lagnado Jeff Zemla & Steve Sloman Concreteness and Abstraction in Everyday Explanation Christos Bechlivanidis, David Lagnado Jeff Zemla & Steve Sloman What goes into a good explanation? What goes into a good explanation? Why there was

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Intelligent Agents Chapter 2 & 27 What is an Agent? An intelligent agent perceives its environment with sensors and acts upon that environment through actuators 2 Examples of Agents

More information

Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013

Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013 Partially-Observable Markov Decision Processes as Dynamical Causal Models Finale Doshi-Velez NIPS Causality Workshop 2013 The POMDP Mindset We poke the world (perform an action) Agent World The POMDP Mindset

More information

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany High-level Vision Bernd Neumann Slides for the course in WS 2004/05 Faculty of Informatics Hamburg University Germany neumann@informatik.uni-hamburg.de http://kogs-www.informatik.uni-hamburg.de 1 Contents

More information

Introduction to Artificial Intelligence 2 nd semester 2016/2017. Chapter 2: Intelligent Agents

Introduction to Artificial Intelligence 2 nd semester 2016/2017. Chapter 2: Intelligent Agents Introduction to Artificial Intelligence 2 nd semester 2016/2017 Chapter 2: Intelligent Agents Mohamed B. Abubaker Palestine Technical College Deir El-Balah 1 Agents and Environments An agent is anything

More information

CHAPTER 7: Achievement motivation, attribution theory, self-efficacy and confidence. Practice questions - text book pages

CHAPTER 7: Achievement motivation, attribution theory, self-efficacy and confidence. Practice questions - text book pages QUESTIONS AND ANSWERS CHAPTER 7: Achievement motivation, attribution theory, self-efficacy and confidence Practice questions - text book pages 111-112 1) Which one of the following best explains achievement

More information

Cerebellar Learning and Applications

Cerebellar Learning and Applications Cerebellar Learning and Applications Matthew Hausknecht, Wenke Li, Mike Mauk, Peter Stone January 23, 2014 Motivation Introduces a novel learning agent: the cerebellum simulator. Study the successes and

More information

Meta-learning in Reinforcement Learning

Meta-learning in Reinforcement Learning Neural Networks 16 (2003) 5 9 Neural Networks letter Meta-learning in Reinforcement Learning Nicolas Schweighofer a, *, Kenji Doya a,b,1 a CREST, Japan Science and Technology Corporation, ATR, Human Information

More information

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)

More information

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Neurocomputing 69 (26) 1327 1331 www.elsevier.com/locate/neucom A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Shinya Ishii a,1, Munetaka

More information

Selective Attention as an Example of Building Representations within Reinforcement Learning

Selective Attention as an Example of Building Representations within Reinforcement Learning University of Colorado, Boulder CU Scholar Psychology and Neuroscience Graduate Theses & Dissertations Psychology and Neuroscience Spring 1-1-2011 Selective Attention as an Example of Building Representations

More information

Hierarchical Control over Effortful Behavior by Dorsal Anterior Cingulate Cortex. Clay Holroyd. Department of Psychology University of Victoria

Hierarchical Control over Effortful Behavior by Dorsal Anterior Cingulate Cortex. Clay Holroyd. Department of Psychology University of Victoria Hierarchical Control over Effortful Behavior by Dorsal Anterior Cingulate Cortex Clay Holroyd Department of Psychology University of Victoria Proposal Problem(s) with existing theories of ACC function.

More information

How Brain May Weigh the World With Simple Dopamine System

How Brain May Weigh the World With Simple Dopamine System March 19, 1996, Tuesday SCIENCE DESK How Brain May Weigh the World With Simple Dopamine System By SANDRA BLAKESLEE (NYT) 2042 words IN the ever increasing cascade of new information on brain chemistry

More information

Introduction to Biological Anthropology: Notes 12 Mating: Primate females and males Copyright Bruce Owen 2009 We want to understand the reasons

Introduction to Biological Anthropology: Notes 12 Mating: Primate females and males Copyright Bruce Owen 2009 We want to understand the reasons Introduction to Biological Anthropology: Notes 12 Mating: Primate females and males Copyright Bruce Owen 2009 We want to understand the reasons behind the lifestyles of our non-human primate relatives

More information

AI Programming CS F-04 Agent Oriented Programming

AI Programming CS F-04 Agent Oriented Programming AI Programming CS662-2008F-04 Agent Oriented Programming David Galles Department of Computer Science University of San Francisco 04-0: Agents & Environments What is an Agent What is an Environment Types

More information

POND-Hindsight: Applying Hindsight Optimization to POMDPs

POND-Hindsight: Applying Hindsight Optimization to POMDPs POND-Hindsight: Applying Hindsight Optimization to POMDPs Alan Olsen and Daniel Bryce alan@olsen.org, daniel.bryce@usu.edu Utah State University Logan, UT Abstract We present the POND-Hindsight entry in

More information

Be Fit for Life Series. Key Number Five Increase Metabolism

Be Fit for Life Series. Key Number Five Increase Metabolism Key #5 Increase Metabolism Page 1 of 9 Be Fit for Life Series Key Number Five Increase Metabolism Sit or lie in a safe and comfortable position I want you to close your eyes keep them closed until I ask

More information

THE PSYCHOLOGY OF HAPPINESS D A Y 3 T H E G O O D L I F E

THE PSYCHOLOGY OF HAPPINESS D A Y 3 T H E G O O D L I F E THE PSYCHOLOGY OF HAPPINESS D A Y 3 T H E G O O D L I F E EXPERIENCE SAMPLING On a scale of 1 (not at all) 10 (extremely) Do you feel: Happy? Relaxed? Awake? AGENDA Grounding Exercise Homework Discussion

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 7: Network models Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single neuron

More information

Brief Summary of Liminal thinking Create the change you want by changing the way you think Dave Gray

Brief Summary of Liminal thinking Create the change you want by changing the way you think Dave Gray Brief Summary of Liminal thinking Create the change you want by changing the way you think Dave Gray The word liminal comes from the Latin word limen which means threshold. Liminal thinking is the art

More information

Respect Handout. You receive respect when you show others respect regardless of how they treat you.

Respect Handout. You receive respect when you show others respect regardless of how they treat you. RESPECT -- THE WILL TO UNDERSTAND Part Two Heading in Decent People, Decent Company: How to Lead with Character at Work and in Life by Robert Turknett and Carolyn Turknett, 2005 Respect Handout Respect

More information

Agents and State Spaces. CSCI 446: Artificial Intelligence

Agents and State Spaces. CSCI 446: Artificial Intelligence Agents and State Spaces CSCI 446: Artificial Intelligence Overview Agents and environments Rationality Agent types Specifying the task environment Performance measure Environment Actuators Sensors Search

More information

Cognitive and Biological Agent Models for Emotion Reading

Cognitive and Biological Agent Models for Emotion Reading Cognitive and Biological Agent Models for Emotion Reading Zulfiqar A. Memon, Jan Treur Vrije Universiteit Amsterdam, Department of Artificial Intelligence De Boelelaan 1081, 1081 HV Amsterdam, the Netherlands

More information

Agents and Environments. Stephen G. Ware CSCI 4525 / 5525

Agents and Environments. Stephen G. Ware CSCI 4525 / 5525 Agents and Environments Stephen G. Ware CSCI 4525 / 5525 Agents An agent (software or hardware) has: Sensors that perceive its environment Actuators that change its environment Environment Sensors Actuators

More information

Organizing Behavior into Temporal and Spatial Neighborhoods

Organizing Behavior into Temporal and Spatial Neighborhoods Organizing Behavior into Temporal and Spatial Neighborhoods Mark Ring IDSIA / University of Lugano / SUPSI Galleria 6928 Manno-Lugano, Switzerland Email: mark@idsia.ch Tom Schaul Courant Institute of Mathematical

More information

Agents. Robert Platt Northeastern University. Some material used from: 1. Russell/Norvig, AIMA 2. Stacy Marsella, CS Seif El-Nasr, CS4100

Agents. Robert Platt Northeastern University. Some material used from: 1. Russell/Norvig, AIMA 2. Stacy Marsella, CS Seif El-Nasr, CS4100 Agents Robert Platt Northeastern University Some material used from: 1. Russell/Norvig, AIMA 2. Stacy Marsella, CS4100 3. Seif El-Nasr, CS4100 What is an Agent? Sense Agent Environment Act What is an Agent?

More information

Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2008 As we have seen before, the bottom line

Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2008 As we have seen before, the bottom line Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2008 As we have seen before, the bottom line in evolution is reproductive success reproductive success:

More information

Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning

Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Arthur Guez, Robert D. Vincent and Joelle Pineau School of Computer Science, McGill University Massimo Avoli Montreal Neurological Institute

More information

Consider the following aspects of human intelligence: consciousness, memory, abstract reasoning

Consider the following aspects of human intelligence: consciousness, memory, abstract reasoning All life is nucleic acid. The rest is commentary. Isaac Asimov Consider the following aspects of human intelligence: consciousness, memory, abstract reasoning and emotion. Discuss the relative difficulty

More information

Computational Neuroscience. Instructor: Odelia Schwartz

Computational Neuroscience. Instructor: Odelia Schwartz Computational Neuroscience 2017 1 Instructor: Odelia Schwartz From the NIH web site: Committee report: Brain 2025: A Scientific Vision (from 2014) #1. Discovering diversity: Identify and provide experimental

More information

The Psychology of Street Photography

The Psychology of Street Photography The Psychology of Street Photography I think street photography is 90% psychology if you can master your own mental psychology, then you will be able to achieve your personal maximum in your photography

More information

Cognitive and Computational Neuroscience of Aging

Cognitive and Computational Neuroscience of Aging Cognitive and Computational Neuroscience of Aging Hecke Dynamically Selforganized MPI Brains Can Compute Nervously Goettingen CNS Seminar SS 2007 Outline 1 Introduction 2 Psychological Models 3 Physiological

More information

Intelligent Agents. BBM 405 Fundamentals of Artificial Intelligence Pinar Duygulu Hacettepe University. Slides are mostly adapted from AIMA

Intelligent Agents. BBM 405 Fundamentals of Artificial Intelligence Pinar Duygulu Hacettepe University. Slides are mostly adapted from AIMA 1 Intelligent Agents BBM 405 Fundamentals of Artificial Intelligence Pinar Duygulu Hacettepe University Slides are mostly adapted from AIMA Outline 2 Agents and environments Rationality PEAS (Performance

More information

Learning to Use Episodic Memory

Learning to Use Episodic Memory Learning to Use Episodic Memory Nicholas A. Gorski (ngorski@umich.edu) John E. Laird (laird@umich.edu) Computer Science & Engineering, University of Michigan 2260 Hayward St., Ann Arbor, MI 48109 USA Abstract

More information

Refresh. The science of sleep for optimal performance and well being. Sleep and Exams: Strange Bedfellows

Refresh. The science of sleep for optimal performance and well being. Sleep and Exams: Strange Bedfellows Refresh The science of sleep for optimal performance and well being Unit 7: Sleep and Exams: Strange Bedfellows Can you remember a night when you were trying and trying to get to sleep because you had

More information

Luke A. Rosedahl & F. Gregory Ashby. University of California, Santa Barbara, CA 93106

Luke A. Rosedahl & F. Gregory Ashby. University of California, Santa Barbara, CA 93106 Luke A. Rosedahl & F. Gregory Ashby University of California, Santa Barbara, CA 93106 Provides good fits to lots of data Google Scholar returns 2,480 hits Over 130 videos online Own Wikipedia page Has

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 2. Rational Agents Nature and Structure of Rational Agents and Their Environments Wolfram Burgard, Bernhard Nebel and Martin Riedmiller Albert-Ludwigs-Universität

More information

CS 331: Artificial Intelligence Intelligent Agents. Agent-Related Terms. Question du Jour. Rationality. General Properties of AI Systems

CS 331: Artificial Intelligence Intelligent Agents. Agent-Related Terms. Question du Jour. Rationality. General Properties of AI Systems General Properties of AI Systems CS 331: Artificial Intelligence Intelligent Agents Sensors Reasoning Actuators Percepts Actions Environmen nt This part is called an agent. Agent: anything that perceives

More information

Introduction to Mental Skills Training for Successful Athletes John Kontonis Assoc MAPS

Introduction to Mental Skills Training for Successful Athletes John Kontonis Assoc MAPS Introduction to Mental Skills Training for Successful Athletes John Kontonis Assoc MAPS "You can always become better." - Tiger Woods Successful Athletes 1. Choose and maintain a positive attitude. 2.

More information

Secrets to the Body of Your Life in 2017

Secrets to the Body of Your Life in 2017 Secrets to the Body of Your Life in 2017 YOU CAN HAVE RESULTS OR EXCUSES NOT BOTH. INTRO TO THIS LESSON Welcome to Lesson #3 of your BarStarzz Calisthenics Workshop! For any new comers, make sure you watch

More information

Brain-Based Devices. Studying Cognitive Functions with Embodied Models of the Nervous System

Brain-Based Devices. Studying Cognitive Functions with Embodied Models of the Nervous System Brain-Based Devices Studying Cognitive Functions with Embodied Models of the Nervous System Jeff Krichmar The Neurosciences Institute San Diego, California, USA http://www.nsi.edu/nomad develop theory

More information

Learning to act by predicting the future

Learning to act by predicting the future Learning to act by predicting the future Alexey Dosovitskiy and Vladlen Koltun Intel Labs, Santa Clara ICLR 2017, Toulon, France Sensorimotor control Aim: produce useful motor actions based on sensory

More information

Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2010 We want to understand the reasons

Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2010 We want to understand the reasons Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2010 We want to understand the reasons behind the lifestyles of our non-human primate relatives

More information

Marcus Hutter Canberra, ACT, 0200, Australia

Marcus Hutter Canberra, ACT, 0200, Australia Marcus Hutter Canberra, ACT, 0200, Australia http://www.hutter1.net/ Australian National University Abstract The approaches to Artificial Intelligence (AI) in the last century may be labelled as (a) trying

More information

Stress is different for everyone While what happens in the brain and the body is the same for all of us, the precipitating factors are very

Stress is different for everyone While what happens in the brain and the body is the same for all of us, the precipitating factors are very 1 Stress is different for everyone While what happens in the brain and the body is the same for all of us, the precipitating factors are very individual. What one person experiences as stressful might

More information

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s

More information

MITOCW watch?v=fmx5yj95sgq

MITOCW watch?v=fmx5yj95sgq MITOCW watch?v=fmx5yj95sgq The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To

More information

PART - A 1. Define Artificial Intelligence formulated by Haugeland. The exciting new effort to make computers think machines with minds in the full and literal sense. 2. Define Artificial Intelligence

More information

Solutions for Chapter 2 Intelligent Agents

Solutions for Chapter 2 Intelligent Agents Solutions for Chapter 2 Intelligent Agents 2.1 This question tests the student s understanding of environments, rational actions, and performance measures. Any sequential environment in which rewards may

More information

Agents. This course is about designing intelligent agents Agents and environments. Rationality. The vacuum-cleaner world

Agents. This course is about designing intelligent agents Agents and environments. Rationality. The vacuum-cleaner world This course is about designing intelligent agents and environments Rationality The vacuum-cleaner world The concept of rational behavior. Environment types Agent types 1 An agent is an entity that perceives

More information

Positive Thinking Train Your Mind For Success And Happiness

Positive Thinking Train Your Mind For Success And Happiness Positive Thinking Train Your Mind For Success And Happiness Francisco Bujan - 1 - Contents Intro 12 Part 1 - Mind dynamics 13 What is self talk and how to make it work for you? 14 Mind sets 15 How inspiration

More information

Introduction and Historical Background. August 22, 2007

Introduction and Historical Background. August 22, 2007 1 Cognitive Bases of Behavior Introduction and Historical Background August 22, 2007 2 Cognitive Psychology Concerned with full range of psychological processes from sensation to knowledge representation

More information

Contents. Foundations of Artificial Intelligence. Agents. Rational Agents

Contents. Foundations of Artificial Intelligence. Agents. Rational Agents Contents Foundations of Artificial Intelligence 2. Rational s Nature and Structure of Rational s and Their s Wolfram Burgard, Bernhard Nebel, and Martin Riedmiller Albert-Ludwigs-Universität Freiburg May

More information

Lecture 2 Agents & Environments (Chap. 2) Outline

Lecture 2 Agents & Environments (Chap. 2) Outline Lecture 2 Agents & Environments (Chap. 2) Based on slides by UW CSE AI faculty, Dan Klein, Stuart Russell, Andrew Moore Outline Agents and environments Rationality PEAS specification Environment types

More information