Reinforcement Learning and Artificial Intelligence
|
|
- Martha Oliver
- 6 years ago
- Views:
Transcription
1 Reinforcement Learning and Artificial Intelligence PIs: Rich Sutton Michael Bowling Dale Schuurmans Vadim Bulitko plus: Dale Schuurmans Vadim Bulitko Lori Troop Mark Lee
2 Reinforcement learning is learning from interaction to achieve a goal Environment state action reward Agent complete agent temporally situated continual learning & planning object is to affect environment environment stochastic & uncertain
3 Psychology Artificial Intelligence Reinforcement Learning Neuroscience Control Theory
4 Intro to RL with a robot bias policy reward value TD error world model generalized policy iteration
5 World or world model 5
6 Hajime Kimura s RL Robots Backward New Robot, Same algorithm Before After
7 Devilsticking Finnegan Southey University of Alberta Stefan Schaal & Chris Atkeson Univ. of Southern California Model-based Reinforcement Learning of Devilsticking
8
9
10 The RoboCup Soccer Competition
11 Autonomous Learning of Efficient Gait Kohl & Stone (UTexas) 2004
12
13
14
15 Policies A policy maps each state to an action to take Like a stimulus response rule We seek a policy that maximizes cumulative reward The policy is a subgoal to achieving reward
16 The Reward Hypothesis The goal of intelligence is to maximize the cumulative sum of a single received number: reward = pleasure - pain Artificial Intelligence = reward maximization
17 Brain reward systems What signal does this neuron carry? Honeybee Brain VUM Neuron Hammer, Menzel
18 Value
19 Value systems are hedonism with foresight We value situations according to how much reward we expect will follow them All efficient methods for solving sequential decision problems determine (learn or compute) value functions as an intermediate step Value systems are a means to reward, yet we care more about values than rewards
20 Pleasure good Even enjoying yourself you call evil whenever it leads to the loss of a pleasure greater than its own, or lays up pains that outweigh its pleasures.... Isn't it the same when we turn back to pain? To suffer pain you call good when it either rids us of greater pains than its own or leads to pleasures that outweigh them. Plato, Protagoras
21 Reward r : States Pr(R) e.g., r t R Policies π : States Pr(Actions) π is optimal Value Functions V π : States R V π (s) = E π γ k 1 r t +k k =1 given that s t = s discount factor 1 but <1
22 Backgammon STATES: configurations of the playing board ( ) ACTIONS: moves REWARDS: win: +1 lose: 1 else: 0 a big game
23 TD-Gammon Tesauro, Value Action selection by 2-3 ply search TD Error V t +1 V t Start with a random Network Play millions of games against itself Learn a value function from this simulated experience Six weeks later it s the best player of backgammon in the world
24 The Mountain Car Problem Goal Gravity wins SITUATIONS: car's position and velocity ACTIONS: three thrusts: forward, reverse, none REWARDS: always 1 until car reaches the goal No Discounting Minimum-Time-to-Goal Problem Moore, 1990
25 Value Functions Learned while solving the Mountain Car problem Minimize Time-to-Goal Value = estimated time to goal Goal region
26 TD error
27 Reward r : States Pr(R) e.g., r t R Policies π : States Pr(Actions) π is optimal Value Functions V π : States R V π (s) = E π γ k 1 r t +k k =1 given that s t = s discount factor 1 but <1
28 V t = E γ k 1 r t +k k =1 = E r t+1 + γ γ k 1 r t +1+k k =1 r t +1 +γv t +1 TDerror t = r t +1 +γv t +1 V t
29 Brain reward systems seem to signal TD error TD error Wolfram Schultz, et al.
30 World or world model 30
31 World models
32 Autonomous helicopter flight via Reinforcement Learning Ng (Stanford), Kim, Jordan, & Sastry (UC Berkeley) 2004
33
34
35 Reason as RL over Imagined Experience 1. Learn a predictive model of the world s dynamics transition probabilities, expected immediate rewards 2. Use model to generate imaginary experiences internal thought trials, mental simulation (Craik, 1943) 3. Apply RL as if experience had really happened vicarious trial and error (Tolman, 1932)
36 GridWorld Example
37 Intro to RL with a robot bias policy reward value TD error world model generalized policy iteration
38 Policy Iteration π 0 V π 0 π 1 V π 1 Lπ * V * π * policy evaluation policy improvement greedification Improvement is monotonic Converges is a finite number of steps Generalized Policy Iteration: Intermix the two steps more finely, state by state, action by action, sample by sample
39 Generalized Policy Iteration policy evaluation Policy π Value Function π V greedification π * V *
40 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model
41 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward
42 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward Key to all efficient methods for finding optimal policies
43 Summary: RL s Computational Theory of Mind Reward Policy Value Function Predictive Model A learned, time-varying prediction of imminent reward Key to all efficient methods for finding optimal policies This has nothing to do with either biology or computers
44 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model
45 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model together with the causal structure of the world
46 Summary: RL s Computational Theory of Mind Reward Policy It s all created from the scalar reward signal Value Function Predictive Model together with the causal structure of the world
47 additions for next year sarsa how it might actually applied to something they might do
48 World or world model 48
49 Current work Knowledge is subjective An individual s knowledge consists of predictions about its future (low-level) sensations and actions Everything else (objects, houses, people, love) must be constructed from sensation and action
50 Subjective/Predictive Knowledge John is in the coffee room My car in is the South parking lot What we know about geography, navigation What we know about how an object looks, rotates What we know about how objects can be used Recognition strategies for objects and letters The portrait of Washington on the dollar in the wallet in my other pants in the laundry, has a mustache on it
51 RoboCup soccer keepaway Stone, Sutton & Kuhlmann, 2005
Chapter 1. Give an overview of the whole RL problem. Policies Value functions. Tic-Tac-Toe example
Chapter 1 Give an overview of the whole RL problem n Before we break it up into parts to study individually Introduce the cast of characters n Experience (reward) n n Policies Value functions n Models
More informationReinforcement Learning
Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction
More informationLecture 13: Finding optimal treatment policies
MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline
More informationSequential Decision Making
Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want
More informationMachine Learning! R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998!
Introduction Machine Learning! Literature! Introduction 1 R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998! http://www.cs.ualberta.ca/~sutton/book/the-book.html! E. Alpaydin:
More informationLearning Utility for Behavior Acquisition and Intention Inference of Other Agent
Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Yasutake Takahashi, Teruyasu Kawamata, and Minoru Asada* Dept. of Adaptive Machine Systems, Graduate School of Engineering,
More informationCS343: Artificial Intelligence
CS343: Artificial Intelligence Introduction: Part 2 Prof. Scott Niekum University of Texas at Austin [Based on slides created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All materials
More informationLearning to Identify Irrelevant State Variables
Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences
More informationDopamine neurons activity in a multi-choice task: reward prediction error or value function?
Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Jean Bellot 1,, Olivier Sigaud 1,, Matthew R Roesch 3,, Geoffrey Schoenbaum 5,, Benoît Girard 1,, Mehdi Khamassi
More informationThe optimism bias may support rational action
The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability
More informationIntroduction to Deep Reinforcement Learning and Control
Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Introduction to Deep Reinforcement Learning and Control Lecture 1, CMU 10703 Katerina Fragkiadaki Logistics 3 assignments
More informationExploration and Exploitation in Reinforcement Learning
Exploration and Exploitation in Reinforcement Learning Melanie Coggan Research supervised by Prof. Doina Precup CRA-W DMP Project at McGill University (2004) 1/18 Introduction A common problem in reinforcement
More informationReinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain
Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement
More informationTowards Learning to Ignore Irrelevant State Variables
Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract
More informationDiagnostic Measure of Procrastination. Dr. Piers Steel
Diagnostic Measure of Procrastination Dr. Piers Steel Scientific Stages of Development 1 2 3 4 Identification and Measurement: What is it and how can we identify it. Prediction and Description: What is
More informationOverview. What is an agent?
Artificial Intelligence Programming s and s Chris Brooks Overview What makes an agent? Defining an environment Overview What makes an agent? Defining an environment Department of Computer Science University
More informationCost-Sensitive Learning for Biological Motion
Olivier Sigaud Université Pierre et Marie Curie, PARIS 6 http://people.isir.upmc.fr/sigaud October 5, 2010 1 / 42 Table of contents The problem of movement time Statement of problem A discounted reward
More informationAgents and Environments
Artificial Intelligence Programming s and s Chris Brooks 3-2: Overview What makes an agent? Defining an environment Types of agent programs 3-3: Overview What makes an agent? Defining an environment Types
More informationWhy Do Kids Play Soccer? Why Do You Coach Soccer?
COMMUNICATION MOTIVATION Why Do Kids Play Soccer? Why Do You Coach Soccer? Why Do Kids Play Soccer? Because they want to have fun Because they want to learn, develop, and get better Because they want to
More informationUsing Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes
Using Eligibility Traces to Find the est Memoryless Policy in Partially Observable Markov Decision Processes John Loch Department of Computer Science University of Colorado oulder, CO 80309-0430 loch@cs.colorado.edu
More informationA Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning
A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer
More informationA Reinforcement Learning Approach Involving a Shortest Path Finding Algorithm
Proceedings of the 003 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems Proceedings Las Vegas, of Nevada the 003 October IEEE/RSJ 003 Intl. Conference on Intelligent Robots and Systems Las Vegas,
More informationSUPERVISED-REINFORCEMENT LEARNING FOR A MOBILE ROBOT IN A REAL-WORLD ENVIRONMENT. Karla G. Conn. Thesis. Submitted to the Faculty of the
SUPERVISED-REINFORCEMENT LEARNING FOR A MOBILE ROBOT IN A REAL-WORLD ENVIRONMENT By Karla G. Conn Thesis Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment
More informationC2 Qu2 DP1 How can psychology affect performance?
C2 Qu2 DP1 How can psychology affect performance? Hi Guys Let s move onto the second critical question in Core 2 How can psychology affect performance? In this video, we will be explore motivation from
More informationOscillatory Neural Network for Image Segmentation with Biased Competition for Attention
Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of
More informationA Model of Dopamine and Uncertainty Using Temporal Difference
A Model of Dopamine and Uncertainty Using Temporal Difference Angela J. Thurnham* (a.j.thurnham@herts.ac.uk), D. John Done** (d.j.done@herts.ac.uk), Neil Davey* (n.davey@herts.ac.uk), ay J. Frank* (r.j.frank@herts.ac.uk)
More informationCS324-Artificial Intelligence
CS324-Artificial Intelligence Lecture 3: Intelligent Agents Waheed Noor Computer Science and Information Technology, University of Balochistan, Quetta, Pakistan Waheed Noor (CS&IT, UoB, Quetta) CS324-Artificial
More informationIntangible Attributes of Baseball s Best Players
by Jim Murphy Intangible Attributes of Baseball s Best Players 1 Dealing with adversity Baseball is a game of failure, and the top players are comfortable with adversity. They don t base their self confidence
More informationDAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY!
H DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY! appy, vibrant, successful people think and behave in certain ways, as do miserable and unfulfilled people. In other words, there
More informationLecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR. Donglin Zeng, Department of Biostatistics, University of North Carolina
Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR Introduction Introduction Consider simple DTRs: D = (D 1,..., D K ) D k (H k ) = 1 or 1 (A k = { 1, 1}). That is, a fixed treatment
More informationDo Reinforcement Learning Models Explain Neural Learning?
Do Reinforcement Learning Models Explain Neural Learning? Svenja Stark Fachbereich 20 - Informatik TU Darmstadt svenja.stark@stud.tu-darmstadt.de Abstract Because the functionality of our brains is still
More informationThe Rescorla Wagner Learning Model (and one of its descendants) Computational Models of Neural Systems Lecture 5.1
The Rescorla Wagner Learning Model (and one of its descendants) Lecture 5.1 David S. Touretzky Based on notes by Lisa M. Saksida November, 2015 Outline Classical and instrumental conditioning The Rescorla
More informationA Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models
A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California
More informationA Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences
A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences Marc Pomplun 1, Yueju Liu 2, Julio Martinez-Trujillo 2, Evgueni Simine 2, and John K. Tsotsos 2 1 Department
More informationDoing High Quality Field Research. Kim Elsbach University of California, Davis
Doing High Quality Field Research Kim Elsbach University of California, Davis 1 1. What Does it Mean to do High Quality (Qualitative) Field Research? a) It plays to the strengths of the method for theory
More informationStress Prevention in 6 Steps S T E P 3 A P P R A I S E : C O G N I T I V E R E S T R U C T U R I N G
Stress Prevention in 6 Steps S T E P 3 A P P R A I S E : C O G N I T I V E R E S T R U C T U R I N G 6 steps overview 1. Assess: Raising Awareness 2. Avoid: Unnecessary stress; problem solving 3. Appraise
More informationChapter 2: Intelligent Agents
Chapter 2: Intelligent Agents Outline Last class, introduced AI and rational agent Today s class, focus on intelligent agents Agent and environments Nature of environments influences agent design Basic
More informationPolicy Gradients. CS : Deep Reinforcement Learning Sergey Levine
Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s Lecture 1.
More informationIS IT A STATE OR A GOAL?
IS IT A STATE OR A GOAL? A STATE IS: A GOAL OR OUTCOME IS: EXPRESSED AMBIGUOUSLY YOU CAN HAVE IT NOW (Immediate Gratification) INFINITE (Too much is never enough!) EXPRESSED FOR SELF AND/OR OTHERS EXPRESSED
More informationKatsunari Shibata and Tomohiko Kawano
Learning of Action Generation from Raw Camera Images in a Real-World-Like Environment by Simple Coupling of Reinforcement Learning and a Neural Network Katsunari Shibata and Tomohiko Kawano Oita University,
More informationStop Smoking Pre Session Guide. Helping YOU Successfully Stop Smoking For Good
Stop Smoking Pre Session Guide Helping YOU Successfully Stop Smoking For Good Important Note: This document and your appointment are not a substitute for appropriate medical advice. If you have concerns
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationArtificial Intelligence Lecture 7
Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge
More informationWorksheet # 1 Why We Procrastinate
Worksheet # 1 Why We Procrastinate Directions. Take your best guess and rank the following reasons for why we procrastinate from 1 to 5 starting with 1 being the biggest reason we procrastinate and 5 being
More informationSeminar Thesis: Efficient Planning under Uncertainty with Macro-actions
Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This
More informationDikran J. Martin. Psychology 110. Name: Date: Principal Features. "First, the term learning does not apply to (168)
Dikran J. Martin Psychology 110 Name: Date: Lecture Series: Chapter 5 Learning: How We're Changed Pages: 26 by Experience TEXT: Baron, Robert A. (2001). Psychology (Fifth Edition). Boston, MA: Allyn and
More informationConcreteness and Abstraction in Everyday Explanation. Christos Bechlivanidis, David Lagnado Jeff Zemla & Steve Sloman
Concreteness and Abstraction in Everyday Explanation Christos Bechlivanidis, David Lagnado Jeff Zemla & Steve Sloman What goes into a good explanation? What goes into a good explanation? Why there was
More informationArtificial Intelligence
Artificial Intelligence Intelligent Agents Chapter 2 & 27 What is an Agent? An intelligent agent perceives its environment with sensors and acts upon that environment through actuators 2 Examples of Agents
More informationPartially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013
Partially-Observable Markov Decision Processes as Dynamical Causal Models Finale Doshi-Velez NIPS Causality Workshop 2013 The POMDP Mindset We poke the world (perform an action) Agent World The POMDP Mindset
More informationHigh-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany
High-level Vision Bernd Neumann Slides for the course in WS 2004/05 Faculty of Informatics Hamburg University Germany neumann@informatik.uni-hamburg.de http://kogs-www.informatik.uni-hamburg.de 1 Contents
More informationIntroduction to Artificial Intelligence 2 nd semester 2016/2017. Chapter 2: Intelligent Agents
Introduction to Artificial Intelligence 2 nd semester 2016/2017 Chapter 2: Intelligent Agents Mohamed B. Abubaker Palestine Technical College Deir El-Balah 1 Agents and Environments An agent is anything
More informationCHAPTER 7: Achievement motivation, attribution theory, self-efficacy and confidence. Practice questions - text book pages
QUESTIONS AND ANSWERS CHAPTER 7: Achievement motivation, attribution theory, self-efficacy and confidence Practice questions - text book pages 111-112 1) Which one of the following best explains achievement
More informationCerebellar Learning and Applications
Cerebellar Learning and Applications Matthew Hausknecht, Wenke Li, Mike Mauk, Peter Stone January 23, 2014 Motivation Introduces a novel learning agent: the cerebellum simulator. Study the successes and
More informationMeta-learning in Reinforcement Learning
Neural Networks 16 (2003) 5 9 Neural Networks letter Meta-learning in Reinforcement Learning Nicolas Schweighofer a, *, Kenji Doya a,b,1 a CREST, Japan Science and Technology Corporation, ATR, Human Information
More informationUsing Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems
Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)
More informationA model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $
Neurocomputing 69 (26) 1327 1331 www.elsevier.com/locate/neucom A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Shinya Ishii a,1, Munetaka
More informationSelective Attention as an Example of Building Representations within Reinforcement Learning
University of Colorado, Boulder CU Scholar Psychology and Neuroscience Graduate Theses & Dissertations Psychology and Neuroscience Spring 1-1-2011 Selective Attention as an Example of Building Representations
More informationHierarchical Control over Effortful Behavior by Dorsal Anterior Cingulate Cortex. Clay Holroyd. Department of Psychology University of Victoria
Hierarchical Control over Effortful Behavior by Dorsal Anterior Cingulate Cortex Clay Holroyd Department of Psychology University of Victoria Proposal Problem(s) with existing theories of ACC function.
More informationHow Brain May Weigh the World With Simple Dopamine System
March 19, 1996, Tuesday SCIENCE DESK How Brain May Weigh the World With Simple Dopamine System By SANDRA BLAKESLEE (NYT) 2042 words IN the ever increasing cascade of new information on brain chemistry
More informationIntroduction to Biological Anthropology: Notes 12 Mating: Primate females and males Copyright Bruce Owen 2009 We want to understand the reasons
Introduction to Biological Anthropology: Notes 12 Mating: Primate females and males Copyright Bruce Owen 2009 We want to understand the reasons behind the lifestyles of our non-human primate relatives
More informationAI Programming CS F-04 Agent Oriented Programming
AI Programming CS662-2008F-04 Agent Oriented Programming David Galles Department of Computer Science University of San Francisco 04-0: Agents & Environments What is an Agent What is an Environment Types
More informationPOND-Hindsight: Applying Hindsight Optimization to POMDPs
POND-Hindsight: Applying Hindsight Optimization to POMDPs Alan Olsen and Daniel Bryce alan@olsen.org, daniel.bryce@usu.edu Utah State University Logan, UT Abstract We present the POND-Hindsight entry in
More informationBe Fit for Life Series. Key Number Five Increase Metabolism
Key #5 Increase Metabolism Page 1 of 9 Be Fit for Life Series Key Number Five Increase Metabolism Sit or lie in a safe and comfortable position I want you to close your eyes keep them closed until I ask
More informationTHE PSYCHOLOGY OF HAPPINESS D A Y 3 T H E G O O D L I F E
THE PSYCHOLOGY OF HAPPINESS D A Y 3 T H E G O O D L I F E EXPERIENCE SAMPLING On a scale of 1 (not at all) 10 (extremely) Do you feel: Happy? Relaxed? Awake? AGENDA Grounding Exercise Homework Discussion
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 7: Network models Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single neuron
More informationBrief Summary of Liminal thinking Create the change you want by changing the way you think Dave Gray
Brief Summary of Liminal thinking Create the change you want by changing the way you think Dave Gray The word liminal comes from the Latin word limen which means threshold. Liminal thinking is the art
More informationRespect Handout. You receive respect when you show others respect regardless of how they treat you.
RESPECT -- THE WILL TO UNDERSTAND Part Two Heading in Decent People, Decent Company: How to Lead with Character at Work and in Life by Robert Turknett and Carolyn Turknett, 2005 Respect Handout Respect
More informationAgents and State Spaces. CSCI 446: Artificial Intelligence
Agents and State Spaces CSCI 446: Artificial Intelligence Overview Agents and environments Rationality Agent types Specifying the task environment Performance measure Environment Actuators Sensors Search
More informationCognitive and Biological Agent Models for Emotion Reading
Cognitive and Biological Agent Models for Emotion Reading Zulfiqar A. Memon, Jan Treur Vrije Universiteit Amsterdam, Department of Artificial Intelligence De Boelelaan 1081, 1081 HV Amsterdam, the Netherlands
More informationAgents and Environments. Stephen G. Ware CSCI 4525 / 5525
Agents and Environments Stephen G. Ware CSCI 4525 / 5525 Agents An agent (software or hardware) has: Sensors that perceive its environment Actuators that change its environment Environment Sensors Actuators
More informationOrganizing Behavior into Temporal and Spatial Neighborhoods
Organizing Behavior into Temporal and Spatial Neighborhoods Mark Ring IDSIA / University of Lugano / SUPSI Galleria 6928 Manno-Lugano, Switzerland Email: mark@idsia.ch Tom Schaul Courant Institute of Mathematical
More informationAgents. Robert Platt Northeastern University. Some material used from: 1. Russell/Norvig, AIMA 2. Stacy Marsella, CS Seif El-Nasr, CS4100
Agents Robert Platt Northeastern University Some material used from: 1. Russell/Norvig, AIMA 2. Stacy Marsella, CS4100 3. Seif El-Nasr, CS4100 What is an Agent? Sense Agent Environment Act What is an Agent?
More informationIntroduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2008 As we have seen before, the bottom line
Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2008 As we have seen before, the bottom line in evolution is reproductive success reproductive success:
More informationAdaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning
Adaptive Treatment of Epilepsy via Batch Mode Reinforcement Learning Arthur Guez, Robert D. Vincent and Joelle Pineau School of Computer Science, McGill University Massimo Avoli Montreal Neurological Institute
More informationConsider the following aspects of human intelligence: consciousness, memory, abstract reasoning
All life is nucleic acid. The rest is commentary. Isaac Asimov Consider the following aspects of human intelligence: consciousness, memory, abstract reasoning and emotion. Discuss the relative difficulty
More informationComputational Neuroscience. Instructor: Odelia Schwartz
Computational Neuroscience 2017 1 Instructor: Odelia Schwartz From the NIH web site: Committee report: Brain 2025: A Scientific Vision (from 2014) #1. Discovering diversity: Identify and provide experimental
More informationThe Psychology of Street Photography
The Psychology of Street Photography I think street photography is 90% psychology if you can master your own mental psychology, then you will be able to achieve your personal maximum in your photography
More informationCognitive and Computational Neuroscience of Aging
Cognitive and Computational Neuroscience of Aging Hecke Dynamically Selforganized MPI Brains Can Compute Nervously Goettingen CNS Seminar SS 2007 Outline 1 Introduction 2 Psychological Models 3 Physiological
More informationIntelligent Agents. BBM 405 Fundamentals of Artificial Intelligence Pinar Duygulu Hacettepe University. Slides are mostly adapted from AIMA
1 Intelligent Agents BBM 405 Fundamentals of Artificial Intelligence Pinar Duygulu Hacettepe University Slides are mostly adapted from AIMA Outline 2 Agents and environments Rationality PEAS (Performance
More informationLearning to Use Episodic Memory
Learning to Use Episodic Memory Nicholas A. Gorski (ngorski@umich.edu) John E. Laird (laird@umich.edu) Computer Science & Engineering, University of Michigan 2260 Hayward St., Ann Arbor, MI 48109 USA Abstract
More informationRefresh. The science of sleep for optimal performance and well being. Sleep and Exams: Strange Bedfellows
Refresh The science of sleep for optimal performance and well being Unit 7: Sleep and Exams: Strange Bedfellows Can you remember a night when you were trying and trying to get to sleep because you had
More informationLuke A. Rosedahl & F. Gregory Ashby. University of California, Santa Barbara, CA 93106
Luke A. Rosedahl & F. Gregory Ashby University of California, Santa Barbara, CA 93106 Provides good fits to lots of data Google Scholar returns 2,480 hits Over 130 videos online Own Wikipedia page Has
More informationFoundations of Artificial Intelligence
Foundations of Artificial Intelligence 2. Rational Agents Nature and Structure of Rational Agents and Their Environments Wolfram Burgard, Bernhard Nebel and Martin Riedmiller Albert-Ludwigs-Universität
More informationCS 331: Artificial Intelligence Intelligent Agents. Agent-Related Terms. Question du Jour. Rationality. General Properties of AI Systems
General Properties of AI Systems CS 331: Artificial Intelligence Intelligent Agents Sensors Reasoning Actuators Percepts Actions Environmen nt This part is called an agent. Agent: anything that perceives
More informationIntroduction to Mental Skills Training for Successful Athletes John Kontonis Assoc MAPS
Introduction to Mental Skills Training for Successful Athletes John Kontonis Assoc MAPS "You can always become better." - Tiger Woods Successful Athletes 1. Choose and maintain a positive attitude. 2.
More informationSecrets to the Body of Your Life in 2017
Secrets to the Body of Your Life in 2017 YOU CAN HAVE RESULTS OR EXCUSES NOT BOTH. INTRO TO THIS LESSON Welcome to Lesson #3 of your BarStarzz Calisthenics Workshop! For any new comers, make sure you watch
More informationBrain-Based Devices. Studying Cognitive Functions with Embodied Models of the Nervous System
Brain-Based Devices Studying Cognitive Functions with Embodied Models of the Nervous System Jeff Krichmar The Neurosciences Institute San Diego, California, USA http://www.nsi.edu/nomad develop theory
More informationLearning to act by predicting the future
Learning to act by predicting the future Alexey Dosovitskiy and Vladlen Koltun Intel Labs, Santa Clara ICLR 2017, Toulon, France Sensorimotor control Aim: produce useful motor actions based on sensory
More informationIntroduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2010 We want to understand the reasons
Introduction to Biological Anthropology: Notes 13 Mating: Primate females and males Copyright Bruce Owen 2010 We want to understand the reasons behind the lifestyles of our non-human primate relatives
More informationMarcus Hutter Canberra, ACT, 0200, Australia
Marcus Hutter Canberra, ACT, 0200, Australia http://www.hutter1.net/ Australian National University Abstract The approaches to Artificial Intelligence (AI) in the last century may be labelled as (a) trying
More informationStress is different for everyone While what happens in the brain and the body is the same for all of us, the precipitating factors are very
1 Stress is different for everyone While what happens in the brain and the body is the same for all of us, the precipitating factors are very individual. What one person experiences as stressful might
More informationPolicy Gradients. CS : Deep Reinforcement Learning Sergey Levine
Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s
More informationMITOCW watch?v=fmx5yj95sgq
MITOCW watch?v=fmx5yj95sgq The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To
More informationPART - A 1. Define Artificial Intelligence formulated by Haugeland. The exciting new effort to make computers think machines with minds in the full and literal sense. 2. Define Artificial Intelligence
More informationSolutions for Chapter 2 Intelligent Agents
Solutions for Chapter 2 Intelligent Agents 2.1 This question tests the student s understanding of environments, rational actions, and performance measures. Any sequential environment in which rewards may
More informationAgents. This course is about designing intelligent agents Agents and environments. Rationality. The vacuum-cleaner world
This course is about designing intelligent agents and environments Rationality The vacuum-cleaner world The concept of rational behavior. Environment types Agent types 1 An agent is an entity that perceives
More informationPositive Thinking Train Your Mind For Success And Happiness
Positive Thinking Train Your Mind For Success And Happiness Francisco Bujan - 1 - Contents Intro 12 Part 1 - Mind dynamics 13 What is self talk and how to make it work for you? 14 Mind sets 15 How inspiration
More informationIntroduction and Historical Background. August 22, 2007
1 Cognitive Bases of Behavior Introduction and Historical Background August 22, 2007 2 Cognitive Psychology Concerned with full range of psychological processes from sensation to knowledge representation
More informationContents. Foundations of Artificial Intelligence. Agents. Rational Agents
Contents Foundations of Artificial Intelligence 2. Rational s Nature and Structure of Rational s and Their s Wolfram Burgard, Bernhard Nebel, and Martin Riedmiller Albert-Ludwigs-Universität Freiburg May
More informationLecture 2 Agents & Environments (Chap. 2) Outline
Lecture 2 Agents & Environments (Chap. 2) Based on slides by UW CSE AI faculty, Dan Klein, Stuart Russell, Andrew Moore Outline Agents and environments Rationality PEAS specification Environment types
More information