Cost-Sensitive Learning for Biological Motion

Size: px
Start display at page:

Download "Cost-Sensitive Learning for Biological Motion"

Transcription

1 Olivier Sigaud Université Pierre et Marie Curie, PARIS 6 October 5, / 42

2 Table of contents The problem of movement time Statement of problem A discounted reward approach Emergent time of motion: Lionel Rigoux' work Policy compilation: Jeremie Decock's work General idea: supervised learning of planned trajectories Generalisation results Improving control between trials Neural implementation of the model Motor sequences Biological background Kohonen maps GRS Model 2 / 42

3 The problem of movement time Statement of problem Variable time movements In the case of a force eld, the time of motion changes 3 / 42

4 The problem of movement time Statement of problem Limitation of standard model The optimal feedback control (OFC) framework as the leading explanation for motor control In standard OFC ([Todorov & Jordan, 2002]), the movement time is given V = C f + t f (xqx + uru)dt t0 Motivation to reach the goal represented as a cost for not being there Compromise between state cost and control cost [Guigon et al., 2008]'s proposal : the movement is realized whatever the cost t f V = u 2 dt t0 Reaching the goal (at time t f ) is a constraint The remaining movement depends on the remaining time If, due to a force eld, the hand drifts away, it must come back very fast 4 / 42

5 The problem of movement time A discounted reward approach General idea Can the movement time emerge from the problem? Reaching a goal produces a reward (represented as a scalar) Irrespective of movement cost, we try to reach the goal as fast as possible Because after that, we can look for another reward Because in a dynamic world the source of reward may not stay there Considering movement cost tends to favor slow motion The movement time emerges as an equilibrium between both contradictory pressures 5 / 42

6 The problem of movement time A discounted reward approach Emergence of movement time The subjective reward is maximum at minimal time 6 / 42

7 The problem of movement time A discounted reward approach Emergence of movement time The shorter the movement, the most expensive There is a physical limitation 6 / 42

8 The problem of movement time A discounted reward approach Emergence of movement time An optimum global value emerges Predictions : If the global value is negative, it's not worth it, no movement If the reward increases, the movement time decreases... If the cost increases, the movement time increases... 6 / 42

9 The problem of movement time A discounted reward approach Temporary goal The model deals with the case of a reward available between t 1 and t 2 If t 1 too small or t 2 too large, the subject won't move 7 / 42

10 The problem of movement time A discounted reward approach Removing t f : intuition from Dynamic Programming Optimal value function V (s) : represents the optimal cumulated utility one can get from s Bellman equation : V π (s) = R(s, π(s)) + γ s p(s s, π(s))v π (s ) The agent reaches the reward as fast as possible from anywhere If we add immediate costs, we may get rid of γ 8 / 42

11 The problem of movement time A discounted reward approach Removing t f : the Dynamic Programming trick An innite horizon is considered In Dynamic Programming with innite horizon, the global utility can be written : V = E[ γ t r t], γ ]0, 1] t=0 Thanks to the innite horizon trick, no need to specify a t f γ t favors reaching the goal quickly Whatever γ's value, it is always better to reach faster γ small/large : rather choose an immediate/far away goal The time of remaining movement depends on the state, not on the elapsed time 9 / 42

12 The problem of movement time Emergent time of motion : Lionel Rigoux' work General approach Lionel Rigoux uses the following discounted cost function J(x(t)) = t (s t) [ e γ0 αu(s) 2 r(x(s)) ] ds With r(x) = δ(x x ) where x is the target state Use of a deterministic variation calculus approach to nd optimal trajectory with this cost function Add noise and replan after every step No learning involved But a simulator of the system is required, it could be learned as a forward model (not studied yet) [Shadmehr et al., 2010] : same idea / 42

13 The problem of movement time Emergent time of motion : Lionel Rigoux' work Reproduction of force eld experiments The global aspect of the trajectory matches 11 / 42

14 The problem of movement time Emergent time of motion : Lionel Rigoux' work Other phenomena Displacement of target during motion 12 / 42

15 The problem of movement time Emergent time of motion : Lionel Rigoux' work Discussion In Lionel Rigoux' work, the time of motion emerges and most motor control properties still hold But Variation calculus is very expensive, considers very unlikely solutions 2. Variation calculus is deterministic whereas motion is probably stochastic (inherent noise) 3. Planning must be performed each time, even in a well-known situation Solution to 1 : task space to joint space constraints... Solution to 2 and 3 : next section 13 / 42

16 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Block diagram description of the approach XCSF ũ = f ( x, x ) x t x x0 x u control noise arm (model) K x t+1 ũ XCSF learns associations between estimated state, goal and current action 14 / 42

17 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Block diagram description of the approach x t x x0 x u ũ XCSF noise arm (model) K x t+1 XCSF controller trained with planned trajectories XCSF controller used instead of planning when the actions are learned 14 / 42

18 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Experimental set-up : arm A 2 dofs planar arm with gravity 15 / 42

19 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Experimental set-up : muscles The arm has 6 muscles 16 / 42

20 Policy compilation : Jeremie Decock's work Generalisation results Trajectories with planning 17 / 42

21 Policy compilation : Jeremie Decock's work Generalisation results Trajectories with generalization 18 / 42

22 Policy compilation : Jeremie Decock's work Generalisation results Corresponding trajectories 19 / 42

23 Policy compilation : Jeremie Decock's work Generalisation results Corresponding trajectories 19 / 42

24 Policy compilation : Jeremie Decock's work Generalisation results Planned trajectories on a larger training set 20 / 42

25 Policy compilation : Jeremie Decock's work Generalisation results Generalization capabilities Generalization to other goals 21 / 42

26 Policy compilation : Jeremie Decock's work Improving control between trials Limitations of the model Performance improves between trials, the level of performance improvement depends on the inter-trial time Three ways to implement that process : 1. Perform planning experiments in the head of the agent (using a learned forward model). That's a Model-based RL process ([Sutton, 1990]). 2. Improve the compiled policy with a Model-free RL process 3. Eventually, combine both (as suggested by [Daw et al., 2005]) An internship on that in / 42

27 Neural implementation of the model Neural relevant properties Implementation of temporal dierence algorithm in the striatal dopaminergic neurons (Basal Ganglia) ([Schultz et al., 1997]) Basal Ganglia are believed to implement an Actor-critic architecture (see [Joel et al., 2002]) 23 / 42

28 Neural implementation of the model Model-based architecture [Daw et al., 2005] (within an Action Selection context) : provides a cue on when to perform reoptimization (on-line replanning) based on outcome uncertainty Integration of a bayesian inference view in this architecture? 24 / 42

29 Motor sequences Biological background Graziano et al. (1) [Graziano et al., 2005] : exciting specic neurons in the intraparietal sulcus results in specic, ecologically relevant postures 25 / 42

30 Motor sequences Biological background Graziano et al. (2) The intraparietal sulcus areas is organized with gradients : somatotopic map of the eector, ecological meaning, target of eector 26 / 42

31 Motor sequences Kohonen maps Kohonen maps model [Aalo & Graziano, 2006b] : abstract encoding of the dimensions Shows that the gradients emerge (no manikin simulation) 27 / 42

32 Motor sequences GRS Model Goal of the study [Gabalda et al., 2007] : reproduce [Aalo & Graziano, 2006a] showing that ecological postures can emerge from the interaction with environment 28 / 42

33 Motor sequences GRS Model Example of sequence To each gesture corresponds a rewarded area 29 / 42

34 Motor sequences GRS Model Kohonen map initialization The Kohonen map codes for motor goals Trained on 2 millions random postures As a result, a few cells code for rewarded postures 30 / 42

35 Motor sequences GRS Model Links from contexts to goals The sequence contains four contexts The algorithm must associate the correct goals to contexts among 384 potential goals 31 / 42

36 Motor sequences GRS Model Goal selection Chosing a goal in a context is an action in that context The active goal is the one whose link to the context is the strongest 32 / 42

37 Motor sequences GRS Model Posture for target reaching A task space goal corresponds to a joint conguration 33 / 42

38 Motor sequences GRS Model Déplacement vers le but Low-level control drives the manikin towards its goal 34 / 42

39 Motor sequences GRS Model Reinforcing the goal When the goal is reached, a reward is received and the link to the celle coding for the current posture is strenghtened 35 / 42

40 Motor sequences GRS Model learning the map When a reward is received, the current cell is trained : it extends its domain 36 / 42

41 Motor sequences GRS Model Learnt map One can see the emergence of zones coding for relevant postures 37 / 42

42 Motor sequences GRS Model Global view We get a hierarchical architecture Linking contexts and goal is an RL problem 38 / 42

43 Motor sequences GRS Model Other topics Motor synergies : used to reduce the size of the optimisation problem Motor primitives : repertoire of ready-to-use simple controllers / 42

44 Motor sequences GRS Model Final discussion RL tools are used at the action selection level because RL theory was about discrete choices By contrast, OC tools are used at the motor control level, considered continuous AC methods provide a potential unication (continuous RL methods), thus the possibility to consider an unique neural substrate (taking the multiple BG loops into account) 40 / 42

45 Motor sequences GRS Model Messages Much progress in maths since naive early proposal (NAC, enac and INAC) That progress did not propagate to biological modelling Actor-critic methods are ecient for motor control modelling Incremental versions might be biologically plausible Evolution towards Bayesian inference models Model-based actor-critic architectures might integrate cost-sensitive learning and planning 41 / 42

46 Motor sequences GRS Model Any question? 42 / 42

47 Motor sequences GRS Model Aalo, T. N. & Graziano, M. S. A. (2006a). Possible origins of the complex topographic organization of motor cortex : Reduction of a multidimensional space onto a two-dimensional array. Journal of Neuroscience, 26(23) : Aalo, T. N. & Graziano, M. S. A. (2006b). Relationship between unconstrained arm movements and single neuron ring in the macaque motor cortex. Journal of Neuroscience, 27(11) : Daw, N., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioural control. Nature Neuroscience, 8 : Gabalda, B., Rigoux, L., & Sigaud, O. (2007). Learning postures through sensorimotor training : a human simulation case study. Edité dans Proceedings of the Seventh International Conference on Epigenetic Robotics, pages Graziano, M. S., Tyson, N. S., & Cooke, D. F. (2005). Arm movements evoked by electrical stimulation in the motor cortex of monkeys. Journal of Neurophysiology, 94 : Guigon, E., Baraduc, P., & Desmurget, M. (2008). Optimality, stochasticity and variability in motor behavior. Journal of Computational Neuroscience, 24(1) :5768. Joel, D., Niv, Y., & Ruppin, E. (2002). Actor-critic models of the basal ganglia : new anatomical and computational perspectives. Neural Networks, 15(4-6) : Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate for prediction and reward. Science, 275 : / 42

48 Motor sequences GRS Model Shadmehr, R., Orban de Xivry, J.-J., Xu-Wilson, M., & Shih, T.-Y. (2010). Temporal discounting of reward and the cost of time in motor control. Journal of Neuroscience, 30(31) : Sutton, R. S. (1990). Integrating architectures for learning, planning, and reacting based on approximating dynamic programming. Edité dans Proceedings of the Seventh International Conference on Machine Learning ICML'90, pages , San Mateo, CA. Morgan Kaufmann. Todorov, E. & Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nature Neurosciences, 5(11) : / 42

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement

More information

A Model of Dopamine and Uncertainty Using Temporal Difference

A Model of Dopamine and Uncertainty Using Temporal Difference A Model of Dopamine and Uncertainty Using Temporal Difference Angela J. Thurnham* (a.j.thurnham@herts.ac.uk), D. John Done** (d.j.done@herts.ac.uk), Neil Davey* (n.davey@herts.ac.uk), ay J. Frank* (r.j.frank@herts.ac.uk)

More information

Reward Hierarchical Temporal Memory

Reward Hierarchical Temporal Memory WCCI 2012 IEEE World Congress on Computational Intelligence June, 10-15, 2012 - Brisbane, Australia IJCNN Reward Hierarchical Temporal Memory Model for Memorizing and Computing Reward Prediction Error

More information

Dopamine enables dynamic regulation of exploration

Dopamine enables dynamic regulation of exploration Dopamine enables dynamic regulation of exploration François Cinotti Université Pierre et Marie Curie, CNRS 4 place Jussieu, 75005, Paris, FRANCE francois.cinotti@isir.upmc.fr Nassim Aklil nassim.aklil@isir.upmc.fr

More information

Dopamine neurons activity in a multi-choice task: reward prediction error or value function?

Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Jean Bellot 1,, Olivier Sigaud 1,, Matthew R Roesch 3,, Geoffrey Schoenbaum 5,, Benoît Girard 1,, Mehdi Khamassi

More information

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Neurocomputing 69 (26) 1327 1331 www.elsevier.com/locate/neucom A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Shinya Ishii a,1, Munetaka

More information

Learning postures through sensorimotor training: a human simulation case study

Learning postures through sensorimotor training: a human simulation case study Learning postures through sensorimotor training: a human simulation case study Belonia Gabalda Lionel Rigoux Olivier Sigaud Institut des Systèmes Intelligents et de Robotique Université Pierre et Marie

More information

Learning Utility for Behavior Acquisition and Intention Inference of Other Agent

Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Yasutake Takahashi, Teruyasu Kawamata, and Minoru Asada* Dept. of Adaptive Machine Systems, Graduate School of Engineering,

More information

Learning Classifier Systems (LCS/XCSF)

Learning Classifier Systems (LCS/XCSF) Context-Dependent Predictions and Cognitive Arm Control with XCSF Learning Classifier Systems (LCS/XCSF) Laurentius Florentin Gruber Seminar aus Künstlicher Intelligenz WS 2015/16 Professor Johannes Fürnkranz

More information

Lecture 13: Finding optimal treatment policies

Lecture 13: Finding optimal treatment policies MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline

More information

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience. Outline: Motivation. What s the attention mechanism? Soft attention vs. Hard attention. Attention in Machine translation. Attention in Image captioning. State-of-the-art. 1 Motivation: Attention: Focusing

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction

More information

A model of reward- and effort-based optimal decision making and motor control

A model of reward- and effort-based optimal decision making and motor control A model of reward- and effort-based optimal decision making and motor control Abbreviated title: Optimal decision and action Lionel Rigoux 1,2, Emmanuel Guigon 1,2 1 UPMC Univ Paris 06, UMR 7222, ISIR,

More information

Emotion Explained. Edmund T. Rolls

Emotion Explained. Edmund T. Rolls Emotion Explained Edmund T. Rolls Professor of Experimental Psychology, University of Oxford and Fellow and Tutor in Psychology, Corpus Christi College, Oxford OXPORD UNIVERSITY PRESS Contents 1 Introduction:

More information

THE FORMATION OF HABITS The implicit supervision of the basal ganglia

THE FORMATION OF HABITS The implicit supervision of the basal ganglia THE FORMATION OF HABITS The implicit supervision of the basal ganglia MEROPI TOPALIDOU 12e Colloque de Société des Neurosciences Montpellier May 1922, 2015 THE FORMATION OF HABITS The implicit supervision

More information

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of

More information

Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making

Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making Leo P. Surgre, Gres S. Corrado and William T. Newsome Presenter: He Crane Huang 04/20/2010 Outline Studies on neural

More information

The optimism bias may support rational action

The optimism bias may support rational action The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single

More information

A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex

A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex Mehdi Khamassi, Rene Quilodran, Pierre Enel, Emmanuel Procyk, and Peter F. Dominey INSERM

More information

Brain Based Change Management

Brain Based Change Management Brain Based Change Management PMI Mile Hi Chapter December 2, 2017 Vanita Bellen Executive Coach and Leadership Consultant True North Coaching and Facilitation Vanita Bellen, MHSc, PHR, SHRM-CP, PCC True

More information

ERA: Architectures for Inference

ERA: Architectures for Inference ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering 7/28/09 1 Intelligent Computing In spite of the transistor bounty of Moore s law, there is a large class of problems

More information

(c) KSIS Politechnika Poznanska

(c) KSIS Politechnika Poznanska Fundamentals of Autonomous Systems Control architectures in robotics Dariusz Pazderski 1 1 Katedra Sterowania i In»ynierii Systemów, Politechnika Pozna«ska 9th March 2016 Introduction Robotic paradigms

More information

Computational & Systems Neuroscience Symposium

Computational & Systems Neuroscience Symposium Keynote Speaker: Mikhail Rabinovich Biocircuits Institute University of California, San Diego Sequential information coding in the brain: binding, chunking and episodic memory dynamics Sequential information

More information

Artificial Intelligence Lecture 7

Artificial Intelligence Lecture 7 Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge

More information

Hierarchical dynamical models of motor function

Hierarchical dynamical models of motor function ARTICLE IN PRESS Neurocomputing 70 (7) 975 990 www.elsevier.com/locate/neucom Hierarchical dynamical models of motor function S.M. Stringer, E.T. Rolls Department of Experimental Psychology, Centre for

More information

Hebbian Plasticity for Improving Perceptual Decisions

Hebbian Plasticity for Improving Perceptual Decisions Hebbian Plasticity for Improving Perceptual Decisions Tsung-Ren Huang Department of Psychology, National Taiwan University trhuang@ntu.edu.tw Abstract Shibata et al. reported that humans could learn to

More information

Reinforcement Learning and Artificial Intelligence

Reinforcement Learning and Artificial Intelligence Reinforcement Learning and Artificial Intelligence PIs: Rich Sutton Michael Bowling Dale Schuurmans Vadim Bulitko plus: Dale Schuurmans Vadim Bulitko Lori Troop Mark Lee Reinforcement learning is learning

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 10: Brain-Computer Interfaces Ilya Kuzovkin So Far Stimulus So Far So Far Stimulus What are the neuroimaging techniques you know about? Stimulus So Far

More information

Time Experiencing by Robotic Agents

Time Experiencing by Robotic Agents Time Experiencing by Robotic Agents Michail Maniadakis 1 and Marc Wittmann 2 and Panos Trahanias 1 1- Foundation for Research and Technology - Hellas, ICS, Greece 2- Institute for Frontier Areas of Psychology

More information

Behavioral considerations suggest an average reward TD model of the dopamine system

Behavioral considerations suggest an average reward TD model of the dopamine system Neurocomputing 32}33 (2000) 679}684 Behavioral considerations suggest an average reward TD model of the dopamine system Nathaniel D. Daw*, David S. Touretzky Computer Science Department & Center for the

More information

Learning to Selectively Attend

Learning to Selectively Attend Learning to Selectively Attend Samuel J. Gershman (sjgershm@princeton.edu) Jonathan D. Cohen (jdc@princeton.edu) Yael Niv (yael@princeton.edu) Department of Psychology and Neuroscience Institute, Princeton

More information

A Dynamic Model for Action Understanding and Goal-Directed Imitation

A Dynamic Model for Action Understanding and Goal-Directed Imitation * Manuscript-title pg, abst, fig lgnd, refs, tbls Brain Research 1083 (2006) pp.174-188 A Dynamic Model for Action Understanding and Goal-Directed Imitation Wolfram Erlhagen 1, Albert Mukovskiy 1, Estela

More information

Title: Computational modeling of observational learning inspired by the cortical underpinnings of human primates.

Title: Computational modeling of observational learning inspired by the cortical underpinnings of human primates. Title: Computational modeling of observational learning inspired by the cortical underpinnings of human primates. Authors: Emmanouil Hourdakis (ehourdak@ics.forth.gr) Institute of Computer Science, Foundation

More information

Memory, Attention, and Decision-Making

Memory, Attention, and Decision-Making Memory, Attention, and Decision-Making A Unifying Computational Neuroscience Approach Edmund T. Rolls University of Oxford Department of Experimental Psychology Oxford England OXFORD UNIVERSITY PRESS Contents

More information

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer

More information

arxiv: v2 [cs.lg] 1 Jun 2018

arxiv: v2 [cs.lg] 1 Jun 2018 Shagun Sodhani 1 * Vardaan Pahuja 1 * arxiv:1805.11016v2 [cs.lg] 1 Jun 2018 Abstract Self-play (Sukhbaatar et al., 2017) is an unsupervised training procedure which enables the reinforcement learning agents

More information

Learning to Selectively Attend

Learning to Selectively Attend Learning to Selectively Attend Samuel J. Gershman (sjgershm@princeton.edu) Jonathan D. Cohen (jdc@princeton.edu) Yael Niv (yael@princeton.edu) Department of Psychology and Neuroscience Institute, Princeton

More information

Introduction to computational motor control

Introduction to computational motor control Introduction to computational motor control Olivier White, PhD, Ir Associate Professor INSERM - U1093 Cognition, Action, and Sensorimotor Plasticity Dijon, 2012 This lecture is based on Reza Shadmehr s,

More information

Dynamic Control Models as State Abstractions

Dynamic Control Models as State Abstractions University of Massachusetts Amherst From the SelectedWorks of Roderic Grupen 998 Dynamic Control Models as State Abstractions Jefferson A. Coelho Roderic Grupen, University of Massachusetts - Amherst Available

More information

Error Detection based on neural signals

Error Detection based on neural signals Error Detection based on neural signals Nir Even- Chen and Igor Berman, Electrical Engineering, Stanford Introduction Brain computer interface (BCI) is a direct communication pathway between the brain

More information

Neurophysiology of systems

Neurophysiology of systems Neurophysiology of systems Motor cortex (voluntary movements) Dana Cohen, Room 410, tel: 7138 danacoh@gmail.com Voluntary movements vs. reflexes Same stimulus yields a different movement depending on context

More information

dacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations

dacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations dacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations Mehdi Khamassi (CNRS & UPMC, Paris) Symposim S23 chaired by

More information

Is model fitting necessary for model-based fmri?

Is model fitting necessary for model-based fmri? Is model fitting necessary for model-based fmri? Robert C. Wilson Princeton Neuroscience Institute Princeton University Princeton, NJ 85 rcw@princeton.edu Yael Niv Princeton Neuroscience Institute Princeton

More information

Modeling the sensory roles of noradrenaline in action selection

Modeling the sensory roles of noradrenaline in action selection Modeling the sensory roles of noradrenaline in action selection Maxime Carrere, Frédéric Alexandre To cite this version: Maxime Carrere, Frédéric Alexandre. Modeling the sensory roles of noradrenaline

More information

A Novel Account in Neural Terms. Gal Chechik Isaac Meilijson and Eytan Ruppin. Schools of Medicine and Mathematical Sciences

A Novel Account in Neural Terms. Gal Chechik Isaac Meilijson and Eytan Ruppin. Schools of Medicine and Mathematical Sciences Synaptic Pruning in Development: A Novel Account in Neural Terms Gal Chechik Isaac Meilijson and Eytan Ruppin Schools of Medicine and Mathematical Sciences Tel-Aviv University Tel Aviv 69978, Israel gal@devil.tau.ac.il

More information

Session Goals. Principles of Brain Plasticity

Session Goals. Principles of Brain Plasticity Presenter: Bryan Kolb Canadian Centre for Behavioural Neuroscience University of Lethbridge Date: January 12, 2011 The FASD Learning Series is part of the Alberta government s commitment to programs and

More information

A behavioral investigation of the algorithms underlying reinforcement learning in humans

A behavioral investigation of the algorithms underlying reinforcement learning in humans A behavioral investigation of the algorithms underlying reinforcement learning in humans Ana Catarina dos Santos Farinha Under supervision of Tiago Vaz Maia Instituto Superior Técnico Instituto de Medicina

More information

Learning to act by predicting the future

Learning to act by predicting the future Learning to act by predicting the future Alexey Dosovitskiy and Vladlen Koltun Intel Labs, Santa Clara ICLR 2017, Toulon, France Sensorimotor control Aim: produce useful motor actions based on sensory

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

MOTOR CONTROL. Emmanuel Guigon. Institut des Systèmes Intelligents et de Robotique Sorbonne Université CNRS / UMR 7222 Paris, France

MOTOR CONTROL. Emmanuel Guigon. Institut des Systèmes Intelligents et de Robotique Sorbonne Université CNRS / UMR 7222 Paris, France MOTOR CONTROL Emmanuel Guigon Institut des Systèmes Intelligents et de Robotique Sorbonne Université CNRS / UMR 7222 Paris, France emmanuel.guigon@sorbonne-universite.fr e.guigon.free.fr/teaching.html

More information

Extending the Computational Abilities of the Procedural Learning Mechanism in ACT-R

Extending the Computational Abilities of the Procedural Learning Mechanism in ACT-R Extending the Computational Abilities of the Procedural Learning Mechanism in ACT-R Wai-Tat Fu (wfu@cmu.edu) John R. Anderson (ja+@cmu.edu) Department of Psychology, Carnegie Mellon University Pittsburgh,

More information

Making Things Happen: Simple Motor Control

Making Things Happen: Simple Motor Control Making Things Happen: Simple Motor Control How Your Brain Works - Week 10 Prof. Jan Schnupp wschnupp@cityu.edu.hk HowYourBrainWorks.net The Story So Far In the first few lectures we introduced you to some

More information

Learning Working Memory Tasks by Reward Prediction in the Basal Ganglia

Learning Working Memory Tasks by Reward Prediction in the Basal Ganglia Learning Working Memory Tasks by Reward Prediction in the Basal Ganglia Bryan Loughry Department of Computer Science University of Colorado Boulder 345 UCB Boulder, CO, 80309 loughry@colorado.edu Michael

More information

Towards Learning to Ignore Irrelevant State Variables

Towards Learning to Ignore Irrelevant State Variables Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Intelligent Agents Chapter 2 & 27 What is an Agent? An intelligent agent perceives its environment with sensors and acts upon that environment through actuators 2 Examples of Agents

More information

A Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl

A Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl A Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl Andrea Haessly andrea@cs.utexas.edu Joseph Sirosh sirosh@cs.utexas.edu Risto Miikkulainen risto@cs.utexas.edu Abstract

More information

Temporal Pattern identication using Spike-Timing Dependent Plasticity

Temporal Pattern identication using Spike-Timing Dependent Plasticity Temporal Pattern identication using Spike-Timing Dependent Plasticity Frédéric Henry, Emmanuel Daucé, Hédi Soula August 31, 2006 Abstract This paper addresses the question of the functional role of the

More information

A Computational Model of Prefrontal Cortex Function

A Computational Model of Prefrontal Cortex Function A Computational Model of Prefrontal Cortex Function Todd S. Braver Dept. of Psychology Carnegie Mellon Univ. Pittsburgh, PA 15213 Jonathan D. Cohen Dept. of Psychology Carnegie Mellon Univ. Pittsburgh,

More information

A Neurocomputational Model of Dopamine and Prefrontal Striatal Interactions during Multicue Category Learning by Parkinson Patients

A Neurocomputational Model of Dopamine and Prefrontal Striatal Interactions during Multicue Category Learning by Parkinson Patients A Neurocomputational Model of Dopamine and Prefrontal Striatal Interactions during Multicue Category Learning by Parkinson Patients Ahmed A. Moustafa and Mark A. Gluck Abstract Most existing models of

More information

Between-word regressions as part of rational reading

Between-word regressions as part of rational reading Between-word regressions as part of rational reading Klinton Bicknell & Roger Levy UC San Diego CUNY 2010: New York Bicknell & Levy (UC San Diego) Regressions as rational reading CUNY 2010 1 / 23 Introduction

More information

Object recognition and hierarchical computation

Object recognition and hierarchical computation Object recognition and hierarchical computation Challenges in object recognition. Fukushima s Neocognitron View-based representations of objects Poggio s HMAX Forward and Feedback in visual hierarchy Hierarchical

More information

Journal of Physiology - Paris

Journal of Physiology - Paris Journal of Physiology - Paris xxx (2013) xxx xxx Contents lists available at ScienceDirect Journal of Physiology - Paris journal homepage: www.elsevier.com/locate/jphysparis Modelling the learning of biomechanics

More information

Bayesian Reinforcement Learning

Bayesian Reinforcement Learning Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline

More information

Learning and Adaptive Behavior, Part II

Learning and Adaptive Behavior, Part II Learning and Adaptive Behavior, Part II April 12, 2007 The man who sets out to carry a cat by its tail learns something that will always be useful and which will never grow dim or doubtful. -- Mark Twain

More information

Modeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments

Modeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments Modeling Individual and Group Behavior in Complex Environments Dr. R. Andrew Goodwin Environmental Laboratory Professor James J. Anderson Abran Steele-Feldman University of Washington Status: AT-14 Continuing

More information

Action Recognition based on Hierarchical Self-Organizing Maps

Action Recognition based on Hierarchical Self-Organizing Maps Action Recognition based on Hierarchical Self-Organizing Maps Miriam Buonamente 1, Haris Dindo 1, and Magnus Johnsson 2 1 RoboticsLab, DICGIM, University of Palermo, Viale delle Scienze, Ed. 6, 90128 Palermo,

More information

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine

Policy Gradients. CS : Deep Reinforcement Learning Sergey Levine Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s

More information

Learning in neural networks

Learning in neural networks http://ccnl.psy.unipd.it Learning in neural networks Marco Zorzi University of Padova M. Zorzi - European Diploma in Cognitive and Brain Sciences, Cognitive modeling", HWK 19-24/3/2006 1 Connectionist

More information

Spiking Inputs to a Winner-take-all Network

Spiking Inputs to a Winner-take-all Network Spiking Inputs to a Winner-take-all Network Matthias Oster and Shih-Chii Liu Institute of Neuroinformatics University of Zurich and ETH Zurich Winterthurerstrasse 9 CH-857 Zurich, Switzerland {mao,shih}@ini.phys.ethz.ch

More information

Learning to Use Episodic Memory

Learning to Use Episodic Memory Learning to Use Episodic Memory Nicholas A. Gorski (ngorski@umich.edu) John E. Laird (laird@umich.edu) Computer Science & Engineering, University of Michigan 2260 Hayward St., Ann Arbor, MI 48109 USA Abstract

More information

Timing and partial observability in the dopamine system

Timing and partial observability in the dopamine system In Advances in Neural Information Processing Systems 5. MIT Press, Cambridge, MA, 23. (In Press) Timing and partial observability in the dopamine system Nathaniel D. Daw,3, Aaron C. Courville 2,3, and

More information

Thalamocortical Feedback and Coupled Oscillators

Thalamocortical Feedback and Coupled Oscillators Thalamocortical Feedback and Coupled Oscillators Balaji Sriram March 23, 2009 Abstract Feedback systems are ubiquitous in neural systems and are a subject of intense theoretical and experimental analysis.

More information

Objectives. Objectives Continued 8/13/2014. Movement Education and Motor Learning Where Ortho and Neuro Rehab Collide

Objectives. Objectives Continued 8/13/2014. Movement Education and Motor Learning Where Ortho and Neuro Rehab Collide Movement Education and Motor Learning Where Ortho and Neuro Rehab Collide Roderick Henderson, PT, ScD, OCS Wendy Herbert, PT, PhD Janna McGaugh, PT, ScD, COMT Jill Seale, PT, PhD, NCS Objectives 1. Identify

More information

A Model of Reward- and Effort-Based Optimal Decision Making and Motor Control

A Model of Reward- and Effort-Based Optimal Decision Making and Motor Control A Model of Reward- and Effort-Based Optimal Decision Making and Motor Control Lionel Rigoux 1,2, Emmanuel Guigon 1,2 * 1 UPMC Univ Paris 06, UMR 7222, ISIR, Paris, France, 2 CNRS, UMR 7222, ISIR, Paris,

More information

ArteSImit: Artefact Structural Learning through Imitation

ArteSImit: Artefact Structural Learning through Imitation ArteSImit: Artefact Structural Learning through Imitation (TU München, U Parma, U Tübingen, U Minho, KU Nijmegen) Goals Methodology Intermediate goals achieved so far Motivation Living artefacts will critically

More information

Cell Responses in V4 Sparse Distributed Representation

Cell Responses in V4 Sparse Distributed Representation Part 4B: Real Neurons Functions of Layers Input layer 4 from sensation or other areas 3. Neocortical Dynamics Hidden layers 2 & 3 Output layers 5 & 6 to motor systems or other areas 1 2 Hierarchical Categorical

More information

Noise Cancellation using Adaptive Filters Algorithms

Noise Cancellation using Adaptive Filters Algorithms Noise Cancellation using Adaptive Filters Algorithms Suman, Poonam Beniwal Department of ECE, OITM, Hisar, bhariasuman13@gmail.com Abstract Active Noise Control (ANC) involves an electro acoustic or electromechanical

More information

EXPLORATION FLOW 4/18/10

EXPLORATION FLOW 4/18/10 EXPLORATION Peter Bossaerts CNS 102b FLOW Canonical exploration problem: bandits Bayesian optimal exploration: The Gittins index Undirected exploration: e-greedy and softmax (logit) The economists and

More information

Organizing Behavior into Temporal and Spatial Neighborhoods

Organizing Behavior into Temporal and Spatial Neighborhoods Organizing Behavior into Temporal and Spatial Neighborhoods Mark Ring IDSIA / University of Lugano / SUPSI Galleria 6928 Manno-Lugano, Switzerland Email: mark@idsia.ch Tom Schaul Courant Institute of Mathematical

More information

REINFORCEMENT LEARNING OF DIMENSIONAL ATTENTION FOR CATEGORIZATION JOSHUA L. PHILLIPS

REINFORCEMENT LEARNING OF DIMENSIONAL ATTENTION FOR CATEGORIZATION JOSHUA L. PHILLIPS COMPUTER SCIENCE REINFORCEMENT LEARNING OF DIMENSIONAL ATTENTION FOR CATEGORIZATION JOSHUA L. PHILLIPS Thesis under the direction of Professor David C. Noelle The ability to selectively focus attention

More information

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Terrence C. Stewart (tcstewar@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for Theoretical

More information

Representation 1. Discussion Question. Roskies: Downplaying the Similarities of Neuroimages to Photographs

Representation 1. Discussion Question. Roskies: Downplaying the Similarities of Neuroimages to Photographs Representation 1 Discussion Question In what way are photographs more reliable than paintings? A. They aren t B. The image in the photograph directly reflects what is before the lens C. The image in the

More information

arxiv: v1 [cs.lg] 29 Jun 2016

arxiv: v1 [cs.lg] 29 Jun 2016 Actor-critic versus direct policy search: a comparison based on sample complexity Arnaud de Froissard de Broissia, Olivier Sigaud Sorbonne Universités, UPMC Univ Paris 06, UMR 7222, F-75005 Paris, France

More information

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task

Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Terrence C. Stewart (tcstewar@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for Theoretical

More information

Degree of freedom problem

Degree of freedom problem KINE 4500 Neural Control of Movement Lecture #1:Introduction to the Neural Control of Movement Neural control of movement Kinesiology: study of movement Here we re looking at the control system, and what

More information

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems

Using Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)

More information

KINE 4500 Neural Control of Movement. Lecture #1:Introduction to the Neural Control of Movement. Neural control of movement

KINE 4500 Neural Control of Movement. Lecture #1:Introduction to the Neural Control of Movement. Neural control of movement KINE 4500 Neural Control of Movement Lecture #1:Introduction to the Neural Control of Movement Neural control of movement Kinesiology: study of movement Here we re looking at the control system, and what

More information

An exploration of the predictors of instruction following in an academic environment

An exploration of the predictors of instruction following in an academic environment MSc Research Projects Completed by students on the MSc Psychology, MSc Brain Imaging & Cognitive Neuroscience and MSc Computational Neuroscience & Cognitive Robotics 2016-2017 An exploration of the predictors

More information

Human Paleoneurology and the Evolution of the Parietal Cortex

Human Paleoneurology and the Evolution of the Parietal Cortex PARIETAL LOBE The Parietal Lobes develop at about the age of 5 years. They function to give the individual perspective and to help them understand space, touch, and volume. The location of the parietal

More information

Intrinsic Motivation Systems for Autonomous Mental Development

Intrinsic Motivation Systems for Autonomous Mental Development IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Intrinsic Motivation Systems for Autonomous Mental Development Pierre-Yves Oudeyer, Frédéric Kaplan, Verena V. Hafner Sony Computer Science Lab, Paris 6

More information

Models of Imitation and Mirror Neuron Activity. COGS171 FALL Quarter 2011 J. A. Pineda

Models of Imitation and Mirror Neuron Activity. COGS171 FALL Quarter 2011 J. A. Pineda Models of Imitation and Mirror Neuron Activity COGS171 FALL Quarter 2011 J. A. Pineda Basis for Models Since a majority of mirror neurons have been found in motor areas (IFG and IPL), it is reasonable

More information

Do Reinforcement Learning Models Explain Neural Learning?

Do Reinforcement Learning Models Explain Neural Learning? Do Reinforcement Learning Models Explain Neural Learning? Svenja Stark Fachbereich 20 - Informatik TU Darmstadt svenja.stark@stud.tu-darmstadt.de Abstract Because the functionality of our brains is still

More information

HST 583 fmri DATA ANALYSIS AND ACQUISITION

HST 583 fmri DATA ANALYSIS AND ACQUISITION HST 583 fmri DATA ANALYSIS AND ACQUISITION Neural Signal Processing for Functional Neuroimaging Neuroscience Statistics Research Laboratory Massachusetts General Hospital Harvard Medical School/MIT Division

More information

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 Practical Bayesian Optimization of Machine Learning Algorithms Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 ... (Gaussian Processes) are inadequate for doing speech and vision. I still think they're

More information

LEAH KRUBITZER RESEARCH GROUP LAB PUBLICATIONS WHAT WE DO LINKS CONTACTS

LEAH KRUBITZER RESEARCH GROUP LAB PUBLICATIONS WHAT WE DO LINKS CONTACTS LEAH KRUBITZER RESEARCH GROUP LAB PUBLICATIONS WHAT WE DO LINKS CONTACTS WHAT WE DO Present studies and future directions Our laboratory is currently involved in two major areas of research. The first

More information

A Biased View of Perceivers. Commentary on `Observer theory, Bayes theory,

A Biased View of Perceivers. Commentary on `Observer theory, Bayes theory, A Biased View of Perceivers Commentary on `Observer theory, Bayes theory, and psychophysics,' by B. Bennett, et al. Allan D. Jepson University oftoronto Jacob Feldman Rutgers University March 14, 1995

More information

COMP150 Behavior-Based Robotics

COMP150 Behavior-Based Robotics For class use only, do not distribute COMP150 Behavior-Based Robotics http://www.cs.tufts.edu/comp/150bbr/timetable.html http://www.cs.tufts.edu/comp/150bbr/syllabus.html Project directions and topics

More information