Cost-Sensitive Learning for Biological Motion
|
|
- Mavis Gillian Sparks
- 5 years ago
- Views:
Transcription
1 Olivier Sigaud Université Pierre et Marie Curie, PARIS 6 October 5, / 42
2 Table of contents The problem of movement time Statement of problem A discounted reward approach Emergent time of motion: Lionel Rigoux' work Policy compilation: Jeremie Decock's work General idea: supervised learning of planned trajectories Generalisation results Improving control between trials Neural implementation of the model Motor sequences Biological background Kohonen maps GRS Model 2 / 42
3 The problem of movement time Statement of problem Variable time movements In the case of a force eld, the time of motion changes 3 / 42
4 The problem of movement time Statement of problem Limitation of standard model The optimal feedback control (OFC) framework as the leading explanation for motor control In standard OFC ([Todorov & Jordan, 2002]), the movement time is given V = C f + t f (xqx + uru)dt t0 Motivation to reach the goal represented as a cost for not being there Compromise between state cost and control cost [Guigon et al., 2008]'s proposal : the movement is realized whatever the cost t f V = u 2 dt t0 Reaching the goal (at time t f ) is a constraint The remaining movement depends on the remaining time If, due to a force eld, the hand drifts away, it must come back very fast 4 / 42
5 The problem of movement time A discounted reward approach General idea Can the movement time emerge from the problem? Reaching a goal produces a reward (represented as a scalar) Irrespective of movement cost, we try to reach the goal as fast as possible Because after that, we can look for another reward Because in a dynamic world the source of reward may not stay there Considering movement cost tends to favor slow motion The movement time emerges as an equilibrium between both contradictory pressures 5 / 42
6 The problem of movement time A discounted reward approach Emergence of movement time The subjective reward is maximum at minimal time 6 / 42
7 The problem of movement time A discounted reward approach Emergence of movement time The shorter the movement, the most expensive There is a physical limitation 6 / 42
8 The problem of movement time A discounted reward approach Emergence of movement time An optimum global value emerges Predictions : If the global value is negative, it's not worth it, no movement If the reward increases, the movement time decreases... If the cost increases, the movement time increases... 6 / 42
9 The problem of movement time A discounted reward approach Temporary goal The model deals with the case of a reward available between t 1 and t 2 If t 1 too small or t 2 too large, the subject won't move 7 / 42
10 The problem of movement time A discounted reward approach Removing t f : intuition from Dynamic Programming Optimal value function V (s) : represents the optimal cumulated utility one can get from s Bellman equation : V π (s) = R(s, π(s)) + γ s p(s s, π(s))v π (s ) The agent reaches the reward as fast as possible from anywhere If we add immediate costs, we may get rid of γ 8 / 42
11 The problem of movement time A discounted reward approach Removing t f : the Dynamic Programming trick An innite horizon is considered In Dynamic Programming with innite horizon, the global utility can be written : V = E[ γ t r t], γ ]0, 1] t=0 Thanks to the innite horizon trick, no need to specify a t f γ t favors reaching the goal quickly Whatever γ's value, it is always better to reach faster γ small/large : rather choose an immediate/far away goal The time of remaining movement depends on the state, not on the elapsed time 9 / 42
12 The problem of movement time Emergent time of motion : Lionel Rigoux' work General approach Lionel Rigoux uses the following discounted cost function J(x(t)) = t (s t) [ e γ0 αu(s) 2 r(x(s)) ] ds With r(x) = δ(x x ) where x is the target state Use of a deterministic variation calculus approach to nd optimal trajectory with this cost function Add noise and replan after every step No learning involved But a simulator of the system is required, it could be learned as a forward model (not studied yet) [Shadmehr et al., 2010] : same idea / 42
13 The problem of movement time Emergent time of motion : Lionel Rigoux' work Reproduction of force eld experiments The global aspect of the trajectory matches 11 / 42
14 The problem of movement time Emergent time of motion : Lionel Rigoux' work Other phenomena Displacement of target during motion 12 / 42
15 The problem of movement time Emergent time of motion : Lionel Rigoux' work Discussion In Lionel Rigoux' work, the time of motion emerges and most motor control properties still hold But Variation calculus is very expensive, considers very unlikely solutions 2. Variation calculus is deterministic whereas motion is probably stochastic (inherent noise) 3. Planning must be performed each time, even in a well-known situation Solution to 1 : task space to joint space constraints... Solution to 2 and 3 : next section 13 / 42
16 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Block diagram description of the approach XCSF ũ = f ( x, x ) x t x x0 x u control noise arm (model) K x t+1 ũ XCSF learns associations between estimated state, goal and current action 14 / 42
17 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Block diagram description of the approach x t x x0 x u ũ XCSF noise arm (model) K x t+1 XCSF controller trained with planned trajectories XCSF controller used instead of planning when the actions are learned 14 / 42
18 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Experimental set-up : arm A 2 dofs planar arm with gravity 15 / 42
19 Policy compilation : Jeremie Decock's work General idea : supervised learning of planned trajectories Experimental set-up : muscles The arm has 6 muscles 16 / 42
20 Policy compilation : Jeremie Decock's work Generalisation results Trajectories with planning 17 / 42
21 Policy compilation : Jeremie Decock's work Generalisation results Trajectories with generalization 18 / 42
22 Policy compilation : Jeremie Decock's work Generalisation results Corresponding trajectories 19 / 42
23 Policy compilation : Jeremie Decock's work Generalisation results Corresponding trajectories 19 / 42
24 Policy compilation : Jeremie Decock's work Generalisation results Planned trajectories on a larger training set 20 / 42
25 Policy compilation : Jeremie Decock's work Generalisation results Generalization capabilities Generalization to other goals 21 / 42
26 Policy compilation : Jeremie Decock's work Improving control between trials Limitations of the model Performance improves between trials, the level of performance improvement depends on the inter-trial time Three ways to implement that process : 1. Perform planning experiments in the head of the agent (using a learned forward model). That's a Model-based RL process ([Sutton, 1990]). 2. Improve the compiled policy with a Model-free RL process 3. Eventually, combine both (as suggested by [Daw et al., 2005]) An internship on that in / 42
27 Neural implementation of the model Neural relevant properties Implementation of temporal dierence algorithm in the striatal dopaminergic neurons (Basal Ganglia) ([Schultz et al., 1997]) Basal Ganglia are believed to implement an Actor-critic architecture (see [Joel et al., 2002]) 23 / 42
28 Neural implementation of the model Model-based architecture [Daw et al., 2005] (within an Action Selection context) : provides a cue on when to perform reoptimization (on-line replanning) based on outcome uncertainty Integration of a bayesian inference view in this architecture? 24 / 42
29 Motor sequences Biological background Graziano et al. (1) [Graziano et al., 2005] : exciting specic neurons in the intraparietal sulcus results in specic, ecologically relevant postures 25 / 42
30 Motor sequences Biological background Graziano et al. (2) The intraparietal sulcus areas is organized with gradients : somatotopic map of the eector, ecological meaning, target of eector 26 / 42
31 Motor sequences Kohonen maps Kohonen maps model [Aalo & Graziano, 2006b] : abstract encoding of the dimensions Shows that the gradients emerge (no manikin simulation) 27 / 42
32 Motor sequences GRS Model Goal of the study [Gabalda et al., 2007] : reproduce [Aalo & Graziano, 2006a] showing that ecological postures can emerge from the interaction with environment 28 / 42
33 Motor sequences GRS Model Example of sequence To each gesture corresponds a rewarded area 29 / 42
34 Motor sequences GRS Model Kohonen map initialization The Kohonen map codes for motor goals Trained on 2 millions random postures As a result, a few cells code for rewarded postures 30 / 42
35 Motor sequences GRS Model Links from contexts to goals The sequence contains four contexts The algorithm must associate the correct goals to contexts among 384 potential goals 31 / 42
36 Motor sequences GRS Model Goal selection Chosing a goal in a context is an action in that context The active goal is the one whose link to the context is the strongest 32 / 42
37 Motor sequences GRS Model Posture for target reaching A task space goal corresponds to a joint conguration 33 / 42
38 Motor sequences GRS Model Déplacement vers le but Low-level control drives the manikin towards its goal 34 / 42
39 Motor sequences GRS Model Reinforcing the goal When the goal is reached, a reward is received and the link to the celle coding for the current posture is strenghtened 35 / 42
40 Motor sequences GRS Model learning the map When a reward is received, the current cell is trained : it extends its domain 36 / 42
41 Motor sequences GRS Model Learnt map One can see the emergence of zones coding for relevant postures 37 / 42
42 Motor sequences GRS Model Global view We get a hierarchical architecture Linking contexts and goal is an RL problem 38 / 42
43 Motor sequences GRS Model Other topics Motor synergies : used to reduce the size of the optimisation problem Motor primitives : repertoire of ready-to-use simple controllers / 42
44 Motor sequences GRS Model Final discussion RL tools are used at the action selection level because RL theory was about discrete choices By contrast, OC tools are used at the motor control level, considered continuous AC methods provide a potential unication (continuous RL methods), thus the possibility to consider an unique neural substrate (taking the multiple BG loops into account) 40 / 42
45 Motor sequences GRS Model Messages Much progress in maths since naive early proposal (NAC, enac and INAC) That progress did not propagate to biological modelling Actor-critic methods are ecient for motor control modelling Incremental versions might be biologically plausible Evolution towards Bayesian inference models Model-based actor-critic architectures might integrate cost-sensitive learning and planning 41 / 42
46 Motor sequences GRS Model Any question? 42 / 42
47 Motor sequences GRS Model Aalo, T. N. & Graziano, M. S. A. (2006a). Possible origins of the complex topographic organization of motor cortex : Reduction of a multidimensional space onto a two-dimensional array. Journal of Neuroscience, 26(23) : Aalo, T. N. & Graziano, M. S. A. (2006b). Relationship between unconstrained arm movements and single neuron ring in the macaque motor cortex. Journal of Neuroscience, 27(11) : Daw, N., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioural control. Nature Neuroscience, 8 : Gabalda, B., Rigoux, L., & Sigaud, O. (2007). Learning postures through sensorimotor training : a human simulation case study. Edité dans Proceedings of the Seventh International Conference on Epigenetic Robotics, pages Graziano, M. S., Tyson, N. S., & Cooke, D. F. (2005). Arm movements evoked by electrical stimulation in the motor cortex of monkeys. Journal of Neurophysiology, 94 : Guigon, E., Baraduc, P., & Desmurget, M. (2008). Optimality, stochasticity and variability in motor behavior. Journal of Computational Neuroscience, 24(1) :5768. Joel, D., Niv, Y., & Ruppin, E. (2002). Actor-critic models of the basal ganglia : new anatomical and computational perspectives. Neural Networks, 15(4-6) : Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate for prediction and reward. Science, 275 : / 42
48 Motor sequences GRS Model Shadmehr, R., Orban de Xivry, J.-J., Xu-Wilson, M., & Shih, T.-Y. (2010). Temporal discounting of reward and the cost of time in motor control. Journal of Neuroscience, 30(31) : Sutton, R. S. (1990). Integrating architectures for learning, planning, and reacting based on approximating dynamic programming. Edité dans Proceedings of the Seventh International Conference on Machine Learning ICML'90, pages , San Mateo, CA. Morgan Kaufmann. Todorov, E. & Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nature Neurosciences, 5(11) : / 42
Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain
Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement
More informationA Model of Dopamine and Uncertainty Using Temporal Difference
A Model of Dopamine and Uncertainty Using Temporal Difference Angela J. Thurnham* (a.j.thurnham@herts.ac.uk), D. John Done** (d.j.done@herts.ac.uk), Neil Davey* (n.davey@herts.ac.uk), ay J. Frank* (r.j.frank@herts.ac.uk)
More informationReward Hierarchical Temporal Memory
WCCI 2012 IEEE World Congress on Computational Intelligence June, 10-15, 2012 - Brisbane, Australia IJCNN Reward Hierarchical Temporal Memory Model for Memorizing and Computing Reward Prediction Error
More informationDopamine enables dynamic regulation of exploration
Dopamine enables dynamic regulation of exploration François Cinotti Université Pierre et Marie Curie, CNRS 4 place Jussieu, 75005, Paris, FRANCE francois.cinotti@isir.upmc.fr Nassim Aklil nassim.aklil@isir.upmc.fr
More informationDopamine neurons activity in a multi-choice task: reward prediction error or value function?
Dopamine neurons activity in a multi-choice task: reward prediction error or value function? Jean Bellot 1,, Olivier Sigaud 1,, Matthew R Roesch 3,, Geoffrey Schoenbaum 5,, Benoît Girard 1,, Mehdi Khamassi
More informationA model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $
Neurocomputing 69 (26) 1327 1331 www.elsevier.com/locate/neucom A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $ Shinya Ishii a,1, Munetaka
More informationLearning postures through sensorimotor training: a human simulation case study
Learning postures through sensorimotor training: a human simulation case study Belonia Gabalda Lionel Rigoux Olivier Sigaud Institut des Systèmes Intelligents et de Robotique Université Pierre et Marie
More informationLearning Utility for Behavior Acquisition and Intention Inference of Other Agent
Learning Utility for Behavior Acquisition and Intention Inference of Other Agent Yasutake Takahashi, Teruyasu Kawamata, and Minoru Asada* Dept. of Adaptive Machine Systems, Graduate School of Engineering,
More informationLearning Classifier Systems (LCS/XCSF)
Context-Dependent Predictions and Cognitive Arm Control with XCSF Learning Classifier Systems (LCS/XCSF) Laurentius Florentin Gruber Seminar aus Künstlicher Intelligenz WS 2015/16 Professor Johannes Fürnkranz
More informationLecture 13: Finding optimal treatment policies
MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline
More informationMotivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.
Outline: Motivation. What s the attention mechanism? Soft attention vs. Hard attention. Attention in Machine translation. Attention in Image captioning. State-of-the-art. 1 Motivation: Attention: Focusing
More informationSequential Decision Making
Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want
More informationReinforcement Learning
Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction
More informationA model of reward- and effort-based optimal decision making and motor control
A model of reward- and effort-based optimal decision making and motor control Abbreviated title: Optimal decision and action Lionel Rigoux 1,2, Emmanuel Guigon 1,2 1 UPMC Univ Paris 06, UMR 7222, ISIR,
More informationEmotion Explained. Edmund T. Rolls
Emotion Explained Edmund T. Rolls Professor of Experimental Psychology, University of Oxford and Fellow and Tutor in Psychology, Corpus Christi College, Oxford OXPORD UNIVERSITY PRESS Contents 1 Introduction:
More informationTHE FORMATION OF HABITS The implicit supervision of the basal ganglia
THE FORMATION OF HABITS The implicit supervision of the basal ganglia MEROPI TOPALIDOU 12e Colloque de Société des Neurosciences Montpellier May 1922, 2015 THE FORMATION OF HABITS The implicit supervision
More informationOscillatory Neural Network for Image Segmentation with Biased Competition for Attention
Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of
More informationChoosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making
Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making Leo P. Surgre, Gres S. Corrado and William T. Newsome Presenter: He Crane Huang 04/20/2010 Outline Studies on neural
More informationThe optimism bias may support rational action
The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single
More informationA computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex
A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex Mehdi Khamassi, Rene Quilodran, Pierre Enel, Emmanuel Procyk, and Peter F. Dominey INSERM
More informationBrain Based Change Management
Brain Based Change Management PMI Mile Hi Chapter December 2, 2017 Vanita Bellen Executive Coach and Leadership Consultant True North Coaching and Facilitation Vanita Bellen, MHSc, PHR, SHRM-CP, PCC True
More informationERA: Architectures for Inference
ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering 7/28/09 1 Intelligent Computing In spite of the transistor bounty of Moore s law, there is a large class of problems
More information(c) KSIS Politechnika Poznanska
Fundamentals of Autonomous Systems Control architectures in robotics Dariusz Pazderski 1 1 Katedra Sterowania i In»ynierii Systemów, Politechnika Pozna«ska 9th March 2016 Introduction Robotic paradigms
More informationComputational & Systems Neuroscience Symposium
Keynote Speaker: Mikhail Rabinovich Biocircuits Institute University of California, San Diego Sequential information coding in the brain: binding, chunking and episodic memory dynamics Sequential information
More informationArtificial Intelligence Lecture 7
Artificial Intelligence Lecture 7 Lecture plan AI in general (ch. 1) Search based AI (ch. 4) search, games, planning, optimization Agents (ch. 8) applied AI techniques in robots, software agents,... Knowledge
More informationHierarchical dynamical models of motor function
ARTICLE IN PRESS Neurocomputing 70 (7) 975 990 www.elsevier.com/locate/neucom Hierarchical dynamical models of motor function S.M. Stringer, E.T. Rolls Department of Experimental Psychology, Centre for
More informationHebbian Plasticity for Improving Perceptual Decisions
Hebbian Plasticity for Improving Perceptual Decisions Tsung-Ren Huang Department of Psychology, National Taiwan University trhuang@ntu.edu.tw Abstract Shibata et al. reported that humans could learn to
More informationReinforcement Learning and Artificial Intelligence
Reinforcement Learning and Artificial Intelligence PIs: Rich Sutton Michael Bowling Dale Schuurmans Vadim Bulitko plus: Dale Schuurmans Vadim Bulitko Lori Troop Mark Lee Reinforcement learning is learning
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 10: Brain-Computer Interfaces Ilya Kuzovkin So Far Stimulus So Far So Far Stimulus What are the neuroimaging techniques you know about? Stimulus So Far
More informationTime Experiencing by Robotic Agents
Time Experiencing by Robotic Agents Michail Maniadakis 1 and Marc Wittmann 2 and Panos Trahanias 1 1- Foundation for Research and Technology - Hellas, ICS, Greece 2- Institute for Frontier Areas of Psychology
More informationBehavioral considerations suggest an average reward TD model of the dopamine system
Neurocomputing 32}33 (2000) 679}684 Behavioral considerations suggest an average reward TD model of the dopamine system Nathaniel D. Daw*, David S. Touretzky Computer Science Department & Center for the
More informationLearning to Selectively Attend
Learning to Selectively Attend Samuel J. Gershman (sjgershm@princeton.edu) Jonathan D. Cohen (jdc@princeton.edu) Yael Niv (yael@princeton.edu) Department of Psychology and Neuroscience Institute, Princeton
More informationA Dynamic Model for Action Understanding and Goal-Directed Imitation
* Manuscript-title pg, abst, fig lgnd, refs, tbls Brain Research 1083 (2006) pp.174-188 A Dynamic Model for Action Understanding and Goal-Directed Imitation Wolfram Erlhagen 1, Albert Mukovskiy 1, Estela
More informationTitle: Computational modeling of observational learning inspired by the cortical underpinnings of human primates.
Title: Computational modeling of observational learning inspired by the cortical underpinnings of human primates. Authors: Emmanouil Hourdakis (ehourdak@ics.forth.gr) Institute of Computer Science, Foundation
More informationMemory, Attention, and Decision-Making
Memory, Attention, and Decision-Making A Unifying Computational Neuroscience Approach Edmund T. Rolls University of Oxford Department of Experimental Psychology Oxford England OXFORD UNIVERSITY PRESS Contents
More informationA Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning
A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer
More informationarxiv: v2 [cs.lg] 1 Jun 2018
Shagun Sodhani 1 * Vardaan Pahuja 1 * arxiv:1805.11016v2 [cs.lg] 1 Jun 2018 Abstract Self-play (Sukhbaatar et al., 2017) is an unsupervised training procedure which enables the reinforcement learning agents
More informationLearning to Selectively Attend
Learning to Selectively Attend Samuel J. Gershman (sjgershm@princeton.edu) Jonathan D. Cohen (jdc@princeton.edu) Yael Niv (yael@princeton.edu) Department of Psychology and Neuroscience Institute, Princeton
More informationIntroduction to computational motor control
Introduction to computational motor control Olivier White, PhD, Ir Associate Professor INSERM - U1093 Cognition, Action, and Sensorimotor Plasticity Dijon, 2012 This lecture is based on Reza Shadmehr s,
More informationDynamic Control Models as State Abstractions
University of Massachusetts Amherst From the SelectedWorks of Roderic Grupen 998 Dynamic Control Models as State Abstractions Jefferson A. Coelho Roderic Grupen, University of Massachusetts - Amherst Available
More informationError Detection based on neural signals
Error Detection based on neural signals Nir Even- Chen and Igor Berman, Electrical Engineering, Stanford Introduction Brain computer interface (BCI) is a direct communication pathway between the brain
More informationNeurophysiology of systems
Neurophysiology of systems Motor cortex (voluntary movements) Dana Cohen, Room 410, tel: 7138 danacoh@gmail.com Voluntary movements vs. reflexes Same stimulus yields a different movement depending on context
More informationdacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations
dacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations Mehdi Khamassi (CNRS & UPMC, Paris) Symposim S23 chaired by
More informationIs model fitting necessary for model-based fmri?
Is model fitting necessary for model-based fmri? Robert C. Wilson Princeton Neuroscience Institute Princeton University Princeton, NJ 85 rcw@princeton.edu Yael Niv Princeton Neuroscience Institute Princeton
More informationModeling the sensory roles of noradrenaline in action selection
Modeling the sensory roles of noradrenaline in action selection Maxime Carrere, Frédéric Alexandre To cite this version: Maxime Carrere, Frédéric Alexandre. Modeling the sensory roles of noradrenaline
More informationA Novel Account in Neural Terms. Gal Chechik Isaac Meilijson and Eytan Ruppin. Schools of Medicine and Mathematical Sciences
Synaptic Pruning in Development: A Novel Account in Neural Terms Gal Chechik Isaac Meilijson and Eytan Ruppin Schools of Medicine and Mathematical Sciences Tel-Aviv University Tel Aviv 69978, Israel gal@devil.tau.ac.il
More informationSession Goals. Principles of Brain Plasticity
Presenter: Bryan Kolb Canadian Centre for Behavioural Neuroscience University of Lethbridge Date: January 12, 2011 The FASD Learning Series is part of the Alberta government s commitment to programs and
More informationA behavioral investigation of the algorithms underlying reinforcement learning in humans
A behavioral investigation of the algorithms underlying reinforcement learning in humans Ana Catarina dos Santos Farinha Under supervision of Tiago Vaz Maia Instituto Superior Técnico Instituto de Medicina
More informationLearning to act by predicting the future
Learning to act by predicting the future Alexey Dosovitskiy and Vladlen Koltun Intel Labs, Santa Clara ICLR 2017, Toulon, France Sensorimotor control Aim: produce useful motor actions based on sensory
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationMOTOR CONTROL. Emmanuel Guigon. Institut des Systèmes Intelligents et de Robotique Sorbonne Université CNRS / UMR 7222 Paris, France
MOTOR CONTROL Emmanuel Guigon Institut des Systèmes Intelligents et de Robotique Sorbonne Université CNRS / UMR 7222 Paris, France emmanuel.guigon@sorbonne-universite.fr e.guigon.free.fr/teaching.html
More informationExtending the Computational Abilities of the Procedural Learning Mechanism in ACT-R
Extending the Computational Abilities of the Procedural Learning Mechanism in ACT-R Wai-Tat Fu (wfu@cmu.edu) John R. Anderson (ja+@cmu.edu) Department of Psychology, Carnegie Mellon University Pittsburgh,
More informationMaking Things Happen: Simple Motor Control
Making Things Happen: Simple Motor Control How Your Brain Works - Week 10 Prof. Jan Schnupp wschnupp@cityu.edu.hk HowYourBrainWorks.net The Story So Far In the first few lectures we introduced you to some
More informationLearning Working Memory Tasks by Reward Prediction in the Basal Ganglia
Learning Working Memory Tasks by Reward Prediction in the Basal Ganglia Bryan Loughry Department of Computer Science University of Colorado Boulder 345 UCB Boulder, CO, 80309 loughry@colorado.edu Michael
More informationTowards Learning to Ignore Irrelevant State Variables
Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract
More informationArtificial Intelligence
Artificial Intelligence Intelligent Agents Chapter 2 & 27 What is an Agent? An intelligent agent perceives its environment with sensors and acts upon that environment through actuators 2 Examples of Agents
More informationA Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl
A Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl Andrea Haessly andrea@cs.utexas.edu Joseph Sirosh sirosh@cs.utexas.edu Risto Miikkulainen risto@cs.utexas.edu Abstract
More informationTemporal Pattern identication using Spike-Timing Dependent Plasticity
Temporal Pattern identication using Spike-Timing Dependent Plasticity Frédéric Henry, Emmanuel Daucé, Hédi Soula August 31, 2006 Abstract This paper addresses the question of the functional role of the
More informationA Computational Model of Prefrontal Cortex Function
A Computational Model of Prefrontal Cortex Function Todd S. Braver Dept. of Psychology Carnegie Mellon Univ. Pittsburgh, PA 15213 Jonathan D. Cohen Dept. of Psychology Carnegie Mellon Univ. Pittsburgh,
More informationA Neurocomputational Model of Dopamine and Prefrontal Striatal Interactions during Multicue Category Learning by Parkinson Patients
A Neurocomputational Model of Dopamine and Prefrontal Striatal Interactions during Multicue Category Learning by Parkinson Patients Ahmed A. Moustafa and Mark A. Gluck Abstract Most existing models of
More informationBetween-word regressions as part of rational reading
Between-word regressions as part of rational reading Klinton Bicknell & Roger Levy UC San Diego CUNY 2010: New York Bicknell & Levy (UC San Diego) Regressions as rational reading CUNY 2010 1 / 23 Introduction
More informationObject recognition and hierarchical computation
Object recognition and hierarchical computation Challenges in object recognition. Fukushima s Neocognitron View-based representations of objects Poggio s HMAX Forward and Feedback in visual hierarchy Hierarchical
More informationJournal of Physiology - Paris
Journal of Physiology - Paris xxx (2013) xxx xxx Contents lists available at ScienceDirect Journal of Physiology - Paris journal homepage: www.elsevier.com/locate/jphysparis Modelling the learning of biomechanics
More informationBayesian Reinforcement Learning
Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline
More informationLearning and Adaptive Behavior, Part II
Learning and Adaptive Behavior, Part II April 12, 2007 The man who sets out to carry a cat by its tail learns something that will always be useful and which will never grow dim or doubtful. -- Mark Twain
More informationModeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments
Modeling Individual and Group Behavior in Complex Environments Dr. R. Andrew Goodwin Environmental Laboratory Professor James J. Anderson Abran Steele-Feldman University of Washington Status: AT-14 Continuing
More informationAction Recognition based on Hierarchical Self-Organizing Maps
Action Recognition based on Hierarchical Self-Organizing Maps Miriam Buonamente 1, Haris Dindo 1, and Magnus Johnsson 2 1 RoboticsLab, DICGIM, University of Palermo, Viale delle Scienze, Ed. 6, 90128 Palermo,
More informationPolicy Gradients. CS : Deep Reinforcement Learning Sergey Levine
Policy Gradients CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 1 milestone due today (11:59 pm)! Don t be late! 2. Remember to start forming final project groups Today s
More informationLearning in neural networks
http://ccnl.psy.unipd.it Learning in neural networks Marco Zorzi University of Padova M. Zorzi - European Diploma in Cognitive and Brain Sciences, Cognitive modeling", HWK 19-24/3/2006 1 Connectionist
More informationSpiking Inputs to a Winner-take-all Network
Spiking Inputs to a Winner-take-all Network Matthias Oster and Shih-Chii Liu Institute of Neuroinformatics University of Zurich and ETH Zurich Winterthurerstrasse 9 CH-857 Zurich, Switzerland {mao,shih}@ini.phys.ethz.ch
More informationLearning to Use Episodic Memory
Learning to Use Episodic Memory Nicholas A. Gorski (ngorski@umich.edu) John E. Laird (laird@umich.edu) Computer Science & Engineering, University of Michigan 2260 Hayward St., Ann Arbor, MI 48109 USA Abstract
More informationTiming and partial observability in the dopamine system
In Advances in Neural Information Processing Systems 5. MIT Press, Cambridge, MA, 23. (In Press) Timing and partial observability in the dopamine system Nathaniel D. Daw,3, Aaron C. Courville 2,3, and
More informationThalamocortical Feedback and Coupled Oscillators
Thalamocortical Feedback and Coupled Oscillators Balaji Sriram March 23, 2009 Abstract Feedback systems are ubiquitous in neural systems and are a subject of intense theoretical and experimental analysis.
More informationObjectives. Objectives Continued 8/13/2014. Movement Education and Motor Learning Where Ortho and Neuro Rehab Collide
Movement Education and Motor Learning Where Ortho and Neuro Rehab Collide Roderick Henderson, PT, ScD, OCS Wendy Herbert, PT, PhD Janna McGaugh, PT, ScD, COMT Jill Seale, PT, PhD, NCS Objectives 1. Identify
More informationA Model of Reward- and Effort-Based Optimal Decision Making and Motor Control
A Model of Reward- and Effort-Based Optimal Decision Making and Motor Control Lionel Rigoux 1,2, Emmanuel Guigon 1,2 * 1 UPMC Univ Paris 06, UMR 7222, ISIR, Paris, France, 2 CNRS, UMR 7222, ISIR, Paris,
More informationArteSImit: Artefact Structural Learning through Imitation
ArteSImit: Artefact Structural Learning through Imitation (TU München, U Parma, U Tübingen, U Minho, KU Nijmegen) Goals Methodology Intermediate goals achieved so far Motivation Living artefacts will critically
More informationCell Responses in V4 Sparse Distributed Representation
Part 4B: Real Neurons Functions of Layers Input layer 4 from sensation or other areas 3. Neocortical Dynamics Hidden layers 2 & 3 Output layers 5 & 6 to motor systems or other areas 1 2 Hierarchical Categorical
More informationNoise Cancellation using Adaptive Filters Algorithms
Noise Cancellation using Adaptive Filters Algorithms Suman, Poonam Beniwal Department of ECE, OITM, Hisar, bhariasuman13@gmail.com Abstract Active Noise Control (ANC) involves an electro acoustic or electromechanical
More informationEXPLORATION FLOW 4/18/10
EXPLORATION Peter Bossaerts CNS 102b FLOW Canonical exploration problem: bandits Bayesian optimal exploration: The Gittins index Undirected exploration: e-greedy and softmax (logit) The economists and
More informationOrganizing Behavior into Temporal and Spatial Neighborhoods
Organizing Behavior into Temporal and Spatial Neighborhoods Mark Ring IDSIA / University of Lugano / SUPSI Galleria 6928 Manno-Lugano, Switzerland Email: mark@idsia.ch Tom Schaul Courant Institute of Mathematical
More informationREINFORCEMENT LEARNING OF DIMENSIONAL ATTENTION FOR CATEGORIZATION JOSHUA L. PHILLIPS
COMPUTER SCIENCE REINFORCEMENT LEARNING OF DIMENSIONAL ATTENTION FOR CATEGORIZATION JOSHUA L. PHILLIPS Thesis under the direction of Professor David C. Noelle The ability to selectively focus attention
More informationNeural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task
Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Terrence C. Stewart (tcstewar@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for Theoretical
More informationRepresentation 1. Discussion Question. Roskies: Downplaying the Similarities of Neuroimages to Photographs
Representation 1 Discussion Question In what way are photographs more reliable than paintings? A. They aren t B. The image in the photograph directly reflects what is before the lens C. The image in the
More informationarxiv: v1 [cs.lg] 29 Jun 2016
Actor-critic versus direct policy search: a comparison based on sample complexity Arnaud de Froissard de Broissia, Olivier Sigaud Sorbonne Universités, UPMC Univ Paris 06, UMR 7222, F-75005 Paris, France
More informationNeural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task
Neural Cognitive Modelling: A Biologically Constrained Spiking Neuron Model of the Tower of Hanoi Task Terrence C. Stewart (tcstewar@uwaterloo.ca) Chris Eliasmith (celiasmith@uwaterloo.ca) Centre for Theoretical
More informationDegree of freedom problem
KINE 4500 Neural Control of Movement Lecture #1:Introduction to the Neural Control of Movement Neural control of movement Kinesiology: study of movement Here we re looking at the control system, and what
More informationUsing Heuristic Models to Understand Human and Optimal Decision-Making on Bandit Problems
Using Heuristic Models to Understand Human and Optimal Decision-Making on andit Problems Michael D. Lee (mdlee@uci.edu) Shunan Zhang (szhang@uci.edu) Miles Munro (mmunro@uci.edu) Mark Steyvers (msteyver@uci.edu)
More informationKINE 4500 Neural Control of Movement. Lecture #1:Introduction to the Neural Control of Movement. Neural control of movement
KINE 4500 Neural Control of Movement Lecture #1:Introduction to the Neural Control of Movement Neural control of movement Kinesiology: study of movement Here we re looking at the control system, and what
More informationAn exploration of the predictors of instruction following in an academic environment
MSc Research Projects Completed by students on the MSc Psychology, MSc Brain Imaging & Cognitive Neuroscience and MSc Computational Neuroscience & Cognitive Robotics 2016-2017 An exploration of the predictors
More informationHuman Paleoneurology and the Evolution of the Parietal Cortex
PARIETAL LOBE The Parietal Lobes develop at about the age of 5 years. They function to give the individual perspective and to help them understand space, touch, and volume. The location of the parietal
More informationIntrinsic Motivation Systems for Autonomous Mental Development
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Intrinsic Motivation Systems for Autonomous Mental Development Pierre-Yves Oudeyer, Frédéric Kaplan, Verena V. Hafner Sony Computer Science Lab, Paris 6
More informationModels of Imitation and Mirror Neuron Activity. COGS171 FALL Quarter 2011 J. A. Pineda
Models of Imitation and Mirror Neuron Activity COGS171 FALL Quarter 2011 J. A. Pineda Basis for Models Since a majority of mirror neurons have been found in motor areas (IFG and IPL), it is reasonable
More informationDo Reinforcement Learning Models Explain Neural Learning?
Do Reinforcement Learning Models Explain Neural Learning? Svenja Stark Fachbereich 20 - Informatik TU Darmstadt svenja.stark@stud.tu-darmstadt.de Abstract Because the functionality of our brains is still
More informationHST 583 fmri DATA ANALYSIS AND ACQUISITION
HST 583 fmri DATA ANALYSIS AND ACQUISITION Neural Signal Processing for Functional Neuroimaging Neuroscience Statistics Research Laboratory Massachusetts General Hospital Harvard Medical School/MIT Division
More informationPractical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012
Practical Bayesian Optimization of Machine Learning Algorithms Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012 ... (Gaussian Processes) are inadequate for doing speech and vision. I still think they're
More informationLEAH KRUBITZER RESEARCH GROUP LAB PUBLICATIONS WHAT WE DO LINKS CONTACTS
LEAH KRUBITZER RESEARCH GROUP LAB PUBLICATIONS WHAT WE DO LINKS CONTACTS WHAT WE DO Present studies and future directions Our laboratory is currently involved in two major areas of research. The first
More informationA Biased View of Perceivers. Commentary on `Observer theory, Bayes theory,
A Biased View of Perceivers Commentary on `Observer theory, Bayes theory, and psychophysics,' by B. Bennett, et al. Allan D. Jepson University oftoronto Jacob Feldman Rutgers University March 14, 1995
More informationCOMP150 Behavior-Based Robotics
For class use only, do not distribute COMP150 Behavior-Based Robotics http://www.cs.tufts.edu/comp/150bbr/timetable.html http://www.cs.tufts.edu/comp/150bbr/syllabus.html Project directions and topics
More information