A Simulation of Sutton and Barto s Temporal Difference Conditioning Model

Size: px
Start display at page:

Download "A Simulation of Sutton and Barto s Temporal Difference Conditioning Model"

Transcription

1 A Simulation of Sutton and Barto s Temporal Difference Conditioning Model Nick Schmansk Department of Cognitive and Neural Sstems Boston Universit Ma, Abstract A simulation of the Sutton and Barto [] model of classical conditioning is shown to exhibit known timing effects of that conditioning paradigm. The effects include an accurate modelling of the time course of the shift of the CR awa from the onset of the UCS and toward the onset of the CS during an acquisition ccle; a graded CR response dependent upon ISI timing; the blocking effect; and second-order conditioning. Ke to the success of this model is the inclusion of short-term memor trace variables x i and ȳ, describing the inputs and output; and the dependence of weight adaptation on a temporal-difference term ȳ. Introduction Sutton and Barto [] introduced a learning network attempting to explain the effects observed during classical conditioning. The Sutton-Barto model is a simple adaptive element described b Equations (). A three-input form of an element is shown in Figure. At its simplest, the model acts as a perceptron: the output a summation of weighted inputs. However, the model includes two additional sets of variables which are critical to the model. Each input stimulus x i has an associated eligibilit trace x i, which acts as a short-term memor of x i, and is used to indicate when and b how much the weight w i associated with x i is modified. A similar variable ȳ exists for the output, where ȳ is a weighted average of the element s past activit. An important aspect of the Sutton-Barto model is the dependence of weight modification on the difference term ȳ (and x i ). This is in contrast to Hebbian learning, which depends on onl (and x i ). The inclusion of the ȳ temporal difference term in the Sutton-Barto model is critical to accounting for the timing results observed in classical conditioning experiments, namel the blocking effect, interstimulus interval variations, and higher-order conditioning effects.

2 x i (t + ) = α x i (t) + x i (t) ȳ(t + ) = βȳ(t) + ( β)(t) w i (t + ) = w i (t) + c((t) ȳ(t)) x i (t) n (t) = w i (t)x i (t) () i= UCS CS x x w w x w UR / CR Figure : The simulated Sutton-Barto adaptive element described b Equations (). In Equations (), α and β are constants ranging from.., and c is a positive learning rate constant. Variables (t) and ȳ must lie in the interval [, ], and the eligibilit trace variable x is alwas greater than. A simulation of the Sutton-Barto model was conducted for the network configuration shown in Figure, based on Equations (). The simulation investigated the effects of timing differences between the input stimuli. Two inputs acted as conditioned stimuli, and CS, and a third input simulated an unconditioned stimulus, U CS. Methods The MATLAB toolkit was used to develop and execute the simulation. Four experiments were conducted, each highlighting known timing effects of classical conditioning.. Simulation (a) - Acquisition of a CR The first experiment simulated the acquisition of a conditioned response, CR, upon pairing with the UCS, following the basic classical conditioning paradigm. The timing relationship between and the UCS is shown in the plots of Figure. The ISI is fixed at time (the time between the onset of and the onset of UCS). For this simulation, the parameters were c =., α =.6, β =, w =.6. Note that weight w associated with the UCS is fixed, whereas the weight w adapts according to Equations (). The experiment consisted of ten trials, where a single trial simulated time. Ten trials were enough for the weight w to adapt to it s asmptote.

3 . Simulation (a) - CR dependenc on ISI In the second experiment, the ISI was varied between and 9 time. Ten trials (as described in the prior experiment) were conducted for each ISI setting, thus a total of data points ( weight w ) were gathered. For this simulation, the parameters were c =., α =.9, β =, w =.6.. Simulation (b) - Blocking effects The third experiment explored the blocking effect. This is where a conditioned stimulus, CS, is unable to alter the CR of the network if it is paired in an identicall timed manner with a which has been prior paired to an UCS resulting in adaptation of the weight. Figure shows the time course of the, CS and UCS signals during the three phases of the experiment. The first phase, shown in the upper plot of Figure, is the standard conditioning paradigm where a CS is paired with the UCS, adapting the weights on to asmptote, a necessar condition for testing the blocking effect. The next phase, shown in the middle plot, attempts to pair a new stimulus, CS, coincident with the previousl trained stimulus. In the third phase, shown in the bottom plot, CS is allowed to precede, thus allowing CS to appear novel. For this simulation, the parameters were c =., α =.6, β =, w =.6.. Simulation (c) - Second-order conditioning The fourth experiment explored the timing effects of the higher-order conditioning paradigm. Figure 7 shows the time course of the and CS signals during the two phases of the experiment. The first phase, shown in the upper plot, follows the standard CS UCS paradigm, where the weight w adapts to asmptote, a necessar condition for testing second-order conditioning to occur in the next phase. The second phase, shown in the middle plot, attempts to pair a new stimulus, CS, with, in the absence of the UCS. This second phase is repeated over a number of trials until both and CS weights w and w reach an asmptote. For this simulation, the parameters were c =., α =.6, β =, w =.6.

4 Results. Simulation (a) - Acquisition of a CR The results of the first experiment are shown in the two plots of Figure. The top plot displas the first trial run, and the bottom plot the tenth (and last) run. In both plots, the rise and deca (short-term memor) nature of the eligibilit trace x is evident. In the first trial, it is active coincident with the temporal difference term ȳ, thus the weight w adapts (at time tick ). B making the weight non-zero, the signal contributes to the output (the CR), thus moving the onset of the CR signal earlier (leftward). B the tenth trial, the CR is coincident with the onset of, but b this time, because the eligibilit trace x is zero, w ceases to adapt.

5 Simulation (a): TD adaptive elements, trial UCS = x = x xbar Bar w relative activit Simulation (a): TD adaptive elements, trial UCS = x = x xbar Bar w relative activit Figure : Shown is the first and last of a series of classical conditioning trials. The upper drawing is the time course of the adaptive elements in the first of ten pairing trials, and the bottom drawing is the time course of the last trial. In each, a CS precedes the UCS in the normal manner. The trace x indicates the eligibilit for modification of the weight. Not shown is this adaptive weight, which is zero at the beginning of trials. In trial, output element initiall responds onl to the UCS, but b trial, adaptation of the weight to its asmptote cause the output element to coincide with the onset of. The product of the trace ȳ, where ȳ is the expected output level, and x, determines the rate of weight increase. Thus, adaptation of this weight occurs until the trace ȳ moves left-ward to the point where x is zero, at which point the weight has reached asmptote.

6 . Simulation (a) - CR dependenc on ISI Figure plots the results on the experiment on CR dependenc on ISI. The results show that the optimal ISI (parameter dependent) is equal to time. The efficac of w asmptotic adaptation decas after that peak is reached, to the point where no adaptation is possible (ISI > time )..6 Simulation (a): Variation of ISI. Amptotic connection weight w..... ISI (simulation time steps) Figure : Shown is the effect of varing the inter-stimulus interval (ISI) between and time in a classical conditioning paradigm. The ISI is the time between the onset of the CS and the onset of the UCS. A test of a particular ISI requires the CS weight to reach asmptote, tpicall in ten trials (figure is an example from a trial set where the ISI equalled time ). The above plot demonstrates an optimal ISI equalling time, decaing exponentiall to a point where no weight adjustment is possible (here, that point is an ISI greater than time ). Variation is of course dependent on simulation parameters (here, c =., α =.9, w =.6). 6

7 . Simulation (b) - Blocking effects Figures and show the results of the exploration of blocking effects. Figure plots the adaptation of the and CS weights over the course of the three phases of the experiment. In trials (phase one), the weight w is allowed to reach its asmptote while CS is held inactive (thus the CS weight w cannot adapt, and remains zero). In trials (phase two), CS is activated coincident with, but the CS weight adapts onl slightl and quickl asmptotes. It is blocked b. However, beginning at trial (phase three), CS is allowed to precede, thus the CS weight w begins to positivel adapt, due to its maximal eligibilit trace x. The weight w decas because its eligibilit trace x becomes progressivel weaker against x. B trial, CS has become the predicting stimulus, and is now blocked..7.6 Simulation (b): Blocking weight w CS weight w. Connection weights.... Trials Figure : The results of a simulation sequence exploring the blocking effect are shown. The simulation consists of three phases, where the time course of the adaptive elements for each phase is shown in Figure. Phase occurs during trials, where is paired with a UCS (CS is not active). The weight associated with reaches asmptote. Phase occurs during trials, where CS is exactl coincident with during a conditioning trial. However, the weight associated with CS adapts onl slightl, demonstrating the blocking effect of the weight. In phase, occurring in trials, CS is allowed to precede, thus allowing adaptation of both weights. B trial, CS has adapted itself to become the novel stimulus, and is blocked. 7

8 6 Simulation (b): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit 6 Simulation (b): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit 6 Simulation (b): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit Figure : Shown is the time course of the adaptive elements during the three phases of a simulation sequence exploring the blocking effect, the results of which are shown in Figure. The first phase, shown in the upper plot, is the standard conditioning paradigm where a CS is paired with an UCS, adapting the weight w to asmptote. The next phase, shown in the middle plot, attempts to pair a new stimulus, CS, coincident with the previousl trained stimulus. As shown in Figure (trials ), the weight associated with CS adapts onl slightl. In the third phase, shown in the bottom plot, CS precedes, thus appearing novel, and allowing adaptation of both and CS weights (shown in trials in Figure ). 8

9 . Simulation (c) - Second-order conditioning Figures 6 and 7 plot the results of the experiment on second-order conditioning. Referring to figure 6, during trials, is paired with the UCS such that weight w reaches asmptote b trial (the time course is shown in the top plot of figure 7). Second-order conditioning begins at trial with the termination of the UCS and activation of CS prior to (the time course is shown in the middle plot of figure 7). CS now begins positivel adapting its weight w in response to. However, because is itself no longer reinforced b the UCS, the weight w decas. Weights w and w equal each other around trial 9, but both deca to zero b the trial, due to a lack of hard reinforcement (which takes the form of the fixed weight w of the UCS)..7.6 Simulation (c): Second order conditioning weight w CS weight w. Connection weights.... Trials Figure 6: The results of a simulation sequence exploring the second-order conditioning paradigm are shown. The simulation consists of two phases, where the time course of the adaptive elements for each phase is shown in Figure 7. Phase occurs during trials, where is paired with a UCS in the standard manner (CS is not active). The weight associated with reaches asmptote b trial. Phase occurs during trials, where CS is paired with, which acts as a reinforcing UCS in the absence of the UCS signal. The effect demonstrated in the second phase (trials ) is an initial adaptation of the CS weight, coincident with a decrease in the weight, due to the absence of the UCS to reinforce it. B trial, both and weights have decaed to zero. 9

10 6 Simulation (c): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit 6 Simulation (c): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit 6 Simulation (c): TD adaptive elements, trial UCS = x = x xbar CS = x xbar Bar relative activit Figure 7: Shown is the time course of the adaptive elements during two phases of a simulation sequence exploring second-order conditioning, the results of which are shown in Figure 6. The first phase, shown in the upper plot, is the standard CS UCS paradigm, where the weight (shown in Figure 6) associated with adapts to asmptote. The second phase, shown in the middle plot, attempts to pair a new stimulus, CS, with, in the absence of the UCS. Pairing is successful earl-on, as evidenced b the output element trace in the middle plot, although it s output is not nearl as strong as compared to the trace of the CS UCS pairing shown in the top plot. The bottom plot shows shows the result of repeated second-order conditioning trials. The output element trace has decaed to, indicating the weights associated with and CS have decaed to zero. Figure 6 demonstrates this effect.

11 Discussion The simulated Sutton-Barto model successfull demonstrates known timing and contextual effects of classical conditioning. It also provides a mechanistic explanation of most aspects of the Rescorla-Wagner theor of classical conditioning. It does so b operating in a lumped-trial manner, where, within a trial, model variables are full specified, allowing insight into effects that occur across trials. For instance, in the blocking experiment shown in figure, it is evident that the CR to stimuli cannot exceed some fixed level (λ = w ). Also, the Sutton-Barto model successfull demonstrates the notion of reinforcement as the difference between actual and expected output level, in contrast to the less sophisticated behavior of the Hebbian learning rule. References [] D.S. Levine. Introduction to Neural and Cognitive Modeling, nd Edition. Lawrence Erlbaum Associates (London),. [] R.S. Sutton and A.G. Barto. Toward a modern theor of adaptive networks: Expectation and prediction. Pschological Review, 88(): 7, 98.

COMPUTATIONAL MODELS OF CLASSICAL CONDITIONING: A COMPARATIVE STUDY

COMPUTATIONAL MODELS OF CLASSICAL CONDITIONING: A COMPARATIVE STUDY COMPUTATIONAL MODELS OF CLASSICAL CONDITIONING: A COMPARATIVE STUDY Christian Balkenius Jan Morén christian.balkenius@fil.lu.se jan.moren@fil.lu.se Lund University Cognitive Science Kungshuset, Lundagård

More information

The Rescorla Wagner Learning Model (and one of its descendants) Computational Models of Neural Systems Lecture 5.1

The Rescorla Wagner Learning Model (and one of its descendants) Computational Models of Neural Systems Lecture 5.1 The Rescorla Wagner Learning Model (and one of its descendants) Lecture 5.1 David S. Touretzky Based on notes by Lisa M. Saksida November, 2015 Outline Classical and instrumental conditioning The Rescorla

More information

Parameter Invariability in the TD Model. with Complete Serial Components. Jordan Marks. Middlesex House. 24 May 1999

Parameter Invariability in the TD Model. with Complete Serial Components. Jordan Marks. Middlesex House. 24 May 1999 Parameter Invariability in the TD Model with Complete Serial Components Jordan Marks Middlesex House 24 May 1999 This work was supported in part by NIMH grant MH57893, John W. Moore, PI 1999 Jordan S.

More information

Classical Conditioning V:

Classical Conditioning V: Classical Conditioning V: Opposites and Opponents PSY/NEU338: Animal learning and decision making: Psychological, computational and neural perspectives where were we? Classical conditioning = prediction

More information

Shadowing and Blocking as Learning Interference Models

Shadowing and Blocking as Learning Interference Models Shadowing and Blocking as Learning Interference Models Espoir Kyubwa Dilip Sunder Raj Department of Bioengineering Department of Neuroscience University of California San Diego University of California

More information

Chapter 1. Give an overview of the whole RL problem. Policies Value functions. Tic-Tac-Toe example

Chapter 1. Give an overview of the whole RL problem. Policies Value functions. Tic-Tac-Toe example Chapter 1 Give an overview of the whole RL problem n Before we break it up into parts to study individually Introduce the cast of characters n Experience (reward) n n Policies Value functions n Models

More information

An Attention Modulated Associative Network

An Attention Modulated Associative Network Published in: Learning & Behavior 2010, 38 (1), 1 26 doi:10.3758/lb.38.1.1 An Attention Modulated Associative Network Justin A. Harris & Evan J. Livesey The University of Sydney Abstract We present an

More information

Cerebral Cortex. Edmund T. Rolls. Principles of Operation. Presubiculum. Subiculum F S D. Neocortex. PHG & Perirhinal. CA1 Fornix CA3 S D

Cerebral Cortex. Edmund T. Rolls. Principles of Operation. Presubiculum. Subiculum F S D. Neocortex. PHG & Perirhinal. CA1 Fornix CA3 S D Cerebral Cortex Principles of Operation Edmund T. Rolls F S D Neocortex S D PHG & Perirhinal 2 3 5 pp Ento rhinal DG Subiculum Presubiculum mf CA3 CA1 Fornix Appendix 4 Simulation software for neuronal

More information

Dikran J. Martin. Psychology 110. Name: Date: Principal Features. "First, the term learning does not apply to (168)

Dikran J. Martin. Psychology 110. Name: Date: Principal Features. First, the term learning does not apply to (168) Dikran J. Martin Psychology 110 Name: Date: Lecture Series: Chapter 5 Learning: How We're Changed Pages: 26 by Experience TEXT: Baron, Robert A. (2001). Psychology (Fifth Edition). Boston, MA: Allyn and

More information

Learning and Adaptive Behavior, Part II

Learning and Adaptive Behavior, Part II Learning and Adaptive Behavior, Part II April 12, 2007 The man who sets out to carry a cat by its tail learns something that will always be useful and which will never grow dim or doubtful. -- Mark Twain

More information

acquisition associative learning behaviorism B. F. Skinner biofeedback

acquisition associative learning behaviorism B. F. Skinner biofeedback acquisition associative learning in classical conditioning the initial stage when one links a neutral stimulus and an unconditioned stimulus so that the neutral stimulus begins triggering the conditioned

More information

Learning. AP PSYCHOLOGY Unit 5

Learning. AP PSYCHOLOGY Unit 5 Learning AP PSYCHOLOGY Unit 5 Learning Learning is a lasting change in behavior or mental process as the result of an experience. There are two important parts: a lasting change a simple reflexive reaction

More information

TEMPORALLY SPECIFIC BLOCKING: TEST OF A COMPUTATIONAL MODEL. A Senior Honors Thesis Presented. Vanessa E. Castagna. June 1999

TEMPORALLY SPECIFIC BLOCKING: TEST OF A COMPUTATIONAL MODEL. A Senior Honors Thesis Presented. Vanessa E. Castagna. June 1999 TEMPORALLY SPECIFIC BLOCKING: TEST OF A COMPUTATIONAL MODEL A Senior Honors Thesis Presented By Vanessa E. Castagna June 999 999 by Vanessa E. Castagna ABSTRACT TEMPORALLY SPECIFIC BLOCKING: A TEST OF

More information

Machine Learning! R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998!

Machine Learning! R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998! Introduction Machine Learning! Literature! Introduction 1 R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction! MIT Press, 1998! http://www.cs.ualberta.ca/~sutton/book/the-book.html! E. Alpaydin:

More information

A Model of Dopamine and Uncertainty Using Temporal Difference

A Model of Dopamine and Uncertainty Using Temporal Difference A Model of Dopamine and Uncertainty Using Temporal Difference Angela J. Thurnham* (a.j.thurnham@herts.ac.uk), D. John Done** (d.j.done@herts.ac.uk), Neil Davey* (n.davey@herts.ac.uk), ay J. Frank* (r.j.frank@herts.ac.uk)

More information

Combining Configural and TD Learning on a Robot

Combining Configural and TD Learning on a Robot Proceedings of the Second International Conference on Development and Learning, Cambridge, MA, June 2 5, 22. Combining Configural and TD Learning on a Robot David S. Touretzky, Nathaniel D. Daw, and Ethan

More information

Lateral Inhibition Explains Savings in Conditioning and Extinction

Lateral Inhibition Explains Savings in Conditioning and Extinction Lateral Inhibition Explains Savings in Conditioning and Extinction Ashish Gupta & David C. Noelle ({ashish.gupta, david.noelle}@vanderbilt.edu) Department of Electrical Engineering and Computer Science

More information

Computational Versus Associative Models of Simple Conditioning i

Computational Versus Associative Models of Simple Conditioning i Gallistel & Gibbon Page 1 In press Current Directions in Psychological Science Computational Versus Associative Models of Simple Conditioning i C. R. Gallistel University of California, Los Angeles John

More information

ALM: An R Package for Simulating Associative Learning Models

ALM: An R Package for Simulating Associative Learning Models ALM: An R Package for Simulating Associative Learning Models Ching-Fan Sheu & Teng-Chang Cheng National Cheng Kung University, Taiwan 9 July 2009 Sheu & Cheng (NCKU) ALM 9 July 2009 1 / 24 Outline 1 Introduction

More information

Backward Inhibitory Learning in Honeybees: A Behavioral Analysis of Reinforcement Processing

Backward Inhibitory Learning in Honeybees: A Behavioral Analysis of Reinforcement Processing Backward Inhibitory Learning in Honeybees: A Behavioral Analysis of Reinforcement Processing Frank Hellstern, 1'3 Rainer Malaka, 2'4 and Martin Hammer 1'5 llnstitut ftir Neurobiologie Freie Universit~it

More information

Model Uncertainty in Classical Conditioning

Model Uncertainty in Classical Conditioning Model Uncertainty in Classical Conditioning A. C. Courville* 1,3, N. D. Daw 2,3, G. J. Gordon 4, and D. S. Touretzky 2,3 1 Robotics Institute, 2 Computer Science Department, 3 Center for the Neural Basis

More information

Packet theory of conditioning and timing

Packet theory of conditioning and timing Behavioural Processes 57 (2002) 89 106 www.elsevier.com/locate/behavproc Packet theory of conditioning and timing Kimberly Kirkpatrick Department of Psychology, Uni ersity of York, York YO10 5DD, UK Accepted

More information

The Influence of the Initial Associative Strength on the Rescorla-Wagner Predictions: Relative Validity

The Influence of the Initial Associative Strength on the Rescorla-Wagner Predictions: Relative Validity Methods of Psychological Research Online 4, Vol. 9, No. Internet: http://www.mpr-online.de Fachbereich Psychologie 4 Universität Koblenz-Landau The Influence of the Initial Associative Strength on the

More information

1. A type of learning in which behavior is strengthened if followed by a reinforcer or diminished if followed by a punisher.

1. A type of learning in which behavior is strengthened if followed by a reinforcer or diminished if followed by a punisher. 1. A stimulus change that increases the future frequency of behavior that immediately precedes it. 2. In operant conditioning, a reinforcement schedule that reinforces a response only after a specified

More information

The Mechanics of Associative Change

The Mechanics of Associative Change The Mechanics of Associative Change M.E. Le Pelley (mel22@hermes.cam.ac.uk) I.P.L. McLaren (iplm2@cus.cam.ac.uk) Department of Experimental Psychology; Downing Site Cambridge CB2 3EB, England Abstract

More information

Cerebellar Substrates for Error Correction in Motor Conditioning

Cerebellar Substrates for Error Correction in Motor Conditioning Neurobiology of Learning and Memory 76, 314 341 (2001) doi:10.1006/nlme.2001.4031, available online at http://www.idealibrary.com on Cerebellar Substrates for Error Correction in Motor Conditioning Mark

More information

Fitting Human Decision Making Models using Python

Fitting Human Decision Making Models using Python PROC. OF THE 15th PYTHON IN SCIENCE CONF. (SCIPY 2016) 1 Fitting Human Decision Making Models using Python Alejandro Weinstein, Wael El-Deredy, Stéren Chabert, Myriam Fuentes Abstract A topic of interest

More information

Associative learning

Associative learning Introduction to Learning Associative learning Event-event learning (Pavlovian/classical conditioning) Behavior-event learning (instrumental/ operant conditioning) Both are well-developed experimentally

More information

Topics in Animal Cognition. Oliver W. Layton

Topics in Animal Cognition. Oliver W. Layton Topics in Animal Cognition Oliver W. Layton October 9, 2014 About Me Animal Cognition Animal Cognition What is animal cognition? Critical thinking: cognition or learning? What is the representation

More information

Chapter 5: Learning and Behavior Learning How Learning is Studied Ivan Pavlov Edward Thorndike eliciting stimulus emitted

Chapter 5: Learning and Behavior Learning How Learning is Studied Ivan Pavlov Edward Thorndike eliciting stimulus emitted Chapter 5: Learning and Behavior A. Learning-long lasting changes in the environmental guidance of behavior as a result of experience B. Learning emphasizes the fact that individual environments also play

More information

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain Reinforcement learning and the brain: the problems we face all day Reinforcement Learning in the brain Reading: Y Niv, Reinforcement learning in the brain, 2009. Decision making at all levels Reinforcement

More information

Context and Pavlovian conditioning

Context and Pavlovian conditioning Context Brazilian conditioning Journal of Medical and Biological Research (1996) 29: 149-173 ISSN 0100-879X 149 Context and Pavlovian conditioning Departamento de Psicologia, Pontifícia Universidade Católica

More information

Learned changes in the sensitivity of stimulus representations: Associative and nonassociative mechanisms

Learned changes in the sensitivity of stimulus representations: Associative and nonassociative mechanisms Q0667 QJEP(B) si-b03/read as keyed THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2003, 56B (1), 43 55 Learned changes in the sensitivity of stimulus representations: Associative and nonassociative

More information

Study Plan: Session 1

Study Plan: Session 1 Study Plan: Session 1 6. Practice learning the vocabulary. Use the electronic flashcards from the Classical The Development of Classical : The Basic Principles of Classical Conditioned Emotional Reponses:

More information

Reactive agents and perceptual ambiguity

Reactive agents and perceptual ambiguity Major theme: Robotic and computational models of interaction and cognition Reactive agents and perceptual ambiguity Michel van Dartel and Eric Postma IKAT, Universiteit Maastricht Abstract Situated and

More information

PSY 402. Theories of Learning Chapter 4 Nuts and Bolts of Conditioning (Mechanisms of Classical Conditioning)

PSY 402. Theories of Learning Chapter 4 Nuts and Bolts of Conditioning (Mechanisms of Classical Conditioning) PSY 402 Theories of Learning Chapter 4 Nuts and Bolts of Conditioning (Mechanisms of Classical Conditioning) Classical vs. Instrumental The modern view is that these two types of learning involve similar

More information

Basic characteristics

Basic characteristics Learning Basic characteristics The belief that the universe is lawful and orderly The occurrence of phenomena as a function of the operation of specific variables Objective observation Controlled experiments

More information

Objectives. 1. Operationally define terms relevant to theories of learning. 2. Examine learning theories that are currently important.

Objectives. 1. Operationally define terms relevant to theories of learning. 2. Examine learning theories that are currently important. Objectives 1. Operationally define terms relevant to theories of learning. 2. Examine learning theories that are currently important. Learning Theories Behaviorism Cognitivism Social Constructivism Behaviorism

More information

A Study on Edge Detection Techniques in Retinex Based Adaptive Filter

A Study on Edge Detection Techniques in Retinex Based Adaptive Filter A Stud on Edge Detection Techniques in Retine Based Adaptive Filter P. Swarnalatha and Dr. B. K. Tripath Abstract Processing the images to obtain the resultant images with challenging clarit and appealing

More information

Chapter 5: How Do We Learn?

Chapter 5: How Do We Learn? Chapter 5: How Do We Learn? Defining Learning A relatively permanent change in behavior or the potential for behavior that results from experience Results from many life experiences, not just structured

More information

EMOTION-I Model: A Biologically-Based Theoretical Framework for Deriving Emotional Context of Sensation in Autonomous Control Systems

EMOTION-I Model: A Biologically-Based Theoretical Framework for Deriving Emotional Context of Sensation in Autonomous Control Systems 28 The Open Cybernetics and Systemics Journal, 2007, 1, 28-46 EMOTION-I Model: A Biologically-Based Theoretical Framework for Deriving Emotional Context of Sensation in Autonomous Control Systems David

More information

Exploration and Exploitation in Reinforcement Learning

Exploration and Exploitation in Reinforcement Learning Exploration and Exploitation in Reinforcement Learning Melanie Coggan Research supervised by Prof. Doina Precup CRA-W DMP Project at McGill University (2004) 1/18 Introduction A common problem in reinforcement

More information

acquisition associative learning behaviorism A type of learning in which one learns to link two or more stimuli and anticipate events

acquisition associative learning behaviorism A type of learning in which one learns to link two or more stimuli and anticipate events acquisition associative learning In classical conditioning, the initial stage, when one links a neutral stimulus and an unconditioned stimulus so that the neutral stimulus begins triggering the conditioned

More information

Learning. Learning: Problems. Chapter 6: Learning

Learning. Learning: Problems. Chapter 6: Learning Chapter 6: Learning 1 Learning 1. In perception we studied that we are responsive to stimuli in the external world. Although some of these stimulus-response associations are innate many are learnt. 2.

More information

Simulation of associative learning with the replaced elements model

Simulation of associative learning with the replaced elements model Behavior Research Methods 7, 39 (4), 993- Simulation of associative learning with the replaced elements model STEVEN GLAUTIER University of Southampton, Southampton, England Associative learning theories

More information

Learning : may be defined as a relatively permanent change in behavior that is the result of practice. There are four basic kinds of learning

Learning : may be defined as a relatively permanent change in behavior that is the result of practice. There are four basic kinds of learning LEARNING Learning : may be defined as a relatively permanent change in behavior that is the result of practice. There are four basic kinds of learning a. Habituation, in which an organism learns that to

More information

Chapter 6. Learning: The Behavioral Perspective

Chapter 6. Learning: The Behavioral Perspective Chapter 6 Learning: The Behavioral Perspective 1 Can someone have an asthma attack without any particles in the air to trigger it? Can an addict die of a heroin overdose even if they ve taken the same

More information

ISIS NeuroSTIC. Un modèle computationnel de l amygdale pour l apprentissage pavlovien.

ISIS NeuroSTIC. Un modèle computationnel de l amygdale pour l apprentissage pavlovien. ISIS NeuroSTIC Un modèle computationnel de l amygdale pour l apprentissage pavlovien Frederic.Alexandre@inria.fr An important (but rarely addressed) question: How can animals and humans adapt (survive)

More information

CHAPTER 7 LEARNING. Jake Miller, Ocean Lakes High School

CHAPTER 7 LEARNING. Jake Miller, Ocean Lakes High School CHAPTER 7 LEARNING Jake Miller, Ocean Lakes High School Learning: Defined Learning: Relatively permanent change in [observable] behavior due to experience NOT temporary changes due to disease, injury,

More information

PSY402 Theories of Learning. Chapter 4 (Cont.) Indirect Conditioning Applications of Conditioning

PSY402 Theories of Learning. Chapter 4 (Cont.) Indirect Conditioning Applications of Conditioning PSY402 Theories of Learning Chapter 4 (Cont.) Indirect Conditioning Applications of Conditioning Extinction Extinction a method for eliminating a conditioned response. Extinction paradigm: Present the

More information

Real-time attentional models for classical conditioning and the hippocampus.

Real-time attentional models for classical conditioning and the hippocampus. University of Massachusetts Amherst ScholarWorks@UMass Amherst Doctoral Dissertations 1896 - February 2014 Dissertations and Theses 1-1-1986 Real-time attentional models for classical conditioning and

More information

I. Classical Conditioning

I. Classical Conditioning Learning Chapter 8 Learning A relatively permanent change in an organism that occur because of prior experience Psychologists must study overt behavior or physical changes to study learning Learning I.

More information

an ability that has been acquired by training (process) acquisition aversive conditioning behavior modification biological preparedness

an ability that has been acquired by training (process) acquisition aversive conditioning behavior modification biological preparedness acquisition an ability that has been acquired by training (process) aversive conditioning A type of counterconditioning that associates an unpleasant state (such as nausea) with an unwanted behavior (such

More information

Effect of extended training on generalization of latent inhibition: An instance of perceptual learning

Effect of extended training on generalization of latent inhibition: An instance of perceptual learning Learn Behav (2011) 39:79 86 DOI 10.3758/s13420-011-0022-x Effect of extended training on generalization of latent inhibition: An instance of perceptual learning Gabriel Rodríguez & Gumersinda Alonso Published

More information

Evaluating the TD model of classical conditioning

Evaluating the TD model of classical conditioning Learn Behav (1) :35 319 DOI 1.3758/s13-1-8- Evaluating the TD model of classical conditioning Elliot A. Ludvig & Richard S. Sutton & E. James Kehoe # Psychonomic Society, Inc. 1 Abstract The temporal-difference

More information

Hebbian Plasticity for Improving Perceptual Decisions

Hebbian Plasticity for Improving Perceptual Decisions Hebbian Plasticity for Improving Perceptual Decisions Tsung-Ren Huang Department of Psychology, National Taiwan University trhuang@ntu.edu.tw Abstract Shibata et al. reported that humans could learn to

More information

Approximately as appeared in: Learning and Computational Neuroscience: Foundations. Time-Derivative Models of Pavlovian

Approximately as appeared in: Learning and Computational Neuroscience: Foundations. Time-Derivative Models of Pavlovian Approximately as appeared in: Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore, Eds., pp. 497{537. MIT Press, 1990. Chapter 12 Time-Derivative Models of

More information

Rescorla-Wagner (1972) Theory of Classical Conditioning

Rescorla-Wagner (1972) Theory of Classical Conditioning Rescorla-Wagner (1972) Theory of Classical Conditioning HISTORY Ever since Pavlov, it was assumed that any CS followed contiguously by any US would result in conditioning. Not true: Contingency Not true:

More information

An Artificial Synaptic Plasticity Mechanism for Classical Conditioning with Neural Networks

An Artificial Synaptic Plasticity Mechanism for Classical Conditioning with Neural Networks An Artificial Synaptic Plasticity Mechanism for Classical Conditioning with Neural Networks Caroline Rizzi Raymundo (B) and Colin Graeme Johnson School of Computing, University of Kent, Canterbury, Kent

More information

Memory, Attention, and Decision-Making

Memory, Attention, and Decision-Making Memory, Attention, and Decision-Making A Unifying Computational Neuroscience Approach Edmund T. Rolls University of Oxford Department of Experimental Psychology Oxford England OXFORD UNIVERSITY PRESS Contents

More information

Lesson 6 Learning II Anders Lyhne Christensen, D6.05, INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS

Lesson 6 Learning II Anders Lyhne Christensen, D6.05, INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS Lesson 6 Learning II Anders Lyhne Christensen, D6.05, anders.christensen@iscte.pt INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS First: Quick Background in Neural Nets Some of earliest work in neural networks

More information

An attention-modulated associative network

An attention-modulated associative network Learning & Behavior 2010, 38 (1), 1-26 doi:10.3758/lb.38.1.1 An attention-modulated associative network JUSTIN A. HARRIS AND EVA N J. LIVESEY University of Sydney, Sydney, New South Wales, Australia We

More information

Behavioral Neuroscience: Fear thou not. Rony Paz

Behavioral Neuroscience: Fear thou not. Rony Paz Behavioral Neuroscience: Fear thou not Rony Paz Rony.paz@weizmann.ac.il Thoughts What is a reward? Learning is best motivated by threats to survival Threats are much better reinforcers Fear is a prime

More information

Learning = an enduring change in behavior, resulting from experience.

Learning = an enduring change in behavior, resulting from experience. Chapter 6: Learning Learning = an enduring change in behavior, resulting from experience. Conditioning = a process in which environmental stimuli and behavioral processes become connected Two types of

More information

Name: Period: Chapter 7: Learning. 5. What is the difference between classical and operant conditioning?

Name: Period: Chapter 7: Learning. 5. What is the difference between classical and operant conditioning? Name: Period: Chapter 7: Learning Introduction, How We Learn, & Classical Conditioning (pp. 291-304) 1. Learning: 2. What does it mean that we learn by association? 3. Habituation: 4. Associative Learning:

More information

To appear in D.A. Rosenbaum & C.E. Collyer (Eds.), Timing of behavior: Neural, computational, and psychological perspectives. Cambridge, MA: MIT Press

To appear in D.A. Rosenbaum & C.E. Collyer (Eds.), Timing of behavior: Neural, computational, and psychological perspectives. Cambridge, MA: MIT Press To appear in D.A. Rosenbaum & C.E. Collyer (Eds.), Timing of behavior: Neural, computational, and psychological perspectives. Cambridge, MA: MIT Press Predictive Timing Under Temporal Uncertainty: The

More information

Theories of Learning

Theories of Learning Theories of Learning Learning Classical conditioning Classical conditioning in real life Operant conditioning Operant conditioning in real life Learning and the mind Watson s Extreme Environmentalism Give

More information

ARTICLE IN PRESS. Cognition xxx (2009) xxx xxx. Contents lists available at ScienceDirect. Cognition. journal homepage:

ARTICLE IN PRESS. Cognition xxx (2009) xxx xxx. Contents lists available at ScienceDirect. Cognition. journal homepage: Cognition xxx (2009) xxx xxx Contents lists available at ScienceDirect Cognition journal homepage: www.elsevier.com/locate/cognit Using perceptrons to explore the reorientation task Michael R.W. Dawson

More information

Modulators of Spike Timing-Dependent Plasticity

Modulators of Spike Timing-Dependent Plasticity Modulators of Spike Timing-Dependent Plasticity 1 2 3 4 5 Francisco Madamba Department of Biology University of California, San Diego La Jolla, California fmadamba@ucsd.edu 6 7 8 9 10 11 12 13 14 15 16

More information

Learning. 3. Which of the following is an example of a generalized reinforcer? (A) chocolate cake (B) water (C) money (D) applause (E) high grades

Learning. 3. Which of the following is an example of a generalized reinforcer? (A) chocolate cake (B) water (C) money (D) applause (E) high grades Learning Practice Questions Each of the questions or incomplete statements below is followed by five suggested answers or completions. Select the one that is best in each case. 1. Just before something

More information

Challenging Behaviours in Childhood

Challenging Behaviours in Childhood Challenging Behaviours in Childhood A/Professor Alasdair Vance Consultant Child and Adolescent Psychiatrist Department of Paediatrics University of Melbourne Royal Children s Hospital Email: avance@unimelb.edu.au

More information

Classical Conditioning Classical Conditioning - a type of learning in which one learns to link two stimuli and anticipate events.

Classical Conditioning Classical Conditioning - a type of learning in which one learns to link two stimuli and anticipate events. Classical Conditioning Classical Conditioning - a type of learning in which one learns to link two stimuli and anticipate events. behaviorism - the view that psychology (1) should be an objective science

More information

A configural theory of attention and associative learning

A configural theory of attention and associative learning Learn Behav (2012) 40:241 254 DOI 10.3758/s13420-012-0078-2 configural theory of attention and associative learning David N. George & John M. Pearce # Psychonomic Society, Inc. 2012 bstract formal account

More information

A Computational Theory

A Computational Theory HZPPOCAMPUS, VOL. 3, NO. 4, PAGES 491-516, OCTOBER 1993 Hippocampal Mediation of Stimulus Representation: A Computational Theory Mark A. Gluck and Catherine E. Myers Center for Molecular and Behavioral

More information

Learning Deterministic Causal Networks from Observational Data

Learning Deterministic Causal Networks from Observational Data Carnegie Mellon University Research Showcase @ CMU Department of Psychology Dietrich College of Humanities and Social Sciences 8-22 Learning Deterministic Causal Networks from Observational Data Ben Deverett

More information

Learning. Learning is a relatively permanent change in behavior acquired through experience or practice.

Learning. Learning is a relatively permanent change in behavior acquired through experience or practice. Learning Learning is a relatively permanent change in behavior acquired through experience or practice. What is Learning? Learning is the process that allows us to adapt (be flexible) to the changing conditions

More information

Learning Habituation Associative learning Classical conditioning Operant conditioning Observational learning. Classical Conditioning Introduction

Learning Habituation Associative learning Classical conditioning Operant conditioning Observational learning. Classical Conditioning Introduction 1 2 3 4 5 Myers Psychology for AP* Unit 6: Learning Unit Overview How Do We Learn? Classical Conditioning Operant Conditioning Learning by Observation How Do We Learn? Introduction Learning Habituation

More information

Unit 6 Learning.

Unit 6 Learning. Unit 6 Learning https://www.apstudynotes.org/psychology/outlines/chapter-6-learning/ 1. Overview 1. Learning 1. A long lasting change in behavior resulting from experience 2. Classical Conditioning 1.

More information

3/7/2010. Theoretical Perspectives

3/7/2010. Theoretical Perspectives Theoretical Perspectives REBT (1955) Albert Ellis Action & Result Oriented Teaches how to identify self-defeating thoughts Replaces thoughts w/ life enhancing ones 1 A B C s of personality formation: A

More information

Access from the University of Nottingham repository:

Access from the University of Nottingham repository: Mondragón, Esther and Gray, Jonathan and Alonso, Eduardo and Bonardi, Charlotte and Jennings, Dómhnall J. (2014) SSCC TD: a serial and simultaneous configural-cue compound stimuli representation for temporal

More information

Reinforcement Learning. Odelia Schwartz 2017

Reinforcement Learning. Odelia Schwartz 2017 Reinforcement Learning Odelia Schwartz 2017 Forms of learning? Forms of learning Unsupervised learning Supervised learning Reinforcement learning Forms of learning Unsupervised learning Supervised learning

More information

CATS IN SHORTS. Easy reader of definitions and formal expressions. Holger Ursin March Uni Helse Universitetet i Bergen.

CATS IN SHORTS. Easy reader of definitions and formal expressions. Holger Ursin March Uni Helse Universitetet i Bergen. CATS IN SHORTS Easy reader of definitions and formal expressions Holger Ursin March 2009 Uni Helse Universitetet i Bergen Alarm Activation Load Stress Response Stress Stressor Stimuli Stimulus expectancy

More information

Behavioral generalization

Behavioral generalization Supplementary Figure 1 Behavioral generalization. a. Behavioral generalization curves in four Individual sessions. Shown is the conditioned response (CR, mean ± SEM), as a function of absolute (main) or

More information

What is Learned? Lecture 9

What is Learned? Lecture 9 What is Learned? Lecture 9 1 Classical and Instrumental Conditioning Compared Classical Reinforcement Not Contingent on Behavior Behavior Elicited by US Involuntary Response (Reflex) Few Conditionable

More information

Emotion Explained. Edmund T. Rolls

Emotion Explained. Edmund T. Rolls Emotion Explained Edmund T. Rolls Professor of Experimental Psychology, University of Oxford and Fellow and Tutor in Psychology, Corpus Christi College, Oxford OXPORD UNIVERSITY PRESS Contents 1 Introduction:

More information

Unit 06 - Overview. Click on the any of the above hyperlinks to go to that section in the presentation.

Unit 06 - Overview. Click on the any of the above hyperlinks to go to that section in the presentation. Unit 06 - Overview How We Learn and Classical Conditioning Operant Conditioning Operant Conditioning s Applications, and Comparison to Classical Conditioning Biology, Cognition, and Learning Learning By

More information

City Research Online. Permanent City Research Online URL:

City Research Online. Permanent City Research Online URL: Mondragon, E., Alonso, E., Fernandez, A. & Gray, J. (01). An extension of the Rescorla and Wagner Simulator for context conditioning. Computer Methods and Programs in Biomedicine, 1(), pp. -0. doi:.1/j.cmpb.01.01.01

More information

Discrimination and Generalization in Pattern Categorization: A Case for Elemental Associative Learning

Discrimination and Generalization in Pattern Categorization: A Case for Elemental Associative Learning Discrimination and Generalization in Pattern Categorization: A Case for Elemental Associative Learning E. J. Livesey (el253@cam.ac.uk) P. J. C. Broadhurst (pjcb3@cam.ac.uk) I. P. L. McLaren (iplm2@cam.ac.uk)

More information

March 12, Introduction to reinforcement learning. Pantelis P. Analytis. Introduction. classical and operant conditioning.

March 12, Introduction to reinforcement learning. Pantelis P. Analytis. Introduction. classical and operant conditioning. March 12, 2018 1 / 27 1 2 3 4 2 / 27 What s? 3 / 27 What s? 4 / 27 classical Conditioned stimulus (e.g. a sound), unconditioned stimulus (e.g. the taste of food), unconditioned response (unlearned behavior

More information

Dopamine, prediction error and associative learning: A model-based account

Dopamine, prediction error and associative learning: A model-based account Network: Computation in Neural Systems March 2006; 17: 61 84 Dopamine, prediction error and associative learning: A model-based account ANDREW SMITH 1, MING LI 2, SUE BECKER 1,&SHITIJ KAPUR 2 1 Department

More information

Learning. Learning is a relatively permanent change in behavior acquired through experience.

Learning. Learning is a relatively permanent change in behavior acquired through experience. Learning Learning is a relatively permanent change in behavior acquired through experience. Classical Conditioning Learning through Association Ivan Pavlov discovered the form of learning called Classical

More information

Information processing at single neuron level*

Information processing at single neuron level* Information processing at single neuron level* arxiv:0801.0250v1 [q-bio.nc] 31 Dec 2007 A.K.Vidybida Bogolyubov Institute for Theoretical Physics 03680 Kyiv, Ukraine E-mail: vidybida@bitp.kiev.ua http://www.bitp.kiev.ua/pers/vidybida

More information

Behavioral Neuroscience: Fear thou not. Rony Paz

Behavioral Neuroscience: Fear thou not. Rony Paz Behavioral Neuroscience: Fear thou not Rony Paz Rony.paz@weizmann.ac.il Thoughts What is a reward? Learning is best motivated by threats to survival? Threats are much better reinforcers? Fear is a prime

More information

Modeling a reaction time variant of the Perruchet effect in humans

Modeling a reaction time variant of the Perruchet effect in humans Modeling a reaction time variant of the Perruchet effect in humans Amy McAndrew (am375@exeter.ac.uk) Fayme Yeates Frederick Verbruggen Ian P.L. McLaren School of Psychology, College of Life and Environmental

More information

Reinforcement Learning. With help from

Reinforcement Learning. With help from Reinforcement Learning With help from A Taxonomoy of Learning L. of representations, models, behaviors, facts, Unsupervised L. Self-supervised L. Reinforcement L. Imitation L. Instruction-based L. Supervised

More information

Why do we have a hippocampus? Short-term memory and consolidation

Why do we have a hippocampus? Short-term memory and consolidation Why do we have a hippocampus? Short-term memory and consolidation So far we have talked about the hippocampus and: -coding of spatial locations in rats -declarative (explicit) memory -experimental evidence

More information

EBCC Data Analysis Tool (EBCC DAT) Introduction

EBCC Data Analysis Tool (EBCC DAT) Introduction Instructor: Paul Wolfgang Faculty sponsor: Yuan Shi, Ph.D. Andrey Mavrichev CIS 4339 Project in Computer Science May 7, 2009 Research work was completed in collaboration with Michael Tobia, Kevin L. Brown,

More information

Functional Relationships Between Arbitrary Stimuli and Fixed Responses: Reflex Conditioning

Functional Relationships Between Arbitrary Stimuli and Fixed Responses: Reflex Conditioning Chapter 9 1 CHAPTER 9 Functional Relationships Between Arbitrary Stimuli and Fied Responses: Refle Conditioning I. Introduction II. Prototypical Procedure Resulting in a Functional Relationship Between

More information

A neurocomputational model of classical conditioning phenomena: A putative role for the hippocampal region in associative learning

A neurocomputational model of classical conditioning phenomena: A putative role for the hippocampal region in associative learning available at www.sciencedirect.com www.elsevier.com/locate/brainres Research Report A neurocomputational model of classical conditioning phenomena: A putative role for the hippocampal region in associative

More information

Representation and Generalisation in Associative Systems

Representation and Generalisation in Associative Systems Representation and Generalisation in Associative Systems M.E. Le Pelley (mel22@hermes.cam.ac.uk) I.P.L. McLaren (iplm2@cus.cam.ac.uk) Department of Experimental Psychology; Downing Site Cambridge CB2 3EB,

More information