Introduction to Learning Associative learning Event-event learning (Pavlovian/classical conditioning) Behavior-event learning (instrumental/ operant conditioning) Both are well-developed experimentally and theoretically Note confusion in usages as procedures, outcomes, and processes 1
Pavlovian conditioning Bell (CS) orient (OR) Food (US) salivate (UR) CS = conditioned stimulus US = unconditioned stimulus OR = orienting response CR = conditioned response UR = unconditioned response salivate (CR) Note limiting nature of defined properties of these events Important consequences of Pavlovian conditioning Behavioral Reflexes vs behavior systems Regulatory systems; anticipatory homeostasis (allostasis) Affective Motivational states and modulation of other behavior Reinforcement power Cognitive Attention Representation 2
Learning theory What are the conditions of learning? What are the contents of learning? What is the relation of learning to behavior? Conditions for learning External (environmental) vs internal (organismic) conditions External Time Reinforcement (do you need it; what is it?) Internal Motivation (drive) 3
Conditions for learning External (environmental) vs internal (organismic) conditions External Time, information Reinforcement (do you need it; what is it?) Internal Motivation (drive) Attention other hypothetical variables Function and evolutionary history Time Conditions for learning Inter-stimulus interval (ISI) 4
Acquisition of Pavlovian associations: interstimulus interval Interstimulus-interval (ISI) function 1 9 8 7 6 Percent CRs 5 4 3 2 1-1 -1-5 5 1 15 2 25 3 CS-US interval (m sec) Condition for learning or content of learning? 1 Eyeblink 1 Licking Pecking % CRs 8 6 4 2 % CRs 8 6 4 Trials to Criterion 4 8 12..2.4.6.8 2 4 6 8 5 1 15 2 25 3 Seconds Seconds Seconds 5 Rear. Conditioned Suppression Flavor Aversion 1. % CRs 4 3 2 Latency (s) 2. 4. 6. Rejection Ratio.8.6.4.2 1 2 3 4 5 6 2 4 6 8 1. 1 2 3 4 5 6 Seconds Seconds Hours 5
Time Important variables Inter-stimulus interval (ISI) Inter-trial interval (ITI) Ratio of ITI/ISI Condition for learning or content of learning? Important variables Time Inter-stimulus interval (ISI) Inter-trial interval (ITI) ITI/ISI Information Contingency/correlation CS US CS US CS US E R I 6
Time Important variables Inter-stimulus interval (ISI) Inter-trial interval (ITI) Information Contingency/correlation Excitation and inhibition Blocking Blocking experiment B: Light food light + tone food tone? C: Light / food light + tone food tone? 7
Many theories Most common & successful are contiguity theories: Learning-induced variations in processing of USs and CSs Reinforcement theories: USs change effectiveness; unpredicted USs more effective Attention theories: CSs change effectiveness; good (or poor) predictors processed better Learning-induced variations in processing of USs Rescorla-Wagner model: V A = α A β 1 (λ 1 -V ΣA...X ) Reinforcing event is error signal (discrepancy between expected and actual value of US) If US underexpected (error>) learning is excitatory If US overexpected (error < ) learning is inhibitory Thus, excitation & inhibition tied to change/contrast rather than absolute events Symmetry of conditions for, and content of, excitation and inhibition Error signal is aggregate error (all sources) Rate parameters for CS and US (constants) determine how fast things happen, and sometimes how big the differences are but otherwise not typically critical 8
Learning-induced variations in processing of USs Rescorla-Wagner model: V A = α A β 1 (λ 1 -V ΣA...X ) Became a standard in behavioral psychology because it did a great job of modeling a large number of odd (counterintuitive) findings as well as obvious ones, with minimum complexity of representation and computation (will demonstrate) later became a standard in neuroscience because it is easy to implement and because midbrain dopamine neurons seemed to show requisite properties to be R-W teaching signals (aggregate prediction error) delivery of unexpected rewards ( positive prediction error ) produces rate increases omission of expected rewards ( negative prediction error ) produces rate decreases as reward becomes expected on the basis of a cue or response, the increases come under control of that cue/response, and fail to occur to reward delivery itself (will say more on this later, too) Learning-induced variations in processing of USs Rescorla-Wagner model: V A = α A β 1 (λ 1 -V ΣA...X ) Apply to blocking Apply to conditioned inhibition procedure (A+, AX-) Apply to overexpectation (A+, B+ ABX+ X?) Apply to contingency (AC+, C-) (AC+, C+) (AC-, C+) Not the last word; many intolerable flaws and wrong predictions Symmetry of excitation and inhibition elusive (e.g., extinction of inhibition and excitation) Aggregate error may not be the whole story; constrained error terms Retrieval effects Evidence for more detailed representation of world ( model-based learning, e.g., Rescorla, 1973) No role for changes in CS processing (attention) 9
Learning-induced variations in processing of CSs Passive (limited resource) models (e.g. Suthreland & Mackintosh, 1971) Active models (must learn to distribute resources) Mackintosh: α increases to better predictors, decreases to worse predictors Pearce-Hall model: V A = α A λ 1, where α A ~ λ 1 -V ΣA...X. (Unsigned) reinforcement error signal determines processing of CS (learning rate parameter). Attends more to bad predictors. Pearce-Hall model Contrast learning and action (controlled and automatic) Apply to blocking Apply to cases in opposition to R-W Unblocking: A++ AX+ X? Hall-Pearce negative transfer & rescue A+ A- A++ A+ A++ A- 1
Pearce-Hall model Works in principle, but usually a problem with speed of process in simple acquisition to one cue, would be gradual loss of α (OK) if added another cue, as in blocking experiment, such that λ V agg =, change in α would be instantaneous if changed or removed reinforcer, change in α would be instantaneous Added dampening factor, such that α A = ϒ λ V agg + (1-ϒ) α A When combined with R-W term for variations in reinforcer effectiveness, accounts for nearly everything Any evidence for an unsigned Pearce-Hall-like error signal? establish learning with one reinforcer and then shift to a bigger or smaller one normalized firing rate Central Nucleus of Amygdala 1.9.8.7 Down shift Up shift.6 block switch first ten trial last ten trials normalized firing rate 1.9.8.7.6 Basolateral Amygdala Down shift Up shift block switch first ten trial last ten trials normalized firing rate.6.4.2 Dopamine (VTA) 1 Down shift Up shift.8 block switch first ten trial last ten trials 1.9 normalized signal strength (undamped version) Selective Downshift Signal.8.7.6 Down shift Up shift block switch first ten trial last ten trials normalized signal strength 1 Down shift Up shift.9.8.7.6 (damped version) Attention Signal block switch first ten trial last ten trials normalized signal strength (Recorla-Wagner signed error) Error Signal 1 Down shift Up shift.8.6.4.2 block switch first ten trial last ten trials 11
More complex theories Contingency theories: Learn individual event relations and compute contingencies Learning based (frequentist & Bayesian) Performance-based (Comparator theory) Learn everything and compute whatever needed Non-arbitrary nature of events Identity of associates matter Engage different neural and behavioral systems for learning Both CS and US individually, and their combination Garcia example 12
Acquisition of Pavlovian associations: cue to consequence Garcia Experiment Bright+Noisy+Tasty -- illness -or- Bright+Noisy+Tasty -- pain 6 5 4 % CR 3 2 tasty bright+noisy 1 illness pain US Content of learning: Pavlovian conditioning as knowledge acquisition Stimulus-stimulus (S-S) vs stimulus-response (S-R) associations Learning how vs. learning what (procedural vs declarative) Model-free vs model-based learning Light means slobber vs Light means food Devaluation experiment ( behavioral syllogism ): Light food men are mortal Food illness Peter is a man Light? Peter is mortal 13
Mediated Performance (devaluation) 4 % time in food cup 3 2 1 T1 T2 1 2 3 4 1 2 Sessions Trials T1 --> flavor1 T2 --> flavor2 flavor2 --> LiCl T1? T2? Holland, 1975;199 Mediated learning Median fluid consumption (ml) 2 16 12 8 4 Dev Non Pretest Test Light Food1 Food1? Light LiCl (Dev) Food1? Tone Food2 Food2? Tone nothing (Non) Food2? Holland, 1981 14
Sensory processing of absent events Mediated Performance (devaluation) 4 % time in food cup 3 2 1 T1 T2 1 2 3 4 1 2 Sessions Trials T1 --> flavor1 T2 --> flavor2 flavor2 --> LiCl T1? T2? continue testing T1 and T2, with unflavored sucrose in cup Sensory processing of absent events Mediated Performance (devaluation) 4 % time in food cup 3 2 1 T1 T2 1 2 3 4 1 2 Sessions Trials T1 --> flavor1 T2 --> flavor2 flavor2 --> LiCl T1? T2? T1 continue testing T1 and T2, with unflavored sucrose in cup 15
Sensory processing of absent events Mediated Performance (devaluation) 4 % time in food cup 3 2 1 T1 T2 1 2 3 4 1 2 Sessions Trials T1 --> flavor1 T2 --> flavor2 flavor2 --> LiCl T1? T2? T2 continue testing T1 and T2, with unflavored sucrose in cup Sensory processing of absent events Mediated Performance (devaluation) % time in food cup 4 3 2 1 T1 T2 Frequency 1 8 6 4 2 Suc Suc + T1 Suc + T2 1 2 3 4 1 2 Sessions Trials Injestive Aversive Evaluative Response T1 --> flavor1 T2 --> flavor2 flavor2 --> LiCl T1? T2? 16
Group Maintain Tone sucrose sucrose//licl tone? Group Devalue Tone sucrose sucrose LiCl tone? Group CTL Tone//sucrose sucrose LiCl, or tone? sucrose//licl Group Maintain Tone sucrose sucrose//licl tone? Group Devalue Tone sucrose sucrose LiCl tone? Group CTL Tone//sucrose sucrose LiCl, or tone? sucrose//licl Response to tone+water Frequency/1s 8. 7. 6. 5. 4. 3. 2. 1.. Maintain Devalue CTL Appetitive Aversive Response Kerfoot, Agarwal, Lee, & Holland, 24 17
FOS expression in nucleus accumbens Accumbens Shell 2 18 16 Maintain Devalue CTL Counts 14 12 1 8 6 4 2 Anterior Posterior Kerfoot, Agarwal, Lee, & Holland, 25 FOS expression in gustatory cortex GC Counts 3. 275. 25. 225. 2. 175. 15. 125. 1. 75. 5. 25.. Maintain Devalue CTL Group Maintain Devalue CTL Kerfoot, Agarwal, Lee, & Holland, 25 18