Instrumental or Operant. Operant Conditioning. History of Instrumental Cond. Classical vs. Instrumental 10/10/2017. Law of Effect

Instrumental or Operant Law of Effect Operant Conditioning Learning & Memory Arlo Clark-Foos operates on environment to cause an outcome behavior is instrumental in causing outcome Priscilla the Fastidious Pig Thorndike & Skinner https://www.youtube.com/watch?v=lsv992ts6as Classical vs. Instrumental History of Instrumental Cond. Differences Classical Reflexive, automatic behavior Reinforcement follows CS, regardless of response Instrumental Voluntary behavior Reinforcement only follows the response Similarities Negative acceleration, blocking, conditioned inhibition, spontaneous recovery, generalization and discrimination Edward Thorndike s (1898) puzzle boxes Initially random acts Decrease in time to escape Law of Effect (S-R Association) Annoying vs. Satisfying events Believed reinforcer is not part of association! S D R 1

Superstitious Behavior Belongingness Breland & Breland (1961) What makes Sammy dance? B.F. Skinner (1938) showed that nearly any behavior a pigeon performs during reinforcement will increase in frequency. Shettleworth (1975) Reinforcing with food only reinforces feeding Behaviors Learned Helplessness Losing Streaks Seligman & Maier (1967) Rats and yoked shocks Detroit Lions, 2008 Detroit Lions, 2015? Later extended to college students and anagrams Also extended to depression 2

Willard Small 1901: Introduced mazes to animal research Studying/Observing Instrumental Learning METHODOLOGY Hampton Court, London Mazes in Research Mazes in Research T-Maze Alternation learning Better at win-shift than win-stay Radial Arm Maze Random without repetition Memory Load: 16+ 3

Mazes in Research Morris Water Maze Cued (Response) Learning Rats can see the platform: S-R Association Place Learning Platform is below surface: Explicit, cognitive memory Conditioning Takes Time Skinner s Free Operant Protocol (vs. Discrete Trials) Skinner box (automatizing data collection) Cumulative recorder (akin to Odometer) Secondary Reinforcer What is Learned? Discriminative Stimuli (S D ) S D (light on) R (press lever) O (get food) S D (light off) R (press lever) O (no food) Habit Slips (Slips of Action; Reason, 1975) Responses (R) Lashley s rats swimming mazes (different motor responses) Outcomes (O) Reinforcers and Punishments Shaping Behavior Shaping Requires skilled trainer Physical rehabilitation and language in autism Bomb/drug detecting dogs Chaining Backward chaining Twiggy https://www.youtube.co m/watch?v=dvfxf8o-lhw 4

Human Skills and Habits Walking feedback from vision/muscles? 1. Lashley (1951): RTs > 100ms Pianists: 16+ movements per second 2. Damage to sensory feedback 3. Sequencing errors 4. Time to initiate depends on length Human Skills and Habits Motor Programs Initiated complete General outline, malleable (Schmidt, 1988) Skill Acquisition (Anderson, 1982) 1. Cognitive Stage 2. Associative Stage 3. Autonomous Stage Reinforcers Primary Food, water, sleep, sex, shelter (temp control) Secondary Predict arrival of primary Token Economies (Conestogas) Drive Reduction Theory (Hull, 1943) Primary not always reinforcing Negative contrast Nipple sucking for sugar water Lame treats on Halloween Punishers Determinants of effectiveness 1. Punishment variable behavior Hot stove 2. SD can encourage cheating Speeding or my dog and Krispy Kreme 3. Concurrent reinforcement Class clowns 4. Intensity matters Child rearing or criminal justice 5

Differential Reinforcement of Alternative Behaviors (DRA) Cinemark (2011) Building S D R O Timing Immediate is best Criminal Justice, Punishment Self Control Immediate vs. Delayed Reward Diets, Studying, etc. Precommitment (SI) Positive vs Negative Reinforcement Positive vs Negative Punishment 6

Continuous vs. Partial Reinforcement Schedules Fixed-ratio (FR) Postreinforcement pause Variable-ratio (VR) Slot machine (keep playing) Fixed-interval (FI) TBPM Variable-interval (VI) Waiting is the hardest part Choosing Between Behaviors Concurrent reinforcement schedules Football on Saturdays Matching Law Behavioral Economics (Thaler wins Nobel Prize, 2017) Bliss point and Sunfish (observation of behavior) Why do I watch football? Behaviors with no primary reinforcers Premack Principle (1959) Rats with water/wheel, Children with candy/pinball For me: Grading/Cleaning Response Deprivation Hypothesis Illegal Drugs? BRAIN SUBSTRATES 7

Basal ganglia S D R Prefrontal Cortex R O Dorsal Striatum (caudate nucleus, putamen) Receives highly processed sensory info Projects to M1 Lesioned rats fail to learn behaviors in response to stimuli SD (light) R (lever press) O (food) Orbitofrontal cortex (OPFC) Receives sensory input (senses and visceral) Projects to dorsal striatum Grape juice neurons (Tremblay & Schultz, 1999) Habitual and Automatic Behaviors Bike riding, playing instruments, running past food in a maze I want you to want me by Cheap Trick James Olds (1954) Electrical current in lateral hypothalamus 700 times an hour, physical exhaustion, starvation Ventral Tegmental Area (VMA) Pleasure center? Excitement/anticipation? Wanting in the VTA/SNc VTA SNc Dopaminergic System Incentive Salience Hypothesis Working for pleasure (want/drive) What if there is no drive (no dopamine)? Addiction, cues, and precommitment Motivational value Projects to SNc 8

Endogenous Opioids Exogenous Opiates: Opium, Morphine, Heroin May mediate Hedonic value Increases liking of other stimuli Decreases perception of pain Endogenous released in response to primary reinforcers Which and how many activated may determine preference Nipple Suckers Play Halo or Watch Cartoons Punishment Signaling Somatosensory Cortex (S1) Nociceptors Social Rejection Insular Cortex (Insula) Dorsal posterior insula Degree of activation correlates with magnitude of punisher Dorsal Anterior Cingulate Cortex Motivational value of punishment Drug Addiction Pathological Known harmful consequences Concurrent reinforcement Yay drugs & Boo withdrawals Dopaminergic System Stroke damage to insula can wipe out addiction Might as well face it, you re addicted to love Behavioral Addiction Gambling, VR Schedules (Skinner), and Gambler s Fallacy Parkinson s patients and dopamine agonists Cognitive and Behavioral Therapies based on Conditioning 9

Not All Conditioning is Equal Partial Reinforcement Effect Partial Reinforcement Extinction Effect (PREE) Frustration (Amsel) vs. Sequential (Capaldi) Theories Fixed vs. Variable & Ratio vs. Interval Child rearing, pet training, gambling, supersition What explains the PREE? Frustration Theory (Amsel) CRF R+ Extinction R- Frustration Punishes Response CRF: R+ R+ R+ R+ R+ R+ Develop (R-O) expectancy PRF: R+ R+ R- R+ R- R- Develop (R-O) and (R-no O) expectancy Evidence for Frustration: Behavior of pigeons Children tantrums S (frustration) R O What explains the PREE? Sequential Theory (Capaldi) Outcome of previous trial serves as a cue for subsequent behavior PRF: R+ R+ R- R+ R- R- Fm Fm NFm Fm NFm NFm Response Chaining Backward Chaining Breaks in the chain Animal intelligence Complex Behavior NFm R (S-R) strengthened by next R+ What happens with long ITI?...Decay Frustration? Memory? Stronger PREE with long ITI 10

Striatum and Skill/Habit Broca vs. Wernicke Double Dissociation Caudate, putamen, nucleus accumbens Organizes somatosensory representations and motor responses for planning and executing goal-oriented behavior. Packard et al. (1989) Radial Arm Maze (8 arms) Win-Stay vs. Win-Shift Response vs. Place Learning 11

Habit Learning in Humans Parkinson s Disease Impaired dopaminergic system in striatum Huntington s Disease Loss of some striatal function Weather Prediction Game Knowlton et al. (1996) (Gabrieli, 1995) Weather Prediction Game Knowlton et al. (1996) Weather Prediction Game Poldrack et al. (1999) 12

Neurophysiological Data Mink (1996) Neurons in striatum fire in anticipation of movement Schultz (2006) DA Neurons from brain stem into striatum Fire with expectation and reception of rewards Blocking and expectation Addiction and Drug Use Dopamine and Reward Loose Ends Stress and Memory Anxiogenics Response Strategy (Packard & Wingard, 2004) Peripheral or Intra-Basolateral Amygdala (Hippocampus) Yohimibine, RS78848-197, Vehicle (Placebo) Autopilot 13