Dopamine and Reward Prediction Error

Similar documents
Reference Dependence In the Brain

Neuroeconomics Lecture 1

MEASURING BELIEFS AND REWARDS: A NEUROECONOMIC APPROACH ANDREW CAPLIN MARK DEAN PAUL W. GLIMCHER ROBB B. RUTLEDGE

Axiomatic Methods, Dopamine and Reward Prediction. Error

Axiomatic methods, dopamine and reward prediction error Andrew Caplin and Mark Dean

Title: Falsifying the Reward Prediction Error Hypothesis with an Axiomatic Model. Abbreviated title: Testing the Reward Prediction Error Model Class

Testing the Reward Prediction Error Hypothesis with an Axiomatic Model

Axiomatic Neuroeconomics

The Neural Basis of Financial Decision Making

Search and Satisficing

Neuroimaging vs. other methods

NSCI 324 Systems Neuroscience

Measuring Impact. Conceptual Issues in Program and Policy Evaluation. Daniel L. Millimet. Southern Methodist University.

Neurobiological Foundations of Reward and Risk

Appendix III Individual-level analysis

Brain Imaging studies in substance abuse. Jody Tanabe, MD University of Colorado Denver

Experimental Testing of Intrinsic Preferences for NonInstrumental Information

Reinforcement learning and the brain: the problems we face all day. Reinforcement Learning in the brain

Experimental Design Notes

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS

Is model fitting necessary for model-based fmri?

Beyond Lazy and Unmotivated

How rational are your decisions? Neuroeconomics

Theory Building and Hypothesis Testing. POLI 205 Doing Research in Politics. Theory. Building. Hypotheses. Testing. Fall 2015

Emotion Explained. Edmund T. Rolls

Lecture 7 Part 2 Crossroads of economics and cognitive science. Xavier Gabaix. March 18, 2004

Psyc 3705, Cognition--Introduction Sept. 13, 2013

Rational Choice Theory I: The Foundations of the Theory

590,000 deaths can be attributed to an addictive substance in some way

SLHS1402 The Talking Brain

Outline for the Course in Experimental and Neuro- Finance Elena Asparouhova and Peter Bossaerts

Introduction to Game Theory:

Toward a Mechanistic Understanding of Human Decision Making Contributions of Functional Neuroimaging

Carol and Michael Lowenstein Professor Director, Philosophy, Politics and Economics. Why were you initially drawn to game theory?

Reward Systems: Human

The Cognitive Systems Paradigm

The Discounted Utility Model

in Cognitive Neuroscience

MSc Neuroimaging for Clinical & Cognitive Neuroscience

Supplementary Information

Introduction to designing Microeconomic Systems

METHODOLOGY FOR DISSERTATION

Establishing the Purpose & Forming A Valid Hypothesis. Introduction to Research

On Two Points of View Regarding Revealed Preferences and Behavioral Economics

Unit 6 REVIEW Page 1. Name: Date:

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Understanding Minimal Risk. Richard T. Campbell University of Illinois at Chicago

Quasi-experimental analysis Notes for "Structural modelling".

Instructions for use

Unlike standard economics, BE is (most of the time) not based on rst principles, but on observed behavior of real people.

ECONOMICS OF HEALTH INEQUALITY

MTAT Bayesian Networks. Introductory Lecture. Sven Laur University of Tartu

On the How and Why of Decision Theory. Joint confusions with Bart Lipman

acquisition associative learning behaviorism B. F. Skinner biofeedback

Some Thoughts on the Principle of Revealed Preference 1

Handout on Perfect Bayesian Equilibrium

Part IV, they are too diverse and unsystematic to be considered as chapters of a handbook. Finally, for those scholars who are not familiar with the

Supporting Information. Demonstration of effort-discounting in dlpfc

MS&E 226: Small Data

The Adolescent Developmental Stage

Learning theory provides the basis for behavioral interventions. The USMLE behavioral science section always contains questions relating to learning

Game theory for playing games: sophistication in a negative-externality experiment

Drugs, addiction, and the brain

Myers PSYCHOLOGY. (7th Ed) Chapter 8. Learning. James A. McCubbin, PhD Clemson University. Worth Publishers

Moderating and Mediating Variables in Psychological Research

Association. Operant Conditioning. Classical or Pavlovian Conditioning. Learning to associate two events. We learn to. associate two stimuli

Council on Chemical Abuse Annual Conference November 2, The Science of Addiction: Rewiring the Brain

A Model of Dopamine and Uncertainty Using Temporal Difference

Receptor Theory and Biological Constraints on Value

Effects of Sequential Context on Judgments and Decisions in the Prisoner s Dilemma Game

Lecture 9 Internal Validity

Learning. Association. Association. Unit 6: Learning. Learning. Classical or Pavlovian Conditioning. Different Types of Learning

PSY318 Computational Modeling Tom Palmeri. Spring 2014! Mon 1:10-4:00 Wilson 316

Enhanced Choice Experiments

A Brief Introduction to Bayesian Statistics

The Logic of Causal Inference

The theory and data available today indicate that the phasic

Collaborative Problem Solving: Operationalizing Trauma Informed Care

Basic definition and Classification of Anhedonia. Preclinical and Clinical assessment of anhedonia.

Beliefs and Endogenous Cognitive Levels: an Experimental Study

Academic year Lecture 16 Emotions LECTURE 16 EMOTIONS

Serotonin System May Have Potential as a Target for Cocaine Medications

Strengthening Operant Behavior: Schedules of Reinforcement. reinforcement occurs after every desired behavior is exhibited

Taken From The Brain Top to Bottom //

Introduction to Applied Research in Economics

Introducing clients to a neuroscience based explanation of addiction

NEURO-BRAIN BOOTCAMP Expanding Leadership and Creativity through the Miracle of Modern (Neuro)Science

ISIS NeuroSTIC. Un modèle computationnel de l amygdale pour l apprentissage pavlovien.

Subjects Motivations

Learning. Classical Conditioning. Classical Conditioning

Rent Seeking in Groups

CONSUMER NEUROSCIENCE

4/9/2012. Happiness & Positive Emotion. Making choices choose what makes you happy

Computational Neuroscience. Instructor: Odelia Schwartz

The Neuroscience of Addiction: A mini-review

Survey of Knowledge Base Content

Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making

Why do people sometimes not keep their promises?

Graphical Modeling Approaches for Estimating Brain Networks

Learning. Learning: Problems. Chapter 6: Learning

Transcription:

Dopamine and Reward Prediction Error Professor Andrew Caplin, NYU The Center for Emotional Economics, University of Mainz June 28-29, 2010

The Proposed Axiomatic Method Work joint with Mark Dean, Paul Glimcher, and Robb Rutledge The potential for neuroeconomic researchers lies in complementarities Neuroscienti c measurement and biological understanding Economic modelling and organizational principles Crucial that there is a methodolgical framework which allows communication across disciplines "Utility" should not mean di erent things to neuroscientist than to economist

The Proposed Axiomatic Method The axiomatic approach developed by decision theorists can provide a unifying role in mature theories Not "abstract" axioms connecting non-observables to one another But "testable axioms connecting latent variables to (ideal) experimental data Novelty of neuroeconomics lies in hybrid nature of data and theories Economists introduced methods out of frustration with other treatments of latent variables The same reasons apply with even more force in neuroscience Ideally leads to a progressive experimental agenda: When data fail to satisfy, amend theory When data do satisfy, re ne e.g. to expected utility theory

The Proposed Axiomatic Method Intuitions concerning utility also important in neuroeconomics Dopamine is a neurotransmitter: transmits information from one part of the brain to another Neurobiological studies have associated dopamine with: Choice: Dopamine manipulations a ect choice behavior in animals Preference: Dopamine encodes information on revealed preferences Beliefs: Changes in expectations modify dopamine activity Learning: Dopamine manipulations a ect the way people learn Addiction: Many drugs of addiction act directly on dopamine Understanding dopamine may give valuable insight into economic behavior

Our Approach There remain barriers to incorporating understanding from dopamine into economics Competing theories of what dopamine does No common language between economics and neuroscience Treatment of unobservables We take an axiomatic approach to testing a model of dopamine Provide a complete list of testable predictions Provide a common language between disciplines by de ning unobservables Failure of particular axioms will aid model development Our aim is to systemize current neurobiological understanding Initially this may prove to be as much an export from, as an import to economics

Dopamine and Reward Prediction Error Schultz et al. [1997] Dopamine res only on receipt of unpredicted rewards Otherwise will re at rst predictor of reward If an expected reward is not received, dopamine ring will pause

Reward Prediction Error Reward Prediction Error hypothesis: Dopamine responds to the di erence between experienced and anticipated reward Information on preferences? Information on beliefs? Involved in reinforcement learning? We provide an axiomatic basis for this model of dopamine activity Use these axioms to develop parsimonious, non-parametric tests of the hypothesis Use experimental data from humans to perform these tests

The Data Set We consider the simplest possible environment in which we can think about reward prediction error Consists of prizes and lotteries: Z: A metric space of prizes with typical elements z, w Λ : Set of all simple probability distributions (lotteries) on Z with typical element p, q Λ(z): Set of all probability distributions whose support includes z Let e z be the lottery that gives prize z with certainty

The Data Set In our idealized data set, we assume we observe a function: δ : M! R M = f(z, p)jz 2 Z, p 2 (z)g where δ(z, p) is the amount of dopamine released when a prize z is obtained from a lottery p 2 Λ(z)

A Graphical Representation Dopamine released when prize 2 is obtained p=0.2 Prob of Prize 1 Dopamine Release Dopamine released when prize 1 is obtained

A Formalization of the RPE hypothesis Need to nd some way of de ning predicted reward from lotteries and experienced reward for prizes such that: Contains all the information which determines dopamine response Dopamine response is increasing in experienced and decreasing in predicted reward Dopamine always responds to no surprise in the same way Demand predicted reward of e z equal to the experienced reward of z We say that a dopamine release function has an RPE representation if we can nd functions r : Λ! R and E : r(z ) r(λ) such that: δ(z, p) = E [r(e z ), r(p)] E is strictly increasing in its rst argument and decreasing in its second argument E (x, x) = E (y, y) for all x, y 2 r(z )

Necessary Condition 1: Coherent Prize Dominance Axiom A1: Coherent Prize Dominance for all (z, p), (w, p), (z, q), (w, q) 2 M δ(z, p) > δ(w, p) ) δ(z, q) > δ(w, q)

Necessary Condition 2: Coherent Lottery Dominance Axiom A2: Coherent Lottery Dominance for all (z, p), (w, p), (z, q), (w, q) 2 M δ(z, p) > δ(z, q) ) δ(w, p) > δ(w, q)

Necessary Condition 3: Equivalence of Certainty Axiom A3: No Surprise Equivalence δ(z, e z ) = δ(w, e w ) 8 z, w 2 Z

A Representation Theorem In general, these conditions are necessary, but not su cient for an RPE representation However, in the special case where we look only at lotteries with two prizes they are Theorem 1: If jz j = 2, a dopamine release function δ satis es axioms A1-A3 if and only if it admits an RPE representation Thus, in order to test RPE in case of two prizes, we need only to test A1-A3

Aim Generate observations of δ in order to test axioms Use a data set containing: Two prizes: win $5, lose $5 Five lotteries: p 2 f0, 0.25, 0.5, 0.75.1g Do not observe dopamine directly Use fmri to observe activity in the Nucleus Accumbens Brain area rich in dopaminergic neurons

Experimental Design

Experimental Details 14 subjects (2 dropped for excess movement) Practice Session (outside scanner) of 4 blocks of 16 trials 2 Scanner Sessions of 8 blocks of 16 trials For Scanner Sessions, subjects paid $35 show up fee, + $100 endowment + outcome of each trial In each trial, subject o ered one option from Observation Set and one from a Decoy Set

Constructing Delta De ning Regions of Interest Need to determine which area of the brain is the Nucleus Accumbens Two ways of doing so: Anatomical ROIs: De ned by location Functional ROIs: De ned by response to a particular stimulus We concentrate on anatomical ROI, but use functional ROIs to test results

Constructing Delta Anatomical Regions of Interest [Neto et al. 2008]

Constructing Delta Estimating delta We now need to estimate the function δ using the data Use a between-subject design Treat all data as coming from a single subject Create a single time series for an ROI Average across voxels Convert to percentage change from session baseline Regress time series on dummies for the revelation of each prize/lottery pair δ(x, p) is the estimated coe cient on the dummy which takes the value 1 when prize x is obtained from lottery p

Results

Results Axioms hold Nucleus Accumbens activity in line with RPE model Experienced and predicted reward sensible

Time Paths

Early Period

Late Period

Two Di erent Signals?

Key Results fmri activity in Nucleus Accumbens does satisfy the necessary conditions for an RPE encoder However, this aggregate result may be the amalgamation of two separate signals Vary in temporal lag Vary in magnitude

Where Now? Observing beliefs and rewards? Axioms + experimental results tell us we can assign numbers to events such that NAcc activity encodes RPE according to those numbers Can we use these numbers to make inferences about beliefs and rewards? Are they beliefs and rewards in the sense that people usually use the words? Can we nd any external validity with respect to other observables? Behavior? Obviously rewarding events? Can we then generalize to other situations?

Economic Applications New way of observing beliefs Makes surprise directly observable Insights into mechanisms underlying learning Building blocks of utility

Conclusion We provide evidence that NAcc activity encodes RPE Can recover consistent dopaminergic beliefs and rewards Potential for important new insights into human behavior and state of mind