LETTERS. Neural correlates, computation and behavioural impact of decision confidence

Similar documents
SUPPLEMENTARY INFORMATION

Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning

A reservoir of time constants for memory traces in cortical neurons

Neural correlates of decision variables in parietal cortex

The sensory nature of mnemonic representation in the primate prefrontal cortex

Fast ensemble representations for abstract visual impressions

An Energy Efficient Seizure Prediction Algorithm

The auditory cortex mediates the perceptual effects of acoustic temporal expectation

SUPPLEMENTARY INFORMATION

Single-Molecule Studies of Unlabelled Full-Length p53 Protein Binding to DNA

Abstract reward and punishment representations in the human orbitofrontal cortex

THE EVALUATION OF DEHULLED CANOLA MEAL IN THE DIETS OF GROWING AND FINISHING PIGS

Check your understanding 3

Medial prefrontal cortical activity reflects dynamic re-evaluation during voluntary persistence

Differential neural coding of acoustic flutter within primate auditory cortex

Effects of physical exercise on working memory and prefrontal cortex function in post-stroke patients

Optimisation of diets for Atlantic cod (Gadus morhua) broodstock: effect of arachidonic acid on egg & larval quality

Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque

Using Paclobutrazol to Suppress Inflorescence Height of Potted Phalaenopsis Orchids

PNEUMOVAX 23 is recommended by the CDC for all your appropriate adult patients at increased risk for pneumococcal disease 1,2 :

EFFECTS OF AN ACUTE ENTERIC DISEASE CHALLENGE ON IGF-1 AND IGFBP-3 GENE EXPRESSION IN PORCINE SKELETAL MUSCLE

A role for inhibition in shaping the temporal flow of information in prefrontal cortex 2002 Nature Publishing Group

The effects of color on brightness

Review TEACHING FOR GENERALIZATION & MAINTENANCE

The effects of neural gain on attention and learning

SUPPLEMENTARY INFORMATION

Temporal Target Integration Underlies Performance at Lag 1 in the Attentional Blink

Copy Number ID2 MYCN ID2 MYCN. Copy Number MYCN DDX1 ID2 KIDINS220 MBOAT2 ID2

EVALUATION OF DIFFERENT COPPER SOURCES AS A GROWTH PROMOTER IN SWINE FINISHING DIETS 1

ENERGY CONTENT OF BARLEY

Agilent G6825AA MassHunter Pathways to PCDL Software Quick Start Guide

Invasive Pneumococcal Disease Quarterly Report July September 2018

Canceling actions involves a race between basal ganglia pathways

Chapter 5: The peripheral nervous system Learning activity suggested answers

Meat and Food Safety. B.A. Crow, M.E. Dikeman, L.C. Hollis, R.A. Phebus, A.N. Ray, T.A. Houser, and J.P. Grobbel

Invasive Pneumococcal Disease Quarterly Report. July September 2017

THE EFFECT OF DIFFERENT STIMULI ON MEAGRE (Argyrosomus regius) FEEDING BEHAVIOUR.

Clinical Study Report Synopsis Drug Substance Naloxegol Study Code D3820C00018 Edition Number 1 Date 01 February 2013 EudraCT Number

Emotional enhancement of memory via amygdaladriven facilitation of rhinal interactions

SUPPLEMENTARY INFORMATION

Not for Citation or Publication Without Consent of the Author

Effect of supplemental fat from dried distillers grains with solubles or corn oil on cow performance, IGF-1, GH, and NEFA concentrations 1

SYNOPSIS Final Abbreviated Clinical Study Report for Study CA ABBREVIATED REPORT

Positional and temporal clustering in serial order memory

SUPPLEMENTARY INFORMATION

How adaptations of substrate utilization regulate body composition

The Measurement of Interviewer Variance

Supplementary figure 1

Summary. Effect evaluation of the Rehabilitation of Drug-Addicted Offenders Act (SOV)

Effect of linear and random non-linear programming on environmental pollution caused by broiler production

Reactivation of emergent task-related ensembles during slow-wave sleep after neuroprosthetic learning

Jillian H. Fecteau and Douglas P. Munoz

Two different lateral amygdala cell populations contribute to the initiation and storage of memory

Teacher motivational strategies and student self-determination in physical education

Lesions of prefrontal cortex reduce attentional modulation of neuronal responses. and synchrony in V4

2. Hubs and authorities, a more detailed evaluation of the importance of Web pages using a variant of

SUPPLEMENTARY INFORMATION

PROVEN ANTICOCCIDIAL IN NEW FORMULATION

Consumer perceptions of meat quality and shelf-life in commercially raised broilers compared to organic free range broilers

XII. HIV/AIDS. Knowledge about HIV Transmission and Misconceptions about HIV

The step method: A new adaptive psychophysical procedure

USE OF SORGHUM-BASED DISTILLERS GRAINS IN DIETS FOR NURSERY AND FINISHING PIGS

Reactivations of emotional memory in the hippocampus amygdala system during sleep

The Effect of Substituting Sugar with Artificial. Sweeteners on the Texture and Palatability of Pancakes

WSU Tree Fruit Research and Extension Center, Wenatchee (509) ext. 265;

Saccadic eye movements cause compression of time as well as space

Appendix J Environmental Justice Populations

Chronic high-sodium diet intake after weaning lead to neurogenic hypertension in adult Wistar rats

Supplementary Figure 1

EFFECTS OF INGREDIENT AND WHOLE DIET IRRADIATION ON NURSERY PIG PERFORMANCE

Dynamic shifts in the owl s auditory space map predict moving sound location

INFLUENCE OF DIFFERENT STRAINS AND WAYS OF INOCULATION ON THE RABBIT S RESPONSE TO EXPERIMENTAL INFECTION WITH PASTEURELLA MULTOCIDA

N. J. Boddicker*, D. J. Garrick*, R. R. R. Rowland, J. K. Lunney, J. M. Reecy* and J. C. M. Dekkers* Summary. Introduction. doi: /age.

Nicholas James Boddicker Iowa State University. Dorian J. Garrick Iowa State University, Raymond Rowland Kansas State University

Author's personal copy

Trajectory of Contact Region On the Fingerpad Gives the Illusion of Haptic Shape

SUPPLEMENTARY INFORMATION

Am J Nephrol 2013;38: DOI: /

changes used to indicate the aversiveness of

Infrared Image Edge Detection based on Morphology- Canny Fusion Algorithm

LALR Analysis. LALR Analysis. LALR Analysis. LALR Analysis

Quantifying perceived impact of scientific publications

Reward Changes Salience in Human Vision via the Anterior Cingulate

Gene expression phenotypic models that predict the activity of oncogenic pathways

Optimizing Metam Sodium Fumigation in Fine-Textured Soils

Dissociable effects of the implicit and explicit memory systems on learning control of reaching

Particle-size distribution of very low density plasma lipoproteins during fat absorption in man

BMI and Mortality: Results From a National Longitudinal Study of Canadian Adults

Simultaneous encoding of tactile information by three primate cortical areas

Shinhaeng Cho, Youngmoon Goh, Chankyu Kim, Haksoo Kim, Jong Hwi Jeong, Young Kyung Lim, Se Byeong Lee, Dongho Shin

Bioactive milk components to secure growth and gut development in preterm pigs ESTER ARÉVALO SUREDA PIGUTNET FA1401 STSM

Chapter 02 Crime-Scene Investigation and Evidence Collection

LETTER. Sensory motor transformations for speech occur bilaterally

Nozzi Valentina, Graber Andreas, Mathis Alex, Schmautz Zala, Junge Ranka

Information Test: corroboration of previous findings and highlights on vulnerabilities

Reducing the Risk. Logic Model

A FACTORIAL STUDY ON THE EFFECTS OF β CYCLODEXTRIN AND POLOXAMER 407 ON THE SOLUBILITY AND DISSOLUTION RATE OF PIROXICAM

Rapid feature selective neuronal synchronization through correlated latency shifting

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

Transcription:

doi:.38/nture72 LETTERS Neurl correltes, computtion nd ehviourl impct of decision confidence Adm Kepecs, Noshige Uchid,2, Htim Zriwl,3 & Zchry F. Minen,4 Humns nd other nimls must often mke decisions on the sis of imperfect evidence,2. Sttisticins use mesures such s P vlues to ssign degrees of confidence to propositions, ut little is known out how the rin computes confidence estimtes out decisions. We explored this issue using ehviourl nlysis nd neurl recordings in rts in comintion with computtionl modelling. Sujects were trined to perform n odour ctegoriztion tsk tht llowed decision confidence to e mnipulted y vrying the distnce of the test stimulus to the ctegory oundry. To understnd how confidence could e computed long with the choice itself, using stndrd models of decision-mking 3 6, we defined simple mesure tht quntified the qulity of the evidence contriuting to prticulr decision. Here we show tht the firing rtes of mny single neurons in the oritofrontl cortex mtch closely to the predictions of confidence models nd cnnot e redily explined y lterntive mechnisms, such s lerning stimulus outcome ssocitions 7. Moreover, when tested using delyed rewrd version of the tsk, we found tht rts willingness to wit for rewrds incresed with confidence, s predicted y the theoreticl model. These results indicte tht confidence estimtes, previously suggested to require metcognition,2 nd conscious wreness 3,4, re ville even in the rodent rin, cn e computed with reltively simple opertions, nd cn drive dptive ehviour. We suggest tht confidence estimtion my e fundmentl nd uiquitous component of decision-mking. Rts were trined on two choice odour mixture ctegoriztion tsk (Fig. ). On ech tril, inry mixture of two pure odornts (A, cproic cid; B, -hexnol) ws delivered t one of severl concentrtion rtios (Fig. ), which were rndomly interleved from tril-to-tril 5. Choices were rewrded t the left choice port for mixtures A/B. 5/5 nd t the right choice port for A/B, 5/5 (Fig. ). By vrying the distnce of the stimulus to the ctegory oundry (5/5) we could vry the difficulty of the decision (Fig. c, d). Although the rewrd contingencies were deterministic, sujects experienced vrying degrees of decision uncertinty due to imperfect perception of stimuli nd/or knowledge of the ctegory oundry. To explore the neurl correltes of decision confidence, we recorded single neuron ctivity in the oritofrontl cortex (OFC; Supplementry Fig. ), rin region implicted in decision-mking under uncertinty 6 2. We resoned tht neurl ctivity relted to the suject s confidence in the outcome of choice should occur while the suject is nticipting the tril outcome, nd therefore focused our nlysis on this dely period (Fig. 2). The firing rtes of mny OFC neurons were modulted y stimulus difficulty during the nticiption period. Figure 2, c shows the ctivity of neuron tht fired more intensely following more difficult decisions. By replotting the sme dt s function of the choice ccurcy ssocited with ech stimulus type, it cn e seen tht this neuron fired more vigorously when the likelihood of n upcoming rewrd ws lower (Fig. 2d). A lrge frction of OFC neurons, like this exmple, fired more intensely for stimuli closer to the ctegory oundry (2/563 t P,.5, Wilcoxon signed-rnk test). A smller frction (66/563) showed the opposite tuning, firing t higher intensity for esy stimuli, those fr from the ctegory oundry (Fig. 2e, f). The oserved modultion of firing rte y stimulus difficulty is consistent with previous findings tht the response of mny OFC neurons correltes with the expected vlues ssocited with rewrd predictive cues 7. Surprisingly, however, when we compred correct nd incorrect choices for the sme stimulus (for exmple, the 68/32 mixture), we found tht mny neurons showed different firing rtes even efore the outcome ws delivered. Figure 3, shows n exmple of neuron tht tended to fire more when the rt hd c d Left choice (%) Accurcy (%) A B 5 Choice A Odour Choice B 8 6 68 32 56 44 44 56 32 68 32 44 56 68 Odour mixture (% A) Figure Odour mixture ctegoriztion tsk., Schemtic of the ehviourl prdigm. To initite tril, the rt enters the centrl odour port nd fter pseudorndom dely of.2.5 s mixture of odours is delivered. Rts respond y moving to the left or right choice port, where drop of wter is delivered fter.3 2 s witing period for correct choices., Stimulus design. c, Performnce of one rt discriminting etween mixtures of cproic cid (A) nd -hexnol (B) in single session. Error rs (s.e.m.) re hidden y mrkers. Colours re used to represent odour mixtures, with different lue nd green lends representing different odour mixture rtios. d, Choice ccurcy s function of odour mixture. Dt cross three rts re plotted s men 6 s.e.m. Cold Spring Hror Lortory, Bungtown Rod, Cold Spring Hror, New York 724, USA. 2 Deprtment of Moleculr nd Cellulr Biology nd Center for Brin Science, Hrvrd University, Cmridge, Msschusetts 238, USA. 3 Allen Institute for Brin Science, Settle, Wshington 983, USA. 4 Chmplimud Neuroscience Progrmme, Instituto Gulenkin de Ciênçi, 278-9 Oeirs, Portugl. 28 Mcmilln Pulishers Limited. All rights reserved

LETTERS NATURE committed n error thn when it ws correct, despite the fct tht the outcome ws not yet reveled to the suject. The sme phenomenon could lso e seen s difference in the verge ehviourl ccurcy when the neuron ws firing t high compred to low rtes (see Supplementry Fig. 2). Similr to this exmple, lrge frction of neurons fired t higher rte in incorrect trils ( error trils ) compred to correct trils within given stimulus type (46/37 neurons for 56/44 mixtures nd 86/563 for 68/32 mixtures t P,.5, permuttion test, Fig. 3d f; Supplementry Figs 2 nd 3c). Interestingly, for esier stimuli the difference in firing rtes etween correct nd error trils ws lrger (Fig. 3; Supplementry Fig. 3d). A second, smller popultion of neurons (2/37 for 56/44 mixtures nd 5/563 for 68/32 mixtures t P,.5, permuttion test) hd n nlogous pttern of ctivity, ut fired more in nticiption of correct rther thn incorrect outcomes (Supplementry Fig. 4). Choice Rte (spikes s ) c Rte (spikes s ) Rewrd / nd /: 98% correct 8 68/32 nd 32/68: 92% correct 8 56/44 nd 44/56: 68% correct 22.4.7 Time from choice port entry (s) 8 6 4 2.3 s d Rte (spikes s ) e Normlized rte f Normlized rte 6 8 Accurcy.5 N = 2.4 32 44 56 68 32 44 56 68 Odour mixture (% A) Odour mixture (% A) 8 6 4 2.9.8.7.6.5.4.9.8.7.6 N = 66 32 44 56 68 Figure 2 Grded representtion of stimulus difficulty in oritofrontl cortex., Timing of outcome nticiption period. Entry into the choice port is recorded using the interruption of the photo-ems within ech port. The delivery of wter is pseudo-rndomly delyed, with the erliest onset vrying etween.3 s nd s nd the ltest offset from.8 s to 2 s fter entry, ccording to uniform distriution with vrying prmeters in ech session. The nticiption period ends t the first possile time of rewrd delivery, nd thus rnges from.3 s to s cross sessions. Firing rtes re clculted either during the initil.4 s of the nticiption period or the entire period if it ws shorter., Activity of n exmple neuronl unit. Rster plots represent neurl ctivity, with ech row corresponding to single tril nd ech tick mrk to spike. Forty trils re shown in ech plot with the post-stimulus time histogrm (PSTH) overlid (smoothed with Gussin filter, s.d. 5 25 ms). Neurl ctivity is ligned to the timing of entry into the choice port. Blue ticks represent the time of rewrd delivery. Trils for different stimuli were interleved in the sessions ut grouped into different pnels ccording to stimulus difficulty, with stimuli nd performnce indicted ove. c, Men firing rte of cell in s function of stimulus identity. Rtes re clculted during the outcome nticiption period (.3 s window eginning t the time of entry into the choice port). Error rs, s.e.m. cross trils. d, Men firing rte s function of men ccurcy grouped y stimulus identity. e, Men-normlized firing rte s function of stimulus identity for the popultion of neurons with higher firing rtes in error trils (Wilcoxon test, P,.5). f, As e ut for the popultion of neurons with higher firing rtes in correct trils (Wilcoxon test, P,.5). 2 These firing ptterns pper prdoxicl for prediction mde on the sis of overll stimulus outcome ssocitions. However, rewrd predictions my e generted y dynmic lerning process sed on recent reinforcement history 2 23. To test this ide, we used more powerful multiple liner regression model to try to predict the firing rte of given tril sed on the history of recent rewrd outcomes nd other externlly oservle vriles (the stimulus nd choice direction). This nlysis reveled tht lthough suset of OFC neurons do crry informtion out pst tril events, these ccount for reltively smll frction of the firing rte vrince compred to wht cn e explined y the nticipted current tril outcome (Supplementry Fig. 5; for detils see Methods). Therefore, the signls we oserved in OFC neurons could not e redily explined s rewrd expectncy sed on either simple verge stimulus rewrd ssocition or more complex predictions sed on reinforcement history. In principle, the proility of correct tril outcome could e estimted sed on sujective mesure of confidence out the decision. We hypothesized tht useful confidence metric could e clculted y mesuring the reliility nd consistency of the vlues Rte (spikes s ) c Normlized rte 3 2.6.4.2 Time from choice port entry (s) e 2 67/3 P <.5 Count 6 56/44 nd 44/56 mixtures Error Correct 28 Mcmilln Pulishers Limited. All rights reserved.4.8.5.5.5.5 Outcome preference Outcome preference d f 68/32 nd 32/68 mixtures Error Correct N = 46 N = 86.4.8 Time from choice port entry (s) 36/563 P <.5 Figure 3 Oritofrontl neurons nticipte tril outcome.,, Firing rte of single neuron ligned to the time of entry into the choice port. Trils re grouped y stimulus difficulty (, 44/56 nd 56/44 odour mixture rtio;, 32/68 nd 68/32) nd tril outcome (correct, ornge nd cyn; error, red nd lue). Shding represents s.e.m.; note there re few 68/32 error trils. Only ctivity occurring efore the onset of wter delivery nd choice port exit is verged into the PSTH. After the outcome nticiption period (.5 s in this session) the PSTH curves re dshed, signifying time period when in some trils rts experienced rewrd delivery, lthough post-rewrd firing is never ctully included. Note tht the seprtion etween correct nd error trils egins efore entry into the choice port ut fter the niml leves the odour smpling port. c, d, Men-normlized firing of negtive outcome selective neurons (those with incresed firing rte in error trils during the nticiption period) is plotted the sme wy s,. Shding represents s.e.m. cross neurons. Dshed curves s in,. e, f, Outcome preference for the popultion of OFC cells during the outcome nticiption period. Outcome preference is clculted using ROC nlysis (see Methods). Colour rs represent significnt selectivity (permuttion test, P,.5); red indictes neurons with incresed firing rtes in incorrect ( error ) trils (negtive outcome selectivity, 46/37 neurons); green indictes neurons with incresed firing rtes in correct trils (positive outcome selectivity, 22/37 neurons); grey rs, not significnt.

NATURE LETTERS of the internl vriles tht contriuted to the decision. To explore this ide, we constructed simple model for the ctegoriztion tsk sed on the comprison of the perceived stimulus vlue nd the reclled ctegory oundry (Fig. 4; see Methods for detils). In this model, the choice depends on whether the stimulus smple, s i, is smller or lrger thn the ctegory oundry, i. This comprison yielded n verge choice function similr to tht oserved ehviourlly (Fig. 4; compre Fig. c). To estimte the confidence out this choice, we propose to mesure the qulity of the evidence in this model using the distnce etween the stimulus nd memory smples, d i 5 js i i j; the lrger the distnce, the more relile should e the decision. We found tht fter simple trnsformtion, d i cn indeed provide veridicl prediction of the likelihood of successful outcome, decision confidence, d i 5 f(d i ), or the likelihood of filure, decision uncertinty, s i 5 2 d i (Fig. 4c). Similr lgorithms cn lso yield useful confidence estimtes in other decision models. For exmple, in two-lterntive rce model, n instnce of clss of models sed on the ccumultion of evidence 4 6, decision confidence cn e clculted from the difference etween two decision vriles t the time decision is reched (Supplementry Fig. 6; Supplementry Informtion). These modelling results demonstrte tht confidence estimtes derived solely from the decision vriles in the current tril cn provide good estimtes of the expected decision outcome cross trils. We next looked for specific predictions ptterns of firing rtes tht would rise from theoreticl confidence estimtes. We noticed tht, when plotted s function of stimulus type nd tril outcome, decision e? s <> = decision c d 8.8 Error 6.6 75 4.4 2.2 Correct 2 4 6 8 5.2.4.6.8 2 4 6 8 Odour mixture (% A) Uncertinty (s) Odour mixture (% A) % choice A g Rte (spikes s ) Normlized rte 3 2 32 44 56 68 Odour mixture (% A).8 h.7.6.5.4 Stimulus: P S Boundry: P B Accurcy (%) s 2 6 N =33 N =33 f Accurcy (%) Accurcy (%) 32 44 56 68.2.4.6.8 Odour mixture (% A) Normlized rte s 2 8 6 4 8 s = confidence Uncertinty (s) 5 5 2 25 Rte (spikes s ) 28 Mcmilln Pulishers Limited. All rights reserved uncertinty, s i, shows chrcteristic nd somewht counterintuitive pttern, nmely opposing V-shped curves for correct nd error choices (Fig. 4d): () for correct choices, s i decreses with distnce from the ctegory oundry; (2) for given stimulus, error trils re ssocited with higher s i thn correct trils; (3) the difference in s i for error nd correct trils increses s the stimulus ecomes esier. These ptterns re roust to model detils nd do not depend on the reltive contriutions of stimulus versus memory noise or on the precise choice of the trnsform function, f (Supplementry Fig. 7). In ddition, the sme pttern of confidence estimtes re produced y decision models sed on integrtion of evidence (Supplementry Fig. 6). The dependence of OFC neuronl ctivity on stimulus type nd tril outcome closely mtched the predictions of confidence estimtes derived from decision models (Fig. 4e h). First, individul OFC neurons showed the predicted dependence on the distnce of the stimulus to the ctegory oundry s well s the predicted difference etween correct nd error trils (Fig. 4e). A similr pttern held t the popultion level (Fig. 4g, 33/563 negtively-tuned neurons, ll stimuli pooled t P,.5, permuttion test; see lso Supplementry Figs 3, 8). These ptterns were qulittively different from those expected from left/right modultion of stimulus selectivity (Supplementry Fig. 3). Second, the proility of correct tril outcome vried with the firing rte of individul neurons (Fig. 4f), nd t the popultion level (Fig. 4h), s predicted (Fig. 4c). This nlysis lso showed tht the highest firing rtes were ssocited with ner chnce performnce (5% rewrd proility), s expected if these neurons signlled lck of confidence rther thn incorrect performnce (% rewrd proility; see Methods for detils). The opposite ptterns held for the positive outcome selective OFC popultion (5/563 neurons for ll stimuli pooled t P,.5, permuttion test; Supplementry Fig. 4). It is possile for the experimenter oserving OFC neurons to predict individul tril outcomes, ut cn rts use such informtion ehviourlly? We tested the ility of rts to provide ehviourl report of confidence using modified version of the tsk in which we encourged rts to give up witing for uncertin rewrds y incresing the dely to rewrd delivery nd permitting sujects to reinitite Figure 4 Confidence estimtion in decision model nd y OFC neurons., Schemtic of model for ctegory decisions. Ech odour mixture stimulus, s well s the memory for the ctegory oundry, is encoded s distriution of vlues. In ech tril stimulus, s i, nd memory of the oundry, i, re drwn from their respective distriutions. A choice is clculted y compring the two smples (s i, i ), nd confidence vlue is estimted y clculting their distnce ( s i 2 i ). Incorrect choices result from noise, represented in the model y the width of the stimulus nd ctegory oundry distriutions. See Methods for detils., Exmple psychometric function of the model, replicting the high choice ccurcy of rts for pure odours nd decresed ccurcy for mixtures ner the imposed the ctegory oundry. c, Men ccurcy of model choices s function of decision uncertinty. The uncertinty estimte, s, is trnsformed from the distnce etween the stimulus nd oundry smples (s i 5 2 tnh( s i 2 i )), see Methods). d, Men decision uncertinty estimtes generted y the model s function of stimulus nd tril outcome. Note tht the model (or suject) hs ccess only to stimulus smple nd not the stimulus type (for exmple, 56/44) (see Supplementry Informtion for n explntion of the pttern of uncertinty estimtes.). e, Firing rte of n exmple neuron (sme unit s Fig. 3, ) during the outcome nticiption period s function of odour stimulus nd tril outcome. Error rs re s.e.m. cross trils. f, Men choice ccurcy s function of the firing rte for the sme unit in e. Firing rtes were inned nd the men ccurcy ws clculted for ech rnge of firing rtes. Error rs represent stndrd errors sed on the inomil distriution of outcomes. g, Men normlized firing rte of negtive outcome selective popultion (negtive outcome preference index cross trils with ll stimuli pooled t P,.5, permuttion test) during the nticiption period. h, Men ccurcy s function of the firing rte for the sme neuron popultion s in g. Firing rtes were inned for individul neurons nd the men ccurcy ws clculted for ech rnge of firing rtes. These curves were normlized to mximl firing rte of nd verged. Error rs represent s.e.m. cross neurons. 3

LETTERS NATURE tril (Fig. 5). While witing t the choice port, the decision whether to sty nd wit for possile rewrd or to go nd reinitite the tril could enefit from n estimte of the confidence in the originl decision. Indeed, we found tht rts preferentilly orted uncertin trils. Like the neurl responses in OFC, these response ptterns closely greed with the predictions of the decision confidence model (Figs 5, c nd 4d). Therefore rts not only show neurl correlte of decision confidence ut they cn use such informtion in susequent decisions to guide dptive ehviour. The ptterns of neurl ctivity nd ehviour we oserved suggest tht when decision is mde the rin not only mkes choice ut lso genertes n evlution out the qulity of evidence tht contriuted to the decision. We liken this to the wy P vlues re ssigned to sttisticl sttements. Our interprettion of the dt rests on two results: first, we defined mechnism for computing confidence in simple decision models nd showed tht this produced close fit to non-trivil pttern of neurl nd ehviourl dt; second, we ruled out lterntive models for the dt, principlly ones sed on lerning. Confidence estimtes sed on internl decision vriles provide useful informtion tht is not redily gined y oserving the pst reltionships etween externlly oservle stimulus, response nd outcome vriles. Intuitively, this is possile ecuse the oservle result of decision, the choice, is only prtil distilltion of the informtion entering the internl decision process. Computing decision confidence essentilly requires clculting how close cll ws the choice or how well the evidence ws in greement. When decision noise rises from sources internl to the rin, this process is inherently sujective (ccessile only to the suject). More formlly, decision confidence cn e expressed s the vrince mesured cross the set of decision vriles contriuting to single tril (see Supplementry Informtion). Two different clsses of decision model yielded very similr results, suggesting degree of generlity to our description. Nevertheless, it will e importnt to exmine the properties of other methods for estimting confidence. A vriety of results suggests tht key function of OFC is to generte rewrd predictions sed on stimulus rewrd ssocitions 7. Our dt support nd extend this ide y showing tht OFC neurons signl Proility of restrt Odour port Choice port Rewrd/ error tone.6.4.2 One rt Error 2 8 s c Choice A Odour Choice B Correct Correct 5 2 47 49 5 53 8 95 5 2 47 49 5 53 8 95 Odour mixture (% A) Odour mixture (% A).6.4.2 4 rts Error Figure 5 Behviourl use of decision confidence., Schemtic of the reinitition tsk. Rewrd delivery ws pseudo-rndomly delyed etween 2 nd 8 s (uniform distriution) fter the rt s choice ws registered. Incorrect choices were signlled with n error tone delivered t the end of the 8 s dely. There ws minimum dely of 2 s from the time of the choice efore rts could initite new tril., Proility of reinitition for single rt plotted s function of odour stimulus nd tril outcome. Error rs represent s.e.m. cross trils. Entry into the odour port within 2 s of orting ws considered reinitition. c, Men proility of reinitition for 4 rts s function of odour stimulus nd tril outcome. Error rs represent s.e.m. cross rts. 4 outcome predictions derived from different source, specificlly, from internl vriles contriuting to perceptul decision on given tril. In ddition to predicting expected rewrds, OFC hs lso een implicted in signlling outcome risk or vrince 6 2. Becuse in two-lterntive psychophysicl decision tsk the expected rewrd nd its vrince re closely relted, our dt re consistent with oth functions nd further experiments will e needed to distinguish etween these lterntives. It lso remins to e determined whether OFC neurons drive the reinitition ehviour displyed y rts (Fig. 5) or other ehviours contingent on confidence estimtes. Indeed, decision confidence signls could e useful for vriety of functions, including controlling explortion 24,25, modulting lerning rtes 26 nd focusing ttention 27,28. Byesin theory suggests tht uncertinty estimtes must e incorported into neurl computtions for optiml ehviour 29. Humns nd other primtes clerly hve the ility to ssess nd ct on the degree of uncertinty or confidence in their eliefs out the world,,3, ut it hs een rgued tht this might e sophisticted metcognitive cpcity requiring self-wreness 3,4 nd neurl rchitecture specific to primtes. Our results show tht rodents possess the ility to ct on their degree of elief in decision 2 nd demonstrte tht estimting the confidence in choice is little more complex thn clculting the choice itself. It is likely tht confidence estimtes for memories or other eliefs,3 could e derived in n nlogous fshion. We suggest tht the computtion of sujective confidence my e core component of decision-mking tht, like sujective vlue signls 7,2 23, is importnt to wide rnge of ehviours nd their neurl sustrtes. METHODS SUMMARY Mle Long-Evns hooded rts were trined to perform n odour ctegoriztion tsk for wter rewrd. Behviourl testing ws controlled y custom softwre written in Mtl (Mthworks) using dt cquisition hrdwre (Ntionl Instruments) to record the port signls nd control the vlves of the olfctometer nd wter-delivery 5. Rts were implnted with custom-mde microdrives in the left oritofrontl cortex (3.5 mm nterior to regm nd 2.5 mm lterl to midline). Extrcellulr recordings were otined with six independently movle tetrodes using the Cheeth system (Neurlynx) nd single units were isolted y mnully clustering spike fetures with MClust (A. D. Redish). We focused our nlysis on the rewrd nticiption period while rts remined t one of the choice ports. This excluded spikes tht occurred during or fter wter vlve ctution on correct trils; on error trils, no feedck ws present. To determine how well neurl ctivity predicted the upcoming outcome (rewrd/no rewrd), we used receiver operting chrcteristics (ROC) nlysis to clculte n outcome preference index (OP) tht mesures how well n idel oserver cn predict the outcome from the knowledge of the firing rte from tril to tril. This index vries from 2 to with the sign denoting whether neuron fires more for rewrded (correct, ) or unrewrded (error, 2) decisions: OP~2(ROC re {:5); ROC re ~ ð? P(f correct ~f )P(f error vf )df where fcorrect nd f error refer to the distriution of firing rtes during the rewrd nticiption period in correct nd error trils respectively. Sttisticl significnce ws evluted using permuttion test, where tril order ws pseudo-rndomly shuffled 2 times to yield P vlue. All procedures involving nimls were crried out in ccordnce with Ntionl Institutes of Helth stndrds nd were pproved y the Cold Spring Hror Lortory Institutionl Animl Cre nd Use Committee. Full Methods nd ny ssocited references re ville in the online version of the pper t www.nture.com/nture. Received 28 Ferury; ccepted 26 June 28. Pulished online August 28. 28 Mcmilln Pulishers Limited. All rights reserved. Khnemn, D., Slovic, P. & Tversky, A. Judgment under Uncertinty: Heuristics nd Bises (Cmridge Univ. Press, 982). 2. Glimcher, P. W. Decisions, Uncertinty, nd the Brin: The Science of Neuroeconomics (MIT Press, 23). 3. Kim, J. N. & Shdlen, M. N. Neurl correltes of decision in the dorsolterl prefrontl cortex of the mcque. Nture Neurosci. 2, 76 85 (999).

NATURE LETTERS 4. Bogcz, R. et l. The physics of optiml decision mking: A forml nlysis of models of performnce in two-lterntive forced-choice tsks. Psychol. Rev. 3, 7 765 (26). 5. Mzurek, M. E., Roitmn, J. D., Ditterich, J. & Shdlen, M. N. A role for neurl integrtors in perceptul decision mking. Cere. Cortex 3, 257 269 (23). 6. Rtcliff, R. & Smith, P. L. A comprison of sequentil smpling models for twochoice rection time. Psychol. Rev., 333 367 (24). 7. Schoenum, G., Chi, A. A. & Gllgher, M. Oritofrontl cortex nd solterl mygdl encode expected outcomes during lerning. Nture Neurosci., 55 59 (998). 8. Tremly, L. & Schultz, W. Reltive rewrd preference in primte oritofrontl cortex. Nture 398, 74 78 (999). 9. Pdo-Schiopp, C. & Assd, J. A. Neurons in the oritofrontl cortex encode economic vlue. Nture 44, 223 226 (26).. Wllis, J. D. Oritofrontl cortex nd its contriution to decision-mking. Annu. Rev. Neurosci. 3, 3 56 (27).. Smith, J. D., Shields, W. E. & Wshurn, D. A. The comprtive psychology of uncertinty monitoring nd metcognition. Behv. Brin Sci. 26, 37 339 34 373 (23). 2. Foote, A. L. & Crystl, J. D. Metcognition in the rt. Curr. Biol. 7, 55 555 (27). 3. Persud, N., McLeod, P. & Cowey, A. Post-decision wgering ojectively mesures wreness. Nture Neurosci., 257 26 (27). 4. Koch, C. & Preuschoff, K. Betting the house on consciousness. Nture Neurosci., 4 4 (27). 5. Uchid, N. & Minen, Z. F. Speed nd ccurcy of olfctory discrimintion in the rt. Nture Neurosci. 6, 224 229 (23). 6. Bechr, A., Dmsio, H., Trnel, D. & Dmsio, A. R. Deciding dvntgeously efore knowing the dvntgeous strtegy. Science 275, 293 295 (997). 7. Critchley, H. D., Mthis, C. J. & Doln, R. J. Neurl ctivity in the humn rin relting to uncertinty nd rousl during nticiption. Neuron 29, 537 545 (2). 8. Grinnd, J., Hirsch, J. & Ferrer, V. P. A neurl representtion of ctegoriztion uncertinty in the humn rin. Neuron 49, 757 763 (26). 9. Hsu, M. et l. Neurl systems responding to degrees of uncertinty in humn decision-mking. Science 3, 68 683 (25). 2. Toler, P. N., O Doherty, J. P., Doln, R. J. & Schultz, W. Rewrd vlue coding distinct from risk ttitude-relted uncertinty coding in humn rewrd systems. J. Neurophysiol. 97, 62 632 (27). 2. Brrclough, D. J., Conroy, M. L. & Lee, D. Prefrontl cortex nd decision mking in mixed-strtegy gme. Nture Neurosci. 7, 44 4 (24). 22. Sugrue, L. P., Corrdo, G. S. & Newsome, W. T. Mtching ehvior nd the representtion of vlue in the prietl cortex. Science 34, 782 787 (24). 23. Lu, B. & Glimcher, P. W. Dynmic response-y-response models of mtching ehvior in rhesus monkeys. J. Exp. Anl. Behv. 84, 555 579 (25). 24. Stephens, D. W. & Kres, J. R. Forging Theory (Princeton Univ. Press, 986). 25. Behrens, T. E., Woolrich, M. W., Wlton, M. E. & Rushworth, M. F. Lerning the vlue of informtion in n uncertin world. Nture Neurosci., 24 22 (27). 26. Yu, A. J. & Dyn, P. Uncertinty, neuromodultion, nd ttention. Neuron 46, 68 692 (25). 27. Dyn, P., Kkde, S. & Montgue, P. R. Lerning nd selective ttention. Nture Neurosci. 3 (Suppl), 28 223 (2). 28. Luck, S. J., Hillyrd, S. A., Moulou, M. & Hwkins, H. L. Mechnisms of visulsptil ttention: Resource lloction or uncertinty reduction? J. Exp. Psychol. Hum. Percept. Perform. 22, 725 737 (996). 29. Knill, D. C. & Pouget, A. The Byesin rin: The role of uncertinty in neurl coding nd computtion. Trends Neurosci. 27, 72 79 (24). 3. Hmpton, R. R. Rhesus monkeys know when they rememer. Proc. Ntl Acd. Sci. USA 98, 5359 5362 (2). Supplementry Informtion is linked to the online version of the pper t www.nture.com/nture. Acknowledgements We thnk J. Pton, A. Pouget, S. Rghvchri, G. Turner nd memers of the Minen lortory for comments on the mnuscript. Support ws provided y the Ntionl Institutes of Helth (NIDCD) (Z.F.M.), the Center for the Neurl Mechnisms of Cognition t Cold Spring Hror Lortory (Z.F.M.), nd the Swrtz Foundtion (A.K., N.U., Z.F.M). Author Informtion Reprints nd permissions informtion is ville t www.nture.com/reprints. Correspondence nd requests for mterils should e ddressed to A.K. (kepecs@cshl.edu) or Z.F.M. (zminen@igc.gulenkin.pt). 28 Mcmilln Pulishers Limited. All rights reserved 5

doi:.38/nture72 METHODS Here we descrie the ehviourl nd physiologicl methods used in this study nd explin the nlyses presented in the min text. Behviourl tsk. The ehviourl ox contins pnel of three ports: the centrl port for odour delivery ( odour port ), nd two ports on ech side ( choice ports ) for wter delivery (Fig. ). Entry nd exit from the ports ws detected sed on n infrred photo-em locted inside ech port. Odours were mixed with pure ir to produce :2 dilution t flow rte of l min 2 using custom-uilt olfctometer 5. Rts self-initited ech experimentl tril y introducing their snout into centrl port where odour ws delivered (Fig. ). After vrile dely, drwn from uniform rndom distriution of.2.5 s, inry mixture of two pure odornts, cproic cid nd -hexnol, ws delivered t one of 4 6 concentrtion rtios (/, 68/32, 56/44, 44/56, 32/68, /; Fig. ) in pseudorndom order within session. After vrile odour smpling time up to s, rts responded y withdrwing from the centrl port, which terminted the delivery of odour, nd moved to the left or right choice port (Fig. ). Choices were rewrded ccording to the dominnt component of the mixture, tht is, t the left port for mixtures A/B, 5/5 nd t the right port for A/B. 5/5 (Fig. ). We introduced vrile rewrd dely period fter entry into the choice port. For correct choices, rewrd ws delivered etween t lest.3 s fter entry into the choice port nd sometimes up to 2 s (in individul sessions the delys were uniformly distriuted with the onset rnging from.3.8 s nd the offset to 2 s). Outcome selectivity clcultions used firing rtes clculted over the first.4 s of the rewrd nticiption period. In few sessions the rewrd nticiption ws.3 s (e.g. Fig. 2c, d); in those sessions the entire rewrd nticiption period ws used. This tsk llowed us to control the distnce of ech stimulus to the ctegory oundry nd hence systemticlly mnipulte the difficulty of individul ctegoriztion prolems (Fig. d). Intuitively, this tsk is nlogous to ctegorizing colours long continuous spectrum (for exmple, lue/green, Fig. ). For colour lends in the middle, the nswer depends on semi-ritrry convention of colour ctegory oundries. Similrly, our trining protocol enforced the 5/ 5 odour ctegory oundry, which is semi-ritrry, s the pure odours do not hve equl intensity. Reinitition tsk. In this version of the tsk, the dely to rewrd ws incresed to etween 2 nd 8 s (uniform rndom distriution). Errors were signlled with n uditory eep t 8 s nd punished with n dditionl 4 s time-out. After 2 s mndtory wit from the entry into choice port nd efore wter or uditory feedck ws provided, sujects were llowed to ort trils y exiting the wter port. Entry into the odour port within 2 s of orting ws considered s reinitition. The stimulus ensemle consisted of 75% esy (95/5, 8/2 mixtures: 92 6 4% ccurcy, s.e.m cross rts) nd 25% difficult (53/47, 5/49 mixtures: 55 6 2% ccurcy) stimuli so tht rts could expect to encounter n esier stimulus fter reinititing new tril. The expecttion of rt to receive rewrd y stying t the choice port should e proportionl to its confidence out the first choice (Fig. 4d) while the expecttion to receive rewrd y reinititing new tril should e fixed (ecuse the new stimulus is not predictle). Therefore the reltive vlue of reinititing is predicted to increse s confidence drops, with pproximtely the sme dependence on stimulus nd outcome s given y the model (Fig. 4c). The exct vlue depends on the ctul delys nd the suject s temporl discounting function. Neurl dt collection nd nlysis. Rts were implnted with custom-mde microdrives in the left oritofrontl cortex (3.5 mm nterior to regm nd 2.5 mm lterl to midline) s descried previously 3 (Supplementry Fig. ). Extrcellulr recordings were otined using six independently djustle tetrodes for recording. Electrodes were dvnced ech recording dy to smple n independent popultion of cells cross sessions. The plcement of electrodes ws estimted y depth nd confirmed with histology. Neurl nd ehviourl dt were synchronized y cquiring time-stmps from the ehviourl system long with the electrophysiologicl signls. Dt nlysis ws performed using Mtl (Mthworks). For Fig. 2e, f, confidence-modulted neurons were selected y performing non-prmetric, Wilcoxon signed-rnk test on firing rtes during the rewrd nticiption period for correct versus error trils. Neurons with significnt (P,.5) firing rte differences were seprted into two popultions sed on whether their men firing rte ws higher for correct or error trils. We then plotted the mximum normlized firing rte verged for ech neurl popultion s function of stimulus mixture rtio. We used this selection criterion ecuse y not using informtion out the stimulus it does not impose specific shpe on the tuning curves. Other selection criteri, such s significnt rteccurcy correltions (for exmple, Fig. 2d), yielded similr results. Multiple liner regression nlysis. We considered the possiility tht prediction of upcoming tril outcome might e mde on the sis of recent rewrd history 32 35 nd other oservle tsk vriles. For exmple, if the verge performnce fluctuted due to chnges in ttention or motivtion nd OFC neurons trcked the recent history of tril outcomes, it could led to differentil prediction of correct versus error trils when verged over the entire session. In this scenrio, outcome selectivity would rise ecuse the present tril s expected outcome is correlted with the recent trils outcomes. Although we did not oserve prominent performnce fluctutions, we wnted to test this nd relted possiilities directly. We used multiple liner regression in n ttempt to predict the firing rte of given tril sed on the history of recent rewrd outcomes nd experimentl vriles (stimulus type nd choice direction). Specificlly we fitted the firing rtes during the rewrd nticiption period to the following model: RATE t~ ~ S t~ z 2 C t~ { X{3 L t~k OL t~k { X{3 R t~k OR t~k zc k~ where S t5 represents the stimulus difficulty of the current tril (t 5 ), which is ssumed to e lerned through long-term experience with given stimulus; C t5 represents the choice of sides (left or right, L or R) in the current tril, which is known to influence the firing rte of OFC neurons 3,36. The vrile Ot~k SIDE represents outcomes of the current tril nd pst three trils (t 52, 22, 23), seprted ccording to the side where the rewrd ws received, gin to ccount for the known selectivity of rodent OFC neurons 3,36. The coefficients nd 2 mesure the influence of the stimulus difficulty nd the choice, L t~k nd R t~k mesure the influence of current nd pst tril outcomes, nd c cptures the men rte not ccounted for y other vriles. The model ws fitted using lest-squre error criterion with singulr vlue decomposition (SVD). In some cses the prolems were ill-conditioned nd therefore we lso tried ridge regression to otin more stle solutions. For this nlysis, the optiml regulriztion prmeter ws chosen y generlized crossvlidtion 37. The results of oth nlyses essentilly greed nd therefore we report the results from SVD estimted regression models. The sttisticl significnce of regression coefficients ws determined using permuttion test y pseudo-rndomly shuffling tril order for the vrile of interest 38. The dt were shuffled, times to yield P vlue for the permuttion test. Supplementry Fig. 5 shows the coefficients of this model fit to the neuron shown in Fig. 3,. Error rs show stndrd devitions estimted using leveone-out-ootstrp 37 nd filled circles show significnt vlues t P,.5 sed on permuttion test. This neuron hd significnt selectivity for the upcoming outcome, L,R t~, for oth choice sides, s well s for the previous outcome, L,R t~{, to much smller degree, while the influence of pst outcomes, L,R t~{2,{3, ws not significnt. Leving out ll pst outcomes, L,R t~{,{2,{3 ~, did not significntly increse the prediction error (P,.5, permuttion test). This nlysis ws repeted on the popultion of 33 neurons (Fig. 4g, h) tht were deemed to e negtive outcome selective (pooling trils cross ll stimuli) sed on ROC nlysis t P,.5. Supplementry Fig. 5 shows the numer of neurons (grey rs) nd the men vlue of significnt regression coefficients (circles, P..5). Overll, 2 neurons hd significnt L,R t~ coefficients for the current outcome nd 7 neurons hd significnt L,R t~{ coefficients for the outcome of the previous tril for t lest one side. Only four neurons crried pst outcome informtion for t lest one side for ll three trils ck. Comprison of the verge vlue of the significnt coefficients for current nd pst tril outcomes (Supplementry Fig. 5, circles) shows tht even when pst tril outcomes hd significnt coefficients the verge vlue of their weights ws only hlf those for the current tril. We lso performed n nlysis to test whether including the history of recent outcomes improves the model fit. To do this, we compred the full model to one in which the coefficients L,R t~{,{2,{3 were set to zero nd used permuttion test to compre the men prediction errors for the full nd reduced model. To otin conservtive estimte (tht is, llow the est chnce for inclusion of history terms to increse performnce) we did not compenste for the incresed complexity of the full model. This nlysis showed tht for only 2 of 6 neurons did the inclusion of pst outcome informtion, L,R t~{,{2,{3, significntly reduce the prediction error (P,.5, permuttion test). Moreover, the reduction in error ws smll, with n verge,3% improvement for the full compred to the reduced (current-tril-only) model. In summry, we conclude tht lthough suset of OFC neurons do crry informtion out pst outcomes, pst tril events ccount for reltively smll frction of the firing rte vrince compred to wht cn e explined y the nticipted current tril outcome. Outcome selectivity nlysis. Oritofrontl cortex is known to signl outcome expecttions 39 42, nd n pprent prediction of outcome might rise from comintion of stimulus nd side selectivity. If firing rtes encoded the stimulus difficulty (Fig. 2) nd in ddition were modulted y the choice side 3,35 one k~ 28 Mcmilln Pulishers Limited. All rights reserved

doi:.38/nture72 would expect () outcome preference would e inverted cross choice sides, nd (2) outcome selectivity would e equl or weker for esier compred to more difficult stimuli. A crtoon of this scenrio is shown in Supplementry Fig. 3, with oth n dditive nd multiplictive component to the choice side modultion. In contrst, the uncertinty model mkes the opposite predictions (Supplementry Fig. 3 nd Fig. 4d). Although the verge tuning curve for negtive outcome selective neurons re similr to wht is expected for representtion of uncertinty (Fig. 4g), we wnted to test these predictions on neuron-y-neuron sis. We used the outcome preference index (OP) to mesure whether the firing rtes re higher or lower for error trils, nd the unsigned version of this mesure, the outcome selectivity index (OS 5 jopj), to mesure whether how strongly firing rtes signl different outcomes. These mesures re sed on signl detection theory nd quntify the difference etween the firing rtes for error nd correct trils (see Methods Summry for detils). Sttisticl significnce ws estimted using 2-fold permuttion test 43 t P,.5. Note tht for these nlyses trils hd to e sudivided ccording to severl stimulus types nd for mny neurons there were few error trils ville to relily compre conditions. An insufficient numer of error trils cn result in either spurious selectivity vlues due to noise nd/or low significnce vlues. First we tested whether the direction of outcome preference ws concordnt cross sides (tht is, regulr rrows in Supplementry Fig. 3, ). We used 3 out of 563 neurons for which there were more thn 5 error trils for ech of 32/68 nd 68/32 stimuli. From these neurons 6 showed outcome selectivity cross ll stimuli, ut only 9 were significntly selective for oth 32/68 nd 68/32 mixtures when considered seprtely. 85% (6/9) of neurons hd concordnt outcome preference vlues, nd the preference vlues were significntly correlted cross sides (r 2 5.66, P,.5; Supplementry Fig. 3c). Next we tested whether outcome selectivity ws stronger for esy stimuli (32/68 nd 68/32 mixtures) compred to more difficult ones (44/56 nd 56/44 mixtures; see dshed rrows in Supplementry Fig. 3, ). Out of 37 neurons with 56/44 trils, 3 were selective cross ll stimuli ut only 23 were significnt for oth esy nd difficult mixtures when considered seprtely. For 9% (2/23) of these neurons, outcome selectivity ws stronger for esier stimuli (Supplementry Fig. 3d). These nlyses support the uncertinty model (Supplementry Fig. 3) nd re not consistent with the hypothesis tht choice side-modultion of stimulus encoding neurons produces n pprent outcome selectivity (Supplementry Fig. 3). Next we conducted n dditionl nlysis to show how well individul neurons conform to the firing ptterns expected for decision confidence cross the entire recorded OFC popultion. We used OP to mesure whether the firing rtes re higher or lower for error versus correct trils cross 32/68, 44/56, 56/44 nd 68/32 stimuli. In ddition, we clculted stimulus difficulty selectivity index (DI) to mesure whether firing rtes re higher or lower for correct choices in difficult trils (32/68, 44/56, 56/44 nd 68/32 stimuli) compred to esy trils (/ nd / stimuli). Agin, oth mesures re derived from the re under the ROC from signl detection theory nd sttisticl significnce ws estimted using 2-fold permuttion test t P,.5. Supplementry Fig. 8 shows DI s function of OP cross the entire popultion. Out of 563 neurons, 83 were significnt for oth mesures, 85 for OP lone, 5 for DI lone nd 29 were not significnt t P,.5. The selectivity mesures were correlted (CC 5.75 t P,.5) cross the entire popultion. This nlysis shows tht cross the popultion without ny preselection there is good correltion etween outcome preference (selectivity for correct/error choices) nd stimulus difficulty preference (selectivity for more/less difficult stimuli) s expected for decision confidence signl. Interprettion of negtive outcome selectivity: error signl or uncertinty? The oserved selectivity of neurl ctivity for the upcoming outcome might rise if, fter executing choice, extr sensory or memory informtion enters decision-mking circuits nd cuses the reliztion tht n error occurred even efore otining feedck. According to this interprettion the negtive outcome selective popultion of OFC neurons would signl error 44 insted of uncertinty. In contrst, the highest oserved firing rtes were ssocited with ner chnce level performnce nd not errors (Fig. 4g, f). To test this more rigorously, we sked whether n idel oserver could otin etter performnce thn the experimentl suject if it could switch choices sed on the firing rte fter the choice nd efore feedck is provided. In ll ut one negtive outcome selective neuron (/33), the highest firing rtes (top 5% of trils) were ssocited with chnce level performnce (within the 95% confidence intervl). Therefore negtive outcome selectivity does not imply tht OFC neurons re ctully le to predict error trils ut rther tht high firing rtes predict ner chnce level performnce consistent with n uncertinty signl. Confidence model. We model the stimulus s the log rtio of the odour mixture with dditive Gussin noise: s i ~ log ½AŠ ½BŠ zg stim in ech tril i, where g stim [N(,s stim ). The oundry is fixed t with dditive noise, i 5 g ound, where g ound [N(,s ound ). The choice is computed y compring stimulus nd oundry, choice i ~ fleftjs i v i ; rightjs i i g The distnce etween the stimulus nd oundry, d i ~js i { i j, provides n estimte of decision confidence. Other distnce metrics, such s Euclidin distnce, re lso suitle. This distnce mesure cn e clirted nd linerized to produce veridicl estimte of outcome proilities 45. We did not ttempt to systemticlly clirte confidence ut found tht sigmoid functions provide good pproximtion (see lso Supplementry Informtion). Therefore we define decision confidence, d i 5 f(d i ) 5 tnh(d i ) nd its opposite decision uncertinty s s i 5 2 d i. For the simultions in Fig. 4 we chose the stimulus nd oundry noise to e equl, s ound 5 s stim 5.5, ut we note tht the results re dependent only on the totl noise (sum of the vrinces) not their reltive contriution (see Supplementry Fig. 7). qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Therefore, the model hs single effective prmeter, s noise ~ s 2 ound zs2 stim, tht determines the slope of the psychometric function, leving no free prmeters with respect to confidence estimtes (Supplementry Fig. 7). 3. Feierstein, C. E. et l. Representtion of sptil gols in rt oritofrontl cortex. Neuron 5, 495 57 (26). 32. Brrclough, D. J., Conroy, M. L. & Lee, D. Prefrontl cortex nd decision mking in mixed-strtegy gme. Nture Neurosci. 7, 44 4 (24). 33. Sugrue, L. P., Corrdo, G. S. & Newsome, W. T. Mtching ehvior nd the representtion of vlue in the prietl cortex. Science 34, 782 787 (24). 34. Lu, B. & Glimcher, P. W. Dynmic response-y-response models of mtching ehvior in rhesus monkeys. J. Exp. Anl. Behv. 84, 555 579 (25). 35. Dw, N. D., O Doherty, J. P., Dyn, P., Seymour, B. & Doln, R. J. D Corticl sustrtes for explortory decisions in humns. Nture 44, 876 879 (26). 36. Roesch, M. R., Tylor, A. R. & Schoenum, G. Encoding of time-discounted rewrds in oritofrontl cortex is independent of vlue representtion. Neuron 5, 59 52 (26). 37. Hnsen, P. C. Rnk-deficient nd Discrete Ill-posed Prolems: Numericl Aspects of Liner Inversion (SIAM, 998). 38. Dvison, A. C. & Hinkley, D. V. Bootstrp Methods nd Their Appliction (Cmridge Univ. Press, 997). 39. Hikosk, K. & Wtne, M. Dely ctivity of oritl nd lterl prefrontl neurons of the monkey vrying with different rewrds. Cere. Cortex, 263 27 (2). 4. Wllis, J. D. & Miller, E. K. Neuronl ctivity in primte dorsolterl nd oritl prefrontl cortex during performnce of rewrd preference tsk. Eur. J. Neurosci. 8, 269 28 (23). 4. Gottfried, J. A. O Doherty, J. & Doln, R.J. Encoding predictive rewrd vlue in humn mygdl nd oritofrontl cortex. Science 3, 4 7 (23). 42. Simmons, J. M. Rvel, S. Shidr, M. & Richmond, B.J. A comprison of rewrdcontingent neuronl ctivity in monkey oritofrontl cortex nd ventrl stritum: guiding ctions towrd rewrds Ann. N.Y. Acd. Sci. 2, 376 394 (27). 43. Efron, B. & Tishirni, R. An Introduction to the Bootstrp (Chpmn & Hll, 993). 44. Luch, M., Wesserg, J. & Nicolelis, M. A. Corticl ensemle ctivity incresingly predicts ehviour outcomes during lerning of motor tsk. Nture 45, 567 57 (2). 45. Keren, G. On the clirtion of proility judgments. Some criticl comments nd lterntive perspectives. J. Behv. Decis. Mking, 269 278 (997). 28 Mcmilln Pulishers Limited. All rights reserved

72 Neurl correltes, computtion nd ehviourl impct of decision confidence Kepecs A., Uchid N., Zriwl H. nd Minen Z.F. Confidence estimtes in integrtor models of decision-mking Computing decision confidence requires tht in ddition to inry choice, the decision process yields grded vlue mesuring the reliility nd consistency of the internl vriles contriuting to decision. This cn e chieved for other clsses of models including models sed on the integrtion of evidence tht re le to lso ccount for other fetures of ehviour, such s rection times -3. To demonstrte the generlity of the model predictions (Fig. 4c,d) we simulted version of the integrtor model, the rce model. In rce models, seprte decision vriles ccumulte evidence for different options nd the decision tken is determined y which decision vrile reches threshold first (Supplementry Fig. 6). To Nsimulte this, in ech tril the stimulus is normlly distriuted rndom vrile s(t)! N(µstim,!stim), where the sign of µstim sets the direction of correct choice nd the signl-to-noise rtio µstim/!stim sets the difficulty of discrimintion. In the simplest version of the rce model there re two independent decision vriles tht ccumulte evidence for nd ginst the hypothesis tht µstim >. Ech decision vrile, e(t), ccumultes evidence for one direction: where nd When one of the decision vriles, e + (t) or e - (t), reches predetermined threshold,!, the rce is terminted nd decision is generted in fvour of the decision vrile crossing threshold first. Therefore t decision time, t!, e + (t!) =! or e - (t!) =!. We simulted this model with the following prmeters: µstim! U(-.2,.2),!stim =,! =, nd dt =. Supplementry Fig. 6c shows the frction of choices in fvour of the + hypothesis, µstim >, s function of stimulus, µstim. This psychometric curve is qulittively similr to tht of rts (compre to Fig. c). An estimte of decision confidence cn e computed in rce model y mesuring the distnce etween the two decision vriles t the time the rce is terminted. This ws originlly proposed y Vickers 4, who termed it the lnce of evidence. To see tht the distnce etween decision vriles cn provide resonle estimte of confidence, we plot choice ccurcy s function of this distnce, "e =#e + (t!) - e - (t!)# (Supplementry Fig. 6d, dshed line). This distnce, "e, cn e normlized, "e/!, to yield the lnce of evidence mesure 4. Here insted we sought to compute n estimte, $, tht is closer to the veridicl confidence nd reflects the ctul outcome proility. For perfectly clirted or veridicl confidence estimte $ would correspond to the proility of correct outcome, from chnce level ($ = ) to perfect ($ = ) performnce. The considertion of the theoreticlly pproprite clirtion method is eyond our scope. For given signl to noise rtio the correct clirtion function my e derived y considering the error rte s function of the decision threshold -3. Here we used n pproximtion, $ = f("e) = 2/(+e c("e/!) ), with c =/3, which provides excellent performnce cross multiple stimuli (see Supplementry Fig. 6). The role of clirtion is illustrted in