Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Berkes, Orban, Lengyel, Fiser.

Similar documents
Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Neural Variability and Sampling-Based Probabilistic Representations in the Visual Cortex

The `Bayesian approach to perception, cognition and disease (3)

Introduction to Computational Neuroscience

Single cell tuning curves vs population response. Encoding: Summary. Overview of the visual cortex. Overview of the visual cortex

Hebbian Plasticity for Improving Perceptual Decisions

Overview of the visual cortex. Ventral pathway. Overview of the visual cortex

Spatio-temporal Representations of Uncertainty in Spiking Neural Networks

Models of Attention. Models of Attention

Rolls,E.T. (2016) Cerebral Cortex: Principles of Operation. Oxford University Press.

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway

Neural Coding. Computing and the Brain. How Is Information Coded in Networks of Spiking Neurons?

Cognitive Modelling Themes in Neural Computation. Tom Hartley

Prof. Greg Francis 7/31/15

Neuronal Dynamics: Computational Neuroscience of Single Neurons

V1 (Chap 3, part II) Lecture 8. Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Princeton University, Fall 2017

How is the stimulus represented in the nervous system?

2/3/17. Visual System I. I. Eye, color space, adaptation II. Receptive fields and lateral inhibition III. Thalamus and primary visual cortex

Deconstructing the Receptive Field: Information Coding in Macaque area MST

Introduction to Computational Neuroscience

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Information and neural computations

Biological Information Neuroscience

VIDEO SALIENCY INCORPORATING SPATIOTEMPORAL CUES AND UNCERTAINTY WEIGHTING

Lecture 2.1 What is Perception?

Neural correlates of reliability-based cue weighting during multisensory integration

Information in the Neuronal Representation of Individual Stimuli in the Primate Temporal Visual Cortex

2 INSERM U280, Lyon, France

Choosing the Greater of Two Goods: Neural Currencies for Valuation and Decision Making

Master s Thesis. Presented to. The Faculty of the Graduate School of Arts and Sciences. Brandeis University. Department of Psychology

Attention modulates spatial priority maps in the human occipital, parietal and frontal cortices

Computational Models of Visual Attention: Bottom-Up and Top-Down. By: Soheil Borhani

Nonlinear processing in LGN neurons

Multimodal interactions: visual-auditory

Sensory Cue Integration

Inferential Brain Mapping

Selective Memory Generalization by Spatial Patterning of Protein Synthesis

Plasticity of Cerebral Cortex in Development

Information Processing During Transient Responses in the Crayfish Visual System

Supplemental Information: Adaptation can explain evidence for encoding of probabilistic. information in macaque inferior temporal cortex

Supplementary Figure 1. Localization of face patches (a) Sagittal slice showing the location of fmri-identified face patches in one monkey targeted

To compliment the population decoding approach (Figs. 4, 6, main text), we also tested the

Models of visual neuron function. Quantitative Biology Course Lecture Dan Butts

SUPPLEMENTARY INFORMATION

Supplementary Information for Correlated input reveals coexisting coding schemes in a sensory cortex

Key questions about attention

Just One View: Invariances in Inferotemporal Cell Tuning

Coding and computation by neural ensembles in the primate retina

Beyond bumps: Spiking networks that store sets of functions

CN530, Spring 2004 Final Report Independent Component Analysis and Receptive Fields. Madhusudana Shashanka

The Integration of Features in Visual Awareness : The Binding Problem. By Andrew Laguna, S.J.

Fundamentals of Psychophysics

A Virtual Retina for Studying Population Coding

Adventures into terra incognita

Lecture overview. What hypothesis to test in the fly? Quantitative data collection Visual physiology conventions ( Methods )

CPC journal club archive

lateral organization: maps

Contextual Influences in Visual Processing

2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science

Modeling the Primary Visual Cortex

Processing and Plasticity in Rodent Somatosensory Cortex

Supplementary materials for: Executive control processes underlying multi- item working memory

Optimal information decoding from neuronal populations with specific stimulus selectivity

Chapter 5. Summary and Conclusions! 131

Entrainment of neuronal oscillations as a mechanism of attentional selection: intracranial human recordings

Error Detection based on neural signals

Primary Visual Pathways (I)

Goals. Visual behavior jobs of vision. The visual system: overview of a large-scale neural architecture. kersten.org

How does Bayesian reverse-engineering work?

Object vision (Chapter 4)

Sensory Adaptation within a Bayesian Framework for Perception

Experimental design for Cognitive fmri

Experimental Design. Thomas Wolbers Space and Aging Laboratory Centre for Cognitive and Neural Systems

Memory, Attention, and Decision-Making

A neural circuit model of decision making!

Neural correlations, population coding and computation

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

A Neural Model of Context Dependent Decision Making in the Prefrontal Cortex

Perception is an unconscious inference (1). Sensory stimuli are

Optimal speed estimation in natural image movies predicts human performance

16/05/2008 #1

M Cells. Why parallel pathways? P Cells. Where from the retina? Cortical visual processing. Announcements. Main visual pathway from retina to V1

Subjective randomness and natural scene statistics

Object recognition and hierarchical computation

Information-theoretic stimulus design for neurophysiology & psychophysics

A Probabilistic Network Model of Population Responses

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011

Visual Categorization: How the Monkey Brain Does It

Encoding and decoding of voluntary movement control in the motor cortex

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., Fried, I. (2005). Invariant visual representation by single neurons in the human brain, Nature,

Self-Organization and Segmentation with Laterally Connected Spiking Neurons

Independent Component Analysis in Neuron Model

Spectro-temporal response fields in the inferior colliculus of awake monkey

The perception of motion transparency: A signal-to-noise limit

SUPPLEMENTARY INFORMATION

Vision Research 70 (2012) Contents lists available at SciVerse ScienceDirect. Vision Research. journal homepage:

Transcription:

Statistically optimal perception and learning: from behavior to neural representations. Fiser, Berkes, Orban & Lengyel Trends in Cognitive Sciences (2010) Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Berkes, Orban, Lengyel, Fiser. Science (2011) Jonathan Pillow Computational & Theoretical Neuroscience Journal Club June 22, 2011

Background General Question: How does the brain represent probability distributions? (also: why would it want to?) Two basic schemes: 1) Probabilistic Population Coding (PPC) (last week) 2) Sampling Hypothesis (today, building on 2 weeks ago)

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) tuning curve

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 flat i th neuron response stimulus (1D) tuning curve stimulus prior

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 flat i th neuron response stimulus (1D) tuning curve stimulus prior log TC posterior

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) posterior tuning curve flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc.

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) posterior tuning curve flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis Olhausen & Field 1996 Sparse Coding Model / ICA

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) posterior tuning curve flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) posterior tuning curve flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA posterior stimulus (high-d) receptive field

(1) Probabilistic Population Codes Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 i th neuron response stimulus (1D) posterior tuning curve flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA posterior stimulus (high-d) receptive field encoding distribution

(1) Probabilistic Population Codes i th neuron response stimulus (1D) posterior tuning curve Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA their idea: neurons aim to represent S with few spikes (MAP estimation) posterior stimulus (high-d) receptive field encoding distribution

(1) Probabilistic Population Codes i th neuron response stimulus (1D) posterior tuning curve Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA their idea: neurons aim to represent S with few spikes posterior stimulus (high-d) receptive field (MAP estimation) achieves sparsity doesn t represent probability encoding distribution

(1) Probabilistic Population Codes i th neuron response stimulus (1D) posterior tuning curve Pouget, Beck, Ma, Latham & colleagues Ma et al, Nat Neuro 2006 flat stimulus prior log TC basic idea: Poisson-like neural noise consistent with a scheme for representing distributions by a sum of log-tuning-curves, weighted by neural responses makes it easy to do cue combination, readout, etc. (2) Generative models / Sampling Hypothesis sparse prior over neural responses Olhausen & Field 1996 Sparse Coding Model / ICA their idea: neurons aim to represent S with few spikes posterior encoding distribution stimulus (high-d) receptive field (MAP estimation) achieves sparsity doesn t represent probability Berkes, Orban, Lengyel,Fiser (today): neurons represent P(S r) with samples

What does it mean to sample? Example: say we want to represent that we believe the probability of a reward on this trial is 1/3 PPC scheme: neuron fires 33 spikes (out of max rate of 100) Sampling scheme: binary neuron responds with 1/3 probability (or, spikes stochastically 1/3 of the time)

What does it mean to sample? Example: say we want to represent that we believe the probability of a reward on this trial is 1/3 PPC scheme: neuron fires 33 spikes (out of max rate of 100) Sampling scheme: binary neuron responds with 1/3 probability (or, spikes stochastically 1/3 of the time) Note that response variability has very different interpretations: PPC: different spike counts different distributions Sampling: variability required to represent a distribution; variable responses across trials represent the same distribution. (but note you need many neurons or multiple trials to represent the distribution accurately!)

Other differences Propose different semantics of neural responses: PPC: a neuron represents bump in the posterior distribution Sampling: neuron represents presence of a given feature in image PPC: has nonlinear tuning curves (extra layer of abstraction) Sparse Coding Model (SCM): projective fields are linearly related to image being represented PPC: low-dimensional stimulus (e.g., orientation of a bar) SCM: high-dimensional stimulus (e.g., image patch) PPC: responses are parameters of the distribution represented Sampling: responses are random variables drawn from a distribution that is to be represented

Fiser et al, TICS 2010 Stated goal: why use probabilistic representations unify inference and learning

what probabilistic representations are good for

learning in a probabilistic model likelihood posterior prior

statistical learning in humans

probabilistic inference and learning with neurons sparse coding model rates image sparse prior receptive field posterior

probabilistic inference and learning with neurons sparse coding model rates image sparse prior receptive field posterior

How to represent a 2D distribution PPC (2 populations) Sampling (2 neurons!)

spontaneous activity resembles evoked activity is structured, correlated

In darkness, posterior = prior updated sparse coding model to include luminance b therefore, spontaneous activity = sampling from the prior

comparison of 2 schemes for encoding distributions Table I. Comparing characteristics of the two main modeling approaches to probabilistic neural representations PPCs Sampling-based Neurons correspond to Parameters Variables Network dynamics required (beyond the first layer) Deterministic Stochastic (self-consistent) Representable distributions Must correspond to a particular Can be arbitrary parametric form Critical factor in accuracy of encoding a distribution Number of neurons Time allowed for sampling Instantaneous representation of uncertainty Complete, the whole distribution is represented at any time Partial, a sequence of samples is required Number of neurons needed for representing multimodal distributions Scales exponentially with the number of dimensions Scales linearly with the number of dimensions Implementation of learning Unknown Well-suited

Berkes et al, Science 2011 empirical test of the sampling hypothesis main result: spontaneous activity and the average stimulus-evoked activity have the same distribution

Berkes et al, Science 2011 empirical test of the sampling hypothesis main result: spontaneous activity and the average stimulus-evoked activity have the same distribution prior (spontaneous activity) evoked activity ( sampling ) distribution of natural images average evoked activity

Berkes et al, Science 2011 empirical test of the sampling hypothesis main result: spontaneous activity and the average stimulus-evoked activity have the same distribution prior (spontaneous activity) evoked activity ( sampling ) distribution of natural images average evoked activity if 2 features cooccur in natural scenes, the spontaneous activity of those neurons should become correlated

visual stimulation decreasing contrast no stimulus prior feature 2 (neuron 2) posterior (EA) multiunit recording posterior = prior (SA) feature 1 (neuron 1)

B YOUNG development ADULT natural stimuli feature 2 (neuron 2) basic logic: over course of development, prior (spontaneous activity) comes to match average posterior (sampled activity) in response to natural images DIFFERENT DIFFERENT feature 1 (neuron 1) average posterior (aea) prior (SA) no stimulus average posterior (aea) SIMILAR prior (SA) DIFFERENT average posterior (aea) average posterior (aea) artificial stimuli (that is, vis. experience required to learn correct prior)

A data analysis methods for basic conclusion electrode number 1 6 11 16 2 ms { activity patterns B M = movie-evoked S = spontaneous time frequency CONDITION 1 KL divergence (KL, bits/sec) 700 600 500 400 300 200 M S 100 activity patterns 0 29 30 44 45 83 92 129 151 postnatal age (days) C time (sec) 0 25 S M pattern frequency (S) 10 0 10 1 10 2 10 3 10 4 10 5 P29 time (sec) 0 25 S M pattern frequency (S) 10 0 10 1 10 2 10 3 10 4 10 5 P129 number of spikes 16 8 50 1 8 16 1 8 16 channels channels 10 5 10 4 10 3 10 2 10 1 10 0 pattern frequency (M) 50 1 8 16 1 8 16 channels channels 10 5 10 4 10 3 10 2 10 1 10 0 pattern frequency (M) 0

contributions of spatial and temporal correlations A divergence (KL, bits/sec) divergence (KL, bits/sec) C 1200 1000 800 600 400 200 400 200 0 400 200 0 M ~ more spatial structure with age M S ~ S 29-30 44-45 83-92 129-151 postnatal age (days) temporal corrs M ~ M S ~ S B divergence (KL, bits/sec) D divergence (KL, bits/sec) 1400 1200 1000 800 600 400 200 0 500 400 300 200 100 diff grows for shuffled data M S M S ~ S ** 29-30 44-45 83-92 129-151 postnatal age (days) ~ S ** ** *** ** *** M = movie-evoked S = spontaneous ~ = shuffled 0 2 4 6 10 20 40 100 1000 (msec) 0 29-30 44-45 83-92 129-151 postnatal age (days)

Also, activity becomes less sparse during development, arguing against the redundancy reduction hypothesis (Barlow) a movie-evoked activity b spontaneous activity ( $)" ;< =>,-,?21/@-0 => 2@A5/@B5 3 ( $)" ;< =>,-,?21/@-0 => 2@A5/@B5 3.,1:.505.. $)' $)+ $)*.,1:.505.. $)' $)+ $)* $)& $)& $)% 3!"!#$ %%!%& '#!"! (!"!(&(,-./01/12314536718.9 $)% 3!"!#$ %%!%& '#!"! (!"!(&(,-./01/12314536718.9 Fig. S2: Sparseness of neural representations over development. Changes over development

Finding is specific to Natural Stimuli A B C divergence (KL, bits/sec) 700 600 500 400 300 200 100 0 postnatal age (days) ection ofallneural activity distributions. Eachdot N M S ** * G 29-30 44-45 83-92 129-151 MDS dim 2 (a.u.) 0.2 0.1 0-0.1-0.2-0.3-0.4 + -0.5-0.4-0.3-0.2-0.1 0 0.1 M N G S divergence (KL, bits/sec) MDS dim 1 (a.u.) postnatal age (days) the origin. For young animals (faintest colors), SA was significantly dissimilar 500 400 300 200 100 0 N M G M = movie-evoked S = spontaneous N = noise-evoked G = gratings 29-30 44-45 83-92 129-151

Conclusions evoked activity stimulus-dependent P (r y) stimulus-averaged P (r y)p (y) dy spontaneous activity ignored analyzed P (r) classical neural coding (e.g., Rieke et al., 1997) information transmission Luczak et al., 2009 coding robustness Schneidman et al., 2006 error correction this paper probabilistic inference Table S2: Approaches to neural activity analysis.

Conclusions sampling is great interesting, surprising account of what spontaneous activity means key prediction: spontaneous activity matches average evoked activity match gets better during development (learning story: prior is being tuned to average posterior )

Conclusions sampling is great interesting, surprising account of what spontaneous activity means key prediction: spontaneous activity matches average evoked activity match gets better during development (learning story: prior is being tuned to average posterior ) open question are there other stories consistent with the fact that the average evoked activity has the same distribution as spontaneous activity?