Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway

Similar documents
Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Berkes, Orban, Lengyel, Fiser.

Deconstructing the Receptive Field: Information Coding in Macaque area MST

2 INSERM U280, Lyon, France

Neurobiology of Hearing (Salamanca, 2012) Auditory Cortex (2) Prof. Xiaoqin Wang

How is the stimulus represented in the nervous system?

Reduction of Information Redundancy in the Ascending Auditory Pathway

Structure and Function of the Auditory and Vestibular Systems (Fall 2014) Auditory Cortex (3) Prof. Xiaoqin Wang

Analyzing Auditory Neurons by Learning Distance Functions

Neural Coding. Computing and the Brain. How Is Information Coded in Networks of Spiking Neurons?

Information in the Neuronal Representation of Individual Stimuli in the Primate Temporal Visual Cortex

Reading Neuronal Synchrony with Depressing Synapses

In M.I. Jordan, M.J. Kearns & S.A. Solla (Eds.). (1998). Advances in Neural Information Processing Systems 10, MIT Press.

Correcting for the Sampling Bias Problem in Spike Train Information Measures

Information Processing During Transient Responses in the Crayfish Visual System

Introduction to Computational Neuroscience

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Information-theoretic stimulus design for neurophysiology & psychophysics

Chapter 5. Summary and Conclusions! 131

Spectro-temporal response fields in the inferior colliculus of awake monkey

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Plasticity of Cerebral Cortex in Development

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

Functional Elements and Networks in fmri

Visual Categorization: How the Monkey Brain Does It

Biological Information Neuroscience

Optimal information decoding from neuronal populations with specific stimulus selectivity

Neuronal Dynamics: Computational Neuroscience of Single Neurons

A HMM-based Pre-training Approach for Sequential Data

The representational capacity of the distributed encoding of information provided by populations of neurons in primate temporal visual cortex

Encoding Stimulus Information by Spike Numbers and Mean Response Time in Primary Auditory Cortex

Representation of sound in the auditory nerve

Synchrony and the attentional state

1 Introduction Synchronous ring of action potentials amongst multiple neurons is a phenomenon that has been observed in a wide range of neural systems

Information and neural computations

Thalamocortical Feedback and Coupled Oscillators

Measuring the Accuracy of the Neural Code

Model-free detection of synchrony in neuronal spike trains, with an application to primate somatosensory cortex

Encoding of categories by non-category specific neurons in the inferior temporal cortex

SUPPLEMENTARY INFORMATION

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Neural correlations, population coding and computation

Why is our capacity of working memory so large?

Exploring the Functional Significance of Dendritic Inhibition In Cortical Pyramidal Cells

Sensory Adaptation within a Bayesian Framework for Perception

16/05/2008 #1

Clusters, Symbols and Cortical Topography

Subjective randomness and natural scene statistics

Information Content of Auditory Cortical Responses to Time-Varying Acoustic Stimuli

Prof. Greg Francis 7/31/15

Lecture 1: Neurons. Lecture 2: Coding with spikes. To gain a basic understanding of spike based neural codes

Basics of Computational Neuroscience

Reorganization between preparatory and movement population responses in motor cortex

Investigation of Physiological Mechanism For Linking Field Synapses

Analysis of spectro-temporal receptive fields in an auditory neural network

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

SI Appendix: Full Description of Data Analysis Methods. Results from data analyses were expressed as mean ± standard error of the mean.

J Jeffress model, 3, 66ff

Beyond bumps: Spiking networks that store sets of functions

Population activity structure of excitatory and inhibitory neurons

Nonlinear processing in LGN neurons

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance *

Formal and Attribute-Specific Information in Primary Visual Cortex

Spiking Neural Model of Supervised Learning in the Auditory Localization Pathway of Barn Owls

Spike Sorting and Behavioral analysis software

G5)H/C8-)72)78)2I-,8/52& ()*+,-./,-0))12-345)6/3/782 9:-8;<;4.= J-3/ J-3/ "#&' "#% "#"% "#%$

Auditory Edge Detection: A Neural Model for Physiological and Psychoacoustical Responses to Amplitude Transients

Sparse Coding in Sparse Winner Networks

A Comparison of Three Measures of the Association Between a Feature and a Concept

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

INFERRING AUDITORY NEURONAT RESPONSE PROPERTIES FROM NETWORK MODELS

A Model of Visually Guided Plasticity of the Auditory Spatial Map in the Barn Owl

Error Detection based on neural signals

Structure and Function of the Auditory and Vestibular Systems (Fall 2014) Auditory Cortex (1) Prof. Xiaoqin Wang

Supplementary Figure 1. Recording sites.

Introduction to Computational Neuroscience

The Integration of Features in Visual Awareness : The Binding Problem. By Andrew Laguna, S.J.

Selective Memory Generalization by Spatial Patterning of Protein Synthesis

How has Computational Neuroscience been useful? Virginia R. de Sa Department of Cognitive Science UCSD

CN530, Spring 2004 Final Report Independent Component Analysis and Receptive Fields. Madhusudana Shashanka

The effects of subthreshold synchrony on the perception of simultaneity. Ludwig-Maximilians-Universität Leopoldstr 13 D München/Munich, Germany

A Neural Model of Context Dependent Decision Making in the Prefrontal Cortex

Dynamics of Hodgkin and Huxley Model with Conductance based Synaptic Input

arxiv:q-bio.nc/ v1 13 Jul 2006

Learning in neural networks

Temporally asymmetric Hebbian learning and neuronal response variability

SCATTER PLOTS AND TREND LINES

Observational Learning Based on Models of Overlapping Pathways

Tuning Curves, Neuronal Variability, and Sensory Coding

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

A Memory Model for Decision Processes in Pigeons

Different inhibitory effects by dopaminergic modulation and global suppression of activity

Entrainment of neuronal oscillations as a mechanism of attentional selection: intracranial human recordings

Independent Component Analysis in Neuron Model

Ever since Adrian and Zotterman observed that the firing rate

Adjacent Visual Cortical Complex Cells Share About 20% of Their Stimulus- Related Information

Chapter 1. Introduction

Signal Processing by Multiplexing and Demultiplexing in Neurons

Transcription:

Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway Gal Chechik Amir Globerson Naftali Tishby Institute of Computer Science and Engineering and The Interdisciplinary Center for Neural Computation Hebrew University of Jerusalem, Israel ggal@cs.huji.ac.il Michael J. Anderson Eric D. Young Department of Biomedical Engineering Johns Hopkins University, Baltimore, MD, USA Israel Nelken Department of Physiology, Hadassah Medical School and The Interdisciplinary Center for Neural Computation Hebrew University of Jerusalem, Israel Abstract The way groups of auditory neurons interact to code acoustic information is investigated using an information theoretic approach. Identifying the case of stimulus-conditioned independent neurons, we develop redundancy measures that allow enhanced information estimation for groups of neurons. These measures are then applied to study the collaborative coding efficiency in two processing stations in the auditory pathway: the inferior colliculus (IC) and the primary auditory cortex (A1). Under two different coding paradigms we show differences in both information content and group redundancies between IC and cortical auditory neurons. These results provide for the first time a direct evidence for redundancy reduction along the ascending auditory pathway, as has been hypothesized by Barlow (1959). The redundancy effects under the single-spikes coding paradigm are significant only for groups larger than ten cells, and cannot be revealed with the standard redundancy measures that use only pairs of cells. Our results suggest that redundancy reduction transformations are not limited to low level sensory processing (aimed to reduce redundancy in input statistics) but are applied even at cortical sensory stations.

1 Introduction How do groups of sensory neurons interact to code information and how do these interactions change along the ascending sensory pathways? One view is that sensory systems are composed of a series of processing stations, representing more and more complex aspects of sensory inputs. Among the computational principles proposed to govern the mapping between these stations are the transformation to sparse and distributed representations, information maximization and redundancy reduction [1, 7]. These ideas provide a computational framework to investigate changes in the neural code along the sensory hierarchy. In order to investigate such changes in practice, one has to develop methods to quantify interactions among groups of neurons, and compare these measures in various processing stations of a sensory system. Interactions and high order correlations between neurons were mostly investigated within single brain areas on the level of pairs of cells, showing both synergistic and redundant interactions [5, 6, 4]. The current study focuses on developing redundancy measures for larger groups of neurons and comparing these measures in different processing stations. To this end we use an information theoretic approach and identify a case for which redundancy may be measured more reliably even for groups of neurons. We then apply our measures to electro-physiological recordings from two auditory stations: the inferior colliculus and the primary auditory cortex, and use complex acoustic stimuli that are critical for the investigation of auditory cortical coding [8]. 2 Redundancy measures for groups of neurons To investigate high order correlations and interactions within groups of neurons we start by defining information measures for groups of cells and then develop information redundancy measures for such groups. The properties of these measures are then further discussed for the specific case of stimulus-conditioned independence. Formally, high order correlations between two variables X and Y are quantified in terms of their mutual information (MI) [11, 3]. This well known quantity, now widely used in analysis of neural data, is defined by I(X; Y ) = D KL [P (X, Y ) P (X)P (Y )] = ( ) p(x, y) p(x, y)log (1) p(x)p(y) x,y and measures how close the joint distribution P (X, Y ) is to the factorization by the marginal distributions P (X)P (Y ) (D KL is the Kullback Leiber divergence [3]). For larger groups of cells, an important generalized measure quantifies the information that several variables provide about each other. This multi information measure is defined as I(X 1,..., X n ) = D KL [P (X 1,..., X n ) P (X 1 )...P (X n )] = (2) ( ) p(x1,..., x n ) = p(x 1,..., x n )log. p(x x 1 )...p(x n ) 1,...,x n Similar to the mutual information case, the multi information measures how close the joint distribution is to the factorization by the marginals, and is always positive. We now turn to develop measures for group redundancies. Consider first the simple

case of a pair of neurons (X 1, X 2 ) conveying information about the stimulus S, for which the redundancy-synergy index ([2, 4]) is defined by RS pairs (X 1, X 2, S) = I(X 1, X 2 ; S) [I(X 1 ; S) + I(X 2 ; S)] (3) Intuitively, RS pairs measures the amount of information on the stimulus S gained by observing the joint distribution of both X 1 and X 2, as compared with observing the two cells independently. In the extreme case where X 1 = X 2, the two cells are completely redundant and provide the same information about the stimulus, yielding RS pairs = I(X 1, X 2 ; S) I(X 1 ; S) I(X 2 ; S) = I(X 1 ; S), which is always non-positive. On the other hand, positive RS pairs values testify for synergistic interaction between X 1 and X 2 ([2, 4]). For larger groups of neurons, two different measures of redundancy-synergy should be considered. The first measure quantifies the residual information obtained from a group of N neurons compared to all its N 1 subgroups. As with inclusion-exclusion calculations this measure takes the form of a telescopic sum RS N N 1 = I(X N ; S) I(X i ; S) (4) {X N 1 } I(X N 1 ; S) +... + ( 1) N 1 {X i} where {X k } are all the subgroups of size k out of the N available neurons. This measure may obtain positive (synergistic), vanishing (independent), or negative (redundant) values. Unfortunately, it involves 2 N information terms, making its calculation unfeasible even for moderate N values 1. A different RS measure quantifies the information embodied in the joint distribution of N neurons compared to that provided by N single independent neurons, and is defined by N RS N 1 = I(X 1,..., X N ; S) I(X i ; S) (5) i=1 Interestingly, this synergy-redundancy measure may be rewritten as the difference between two multi-information terms RS N 1 = I(X 1,..., X N ; S) N I(X i ; S) = (6) i=1 = H(X 1,..., X N ) H(X 1,..., X N S) = I(X 1 ;...; X N S) I(X 1 ;...; X N ) N H(X i ) H(X i S) = where H(X) = x p(x)log(p(x)) is the entropy of X 2. We conclude that the index RS N 1 can be separated into two terms: one that is always non-negative, and measures the coding synergy, and the second which is always non-positive and quantifies the redundancy among the group of neurons. The current work focuses on the latter redundancy term I(X 1 ;...; X N ). 1 Our results below suggest that some redundancy effects become significant only for groups larger than 10-15 cells. 2 When comparing redundancy in different processing stations, one must consider the effects of the baseline information conveyed by each set of neurons. We thus use the normalized redundancy (compare with [10] p.315 and [2]) defined by RS N 1 = ( I(X1,..., X N ; S) N I(Xi; S)) /I(X i=1 1;...; X N ; S). i=1

The formulation of RS N 1 in equation 6 proves highly useful in the case where neural activities are independent given the stimulus P ( X S) = Π N i=1 P (X i S). In such scenario, the first (synergy) term vanishes, thus limiting neural interactions to the redundant regime. More importantly, under the independence assumption we only have to estimate the marginal distributions P (X i S = s) for each stimulus s instead of the full distribution P ( X S = s). It thus allows to estimate an exponentially smaller number of parameters, which in our case of small sample sizes, provides more accurate information estimates. This approximation hence allows to investigate redundancy among considerably larger groups of neurons than the two-three neurons considered in the literature. How reasonable is the conditional-independence approximation? It is fully accurate in the case of non-simultaneous recordings which is indeed the case in our data. The approximation is also reasonable whenever neuronal activity is mostly determined by the presented stimulus and to a lesser extent by interactions with nearby neurons, but the experimental evidence in this regard is mixed (see e.g.[6]). To summarize, the stimulus-conditioned independence assumption limits us to interactions in the redundant regime, but allows us to compare the extent of redundancy among large groups of cells in different brain areas. 3 Experimental Methods To investigate redundancy in the auditory pathway, we analyze extracellular recordings from two brain areas of Halothane anesthetized cats: 16 cells from the Inferior Colliculus (IC) - the third processing station of the ascending auditory pathway - and 19 cells from the Primary Auditory Cortex (A1) - the fifth station. Neural activity was recorded non-simultaneously from a total of 6 different animals responding to a fixed set of stimuli. Because cortical auditory neurons respond considerably different to simplified and complex stimuli [8], we refrain from using artificial over-simplified acoustic stimuli but instead use a set of stimuli based on bird vocalizations which contains complex real-life acoustic features. A representative example is shown in figure 1. Figure 1: A representative stimulus containing a short bird vocalization recorded in a natural environment. The set of stimuli consisted of similar natural and modified recordings. A. Signal in time domain B. Signal in frequency domain. 4 Experimental Results In order to estimate in practice the information content of neural activity, one must assume some representation of neural activity and sensory inputs. In this paper we consider two extreme cases: coding acoustics with single spikes and coding the stimulus identity with spike counts.

4.1 Coding acoustics with single spikes The current section focuses on the relation between single spikes and short windows of the acoustic stimuli shortly preceding them (which we denote as frames). As the set of possible frames is very large and no frame actually repeats itself, we must first pre-process the stimuli to reduce frames dimensionality. To this end, we first transform the stimuli into the frequency domain (roughly approximating the cochlear transformation) and then extract overlapping windows of 50 millisecond length, with 1 millisecond spacing. This set is clustered into 32 representatives, which capture different acoustic features in our stimulus set. This representation allows us to estimate the joint distribution (under the stimulusconditioned independence assumption) of cells activity and stimuli, for groups of cells of different sizes. Figure 2 shows the incremental mutual information as a function of number of cells for both A1 and IC neurons, compared with the average information conveyed by single cells. The difference between these two lines measures the redundancy RS N 1 of equation 6. The information conveyed by IC neurons saturates already with 15 cells, exhibiting significant redundancy for groups larger than 10 cells. More importantly, single A1 neurons provide less information (note the y-axis scale) but their information sums almost linearly. A. Primary Auditory Cortex B. Inferior Colliculus Figure 2: Information about stimulus frames as a function of group size. Information calculation was repeated for several subgroups of each size, and with several random seed initializations. The dark linear curve depicts the average information provided by neurons treated independently k N i I(X i) while the curved line depicts average information from joint distribution of sets of neurons Mean[I(X 1,...X k ; S)]. All information estimations were corrected for small-samples bias by shuffling methods [9]. 4.2 Coding stimuli by spike counts We now turn to investigate a second coding paradigm, and calculate the information conveyed by A1 and IC spike counts about the identity of the presented stimulus. To this end, we calculate a histogram of spike counts and estimate the counts distribution as obtained from repeated presentations of the stimuli. Figure 3A depicts MI distribution obtained from single cells and figure 3B presents the distribution of redundancy values among all cell-pairs (equation 3) in IC and A1. As in the case of coding with single spikes, single A1 cells convey on average less information about the stimulus. However, they are also more independent, thus allowing to gain more information from groups of neurons. IC neurons on the other hand, provide more information when considered separately but are more redundant.

To illustrate the high information provided by both sets, we trained a neural network classifier that predicts the identity of the presented stimulus according to spike counts of a limited set of neurons. Figure 4 shows that both sets of neurons achieve considerable prediction accuracy, but IC neurons obtain average accuracy of more than 90 percent already with five cells, while the average prediction accuracy using cortical neurons rises continuously 3. A. Single Cell Information B. Pairwise redundancy Figure 3: A. Distribution of MI across cells. A1 cells (light bars) convey slightly less information on average than IC neurons. B. Distribution of the normalized redundancy. A1 pairs (light bars) are concentrated near zero, while IC pairs have significantly higher redundancy values. Spike counts were collected over a window that maximizes overall MI. Number of bins in counts-histogram was optimized separately for each stimulus and every cell. Information estimations were corrected for small-samples bias by shuffling methods [9]. Figure 4. Prediction accuracy of stimulus identity as a function of number of cells used by the classifier. Error bars denote standard deviation across several subgroups of the same size. For each subgroup, a one-hidden layer neural network was trained separately for each stimulus using some stimulus presentations as a training set and the rest for testing. Performance reported is for the testing set. 5 Discussion We have developed information theoretic measures of redundancy among groups of neurons and applied them to investigate the collaborative coding efficiency in the auditory modality. Under two different coding paradigms, we show differences in both information content and group redundancies between IC and cortical auditory 3 The probability of accurate prediction is exponentially related to the input-output mutual information, via the relation P correct = exp( missing nats) yielding MI nats = ln(no. of stimuli) + ln(p correct). Classification thus provides lower bounds on information content.

neurons. Single IC neurons carry more information about the presented stimulus, but are also more redundant. On the other hand, auditory cortical neurons carry less information but are more independent, thus allowing information to be summed almost linearly when considering groups of few tens of neurons. The results provide for the first time direct evidence for redundancy reduction along the ascending auditory pathway, as has been hypothesized by Barlow [1]. The redundancy effects under the single-spikes coding paradigm are significant only for groups larger than ten cells, and cannot be revealed with the standard redundancy measures that use only pairs of cells. Our results suggest that redundancy reduction transformations are not limited to low level sensory processing (aimed to reduce redundancy in input statistics) but are applied even at cortical sensory stations. We suggest that an essential experimental prerequisite to reveal these effects is the use of complex acoustic stimuli whose processing occurs at the cortical stations. The above findings are in agreement with the view that along the ascending sensory pathways, the number of neurons increase, their firing rates decrease, and neurons become tuned to more complex and independent features. Together, these suggest that the neural representation is mapped into a representation with higher effective dimensionality. Interestingly, recent advances in kernel-methods learning have shown that nonlinear mapping into higher dimension and over-complete representations may be useful for learning of complex classifications. It is therefor possible that such mappings provide easier readout and more efficient learning in the brain. References [1] H.B. Barlow. Sensory mechanisms, the reduction of redundancy, and intelligence. In Mechanisation of thought processes, pages 535 539. Her Majesty s stationary office, London, 1959. [2] N. Brenner, S.P. Strong, R. Koberle, R. de Ruyter van Steveninck, and W. Bialek. Synergy in a neural code. Neural Computation, 13(7):1531, 2000. [3] T.M. Cover and J.A. Thomas. The elements of information theory. Plenum Press, New York, 1991. [4] I. Gat and N. Tishby. Synergy and redundancy among brain cells of behaving monkeys. In M.S. Kearns, S.A. Solla, and D.A.Cohn, editors, Advances in Neural Information Proceedings systems, volume 11, Cambridge, MA, 1999. MIT Press. [5] T.J. Gawne and B.J. Richmond. How independent are the messages carried by adjacent inferior temporal cortical neurons? J. Neuroscience, 13(7):2758 2771, 1993. [6] P.M. Gochin, M. Colombo, G. A. Dorfman, G.L. Gerstein, and C.G. Gross. Neural ensemble coding in inferior temporal cortex. J. Neurophysiol., 71:2325 2337, 1994. [7] J.P. Nadal, N. Brunel, and N. Parga. Nonlinear feedforward networks with stochastic outputs: infomax implies redundancy reduction. Network: Computation in neural systems, 9:207 217, 1998. [8] I. Nelken, Y. Rotman, and O. Bar-Yosef. Specialization of the auditory system for the analysis of natural sounds. In J. Brugge and P.F. Poon, editors, Central Auditory Processing and Neural Modeling. Plenum, New York, 1997. [9] LM. Optican, T.J. Gawne, B.J. Richmond, and P.J. Joseph. Unbiased measures of transmitted information and channel capacity from multivariate neuronal data. Biol. Cyber, 65:305 310, 1991. [10] E. T. Rolls and A. Treves. Neural Networks and Brain Function. Oxford Univ. Press, 1998. [11] C.E. Shanon. A mathematical theory of communication. The Bell systems technical journal, 27:379 423,623 656, 1948.