INFERRING AUDITORY NEURONAT RESPONSE PROPERTIES FROM NETWORK MODELS

Similar documents
Spectro-temporal response fields in the inferior colliculus of awake monkey

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

How is the stimulus represented in the nervous system?

Representation of sound in the auditory nerve

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Spectral-Temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds

Spectrograms (revisited)

The Structure and Function of the Auditory Nerve

Model of Song Selectivity and Sequence Generation in Area HVc of the Songbird

Extra-Classical Tuning Predicts Stimulus-Dependent Receptive Fields in Auditory Neurons

Analysis of spectro-temporal receptive fields in an auditory neural network

Discrimination of Communication Vocalizations by Single Neurons and Groups of Neurons in the Auditory Midbrain

Information Processing During Transient Responses in the Crayfish Visual System

Models of visual neuron function. Quantitative Biology Course Lecture Dan Butts

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Analyzing Auditory Neurons by Learning Distance Functions

J Jeffress model, 3, 66ff

Dynamics of Precise Spike Timing in Primary Auditory Cortex

HCS 7367 Speech Perception

Cooperative Nonlinearities in Auditory Cortical Neurons

Song- and Order-Selective Neurons in the Songbird Anterior Forebrain and their Emergence during Vocal Development

Temporal Adaptation. In a Silicon Auditory Nerve. John Lazzaro. CS Division UC Berkeley 571 Evans Hall Berkeley, CA

Neural Coding. Computing and the Brain. How Is Information Coded in Networks of Spiking Neurons?

Linear and nonlinear auditory response properties of interneurons in a high-order avian vocal motor nucleus during wakefulness

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Behavioral generalization

Lecture 1: Neurons. Lecture 2: Coding with spikes. To gain a basic understanding of spike based neural codes

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION

Signals, systems, acoustics and the ear. Week 5. The peripheral auditory system: The ear as a signal processor

2 INSERM U280, Lyon, France

Chapter 5. Summary and Conclusions! 131

Nonlinear processing in LGN neurons

Analysis of sensory coding with complex stimuli Jonathan Touryan* and Yang Dan

Supplementary Information for Correlated input reveals coexisting coding schemes in a sensory cortex

Over-representation of speech in older adults originates from early response in higher order auditory cortex

Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway

Temporal coding in the sub-millisecond range: Model of barn owl auditory pathway

Reading Neuronal Synchrony with Depressing Synapses

Modulation and Top-Down Processing in Audition

To Accompany: Thalamic Synchrony and the Adaptive Gating of Information Flow to Cortex Wang, Webber, & Stanley

A General Theory of the Brain Based on the Biophysics of Prediction

Spike Sorting and Behavioral analysis software

Stimulus-Dependent Auditory Tuning Results in Synchronous Population Coding of Vocalizations in the Songbird Midbrain

Spectral-peak selection in spectral-shape discrimination by normal-hearing and hearing-impaired listeners

Comment by Delgutte and Anna. A. Dreyer (Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA)

Nature Neuroscience: doi: /nn Supplementary Figure 1. Trial structure for go/no-go behavior

Discrimination of temporal fine structure by birds and mammals

Linguistic Phonetics Fall 2005

Biological Information Neuroscience

Efficient Encoding of Vocalizations in the Auditory Midbrain

Encoding of Temporal Information by Timing, Rate, and Place in Cat Auditory Cortex

Computational Perception /785. Auditory Scene Analysis

Basics of Computational Neuroscience

Multivariate receptive field mapping in marmoset auditory cortex

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

Effect of spectral content and learning on auditory distance perception

Bursting dynamics in the brain. Jaeseung Jeong, Department of Biosystems, KAIST

FINE-TUNING THE AUDITORY SUBCORTEX Measuring processing dynamics along the auditory hierarchy. Christopher Slugocki (Widex ORCA) WAS 5.3.

Subjects. Birds were caught on farms in northern Indiana, transported in cages by car to

Responses of Neurons in Cat Primary Auditory Cortex to Bird Chirps: Effects of Temporal and Spectral Context

Observational Learning Based on Models of Overlapping Pathways

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

G5)H/C8-)72)78)2I-,8/52& ()*+,-./,-0))12-345)6/3/782 9:-8;<;4.= J-3/ J-3/ "#&' "#% "#"% "#%$

Auditory Phase Opponency: A Temporal Model for Masked Detection at Low Frequencies

Title A generalized linear model for estimating spectrotemporal receptive fields from responses to natural sounds.

Chapter 40 Effects of Peripheral Tuning on the Auditory Nerve s Representation of Speech Envelope and Temporal Fine Structure Cues

Characterizing Neurons in the Primary Auditory Cortex of the Awake Primate U sing Reverse Correlation

Neural Encoding. Naureen Ghani. February 10, 2018

Binaural Hearing. Steve Colburn Boston University

Introduction to Computational Neuroscience

Supplementary materials for: Executive control processes underlying multi- item working memory

Theme 2: Cellular mechanisms in the Cochlear Nucleus

A TEMPORAL MODEL OF FREQUENCY DISCRIMINATION IN ELECTRIC HEARING

Acoustics Research Institute

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

SUPPLEMENTARY INFORMATION

Estimating sparse spectro-temporal receptive fields with natural stimuli

Neurobiology of Hearing (Salamanca, 2012) Auditory Cortex (2) Prof. Xiaoqin Wang

Reading Assignments: Lecture 18: Visual Pre-Processing. Chapters TMB Brain Theory and Artificial Intelligence

Information and neural computations

Active Control of Spike-Timing Dependent Synaptic Plasticity in an Electrosensory System

Coincidence Detection in Pitch Perception. S. Shamma, D. Klein and D. Depireux

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

11/2/2011. Basic circuit anatomy (the circuit is the same in all parts of the cerebellum)

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

FIRING PROPERTY OF INFERIOR COLLICULUS NEURONS AFFECTED BY FMR1 GENE MUTATION

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Task Difficulty and Performance Induce Diverse Adaptive Patterns in Gain and Shape of Primary Auditory Cortical Receptive Fields

Towards a biological theory of phone5c percep5on

State and Neuronal Class-Dependent Reconfiguration in the Avian Song System

Introduction to Computational Neuroscience

Structure and Function of the Auditory and Vestibular Systems (Fall 2014) Auditory Cortex (3) Prof. Xiaoqin Wang

THRESHOLD PREDICTION USING THE ASSR AND THE TONE BURST CONFIGURATIONS

Short Bouts of Vocalization Induce Long-Lasting Fast Gamma Oscillations in a Sensorimotor Nucleus

Nonlinear Spectrotemporal Interactions Underlying Selectivity for Complex Sounds in Auditory Cortex

Information-theoretic stimulus design for neurophysiology & psychophysics

J Neurophysiol 90: , First published July 23, 2003; /jn Submitted 18 April 2003; accepted in final form 17 July 2003

Evidence that the ventral stream codes the errors used in hierarchical inference and learning Running title: Error coding in the ventral stream

Twenty subjects (11 females) participated in this study. None of the subjects had

Transcription:

INFERRING AUDITORY NEURONAT RESPONSE PROPERTIES FROM NETWORK MODELS Sven E. Anderson Peter L. Rauske Daniel Margoliash sven@data. uchicago. edu pete@data. uchicago. edu dan@data. uchicago. edu Department of Organi'smal Biology and Anatomy Uniaersi,ty of Chi,cago, Chicago, IL 60637 AesrRlct Relating cell response to stimulus parameters is an important analytic method by which neural systems are understood. We inferred neurally encoded stimoirrc p.ru*"ters by training artificial neural networks to predict single cell response to auditory stimuli. A relatively simple time-delay architecture modeled each cell. For three cells, models successfully predict response to complex song stimuli based on optimization using much simpler artificial stimuli. For these models, average error predicting song response is less than 40% of. the cell's response variance. Model parameters are directly comparable to stimulus parameters and thereby estimate a neuron's spectro-temporal receptive field. We show that variation in model parameters can be used to assess cell sensitivity to stimulus parameters- INrRopucrtoN There is considerable evidence that the functional properties of neurons in a COMPUTAIIONALNEUROSCIENCE 157 Copyright @ 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.

158 SVEN E. ANDERSON ETAL. sensory system vary along several peripheral to central hiera"rchical axes. In the auditory system, peripheral neurons typically respond to simpre stimuli such as noise and tone bursts. At higher levels of the auditory system, neurons can exhibit complex responses for species-specific combinations of spectral and temporal elements. Ironically, "top-down" analysis may be easier than analysis ofspecies-specific processing in neurons at intermediate levels, because neurons that exhibit well-characterized responses to simple stimuli may exhibit complex responses to complex stimuli. Techniques for elucidating cell properties that underly complex response include reverse correlation (De Boer and De Jongh, 1978), spectro-temporal receptive fields (Aertsen et al., 1980), and stimulus reconstruction (Bialek et al., 1991). Complex stimulus encoding can also be assessed using artificial neural network architectures that are minimal and/or homologous with stimulus structure. we modeled neurons located in the thalamic zebra finch nucleus ovoidalis (ov). OV is tonotopically organized and many of its neurons respond most strongly to a characteristic frequency (CF) with a sustained response (Bigalke-Kunz et al., 1987). Other neurons have tonic, phasic/tonic, and inhibitory responses to stimuli. Response to complex stimuli such as song is quite complex, but is sometimes predictable from response to simple stimuli (Bankes and Margoliash, 1993). Our modeling of OV neurons addresses several general questions: (1) Can a model predict response to complex stimuli (e.g., bird song) from response to simple artificial stimuli? (2) Are all cells having similar response profiles equally well modeled? (3) Do model parameters relate to stimulus parameters and thereby reveal cell sensitivities? Mprsops single-unit neuronal data were collected extracellularly from three urethaneanesthetized zebra finches (Taeniopygia guttata) (Diekamp and Margoliash, 1991). Data were collected during presentation of 5-10 repetitions of broadband noise, tone bursts, harmonic stacks, frequency modulations, and amplitude modulated noise, as well as the bird's own song (BOS) and conspecific songs. For each cell, a stimulus set typically comprised response to 10 repet! tions of approximately 400 stimuli. We modeled only a subset comprising 14 cells that had simple tuning curves displaying tonic or phasic/tonic responses. The modeled responses were 3- point averages of 6.4 ms peri-stimulus time histograms (PSTHs) scaled linearly onto the interval [0.1,0.9]. Stimuli were represented by separate spectral and amplitude codes. The spectral units comprised FFT bins of Hanning-filtered windows of 12.8 ms, stepped at 6.4 ms. Spectral range was limited to 500-8000 Hz, resulting in 43 spectral inputs per 6.4 ms frame. RMS amplitude (30-g0dB) was represented by either a thermometer or interpolation code over 11 input

AUDITORY NEURONAL RESPONSE PROPERTIES 159 Figure 1: Response (solid line) and model prediction (dashed line) of cell or331 ptottea above the spectrogram of one song fragment (R2 = 0.77). The model was trained on artificial stimuli only. units. This choice did not affect model performance, but only the interpolation code induced weights that qualitatively fit a cell's amplitude sensitivity derived from response to amplitude variation of white noise stimuli. Numerous network architectures were investigated, including time-delay neural networks (TDNNs) with several hidden units (Lang et ai., 1990), linear feedforward and recurrent networks. Only TDNN and recurrent networks fit the data weii, and only the performance of the former extrapolated from artificial stimuli to accurately predict response to natural stimuli. we trained TDNN networks having a single output unit, one to several hidden units, a.nd delays of 0-186 ms. The architecture having a single hidden unit we hereafter refer to as the canonical architecture. Networks were trained using a gradient descent technique so that the output unit predicted a cell's firing rate (PSTH) in response to an acoustic stimulus. An example is shown in Figure 1. Twenty percent ofthe training data was reserved for cross-validation, and was therefore not used during training. Network parameter optimization was halted when the smoothed average sum squared error for the validation set increased' Cells vary widely in their overall responsiveness to stimuli. To evaluate model performance across cells we adopted the cofficient o! determ'i,nation, R2 : t - (DLo@(t) - m(t))2ldlo(y(t) - 9(t))"), where e(t) is cell response, g(t) is mean cell response, rn(t) is model prediction, and T is the stimulus duration. This measure varies between -oo and 1.0 and indicates the size of error relative to variation inherent in a cell's response. Rpsulrs The canonical architecture captured most response variance in all 14 cells: for each cell, networks trained on a subset ofsong and artificial stimuli predicted the response to all songs and artificial stimuli with an average ft2 > 0.5. For three cells, networks that trained on artificial stimuli predicted response to song (fi2 >

160 SVEN E. ANDERSON ETAL, Call 0r331 ot712 ot412 pnl1l o1631 o1121 o1511 pn1 12 o1521 o1111 orl 612 o1031 o1512 o1032 0.0 0.4 R2 Training Stimuli @ Artificlal I Artificial and Song Figure 2: Prediction of response to song when trained with artificial stimuli or song and artificial stimuli. 0.6). successful modeling of a cell's response properties is often measured by the model's ability to interpolate properly between examples in the training data set and thereby predict response to similar stimuli. In this case, we have found network models that extrapolate to predict response to an entirely different and more complex type of stimulus. An example of extrapolated response is shown in Figure 1. The extrapolation from simple to complex stimulus types demonstrates that the response of these three cells to complex stimuli is highly predictable from their response to much simpler stimuli. For one cell, a model trained on only song stimuli predicted responses to artificial stimuli (ft2 : 0.55). Network models demonstrate how response to one stimulus type is predictabie from response to another type, thus increasing our certainty that the model has accurately estimated specific paxameters governing cell response. while models extrapolated well for 3 cells, our results also suggest that not all ov neurons are equally simple. The extrapolation results (Figure 2) vary greatly across the set of modeled cells. The model may fit two cells equally, but extrapolation will be vast different (e.g., o1032 and pn112). There is also variation in the complexity of networks that extrapolate. we find that models tend to encode the low frequency components of responsel rapid, highry-nonlinear responses are better fit by networks having 2-4 hidden units. For 8 cells, a greater number of hidden units is associated with poorer extrapolation from artificial stimuli to song. For 5 cells, models having 2 hidden units extrapolate equally with the canonical architecture, and in one case a network having two hidden units extrapolates to song substantially better (R2 = 0.40 vs. 0.06). cell response types can be further delineated by network parameters. Networks represent spectral, amplitude, and temporal sensitivities based on response to both simple and complex stimuli. We therefore compared networks with

AUDITORY NEURONAL RESPONSE PROPERTIES Figure 3: STRF plot for cell o1521 derived by tone bursts (left) and song (right). a similarly general technique that estimates spectro-temporal receptive fields (STRFs) (Aertsen et al., 1980). STRF analysis compares the average spectrogram preceding individual spikes with an average preceding random events. To speed analysis here, STRF plots were created by using the PSTH values to weight spectrograms (128 point FFTs, stepped 3.2 ms). Comparison with spike-triggered strfs revealed no important differences between the two procedures. Figure 3 shows an examples of two STRFs, one created from tone bursts, the other from song. The song-derived STRF has a much broader peak, 'and post-stimulus inhibition is more appaxent in the tone-derived STRF- We compared best frequencies derived for song stimuli from networks and STRFs. For most network models of cells the weights connecting input and hidden units approximate the tuning curve estimated from responses to a set of sinusoidal stimuli. Best frequencies were determined from network weights by selecting the la.rgest spectral weight. For strfs, best frequencies were assu*ed to at the frequency associated with the maximum peak value- The 12 correlation coefficient for best frequencies determined from song by STRF and network weights was 0.85. Networks and STRF best frequencies correlated with the characteristic frequency determined from tone burst data with r2 : 0.40 and 0.52, respectively. This much lower correlation was due to 3-4 bad matches. We have thus far not determined the source of disagreement between best frequencies determined by analysis and CFs derived from tone burst data. Time delay weights from the hidden units to output unit are often initially positive for delays of 25-50 ms followed by negative weights for 25-50 ms. one difficulty with the strf analysis is that it does not indicate a causal relationship, nor does it reveal whether single parameters or conjunctions of parameters (e.g., two peaks in an STRF) are associated with cell response. Causality can be demonstrated in a predictive model via variation of model parameters and measurement of subsequent performance deficits (Bankes and Margoliash, 1993). Similarly, parametric variation allows us to assess cell sensitivities. For example, temporal integration of the three best-modeled cells was examined by training networks having a range of delays from 0-186 ms connecting hid-

r62 SVEN E. ANDERSON ETAL. den units to the output unit. For each cell, there is a duration beyond which performance does not improve with increasingly long integration periods. This occurs at about 100 ms (90 ms after correction for 7-10 ms response latency). Thus, for the variance captured by the models, we conclude that these cells have integration windows of no more than g0 ms. SuutvrlRY AND CoNCLUSIoNS This modeling study demonstrates that the majority of the variance of the modeled ov cells can be related to stimulus properties. For several cells, a model that is trained to predict response to artificial stimuli also predicts response to song. Thus, even the relatively simple neurons modeled here do not repond independently from stimulus type. Model parameters directly reflect important cell properties (e.g., best frequency). These observations demonstrate that network models of cell firing rates can be used to determine important physiological characteristics not apparent from response to complex natural stimuli. Acknowledgments The authors are indebted to B. Diekamp for collection of the physiological data. simulations were conducted using the cray c-90 of the pittsburgh supercomputing center (NSF IBN-940002P). S. Anderson was supported by NIH 1F32-MH10525-01. RpppRpNcps Aertsen, A. M. H. J., Johannesma, P. I. M., and Hermes, D. J. (1gg0). Spectrotemporal receptive fields of auditory neurons in the grassfrog: II. Analysis of the stimulus-event relation for tonal stimuli. Biol. Cybernetics, Jg,2S5-24g. Bankes, s. c. and Margoliash, D. (1993). Parametric modeling of the temporal dynamics of neuronal respon ses using connectionist architectures. "I. NeurophEs., 6g, 980981. Bialek, W., Rieke, F., de Ruyter van Steveninck, R., and Warland, D. (1991). Reading a neural code. Science, 252,1854-1,857. Bigalke-Kunz, 8., Riibsamen, R., and Diirrscheidt, G. J. (1987). Tonotopic organi_ zation and functional characterization of the auditory thalamus in a songbird, the European sta.rling. Journal of Comparatiue Physiologg A, L6t,25b-26b. De Boer, E. and De Jongh, H. (1978). on cochlear encoding: potentialities and limitations of the reverse correlation technique. J. Acoust. soc. Arn.,6B,11b-1Jb. rt Dieka'rnp, B. and Margoliash, D. (1991). Auditory responses in the nucleus ovoidalis a.re not so simple. In Soc. Neursoci..Absfr., volume!7, page 446. Lang, K. J., Waibel, A. H., and Hinton, G. E. (1990). A time-delay neural network architecture for isolated word recognition. Neurol Networks, 3,2245.