HST.723J, Spring 2005 Theme 3 Report

Similar documents
Binaural Hearing. Why two ears? Definitions

Sound localization psychophysics

21/01/2013. Binaural Phenomena. Aim. To understand binaural hearing Objectives. Understand the cues used to determine the location of a sound source

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Binaural Hearing. Steve Colburn Boston University

Neural correlates of the perception of sound source separation

Modeling Physiological and Psychophysical Responses to Precedence Effect Stimuli

The neural code for interaural time difference in human auditory cortex

Publication VI. c 2007 Audio Engineering Society. Reprinted with permission.

Processing in The Superior Olivary Complex

The Auditory Nervous System

Representation of sound in the auditory nerve

Physiological measures of the precedence effect and spatial release from masking in the cat inferior colliculus.

J Jeffress model, 3, 66ff

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 THE DUPLEX-THEORY OF LOCALIZATION INVESTIGATED UNDER NATURAL CONDITIONS

Binaural processing of complex stimuli

Neural Correlates and Mechanisms of Spatial Release From Masking: Single-Unit and Population Responses in the Inferior Colliculus

3-D Sound and Spatial Audio. What do these terms mean?

Hearing in the Environment

Comment by Delgutte and Anna. A. Dreyer (Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA)

Spatial hearing and sound localization mechanisms in the brain. Henri Pöntynen February 9, 2016

A developmental learning rule for coincidence. tuning in the barn owl auditory system. Wulfram Gerstner, Richard Kempter J.

The Central Auditory System

ABSTRACT. Figure 1. Stimulus configurations used in discrimination studies of the precedence effect.

Binaurally-coherent jitter improves neural and perceptual ITD sensitivity in normal and electric hearing

Accurate Sound Localization in Reverberant Environments Is Mediated by Robust Encoding of Spatial Cues in the Auditory Midbrain

William A. Yost and Sandra J. Guzman Parmly Hearing Institute, Loyola University Chicago, Chicago, Illinois 60201

Isolating mechanisms that influence measures of the precedence effect: Theoretical predictions and behavioral tests

Lab 4: Compartmental Model of Binaural Coincidence Detector Neurons

Trading Directional Accuracy for Realism in a Virtual Auditory Display

Temporal coding in the sub-millisecond range: Model of barn owl auditory pathway

The use of interaural time and level difference cues by bilateral cochlear implant users

Role of F0 differences in source segregation

Lecture 7 Hearing 2. Raghav Rajan Bio 354 Neurobiology 2 February 04th All lecture material from the following links unless otherwise mentioned:

3-D SOUND IMAGE LOCALIZATION BY INTERAURAL DIFFERENCES AND THE MEDIAN PLANE HRTF. Masayuki Morimoto Motokuni Itoh Kazuhiro Iida

HCS 7367 Speech Perception

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080

The role of low frequency components in median plane localization

Temporal coding in the auditory brainstem of the barn owl

Nucleus Laminaris. Yuan WangJason Tait Sanchez, and Edwin W Rubel

The role of periodicity in the perception of masked speech with simulated and real cochlear implants

NEW ROLES FOR SYNAPTIC INHIBITION IN SOUND LOCALIZATION

HEARING AND PSYCHOACOUSTICS

Creating a sense of auditory space

What Is the Difference between db HL and db SPL?

Hearing II Perceptual Aspects

Juha Merimaa b) Institut für Kommunikationsakustik, Ruhr-Universität Bochum, Germany

Auditory System & Hearing

The Effects of Hearing Loss on Binaural Function

Systems Neuroscience Oct. 16, Auditory system. http:

P. M. Zurek Sensimetrics Corporation, Somerville, Massachusetts and Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

The role of spatial cues for processing speech in noise

Coincidence Detection in Pitch Perception. S. Shamma, D. Klein and D. Depireux

A dendritic model of coincidence detection in the avian brainstem

IN EAR TO OUT THERE: A MAGNITUDE BASED PARAMETERIZATION SCHEME FOR SOUND SOURCE EXTERNALIZATION. Griffin D. Romigh, Brian D. Simpson, Nandini Iyer

Sensitivity to Interaural Time Differences in the Medial Superior Olive of a Small Mammal, the Mexican Free-Tailed Bat

Study of sound localization by owls and its relevance to humans

Effect of spectral content and learning on auditory distance perception

Lauer et al Olivocochlear efferents. Amanda M. Lauer, Ph.D. Dept. of Otolaryngology-HNS

On the influence of interaural differences on onset detection in auditory object formation. 1 Introduction

How is the stimulus represented in the nervous system?

Behavioral/Systems/Cognitive. Yi Zhou, 1 Laurel H. Carney, 2 and H. Steven Colburn 1 1

Influence of acoustic complexity on spatial release from masking and lateralization

Sound Localization PSY 310 Greg Francis. Lecture 31. Audition

The perceptual consequences of binaural hearing

B. G. Shinn-Cunningham Hearing Research Center, Departments of Biomedical Engineering and Cognitive and Neural Systems, Boston, Massachusetts 02215

Brian D. Simpson Veridian, 5200 Springfield Pike, Suite 200, Dayton, Ohio 45431

CODING OF AUDITORY SPACE

An Auditory System Modeling in Sound Source Localization

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa

SPHSC 462 HEARING DEVELOPMENT. Overview Review of Hearing Science Introduction

INTRODUCTION J. Acoust. Soc. Am. 100 (4), Pt. 1, October /96/100(4)/2352/13/$ Acoustical Society of America 2352

Speech segregation in rooms: Effects of reverberation on both target and interferer

Linking dynamic-range compression across the ears can improve speech intelligibility in spatially separated noise

Abstract. 1. Introduction. David Spargo 1, William L. Martens 2, and Densil Cabrera 3

Lateralized speech perception in normal-hearing and hearing-impaired listeners and its relationship to temporal processing

Release from speech-on-speech masking by adding a delayed masker at a different location

Learning to localise sounds with spiking neural networks

Dynamic-range compression affects the lateral position of sounds

Development of Sound Localization Mechanisms in the Mongolian Gerbil Is Shaped by Early Acoustic Experience

Effect of Reverberation Context on Spatial Hearing Performance of Normally Hearing Listeners

Neural Recording Methods

Neural Responses in the Inferior Colliculus to Binaural Masking Level Differences Created by Inverting the Noise in One Ear

Proceedings of Meetings on Acoustics

Auditory Phase Opponency: A Temporal Model for Masked Detection at Low Frequencies

The Structure and Function of the Auditory Nerve

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

Evidence for a neural source of the precedence effect in sound localization

Spectral processing of two concurrent harmonic complexes

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Signals, systems, acoustics and the ear. Week 5. The peripheral auditory system: The ear as a signal processor

Binaural detection with narrowband and wideband reproducible noise maskers. IV. Models using interaural time, level, and envelope differences

Evidence specifically favoring the equalization-cancellation theory of binaural unmasking

INTRODUCTION. 494 J. Acoust. Soc. Am. 103 (1), January /98/103(1)/494/13/$ Acoustical Society of America 494

INTRODUCTION. 475 J. Acoust. Soc. Am. 103 (1), January /98/103(1)/475/19/$ Acoustical Society of America 475

I. INTRODUCTION. NL-5656 AA Eindhoven, The Netherlands. Electronic mail:

HCS 7367 Speech Perception

Temporal weighting functions for interaural time and level differences. IV. Effects of carrier frequency

Auditory Scene Analysis

Review Auditory Processing of Interaural Timing Information: New Insights

Transcription:

HST.723J, Spring 2005 Theme 3 Report Madhu Shashanka shashanka@cns.bu.edu Introduction The theme of this report is binaural interactions. Binaural interactions of sound stimuli enable humans (and other species) to localize sounds in space. For example, the arrival times of a sound at the two ears are only microseconds apart (almost two orders of magnitude lesser than the duration of an action potential, see Grothe (2003)), but birds and mammals can use these time differences to localize low-frequency sounds. This report is a discussion of five papers which deal with various issues involved in the understanding of such binaural interactions. MacPherson and Middlebrooks (2002) examine cues that help in sound localization and revisit the duplex theory in the context of complex sounds. Tollin and Yin (2003) investigate the precedence effect - an auditory spatial illusion - in cats. Freyman et al. (2001) report results of their experiments to determine the extent to which perceived separation of speech and interference improves speech recognition in free field. Reyes et al. (1996) and Brand et al. (2002) are physiological studies which deal with in vitro recordings from avian Nucleus Laminaris (NL) neurons and in vivo recordings from MSO neurons of the Mongolian gerbil, respectively. Psychophysical studies The acoustical cues for sound localization were explored as early as the end of the 19th century by the British physicist Lord Rayleigh, among others. It has been known since then that interaural time differences (ITDs) provide important cues to the lateral positions of low-frequency sources while interaural level differences (ILDs) are important for sources with higher frequencies (> 500 Hz). MacPherson and Middlebrooks (2002) tested this duplex theory of sound localization in the context of complex sounds (broadband, low-pass and high-pass) using virtual auditory space techniques and found that their results nicely accorded with the theory. Directional Transfer Functions (DTFs) were measured for each listener and the derived directional impulse responses were used in the synthesis of the virtual free-field targets. The authors manipulated ITD and ILD cues present in the stimuli by imposing a whole waveform delay or attenuating the signal at one of the ears. To assess the salience of the manipulated cue in a manner which would permit meaningful comparisons between the weights, they derived dimensionless weights relating bias in responses to imposed bias in the underlying cue. They measured the correspondence between the physical cue and the lateral angle, and used this relation to convert any shift in the angular response to a quantity expressed in units of the manipulated cue. The listener s weighting of the manipulated cue was computed 1

as the slope of the linear fit between the observed cue bias and the imposed cue bias. Thus, a cue weight close to 0 indicated a weak cue while a cue weight close to 1 meant that the listener derived judgements primarily from the manipulated cue (see figure 1). They found that ITD was weighted more heavily than ILD in the wideband condition. In the low-pass condition, ITD weights were substantially higher than the ILD weights. In the high-pass condition, ITD was weighted less heavily than ILD for all listeners (figure 2 left panel). Both onset and envelope cues were found to be important in high-frequency ITD sensitivity (figure 2 right panel). They also manipulated the interaural level spectra (ILS) and found that ILS shape was less important than interaural energy difference. In another experiment, they investigated the importance of monaural spectral cues independently of ITD and ILD and found that they only had a minor influence on lateral angle localization. They concluded that the duplex theory does serve as a useful description of the relative potency of ITD and ILD cues in low and high frequency regimes. The precedence effect (PE) describes several spatial perceptual phenomena that are thought to be responsible for the ability to localize sounds in reverberant environments. They occur when similar sounds are presented from two different locations and separated by a delay. Physiological studies in animals have explored the neural bases for PE using stimuli expected to evoke it but little is known if the animals really perceive the effect as humans do. Tollin and Yin (2003) measured behavioral localization capabilities of cats under stimulus conditions similar to those that had been used in prior psychophysical studies in humans and that had also been used for physiological studies in experimental animals. Cats were trained to saccade to apparent locations of auditory/visual stimuli and cats eye positions were tracked using the scleral search coil technique. A fundamental assumption here is that the cats were looking at the perceived location of the acoustic stimuli. The authors examined three psychophysical aspects of PE - summing localization (percept of a phantom sound between the two sound sources), localization dominance (loss of ability to localize the second sound) and echo threshold (the shortest delay at which a reflection is localized near its source). For interstimulus delays (ISDs) between ±400µs, the apparent azimuthal location depended on ISD, which is consistent with summing localization (figure 3, right-top panel). Summing localization did not operate in the vertical plane. If the ISDs were greater (and 10ms), the later arriving stimulus had little influence on the ability of cats to indicate the location of the leading stimulus and this supports the localization dominance hypothesis. There was a small but significant undershoot which indicated that there was a small residual effect of the lagging stimulus that pulled the apparent location of the paired stimuli toward the spatial location of the lagging source. For ISDs about 15 ms and greater, the cats often made saccades to the lagging source on some trials and this type of response is consistent with the hypothesis that at these ISDs, the cats perceived two source locations but could not consistently identify which was actually the leading source. An important difference between this study and studies with human observers is that noise-burst stimuli were repeated several times, whereas stimuli used in many human experiments were often presented only once. Also, as the authors do point out, one can never be sure what the cats actually perceived. But they argue that their experiments and data strongly suggest that the cats do experience the three major perceptual phenomena of PE. We know that reflections have a minimal effect on the perceived locations of the pre- 2

sented stimuli due to the precedence effect. It was suggested in the 1950s that perceived differences in location between speech and interference might contribute to improved speech understanding. It was later shown that there was an advantage in the precedence effect condition when the masker was speech but not when it was steady noise. It was proposed that perceived that spatial separation provided a cue that helped listeners segregate the speech of the two talkers but this cue was unnecessary for segregating the target speech from continuous noise. Freyman et al. (2001) tried to determine more precisely the characteristics of a masker that are necessary for it to produce informational masking of speech. They explored whether unintelligible speech, or noise with the temporal and spectral fluctuations of speech (in addition to speech) might also create perceived separation advantages. Target sentences were 320 nonsense sentences which had a subject, verb, object syntax and performance was scored on these three keywords. The target-interference combination could either be F-F (target and interference from the front loudspeaker) or F-RF (target from the front loudspeaker and interference from both loudspeakers, with the right leading the front by 4 ms). With single-talker or two-talker speech as interference, the F-RF condition produced better results at each SNR (figure 4, left panels). The authors did the experiment with high-pass filtered speech to be sure that data were not affected by lowfrequency release from masking effects and found smaller but similar differences (figure 4, right panel). They found similar results when they used reversed speech (figure 5, left panel) or Dutch speech (figure 5, right panel) as maskers. This suggests that perceived spatial separation can lead to improved speech recognition even when the interference is not understood by the listener. When the interference was modulated noise, the results for F-F and F-RF configurations were similar and the authors concluded that modulations in talker maskers were not responsible for the improvement in F-RF conditions in the first experiment. Physiological studies Reyes et al. (1996) used an in vitro brain slice preparation (chick) to study the properties of Nucleus Laminaris (NL) neurons. NL contains third-order neurons that are innervated bilaterally by afferents from Nucleus Magnocellularis (NM) and the firing of NL neurons is modulated by differences in arrival times of sound to the ears. The authors wanted to examine the possibility that modulation is caused by the postsynaptic membrane properties of NL neurons and does not involve inhibitory circuitry. Under computer control, neurons were stimulated intracellularly with current that resembled synaptic currents arriving in NL during acoustic stimulation (see figure 6). The synaptic current reaching the soma of NL neurons during binaural stimulation was mimicked by constructing two sets of trains, representing inputs from ipsilateral and contralateral NM. To simulate ITDs, the onset of one of the trains was delayed systematically before summation and injection. Their results showed that there was a trade-off between binaural modulation of firing rate versus phase-locking and maintenance of high firing rates. High firing rates and good phase locking in NL cells at high frequencies were best achieved by having a small number of large synaptic inputs from NM while sensitivity to differences in arrival times of the NM inputs from the two sides was best when there was a large number of small synaptic inputs. To explain these conflicting requirements, the authors mention two possibilities - neurons 3

with higher CFs receiving fewer and larger post synaptic currents, and by synchronizing NM afferents. Another important finding was that rapidly activating outward rectification was crucial for phase locking and coincidence detection. The model uses perfect phase locking and does not consider the jitter. Hence, it is a best case analysis. The authors do not consider the role of inhibition. They use a point neuron model, possibly ignoring effects of bipolar dendrite filtering. Brand et al. (2002) address the role of inhibitory inputs to the MSO in ITD processing. They performed in vivo recordings from the MSO of the Mongolian gerbil, a small rodent with good low-frequency hearing. They used iontophoretic application of glycine and its antagonist strychnine to assess the role of inhibition. In the Jeffress model, it is generally assumed that response maxima of ITD functions are independent of the frequency-tuning characteristics of the neurons. However, Brand et al. (2002) found data which does not support this view. They found that maximal response was evoked by an ITD outside the physiologically relevant range (figure 8a) and ITD functions were steepest within the physiologically relevant range. Furthermore, the ITD that evoked the peak response was dependent on neuronal tuning for sound frequency (figure 8b). Application of strychnine produced a significant increase in discharge rate and shifted the peak of the ITD functions towards zero ITD (figure 9). The steepest slope of the function fell outside the physiologically relevant range of ITDs. The functions became nonmonotonic within the physiological range, and hence, ambiguous for ITD. From these results, the authors argued that slopes of ITD functions were critical for localizing sounds in azimuth and precise inhibition was necessary. They also presented simulation results using a point neuron model for which the time constants of inhibitory post synaptic currents had to be extremely short (0.1 ms). However, Zhou et al. (2005) argue that such brief time constants are inconsistent with in vitro observations from MSO neurons and propose an alternative mechanism which predicts the above results while being consistent with the known anatomy and physiology of MSO neurons. Discussion and Conclusions Tollin and Yin (2003) confirmed that stimuli used in physiological studies to understand neural bases of PE that were expected to evoke the PE illusion in experimental animals, in fact, did invoke the illusion (in cats). One should be very careful while drawing inferences about one species from physiological/behavioral studies conducted in animals of another species. One can argue that it is impossible to say whether cats perceived a fused sound or not, and whether the undershoot in the case of paired-source stimuli was simply because of an ambiguous acoustic image. One cannot rule out these arguments entirely but the experimental paradigm and results strongly suggest otherwise. Reyes et al. (1996) conducted in vitro recordings in NL neurons in the chick. Though they tried to mimic the NM inputs, not all aspects of the in vivo cell function can be adequately mimicked by an in vitro stimulation protocol. They concluded that modulation of firing was not critically dependent on inhibitory inputs. Brand et al. (2002) argued that precisely timed inhibition was a critical part of the mechanism by which ITDs are encoded in the MSO. As Grothe (2003) points out, the role of synaptic inhibition in mammals and birds are quite different, again emphasizing the fact that one should be aware of species differences. 4

Figure 1: Illustration of cue bias weight computation. The cue bias weight is the slope of a linear fit to the data. (From MacPherson and Middlebrooks (2002), figure 4) Figure 2: Left. ITD and ILD bias weights. Right. Effect of onset ramp duration on highpass ITD and ILD bias weights. (From MacPherson and Middlebrooks (2002), figures 5 and 6) 5

Figure 3: Mean final eye positions averaged across 5 cats to single sources (left) and to paired sources as a function of ISD (right). (From Tollin and Yin (2003), figure 3) Figure 4: Mean percent correct key word scores and ± one standard error (From Freyman et al. (2001), figures 3 and 4) 6

Figure 5: Mean percent correct key word scores and ± one standard error (From Freyman et al. (2001), figures 8 and 9) Figure 6: Computer controlled injection of current (From Reyes et al. (1996), figure 1) 7

Figure 7: Variation of evoked firing rate with (left) PSC amplitude, and (right) time-delayed stimulation (From Reyes et al. (1996), figures 5 and 8) Figure 8: a. ITD functions of gerbil MSO neurons (peaks are outside the physiologically relevant range - blue area) b. Distribution of best ITDs as a function of best frequencies of the neurons (From Brand et al. (2002), figure 2) Figure 9: c. ITD functions of a neuron (BF 1060 Hz) during control condition and strychnine application d. Average IPDs under control conditions (blue) and during application of strychnine (red) (From Brand et al. (2002), figure 3) 8

References Brand, A., Behrend, O., Marquardt, T., McAlpine, D., and Grothe, B. (2002). Precise inhibition is essential for microsecond interaural time difference coding. Nature, 417:543 547. Freyman, R. L., Balakrishnan, U., and Helfer, K. S. (2001). Spatial release from informational masking in speech recognition. J. Acoust. Soc. Am., 109:2112 2122. Grothe, B. (2003). New roles for synaptic inhibition in sound localization. Nature Rev. Neurosci., 4:540 550. MacPherson, E. A. and Middlebrooks, J. C. (2002). Listener weighting of cues for lateral angle: The duplex theory of sound localization revisited. J. Acoust. Soc. Am., 111:2219 2236. Reyes, A. D., Rubel, E. W., and Spain, W. J. (1996). In vitro analysis of optimal stimuli for phase locking and time-delayed modulation of firing in avian nucleus laminaris neurons. J. Neurosci., 16:993 1007. Tollin, D. J. and Yin, T. C. T. (2003). Psychophysical investigation of an auditory spatial illusion in cats: The precedence effect. J. Neurophysiol., 90:2149 2162. Zhou, Y., Carney, L. E., and Colburn, H. S. (2005). A model for itd sensitivity in the mso: Interaction of excitatory and inhibitory synaptic inputs, channel dynamics, and cellular morphology. J. Neurosci., 25(12):3046 3058. 9