On the influence of interaural differences on onset detection in auditory object formation. 1 Introduction

Similar documents
Sound localization psychophysics

Auditory Scene Analysis

Binaural Hearing. Why two ears? Definitions

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Auditory scene analysis in humans: Implications for computational implementations.

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 THE DUPLEX-THEORY OF LOCALIZATION INVESTIGATED UNDER NATURAL CONDITIONS

What Is the Difference between db HL and db SPL?

Binaural interference and auditory grouping

INTRODUCTION J. Acoust. Soc. Am. 100 (4), Pt. 1, October /96/100(4)/2352/13/$ Acoustical Society of America 2352

21/01/2013. Binaural Phenomena. Aim. To understand binaural hearing Objectives. Understand the cues used to determine the location of a sound source

Computational Perception /785. Auditory Scene Analysis

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080

Influence of acoustic complexity on spatial release from masking and lateralization

Trading Directional Accuracy for Realism in a Virtual Auditory Display

Hearing in the Environment

Publication VI. c 2007 Audio Engineering Society. Reprinted with permission.

Comment by Delgutte and Anna. A. Dreyer (Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA)

Neural correlates of the perception of sound source separation

Auditory temporal order and perceived fusion-nonfusion

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Combating the Reverberation Problem

Sound Localization in Multisource Environments: The Role of Stimulus Onset Asynchrony and Spatial Uncertainty

HEARING AND PSYCHOACOUSTICS

Hearing II Perceptual Aspects

Speech segregation in rooms: Effects of reverberation on both target and interferer

Systems Neuroscience Oct. 16, Auditory system. http:

HCS 7367 Speech Perception

Temporal offset judgments for concurrent vowels by young, middle-aged, and older adults

Physiological measures of the precedence effect and spatial release from masking in the cat inferior colliculus.

Topic 4. Pitch & Frequency

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Frequency refers to how often something happens. Period refers to the time it takes something to happen.

Effect of spectral content and learning on auditory distance perception

Auditory Scene Analysis: phenomena, theories and computational models

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

I. INTRODUCTION. NL-5656 AA Eindhoven, The Netherlands. Electronic mail:

Minimum Audible Angles Measured with Simulated Normally-Sized and Oversized Pinnas for Normal-Hearing and Hearing- Impaired Test Subjects

IN EAR TO OUT THERE: A MAGNITUDE BASED PARAMETERIZATION SCHEME FOR SOUND SOURCE EXTERNALIZATION. Griffin D. Romigh, Brian D. Simpson, Nandini Iyer

Categorical Perception

Combination of binaural and harmonic. masking release effects in the detection of a. single component in complex tones

Binaural processing of complex stimuli

Linguistic Phonetics Fall 2005

Spectral processing of two concurrent harmonic complexes

Sound Localization PSY 310 Greg Francis. Lecture 31. Audition

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking

Study of perceptual balance for binaural dichotic presentation

HST.723J, Spring 2005 Theme 3 Report

Quantifying temporal ventriloquism in audio-visual synchrony perception

3-D Sound and Spatial Audio. What do these terms mean?

Binaural Hearing. Steve Colburn Boston University

Role of F0 differences in source segregation

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa

The masking of interaural delays

B. G. Shinn-Cunningham Hearing Research Center, Departments of Biomedical Engineering and Cognitive and Neural Systems, Boston, Massachusetts 02215

Using 3d sound to track one of two non-vocal alarms. IMASSA BP Brétigny sur Orge Cedex France.

Brian D. Simpson Veridian, 5200 Springfield Pike, Suite 200, Dayton, Ohio 45431

Robust Neural Encoding of Speech in Human Auditory Cortex

P. M. Zurek Sensimetrics Corporation, Somerville, Massachusetts and Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

William A. Yost and Sandra J. Guzman Parmly Hearing Institute, Loyola University Chicago, Chicago, Illinois 60201

The basic hearing abilities of absolute pitch possessors

Effects of Age and Hearing Loss on the Processing of Auditory Temporal Fine Structure

Localization in the Presence of a Distracter and Reverberation in the Frontal Horizontal Plane. I. Psychoacoustical Data

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

The extent to which a position-based explanation accounts for binaural release from informational masking a)

Effect of microphone position in hearing instruments on binaural masking level differences

Some methodological aspects for measuring asynchrony detection in audio-visual stimuli

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

Variation in spectral-shape discrimination weighting functions at different stimulus levels and signal strengths

Congruency Effects with Dynamic Auditory Stimuli: Design Implications

Effect of mismatched place-of-stimulation on the salience of binaural cues in conditions that simulate bilateral cochlear-implant listening

Audio Quality Assessment

The role of tone duration in dichotic temporal order judgment (TOJ)

The role of low frequency components in median plane localization

J Jeffress model, 3, 66ff

= + Auditory Scene Analysis. Week 9. The End. The auditory scene. The auditory scene. Otherwise known as

Loudness Processing of Time-Varying Sounds: Recent advances in psychophysics and challenges for future research

Juha Merimaa b) Institut für Kommunikationsakustik, Ruhr-Universität Bochum, Germany

Hearing. Juan P Bello

Lateralized speech perception in normal-hearing and hearing-impaired listeners and its relationship to temporal processing

Tactile Communication of Speech

2201 J. Acoust. Soc. Am. 107 (4), April /2000/107(4)/2201/8/$ Acoustical Society of America 2201

HCS 7367 Speech Perception

AUDITORY GROUPING AND ATTENTION TO SPEECH

Evidence specifically favoring the equalization-cancellation theory of binaural unmasking

Abstract. 1. Introduction. David Spargo 1, William L. Martens 2, and Densil Cabrera 3

Release from speech-on-speech masking by adding a delayed masker at a different location

Yonghee Oh, Ph.D. Department of Speech, Language, and Hearing Sciences University of Florida 1225 Center Drive, Room 2128 Phone: (352)

Issues faced by people with a Sensorineural Hearing Loss

EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE DESIGN OF SPATIAL AUDIO DISPLAYS

Release from informational masking in a monaural competingspeech task with vocoded copies of the maskers presented contralaterally

BINAURAL DICHOTIC PRESENTATION FOR MODERATE BILATERAL SENSORINEURAL HEARING-IMPAIRED

A NOVEL HEAD-RELATED TRANSFER FUNCTION MODEL BASED ON SPECTRAL AND INTERAURAL DIFFERENCE CUES

PARAMETERS OF ECHOIC MEMORY

Even though a large body of work exists on the detrimental effects. The Effect of Hearing Loss on Identification of Asynchronous Double Vowels

SPHSC 462 HEARING DEVELOPMENT. Overview Review of Hearing Science Introduction

3-D SOUND IMAGE LOCALIZATION BY INTERAURAL DIFFERENCES AND THE MEDIAN PLANE HRTF. Masayuki Morimoto Motokuni Itoh Kazuhiro Iida

The role of periodicity in the perception of masked speech with simulated and real cochlear implants

Representation of sound in the auditory nerve

Acoustics Research Institute

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants

Transcription:

On the influence of interaural differences on onset detection in auditory object formation Othmar Schimmel Eindhoven University of Technology, P.O. Box 513 / Building IPO 1.26, 56 MD Eindhoven, The Netherlands, e-mail: o.v.schimmel@tm.tue.nl, Armin Kohlrausch Eindhoven University of Technology, P.O. Box 513 / Building IPO 1.33, 56 MD Eindhoven, The Netherlands, and Philips Research Laboratories, Prof. Holstlaan 4 / Building WO 2, 5656 AA Eindhoven, The Netherlands, e-mail: armin.kohlrausch@philips.com The research reported here explored a possible influence of interaural differences of a broadband target sound of several different durations on detection of its onset in a diotic broadband masker sound. For two positions in the lateral plane, five target/marker duration combinations and various signal-to-noise ratios, the subject s ability to accurately align a lateralized Gaussian noise target s onset to the meter of a regular series of marker pulses in a Gaussian noise masker was investigated. It was found that decreasing the signalto-noise ratio between target and masker yields a coherent degradation in onset positioning accuracy, used as a quantitative measure for detection of the onset, for all the lateralization conditions and target/marker duration combinations applied. 1 Introduction Perceptual stream segregation, or auditory object formation, is defined as the human listener s ability to group signal components, and consequentially separate discrete sources from a complex mixture of sounds. Grouping of signal components is established in the human hearing system by a number of cues that adhere to Gestaltlike principles underlying this auditory organization, like onset/offset synchrony, harmonicity (the ratio between the frequencies of the constituent signal components), common modulation, and spatial localization [1]. These so-called primitive organization principles are assumed to be innate to the human hearing system, and therefore subject independent. Of particular interest to initiate the grouping of signal components are rapid increases in intensity, if they have a level above detection threshold and occur simultaneously in several decomposed spectral parts of an acoustic mixture, the onsets. Such synchronous onsets are likely to come from the same physical source, and are combined to form a structure that corresponds to a perceptual stream or auditory object. Once a stream is formed on the basis of a detected onset, it is further monitored on frequency content and modulations. Harmonicity, frequency modulation, amplitude modulation, and directional information over time then continue to support the structure, until the energy in the concerning spectral parts decreases below perception threshold and the auditory object is supposed to have disappeared. The onset of a sound is commonly defined as the beginning of a discrete event in an acoustical signal, providing information about when exactly something is happening. The percept of an onset is caused by a noticeable change in intensity, pitch or timbre of a sound [2]. In studying the effects of rate and asynchrony of tone onsets on the discriminability of onset order, Bregman et al. [3, 4] found that the larger the onset asynchrony and the more sudden the onset, the stronger the segregation from concurrent sounds in a mixture. Theoretically an onset can be detected at the moment a certain level of energy, or change in energy over frequency parts, is reached to make it stand out from an acoustical complex, and a new sound can actually be observed. Less obvious is the role of spatial localization in forming a new auditory object. Interaural differences in time and intensity enable localization of sound sources, and assist the human auditory system in segregating sounds from and within complex acoustic scenes, due to a spatial release from masking, commonly referred to as the binaural masking level difference [5, 6]. Differences in location of concurrent sound sources provide the physical cues to support efficient listening in so-called cocktailparty conditions [7, 8]. Since directional information is part of every sound when listening binaurally, and it is already present at the onset of a sound, it is not a priori clear what its contribution to grouping is. Do the directional cues in interaural disparities support grouping of signal components and thus make it easier to detect the onset, or is grouping of signal components by onset detection required for deriving localization information? An established effect of onset on sound source localization is the precedence effect, i.e. the observation that localization of a sound in a reverberant environment is mainly determined by the interaural cues of the firstarriving direct sound to the neglect of later-arriving reflections [9, 1]. Zurek [11] observed that interaural sensitivity to changes in both time and intensity follows a non-monotonic course after the abrupt onset of a sound, and from his experiments the precedence effect can be 159

Forum Acusticum 25 Budapest seen as a result of a temporary degradation of interaural sensitivity, between approximately.5 and 1 ms after an immediate onset. In assessing a model of auditory object formation, Woods and Colburn [12] assumed object formation to take place before allocation of perceptual features like direction of a sound source. This implies object formation and localization to be independent processes. Drennan et al. [13] found, in their research on the role of spatial location on perceptual segregation, auditory object formation to be depending on interaural differences in time and intensity. Although binaural cues may not be required to segregate a perceptual stream when monaural cues are strong enough, they may contribute concurrently to segregation, and a parallel process can not be excluded. In general, onsets are shown to have an influence on both auditory object formation and spatial localization. The research presented here was performed as continuation of our previous research [14] that explored the influence of interaural differences on forming a new auditory object, i.e. detecting its onset. The question was addressed whether the availability of different directional cues for two sounds improves their onset detection, and thereby affects the ability to segregate one sound from another, compared to a situation without any difference in directional information in the two sounds. The focus of the current research in the same paradigm was on exploring the possible effects of duration of a lateralized sound on the ability to detect its onset. 2 Experiment 2.1 Method To determine a sound s onset detection, an experimental procedure was adapted from speech perception research, based on perceptual center location [15, 16]. The perceptual center procedure comprises judging the temporal position of a target sound, and/or moving a target sound along the time axis of an acoustical stimulus, relative to isochronous reference sounds in the stimulus, in order to Level (db) 76 7 t 5 1 15 2 25 Figure 1: Example of an acoustic stimulus, in which the noise target (white bar) is randomly positioned at about +2 ms from the temporal center at 1 ms. find the desired temporal alignment, or perceptual center. These P-centers are related to the acoustic makeup of the stimulus, and in speech they are determined both by vowel onset and duration of preceding consonants. Since perceptually regular lists have regular P-centers, the assumption in the adaptation here is that perceptual isochrony can be achieved with isochronous onsets of the stimulus components. The stimulus structure used in this experiment consisted of a regular series of five marker pulses of which the middle marker pulse was omitted, against a continuous masker (see Figure 1 for a graphical representation of a stimulus). The subject s task was to temporally align the spatialized target (in white) in the masker (in light gray), randomly positioned between the second and the third marker pulse at the start of the procedure, to the meter of these clearly audible marker pulses (in dark gray). The time difference ( t) between the adjusted temporal position of the onset of the target, from the bisection of the time between the two flanking marker pulse onsets (the dotted line at 1 ms) was recorded, and served as a quantitative measure for the positioning accuracy. From these timing data, the positioning accuracy and bias of differently spatialized targets could be obtained. A monotonic relationship between positioning accuracy and the underlying dimension of onset detection is assumed. In accordance with Marcus [15], the step size for timing adjustments in this experiment was set to 5 ms, which corresponds to about the minimum sensitivity for perceptual center changes found. The subject controlled the experiment from the computer keyboard. After an initial playback of the stimulus with a randomly positioned target, the target position could be adjusted by pressing f (for forward ) or b (for backward ) any number of times the target was to be shifted by the step size. Confirmation by pressing Enter would start the playback of the stimulus with the target at the new temporal position. This procedure could be repeated until the subject was satisfied with the target s fit to the meter of the marker pulses. The final time instant of the target sound on the time line of the stimulus was then recorded as the perceived moment of occurrence. The experiment was performed in an acoustically isolated listening room at the Philips Research Laboratories, using a computer running MATLAB software to digitally generate the stimuli, and automate the experiment and data collection. The stimuli were converted to the analog domain by a Marantz CDA-94 two-channel 16-bit digitalto-analog converter at a sampling rate of 44.1 khz, and presented to the subjects over Beyer Dynamic DT99 Pro headphones. A within-subject experimental design with counterbalanced complete randomization was applied. Each condition was presented twice to each subject, and the results were averaged across subjects. 151

Forum Acusticum 25 Budapest 2.2 Stimuli Three types of sound were used: a continuous noise masker with a level of 7 db SPL, a number of 5-ms or 35-ms marker pulses with a level of 76 db SPL, and a lateralized noise target of different durations, at a number of different levels. All types were noises with independent Gaussian probability distributions and had immediate onsets and offsets, to enable detection at the actual start of the stimulus constituents, and to enable their unambiguous fixation on the stimulus time line. The applied target durations were 5 ms, 5 ms or 35 ms. The 5-ms target/ 5-ms marker combination was established earlier, under otherwise equal experimental conditions [14], and these data were pooled with the results of the current experiment. The noise masker and the marker pulses were identical at both ears, to establish lateralization in the center. The noise target was presented with different interaural parameters to yield lateralization at two different positions, one in the center ([C]) and one to the far right ([R]). This latter case was achieved by usage and controlled manipulation of the interaural time difference (ITD) or interaural level difference (ILD). The value of the interaural time difference in the ITD [R] condition was set at 29 samples at 44.1 khz, to approximate a 66-µs lag at the left ear. The value of the interaural level difference in the ILD [R] condition was set at 18 db: target level +9 db at the right ear, and target level -9 db at the left ear. Five different sound pressure levels were applied to the noise target: four, from to +3 db relative to the previously established detection threshold of the noise target, measured per lateralization condition per target duration per subject (cf. [14]), and one with the same level as the noise masker level, 7 db. By choosing target levels relative to the corresponding threshold, the sensation levels of the target are equal in all different conditions and data can be pooled across subjects. The measurements at a target level equal to the noise masker level, measured both with and without noise masker, served as control conditions for assessing the temporal positioning accuracy at high signal-to-noise ratios. 2.3 Subjects Five male subjects participated voluntarily in the experiment. Their ages ranged from 26 to 38 years, with an average age of 3.8 years. All subjects had previous experience in listening tests, and reported normal hearing. Training for this particular experiment was thought not to be required, since subjects were allowed to reiterate the procedure for each condition as long as necessary to reach a high confidence level about the result, before continuing with the next condition. 2.4 Results Figure 2 gives an overview of the results per lateralization condition. Indicated on the abscissae is the target level with decreasing signal-to-noise ratio from left to right. To the left are displayed the two control conditions at a target level equal to the 7 db SPL noise masker [without noise masker, indicated as target level 7 db SPL ( ), and with noise masker, indicated as target level 7 db SPL (+)], to the right are the four target levels relative to the detection threshold in the noise masker for that particular condition (target levels to ). The ordinates show the time deviation from the temporal center. The five lines connecting the different symbols represent the mean results of each target/marker duration combination. Negative values signify onsets positioned earlier than the temporal center, positive values signify onsets positioned later than the temporal center. Please note that the target level 7 db SPL [+] is missing for the target/marker combination, since this target level is below detection threshold of the 5-ms noise target in the 7-dB noise masker. With regard to these mean time deviations from the temporal center, the following is shown in Figure 2. A possible bias in the time deviations of all five target/marker duration combinations in the monaural [C] condition (upper panel) is constrained to a range of about ±3 ms, as already established in the control condition without the noise masker, i.e. at an infinite signal-to-noise ratio [7 db SPL ( )]. The means of the four target/marker combinations with the 5 and 5-ms marker durations in general stay close to the temporal center with a decrease in signal-to-noise ratio, although mainly at the negative side, at all target levels that include the noise masker, except at the lowest target level ( db SL). The means of the 5/35 target/marker combination are predominantly at the positive side, and have larger time deviations than the other target/marker combinations. In both binaural [R] conditions (middle and lower panel), the results in the control condition without the noise masker and the upper limits are very similar to the [C] condition. However, the four target/marker combinations with the 5 and 5-ms marker durations exhibit a considerable shift in mean toward the negative at decreasing signal-to-noise ratios. For example, the mean time deviation in positioning a 35-ms target with 5-ms markers at 2 db above detection threshold is more than twice (ILD [R]) or four times (ITD [R]) the magnitude of the time deviation at the control condition without the noise masker. Again, the means of the 5/35 target/marker combination are mainly at the positive side, and have time deviations within the same range as in the monaural [C] condition. In general, the mean time deviations of the various target/marker duration combinations are seen to vary more between the different target levels in the ILD [R] condition than in the ITD [R] condition. 1511

Forum Acusticum 25 Budapest 1 5 5 1 7 db SPL ( ) 1 5 5 5/35 3 1 7 db SPL ( ) 1 5 5 5/35 3 5/35 3 1 7 db SPL ( ) 7 db SPL (+) 7 db SPL (+) 7 db SPL (+) [C] ITD [R] ILD [R] db SL db SL db SL Figure 2: Means of the time deviation from the temporal center, per lateralization condition, N=1. Analysis of variance on the pooled raw data reveals, next to the expected significant differences at the 5% level between the lateralization conditions (F (2,81) = 8.2, p =., R 2 =.2) and target levels (F (5,81) = 4., p =.1, R 2 =.2), the differences between the target/marker duration combinations to be significant and account for 9% of the variance in the data (F (5,81) = 22.39, p =., R 2 =.9). Furthermore, the interactions of lateralization condition with target level and target/marker combination are also significant (respectively F (1,81) = 2.28, p =.12, R 2 =.2 and F (8,81) = 2.3, p =.19, R 2 =.2). Four out of five significant effects do not exceed a correlation ratio of.2, indicating only minor contributions to the variance in the data. In a post-hoc analysis, the results in the [C] condition are found to be significantly different from those of both ITD [R] and ILD [R], which do not differ significantly from each other. Also not surprisingly, most target levels with low signal-to-noise ratio are found to be different at the 5% level from the two control conditions with high signal-to-noise ratio. With regard to the target/marker duration combinations, with a correlation ratio of.9 accounting for the largest part of the variance next to the error, the ones with short durations (, and ) are not significantly different from each other. However, all three do differ significantly from the combinations with the longest durations (5/35 and 3), which in their turn also differ from each other. Since the interest was mainly in the accuracy of positioning the target s onset to the regular meter of marker pulse onsets, regardless of a possible bias, the standard deviations were used for further analysis. In analogy to the previous figure, Figure 3 gives an overview per lateralization condition of the standard deviations of the time deviations, for the five applied target/marker duration combinations. As can be seen in all three panels of Figure 3, in the control condition without the noise masker [7 db SPL ( )] the standard deviations of all target/marker duration combinations are generally the smallest. With introduction of the noise masker, the standard deviations of all target/marker duration combinations increase with a decrease in signal-to-noise ratio, showing similar reductions in accuracy of target s onset positioning for all three lateralization conditions. However, the precision in the ability to align the target s onset of the 35-ms target condition (3) is seen to decline more rapidly with a decrease in signal-to-noise ratio in the ITD [R] condition, and reach the maximum standard deviations found in both binaural conditions. The other four target/marker combinations stay within the comparable ranges, exhibiting equivalent positioning accuracy of the target at low signal-to-noise ratios for all lateralization conditions. 1512

Forum Acusticum 25 Budapest 3 Discussion At the highest signal-to-noise ratio [7 db SPL ( )], the means are consistently closer to zero for the target/marker combinations with short durations (, and ) than for the ones with the longer durations (5/35 and 3), suggesting a shift in P-center for the applied 35-ms target and markers. For the long marker combination (5/35), the target is positioned on average later (about 27 ms), and for the long target combination (3), the target is positioned on average earlier (about 25 ms) than the temporal center. Occuring in the control condition with the highest signal-to-noise ratio, this leads to the conclusion that the P-center of the 35-ms target and marker is shifted by approximately the same amount away from the corresponding perceptual center of the 5-ms and 5-ms durations. Supported by the statistical analysis on the pooled raw data, the overall results of these two target/marker combinations are seen as different from each other and from the other three target/marker combinations. This change in P-center of the 35-ms sounds then accounts for the larger time deviations to either side of the temporal center found. It remains, however, a surprising observation that the P-center of these longer sounds is no longer associated with the sound s onset, despite immediate onsets and an infinite signal-to-noise ratio. When comparing between the monaural and both binaural conditions, from high to low signal-to-noise ratios, the shift in means toward the negative for the four target/marker combinations with short marker durations (,, and 3) may reveal a change in the so-called onset effect (cf. [17]). In the case of perceptual ambiguity, due to the inability to perceive a clear onset cue at low signal-to-noise ratios, the binaural hearing system is to look for fine-structure cues in the ongoing part of the signal to establish or support the aggravated lateralization. The required processing of the binaural hearing system to resolve and integrate over time the interaural differences in the acoustical input may interfere with perceiving the exact onset of its constituent elements. In line with Drennan et al. [13], this leads to the conclusion that realization of a new auditory object under these circumstances is dependent on processing of binaural cues. Target/marker combinations with short durations (, and ) are found to yield similar accuracy in temporal positioning of the noise target s onset, i.e. the or combinations do not lead to more precise onset detection than the combination. Given the established sensitivity for P-center variation [15], this finding shows that, for durations of 5 ms, subjects indeed focus on the onset of both marker and target, supporting our previous conclusions with regard to the influence of interaural differences on onset detection of the 5-ms target and markers [14]. 1 75 5 25 7 db SPL ( ) 1 75 5 25 5/35 3 7 db SPL ( ) 1 75 5 25 5/35 3 5/35 3 7 db SPL ( ) 7 db SPL (+) 7 db SPL (+) 7 db SPL (+) [C] ITD [R] ILD [R] db SL db SL db SL Figure 3: Standard deviations of the time deviation from the temporal center, per lateralization condition, N=1. 1513

Forum Acusticum 25 Budapest 4 Conclusion Different methods for lateralization of a target sound were applied with different though coherent results in influencing the onset detectability of a Gaussian noise target of several durations and levels in a continuous Gaussian noise masker. It was found that the ability to accurately align the onset of a noise target to a regular meter of marker pulses is influenced by the target s directional information (time deviations are generally larger in the binaural conditions), level (time deviations increase with decreasing signal-to-noise ratio) and duration (time deviations are smaller for short durations). The obtained data show a decrease in positioning accuracy, but no significant bias, at low signal-to-noise ratios for the short target durations. The longer target duration yielded a shift in the perceived onset, indicating a more complex interaction of the target s features on determining its onset. The significant differences between lateralization conditions, between target levels, and between target/marker duration combinations suggest binaural processing to unmask the noise target from the noise masker, due to interaural disparities and weak onset cues, to be responsible for deteriorated performance in detecting and temporally positioning the noise target s onset at low signal-to-noise ratios. With regard to auditory object formation, this reduced accuracy indicates that for long sounds there is less advantage in binaural hearing for a more complex task such as precise onset detection than for simply detection. References [1] A.S. Bregman, Auditory scene analysis: The perceptual organization of sound, MIT Press, Cambridge (MA) (199) [2] A. Klapuri, Sound onset detection by applying psychoacoustic knowledge. Proc. ICASSP 1999, pp. 389 392. (1999) [3] A.S. Bregman, P. Ahad, J. Kim and L. Melnerich, Resetting the pitch-analysis system: 1. Effects of rise times of tones in noise backgrounds or harmonics in a complex tone, Perception and Psychophysics, Vol. 56. pp. 155 162. (1994) [4] A.S. Bregman, P.A. Ahad and J. Kim, Resetting the pitch-analysis system: 2. Role of sudden onset and offsets in the perception of individual components in a cluster of overlapping tones, Journal of the Acoustical Society of America, Vol. 96. pp. 2694 273. (1994) [5] I.J. Hirsh, The influence of interaural phase on interaural summation and inhibition, Journal of the Acoustical Society of America, Vol. 2. pp. 536 544. (1948) [6] J.C.R. Licklider, The influence of interaural phase relations upon the masking of speech by white noise, Journal of the Acoustical Society of America, Vol. 2. pp. 15 159. (1948) [7] E.C. Cherry, Some experiments on the recognition of speech, with one and with two ears, Journal of the Acoustical Society of America, Vol. 25. pp. 975 979. (1953) [8] W.A. Yost, The cocktail party problem: forty years later. Binaural and spatial hearing in real and virtual environments, edited by R.H. Gilkey and T.R. Anderson, Erlbaum, Mahwah (NJ), pp. 329 347. (1997) [9] H. Wallach, E.B. Newman and M.R. Rosenzweig, The precedence effect in sound localization, American Journal of Psychology, Vol. 62. pp. 315 336. (1949) [1] R.Y. Litovsky, S.H. Colburn, W.A. Yost and S.J. Guzman, The precedence effect, Journal of the Acoustical Society of America, Vol. 16. pp. 1633 1654. (1999) [11] P.M. Zurek, The precedence effect and its possible role in the avoidance of interaural ambiguities, Journal of the Acoustical Society of America, Vol. 67. pp. 952 964. (198) [12] W.S. Woods and H.S. Colburn, Test of a model of auditory object formation using intensity and interaural time difference discrimination, Journal of the Acoustical Society of America, Vol. 91. pp. 2894 292. (1992) [13] W.R. Drennan, S. Gatehouse and C. Lever, Perceptual segregation of competing speech sounds: The role of spatial location, Journal of the Acoustical Society of America, Vol. 114. pp. 2178 2189. (23) [14] O. Schimmel and A. Kohlrausch, Principles of auditory object formation: On the influence of spatial information on onset detection, Proc. Joint Congress CFA/DAGA 24, Strasbourg, Vol. 2. pp. 885 886. (24) [15] S.M. Marcus. Perceptual centers, Ph.D. dissertation, King s College, Cambridge (UK). (1976) [16] B. Pompino-Marschall, On the psychoacoustic nature of the P-center phenomenon, Journal of Phonetics Vol. 17. pp. 175-192. (1989) [17] P.M. Zurek, A note on onset effects in binaural hearing, Journal of the Acoustical Society of America, Vol. 93. pp. 12 121. (1993) 1514