Auditory Scene Analysis: phenomena, theories and computational models

Size: px
Start display at page:

Download "Auditory Scene Analysis: phenomena, theories and computational models"

Transcription

1 Auditory Scene Analysis: phenomena, theories and computational models July 1998 Dan Ellis International Computer Science Institute, Berkeley CA Outline The computational theory of ASA Cues & grouping Expectations & inference Big issues ASA - Dan Ellis 1998jul11-1

2 Auditory Scene Analysis What does our sense of hearing do? - recover useful information... about objects of interest... in a wide range of circumstances Measuring objects in an auditory scene: ASA - Dan Ellis 1998jul11-2

3 Subjective analysis of auditory scenes f/hz City Horn1 (10/10) S9 horn 2 S10 car horn S4 horn1 S6 double horn S2 first double horn S7 horn S7 horn2 S3 1st horn S5 Honk S8 car horns S1 honk, honk Crash (10/10) S7 gunshot S8 large object crash S6 slam S9 door Slam? S2 crash S4 crash S10 door slamming S5 Trash can S3 crash (not car) S1 slam Horn2 (5/10) S9 horn 5 S8 car horns S2 horn during crash S6 doppler horn S7 horn3 Truck (7/10) S8 truck engine S2 truck accelerating S5 Acceleration S1 rev up/passing S6 acceleration S3 closeup car S10 wheels on road Horn3 (5/10) S7 horn4 S9 horn 3 S8 car horns S3 2nd horn S10 car horn Subjects identify structures in dense scenes with high agreement ASA - Dan Ellis 1998jul11-3

4 Outline The computational theory of ASA - ASA and CASA - The grouping paradigm - Marr s three levels of explanation Cues & grouping Expectations & inference Big issues ASA - Dan Ellis 1998jul11-4

5 Auditory Scene Analysis (ASA) The organization of sound scenes according to their inferred sources Real-world sounds rarely occur in isolation a useful sense of hearing must be able to segregate mixtures - people (and...) do this very well; unexpectedly difficult to model - depends on: subjective definition of relevant sources regularity/constraints of real-world sounds Studied via experimental psychology - characterize rules for organizing simple pieces (tones, noise bursts, clicks) i.e. reductive approach ASA - Dan Ellis 1998jul11-5

6 Computational Auditory Scene Analysis (CASA) Psychological rules suggest computer implementation -.. but many practical problems arise! Motivations: Practical applications - real-world interactive systems - indexing of media databases - hearing prostheses Crossover opportunities - unknown signal/information processing principles? Benefits for theory - implementations are very revealing ASA - Dan Ellis 1998jul11-6

7 The grouping paradigm Standard theory of ASA (Bregman, Darwin &c): - sound mixture is broken up into small elements e.g. time-frequency cells - each element has a number of feature dimensions (amplitude, ITD, period) - elements are grouped together according to their features to form larger structures - resulting groups have overall attributes (pitch, location) (from Darwin 1996) ASA - Dan Ellis 1998jul11-7

8 Marr s levels-of-explanation of information processing Three distinct aspects to info. processing Computational Theory Algorithm Implementation what and why ; the overall goal how ; an approach to meeting the goal practical realization of the process. Sound source organization Auditory grouping Feature calculation & binding Why bother? - to help organize understanding - avoid confusion/wasted effort use as an analysis tool... ASA - Dan Ellis 1998jul11-8

9 Level 1: Computational theory The underlying regularities that make the problem possible - i.e. the ecological facts Implicit definition of what is a source? : Independence of attributes between sources Continuity of attributes for each source + other source-specific constraints ASA - Dan Ellis 1998jul11-9

10 Level 2: Algorithm A particular approach to exploiting the constraints of the computational theory - both process & representation Audition: the elements-then-grouping approach - could have been otherwise e.g. templates Often the focus of analysis - but: debate is muddled without a clear computational theory ASA - Dan Ellis 1998jul11-10

11 Level 3: Implementation A specific realization of the algorithm - computer programs - neurons -... Can be analyzed separately? - provided epiphenomena are correctly assigned Needs context of algorithm, computational theory You cannot understand stereopsis simply by thinking about neurons ASA - Dan Ellis 1998jul11-11

12 The advantage of the appropriate level Computational theory - determines the purpose of the process; provides focus necessary for analysis e.g. biosonar: benefit of hyperresolution Algorithm - abstraction that is still specific, transferable e.g. autocorrelation for pitch Implementation - explain epiphenomena e.g. subjective octave from refractory period ASA - Dan Ellis 1998jul11-12

13 An example: Neural inhibition Computational theory Frequencydomain processing X(f) f Algorithm Discrete-time filtering (subtraction) Implementation Neurons with GABAergic inhibitions ASA - Dan Ellis 1998jul11-13

14 Summary 1 Acoustic scenes are very complex.. but the auditory system extracts useful information Grouping is the main focus of Auditory Scene Analysis.. but it fits into a larger Marrian framework ASA - Dan Ellis 1998jul11-14

15 Outline The computational theory of ASA Cues & grouping - Cue analysis - Simple scenes - Models - Complications: interaction, ambiguity, time Expectations & inference Big issues ASA - Dan Ellis 1998jul11-15

16 Cues to grouping Common onset/offset/modulation ( fate ) Common periodicity ( pitch ) Common onset Periodicity Computational theory Algorithm Implementation Acoustic consequences tend to be synchronized Group elements that start in a time range Onset detector cells Synchronized osc s? (Nonlinear) cyclic processes are common? Place patterns? Autocorrelation? Delay-and-mult? Modulation spect Spatial location (ITD, ILD, spectral cues) Sequential cues... Source-specific cues... ASA - Dan Ellis 1998jul11-16

17 Simple grouping E.g. isolated tones freq time Computational theory Algorithm Implementation common onset common period (harmonicity) locate elements (tracks) group by shared features? exhaustive search evolution in time ASA - Dan Ellis 1998jul11-17

18 Computer models of grouping Bregman at face value (e.g. Brown 1992): input mixture Front end signal features (maps) Object formation discrete objects Grouping rules Source groups freq time onset period frq.mod - feature maps - periodicity cue - common-onset boost - resynthesis ASA - Dan Ellis 1998jul11-18

19 Grouping model results Able to extract voiced speech: brn1h.aif frq/hz brn1h.fi.aif frq/hz time/s time/s Periodicity is the primary cue - how to handle aperiodic energy? Limitations - resynthesis via filter-mask - only periodic targets - robustness of discrete objects ASA - Dan Ellis 1998jul11-19

20 Complications for grouping: 1: Cues in conflict Mistuned harmonic (Moore, Darwin..): freq - harmonic usually groups by onset & periodicity - can alter frequency and/or onset time - degree of grouping from overall pitch match Gradual, various results: pitch shift time 3% mistuning - heard as separate tone, still affects pitch ASA - Dan Ellis 1998jul11-20

21 Complications for grouping: 2: The effect of time Added harmonics: freq time - onset cue initially segregates; periodicity eventually fuses The effect of time - some cues take time to become apparent - onset cue becomes increasingly distant... What is the impetus for fission? - e.g. double vowels - depends on what you expect..? ASA - Dan Ellis 1998jul11-21

22 Summary 2 Known grouping cues make sense Simple examples are straightforward Models can be implemented directly.. but problematic situations abound ASA - Dan Ellis 1998jul11-22

23 Outline The computational theory of ASA Cues & grouping Expectations & inference - Old-plus-new - Streaming - Restoration & illusions - Top-down models Big issues ASA - Dan Ellis 1998jul11-23

24 The effect of context Context can create an expectation : i.e. a bias towards a particular interpretation e.g. Bregman s old-plus-new principle: A change in a signal will be interpreted as an added source whenever possible freq/khz time/s - a different division of the same energy depending on what preceded it ASA - Dan Ellis 1998jul11-24

25 Streaming Successive tone events form separate streams freq. TRT: ms 1 khz f: ±2 octaves Order, rhythm &c within, not between, streams time Computational theory Algorithm Implementation Consistency of properties for successive source events expectation window for known streams (widens with time) competing time-frequency affinity weights... ASA - Dan Ellis 1998jul11-25

26 Restoration & illusions Direct evidence may be masked or distorted make best guess using available information E.g. the continuity illusion : f/hz ptshort time/s - tones alternates with noise bursts - noise is strong enough to mask tone... so listener discriminate presence - continuous tone distinctly perceived for gaps ~100s of ms Inference acts at low, preconscious level ASA - Dan Ellis 1998jul11-26

27 Speech restoration Speech provides very strong bases for inference (coarticulation, grammar, semantics): frq/hz nsoffee.aif Phonemic restoration time/s Temporal compound (1998jul10) Temporal compounds Sinewave speech (duplex?) f/bark S1 env.pf: time / ms ASA - Dan Ellis 1998jul11-27

28 Models of top-down processing Perception as a search for plausible explanations Prediction-driven CASA (PDCASA): input mixture Front end signal features Hypothesis management prediction errors Compare & reconcile hypotheses Noise components Periodic components predicted features Predict & combine An approach as well as an implementation... Key features: - complete explanation of all scene energy - vocabulary of periodic/noise/transient elements - multiple hypotheses - explanation hierarchy ASA - Dan Ellis 1998jul11-28

29 PDCASA for old-plus-new Incremental analysis t1 t2 t3 Input signal Time t1: initial element created Time t2: Additional element required Time t3: Second element finished ASA - Dan Ellis 1998jul11-29

30 PDCASA for the continuity illusion Subjects hear the tone as continuous... if the noise is a plausible masker f/hz ptshort Data-driven analysis gives just visible portions: Prediction-driven can infer masking: ASA - Dan Ellis 1998jul11-30

31 PDCASA analysis of a complex scene f/hz City Horn1 (10/10) Horn2 (5/10) Horn3 (5/10) Horn4 (8/10) Horn5 (10/10) f/hz Wefts f/hz Noise2,Click Weft5 Wefts6,7 Weft8 Wefts9 12 Crash (10/10) f/hz Noise Squeal (6/10) Truck (7/10) time/s db ASA - Dan Ellis 1998jul11-31

32 Marrian analysis of PDCASA Marr invoked to separate high-level function from low-level details Computational theory Algorithm Objects persist predictably Observations interact irreversibly Build hypotheses from generic elements Update by prediction-reconciliation Implementation??? It is not enough to be able to describe the response of single cells, nor predict the results of psychophysical experiments. Nor is it enough even to write computer programs that perform approximately in the desired way: One has to do all these things at once, and also be very aware of the computational theory... ASA - Dan Ellis 1998jul11-32

33 Summary 3 Perceptual processing is highly context-dependent Auditory system will use prior knowledge to fill-in gaps (subconsciously) Prediction-reconciliation models can encompass this behavior ASA - Dan Ellis 1998jul11-33

34 Outline The computational theory of ASA Cues & grouping Expectations & inference Big issues - the state of ASA and CASA - outstanding issues - discussion points ASA - Dan Ellis 1998jul11-34

35 The current state of ASA and CASA ASA - detailed descriptions of in vitro tests - some quite subtle effects explained (DV beats) but: how to extend to complex scenarios? CASA - numerous models, some convergence (mainly periodicity-based) - best results sound impressive (least plausible systems!) - applications in speech recognition? but: domains limited, poor robustness ASA - Dan Ellis 1998jul11-35

36 Big issues in CASA: Plausibility - correct level for human correspondence? - which phenomena are important to match? - how to implement symbolic-style processing? Top-down vs. bottom-up - different approaches to ambiguity, latency - how far down for top-down? - how far up for high level? - choice between extraction & inference? Integrating multiple cues (e.g. binaural) Other debates: - what is the real goal? - resynthesis - evaluation ASA - Dan Ellis 1998jul11-36

37 Big issues in ASA & CASA: Knowledge: how to acquire, represent & store... - short-term: context - long-term: memories - abstract: classes, generalities Attention: - what does it mean in these models? - limitation or important principle? ASA - Dan Ellis 1998jul11-37

38 Conclusions Real-world sounds are complex; scene-analysis is required We know certain cues & some rules, but real situations raise contradictions Current models handle obvious cases; robustness & generality are hard Many issues remain ASA - Dan Ellis 1998jul11-38

39 Discussion points Are Marr s levels important? Useful? Can you study levels in isolation? What do restoration phenomena imply about internal representations? Do we have an adequate account of an ASA algorithm? e.g. where do hypotheses come from? How important/challenging are phenomena like duplex perception, sinewave speech etc.? ASA - Dan Ellis 1998jul11-39

Automatic audio analysis for content description & indexing

Automatic audio analysis for content description & indexing Automatic audio analysis for content description & indexing Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 Auditory Scene Analysis (ASA) Computational

More information

Computational Auditory Scene Analysis: Principles, Practice and Applications

Computational Auditory Scene Analysis: Principles, Practice and Applications Computational Auditory Scene Analysis: Principles, Practice and Applications Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 Auditory Scene Analysis

More information

Computational Auditory Scene Analysis: An overview and some observations. CASA survey. Other approaches

Computational Auditory Scene Analysis: An overview and some observations. CASA survey. Other approaches CASA talk - Haskins/NUWC - Dan Ellis 1997oct24/5-1 The importance of auditory illusions for artificial listeners 1 Dan Ellis International Computer Science Institute, Berkeley CA

More information

Sound, Mixtures, and Learning

Sound, Mixtures, and Learning Sound, Mixtures, and Learning Dan Ellis Laboratory for Recognition and Organization of Speech and Audio (LabROSA) Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

Auditory scene analysis in humans: Implications for computational implementations.

Auditory scene analysis in humans: Implications for computational implementations. Auditory scene analysis in humans: Implications for computational implementations. Albert S. Bregman McGill University Introduction. The scene analysis problem. Two dimensions of grouping. Recognition

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 5 Introducing Tandem modeling

More information

Lecture 4: Auditory Perception. Why study perception?

Lecture 4: Auditory Perception. Why study perception? EE E682: Speech & Audio Processing & Recognition Lecture 4: Auditory Perception 1 2 3 4 5 6 Motivation: Why & how Auditory physiology Psychophysics: Detection & discrimination Pitch perception Speech perception

More information

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080 Perceptual segregation of a harmonic from a vowel by interaural time difference in conjunction with mistuning and onset asynchrony C. J. Darwin and R. W. Hukin Experimental Psychology, University of Sussex,

More information

Using Source Models in Speech Separation

Using Source Models in Speech Separation Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

Computational Perception /785. Auditory Scene Analysis

Computational Perception /785. Auditory Scene Analysis Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in

More information

Sound, Mixtures, and Learning

Sound, Mixtures, and Learning Sound, Mixtures, and Learning Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

Hearing in the Environment

Hearing in the Environment 10 Hearing in the Environment Click Chapter to edit 10 Master Hearing title in the style Environment Sound Localization Complex Sounds Auditory Scene Analysis Continuity and Restoration Effects Auditory

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University Outline 1 2 3 4 5 Sound organization Background & related work Existing projects

More information

Lecture 3: Perception

Lecture 3: Perception ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 3: Perception 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception Dan Ellis Dept. Electrical Engineering, Columbia University

More information

Modulation and Top-Down Processing in Audition

Modulation and Top-Down Processing in Audition Modulation and Top-Down Processing in Audition Malcolm Slaney 1,2 and Greg Sell 2 1 Yahoo! Research 2 Stanford CCRMA Outline The Non-Linear Cochlea Correlogram Pitch Modulation and Demodulation Information

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

Hearing II Perceptual Aspects

Hearing II Perceptual Aspects Hearing II Perceptual Aspects Overview of Topics Chapter 6 in Chaudhuri Intensity & Loudness Frequency & Pitch Auditory Space Perception 1 2 Intensity & Loudness Loudness is the subjective perceptual quality

More information

J Jeffress model, 3, 66ff

J Jeffress model, 3, 66ff Index A Absolute pitch, 102 Afferent projections, inferior colliculus, 131 132 Amplitude modulation, coincidence detector, 152ff inferior colliculus, 152ff inhibition models, 156ff models, 152ff Anatomy,

More information

Binaural Hearing. Why two ears? Definitions

Binaural Hearing. Why two ears? Definitions Binaural Hearing Why two ears? Locating sounds in space: acuity is poorer than in vision by up to two orders of magnitude, but extends in all directions. Role in alerting and orienting? Separating sound

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception Long-term spectrum of speech HCS 7367 Speech Perception Connected speech Absolute threshold Males Dr. Peter Assmann Fall 212 Females Long-term spectrum of speech Vowels Males Females 2) Absolute threshold

More information

Auditory Scene Analysis. Dr. Maria Chait, UCL Ear Institute

Auditory Scene Analysis. Dr. Maria Chait, UCL Ear Institute Auditory Scene Analysis Dr. Maria Chait, UCL Ear Institute Expected learning outcomes: Understand the tasks faced by the auditory system during everyday listening. Know the major Gestalt principles. Understand

More information

Robustness, Separation & Pitch

Robustness, Separation & Pitch Robustness, Separation & Pitch or Morgan, Me & Pitch Dan Ellis Columbia / ICSI dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. Robustness and Separation 2. An Academic Journey 3. Future COLUMBIA

More information

21/01/2013. Binaural Phenomena. Aim. To understand binaural hearing Objectives. Understand the cues used to determine the location of a sound source

21/01/2013. Binaural Phenomena. Aim. To understand binaural hearing Objectives. Understand the cues used to determine the location of a sound source Binaural Phenomena Aim To understand binaural hearing Objectives Understand the cues used to determine the location of a sound source Understand sensitivity to binaural spatial cues, including interaural

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening AUDL GS08/GAV1 Signals, systems, acoustics and the ear Pitch & Binaural listening Review 25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 Part I: Auditory frequency selectivity Tuning

More information

Sound localization psychophysics

Sound localization psychophysics Sound localization psychophysics Eric Young A good reference: B.C.J. Moore An Introduction to the Psychology of Hearing Chapter 7, Space Perception. Elsevier, Amsterdam, pp. 233-267 (2004). Sound localization:

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception Babies 'cry in mother's tongue' HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Babies' cries imitate their mother tongue as early as three days old German researchers say babies begin to pick up

More information

= + Auditory Scene Analysis. Week 9. The End. The auditory scene. The auditory scene. Otherwise known as

= + Auditory Scene Analysis. Week 9. The End. The auditory scene. The auditory scene. Otherwise known as Auditory Scene Analysis Week 9 Otherwise known as Auditory Grouping Auditory Streaming Sound source segregation The auditory scene The auditory system needs to make sense of the superposition of component

More information

The role of periodicity in the perception of masked speech with simulated and real cochlear implants

The role of periodicity in the perception of masked speech with simulated and real cochlear implants The role of periodicity in the perception of masked speech with simulated and real cochlear implants Kurt Steinmetzger and Stuart Rosen UCL Speech, Hearing and Phonetic Sciences Heidelberg, 09. November

More information

Cognitive Processes PSY 334. Chapter 2 Perception

Cognitive Processes PSY 334. Chapter 2 Perception Cognitive Processes PSY 334 Chapter 2 Perception Object Recognition Two stages: Early phase shapes and objects are extracted from background. Later phase shapes and objects are categorized, recognized,

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 Introducing Robust speech recognition

More information

Auditory gist perception and attention

Auditory gist perception and attention Auditory gist perception and attention Sue Harding Speech and Hearing Research Group University of Sheffield POP Perception On Purpose Since the Sheffield POP meeting: Paper: Auditory gist perception:

More information

On the influence of interaural differences on onset detection in auditory object formation. 1 Introduction

On the influence of interaural differences on onset detection in auditory object formation. 1 Introduction On the influence of interaural differences on onset detection in auditory object formation Othmar Schimmel Eindhoven University of Technology, P.O. Box 513 / Building IPO 1.26, 56 MD Eindhoven, The Netherlands,

More information

Consonant Perception test

Consonant Perception test Consonant Perception test Introduction The Vowel-Consonant-Vowel (VCV) test is used in clinics to evaluate how well a listener can recognize consonants under different conditions (e.g. with and without

More information

Auditory Scene Analysis in Humans and Machines

Auditory Scene Analysis in Humans and Machines Auditory Scene Analysis in Humans and Machines Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

Hearing. Juan P Bello

Hearing. Juan P Bello Hearing Juan P Bello The human ear The human ear Outer Ear The human ear Middle Ear The human ear Inner Ear The cochlea (1) It separates sound into its various components If uncoiled it becomes a tapering

More information

Role of F0 differences in source segregation

Role of F0 differences in source segregation Role of F0 differences in source segregation Andrew J. Oxenham Research Laboratory of Electronics, MIT and Harvard-MIT Speech and Hearing Bioscience and Technology Program Rationale Many aspects of segregation

More information

16.400/453J Human Factors Engineering /453. Audition. Prof. D. C. Chandra Lecture 14

16.400/453J Human Factors Engineering /453. Audition. Prof. D. C. Chandra Lecture 14 J Human Factors Engineering Audition Prof. D. C. Chandra Lecture 14 1 Overview Human ear anatomy and hearing Auditory perception Brainstorming about sounds Auditory vs. visual displays Considerations for

More information

HEARING AND PSYCHOACOUSTICS

HEARING AND PSYCHOACOUSTICS CHAPTER 2 HEARING AND PSYCHOACOUSTICS WITH LIDIA LEE I would like to lead off the specific audio discussions with a description of the audio receptor the ear. I believe it is always a good idea to understand

More information

HST.723J, Spring 2005 Theme 3 Report

HST.723J, Spring 2005 Theme 3 Report HST.723J, Spring 2005 Theme 3 Report Madhu Shashanka shashanka@cns.bu.edu Introduction The theme of this report is binaural interactions. Binaural interactions of sound stimuli enable humans (and other

More information

Robust Neural Encoding of Speech in Human Auditory Cortex

Robust Neural Encoding of Speech in Human Auditory Cortex Robust Neural Encoding of Speech in Human Auditory Cortex Nai Ding, Jonathan Z. Simon Electrical Engineering / Biology University of Maryland, College Park Auditory Processing in Natural Scenes How is

More information

Recognition & Organization of Speech & Audio

Recognition & Organization of Speech & Audio Recognition & Organization of Speech & Audio Dan Ellis http://labrosa.ee.columbia.edu/ Outline 1 2 3 Introducing Projects in speech, music & audio Summary overview - Dan Ellis 21-9-28-1 1 Sound organization

More information

What Is the Difference between db HL and db SPL?

What Is the Difference between db HL and db SPL? 1 Psychoacoustics What Is the Difference between db HL and db SPL? The decibel (db ) is a logarithmic unit of measurement used to express the magnitude of a sound relative to some reference level. Decibels

More information

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine

More information

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER A Computational Model of Auditory Selective Attention

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER A Computational Model of Auditory Selective Attention IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1151 A Computational Model of Auditory Selective Attention Stuart N. Wrigley, Member, IEEE, and Guy J. Brown Abstract The human auditory

More information

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics 2/14/18 Can hear whistle? Lecture 5 Psychoacoustics Based on slides 2009--2018 DeHon, Koditschek Additional Material 2014 Farmer 1 2 There are sounds we cannot hear Depends on frequency Where are we on

More information

Representation of sound in the auditory nerve

Representation of sound in the auditory nerve Representation of sound in the auditory nerve Eric D. Young Department of Biomedical Engineering Johns Hopkins University Young, ED. Neural representation of spectral and temporal information in speech.

More information

General Soundtrack Analysis

General Soundtrack Analysis General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 THE DUPLEX-THEORY OF LOCALIZATION INVESTIGATED UNDER NATURAL CONDITIONS

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 THE DUPLEX-THEORY OF LOCALIZATION INVESTIGATED UNDER NATURAL CONDITIONS 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 27 THE DUPLEX-THEORY OF LOCALIZATION INVESTIGATED UNDER NATURAL CONDITIONS PACS: 43.66.Pn Seeber, Bernhard U. Auditory Perception Lab, Dept.

More information

Auditory System & Hearing

Auditory System & Hearing Auditory System & Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Spring 2015 1 Cochlea: physical device tuned to frequency! place code: tuning of different

More information

group by pitch: similar frequencies tend to be grouped together - attributed to a common source.

group by pitch: similar frequencies tend to be grouped together - attributed to a common source. Pattern perception Section 1 - Auditory scene analysis Auditory grouping: the sound wave hitting out ears is often pretty complex, and contains sounds from multiple sources. How do we group sounds together

More information

Acoustics Research Institute

Acoustics Research Institute Austrian Academy of Sciences Acoustics Research Institute Modeling Modelingof ofauditory AuditoryPerception Perception Bernhard BernhardLaback Labackand andpiotr PiotrMajdak Majdak http://www.kfs.oeaw.ac.at

More information

Binaural interference and auditory grouping

Binaural interference and auditory grouping Binaural interference and auditory grouping Virginia Best and Frederick J. Gallun Hearing Research Center, Boston University, Boston, Massachusetts 02215 Simon Carlile Department of Physiology, University

More information

Topic 4. Pitch & Frequency

Topic 4. Pitch & Frequency Topic 4 Pitch & Frequency A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu (accompanying himself on doshpuluur) demonstrates perfectly the characteristic sound of the Xorekteer voice An

More information

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice Multiple Choice 1. Which structure is not part of the visual pathway in the brain? a. occipital lobe b. optic chiasm c. lateral geniculate nucleus *d. frontal lobe Answer location: Visual Pathways 2. Which

More information

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds Hearing Lectures Acoustics of Speech and Hearing Week 2-10 Hearing 3: Auditory Filtering 1. Loudness of sinusoids mainly (see Web tutorial for more) 2. Pitch of sinusoids mainly (see Web tutorial for more)

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 8 Audio and Multimodal Attention Audio Scene Analysis Two-stage process Segmentation: decomposition to time-frequency segments Grouping

More information

Systems Neuroscience Oct. 16, Auditory system. http:

Systems Neuroscience Oct. 16, Auditory system. http: Systems Neuroscience Oct. 16, 2018 Auditory system http: www.ini.unizh.ch/~kiper/system_neurosci.html The physics of sound Measuring sound intensity We are sensitive to an enormous range of intensities,

More information

Sonic Spotlight. SmartCompress. Advancing compression technology into the future

Sonic Spotlight. SmartCompress. Advancing compression technology into the future Sonic Spotlight SmartCompress Advancing compression technology into the future Speech Variable Processing (SVP) is the unique digital signal processing strategy that gives Sonic hearing aids their signature

More information

Lecture 9: Speech Recognition: Front Ends

Lecture 9: Speech Recognition: Front Ends EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition: Front Ends 1 2 Recognizing Speech Feature Calculation Dan Ellis http://www.ee.columbia.edu/~dpwe/e682/

More information

3-D Sound and Spatial Audio. What do these terms mean?

3-D Sound and Spatial Audio. What do these terms mean? 3-D Sound and Spatial Audio What do these terms mean? Both terms are very general. 3-D sound usually implies the perception of point sources in 3-D space (could also be 2-D plane) whether the audio reproduction

More information

Lauer et al Olivocochlear efferents. Amanda M. Lauer, Ph.D. Dept. of Otolaryngology-HNS

Lauer et al Olivocochlear efferents. Amanda M. Lauer, Ph.D. Dept. of Otolaryngology-HNS Lauer et al. 2012 Olivocochlear efferents Amanda M. Lauer, Ph.D. Dept. of Otolaryngology-HNS May 30, 2016 Overview Structural organization Responses Hypothesized roles in hearing Olivocochlear efferent

More information

Yonghee Oh, Ph.D. Department of Speech, Language, and Hearing Sciences University of Florida 1225 Center Drive, Room 2128 Phone: (352)

Yonghee Oh, Ph.D. Department of Speech, Language, and Hearing Sciences University of Florida 1225 Center Drive, Room 2128 Phone: (352) Yonghee Oh, Ph.D. Department of Speech, Language, and Hearing Sciences University of Florida 1225 Center Drive, Room 2128 Phone: (352) 294-8675 Gainesville, FL 32606 E-mail: yoh@phhp.ufl.edu EDUCATION

More information

INTRODUCTION J. Acoust. Soc. Am. 100 (4), Pt. 1, October /96/100(4)/2352/13/$ Acoustical Society of America 2352

INTRODUCTION J. Acoust. Soc. Am. 100 (4), Pt. 1, October /96/100(4)/2352/13/$ Acoustical Society of America 2352 Lateralization of a perturbed harmonic: Effects of onset asynchrony and mistuning a) Nicholas I. Hill and C. J. Darwin Laboratory of Experimental Psychology, University of Sussex, Brighton BN1 9QG, United

More information

AUDITORY GROUPING AND ATTENTION TO SPEECH

AUDITORY GROUPING AND ATTENTION TO SPEECH AUDITORY GROUPING AND ATTENTION TO SPEECH C.J.Darwin Experimental Psychology, University of Sussex, Brighton BN1 9QG. 1. INTRODUCTION The human speech recognition system is superior to machine recognition

More information

Combating the Reverberation Problem

Combating the Reverberation Problem Combating the Reverberation Problem Barbara Shinn-Cunningham (Boston University) Martin Cooke (Sheffield University, U.K.) How speech is corrupted by reverberation DeLiang Wang (Ohio State University)

More information

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa J Am Acad Audiol 20:453 458 (2009) Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: 10.3766/jaaa.20.7.6 Ilse J.A. Wambacq * Janet Koehnke * Joan Besing * Laurie L. Romei

More information

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview Infant Hearing Development: Translating Research Findings into Clinical Practice Lori J. Leibold Department of Allied Health Sciences The University of North Carolina at Chapel Hill Auditory Development

More information

Binaural processing of complex stimuli

Binaural processing of complex stimuli Binaural processing of complex stimuli Outline for today Binaural detection experiments and models Speech as an important waveform Experiments on understanding speech in complex environments (Cocktail

More information

The functional importance of age-related differences in temporal processing

The functional importance of age-related differences in temporal processing Kathy Pichora-Fuller The functional importance of age-related differences in temporal processing Professor, Psychology, University of Toronto Adjunct Scientist, Toronto Rehabilitation Institute, University

More information

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants Context Effect on Segmental and Supresegmental Cues Preceding context has been found to affect phoneme recognition Stop consonant recognition (Mann, 1980) A continuum from /da/ to /ga/ was preceded by

More information

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu

More information

Spectral processing of two concurrent harmonic complexes

Spectral processing of two concurrent harmonic complexes Spectral processing of two concurrent harmonic complexes Yi Shen a) and Virginia M. Richards Department of Cognitive Sciences, University of California, Irvine, California 92697-5100 (Received 7 April

More information

SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds

SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds Phonak Insight April 2016 SoundRecover2 the first adaptive frequency compression algorithm More audibility of high frequency sounds Phonak led the way in modern frequency lowering technology with the introduction

More information

Hearing the Universal Language: Music and Cochlear Implants

Hearing the Universal Language: Music and Cochlear Implants Hearing the Universal Language: Music and Cochlear Implants Professor Hugh McDermott Deputy Director (Research) The Bionics Institute of Australia, Professorial Fellow The University of Melbourne Overview?

More information

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking Vahid Montazeri, Shaikat Hossain, Peter F. Assmann University of Texas

More information

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014 Analysis of in-vivo extracellular recordings Ryan Morrill Bootcamp 9/10/2014 Goals for the lecture Be able to: Conceptually understand some of the analysis and jargon encountered in a typical (sensory)

More information

Binaural Hearing. Steve Colburn Boston University

Binaural Hearing. Steve Colburn Boston University Binaural Hearing Steve Colburn Boston University Outline Why do we (and many other animals) have two ears? What are the major advantages? What is the observed behavior? How do we accomplish this physiologically?

More information

EEL 6586, Project - Hearing Aids algorithms

EEL 6586, Project - Hearing Aids algorithms EEL 6586, Project - Hearing Aids algorithms 1 Yan Yang, Jiang Lu, and Ming Xue I. PROBLEM STATEMENT We studied hearing loss algorithms in this project. As the conductive hearing loss is due to sound conducting

More information

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment Acoustics, signals & systems for audiology Psychoacoustics of hearing impairment Three main types of hearing impairment Conductive Sound is not properly transmitted from the outer to the inner ear Sensorineural

More information

Linguistic Phonetics Fall 2005

Linguistic Phonetics Fall 2005 MIT OpenCourseWare http://ocw.mit.edu 24.963 Linguistic Phonetics Fall 2005 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 24.963 Linguistic Phonetics

More information

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED Francisco J. Fraga, Alan M. Marotta National Institute of Telecommunications, Santa Rita do Sapucaí - MG, Brazil Abstract A considerable

More information

Theme 4: Pitch and Temporal Coding

Theme 4: Pitch and Temporal Coding Theme 4: Pitch and Temporal Coding Pitch is that attribute of auditory sensation in terms of which sounds may be ordered on a scale extending from low to high. Pitch depends mainly on the frequency content

More information

Chapter 16: Chapter Title 221

Chapter 16: Chapter Title 221 Chapter 16: Chapter Title 221 Chapter 16 The History and Future of CASA Malcolm Slaney IBM Almaden Research Center malcolm@ieee.org 1 INTRODUCTION In this chapter I briefly review the history and the future

More information

Binaural Hearing for Robots Introduction to Robot Hearing

Binaural Hearing for Robots Introduction to Robot Hearing Binaural Hearing for Robots Introduction to Robot Hearing 1Radu Horaud Binaural Hearing for Robots 1. Introduction to Robot Hearing 2. Methodological Foundations 3. Sound-Source Localization 4. Machine

More information

Effects of Location, Frequency Region, and Time Course of Selective Attention on Auditory Scene Analysis

Effects of Location, Frequency Region, and Time Course of Selective Attention on Auditory Scene Analysis Journal of Experimental Psychology: Human Perception and Performance 2004, Vol. 30, No. 4, 643 656 Copyright 2004 by the American Psychological Association 0096-1523/04/$12.00 DOI: 10.1037/0096-1523.30.4.643

More information

Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence

Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence Journal of Experimental Psychology: Human Perception and Performance 1976, Vol. 2, No. 1, 23-29 Separate What and Where Decision Mechanisms In Processing a Dichotic Tonal Sequence Diana Deutsch and Philip

More information

Oticon Agil. Connectivity Matters - A Whitepaper. Agil Improves ConnectLine Sound Quality

Oticon Agil. Connectivity Matters - A Whitepaper. Agil Improves ConnectLine Sound Quality Oticon Agil Connectivity Matters - A Whitepaper Agil Improves ConnectLine Sound Quality M. Lisa Sjolander, AuD Oticon A/S, Smørum, Denmark Agil Improves ConnectLine Sound Quality Wireless technology is

More information

Discrete Signal Processing

Discrete Signal Processing 1 Discrete Signal Processing C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.cs.nctu.edu.tw/~cmliu/courses/dsp/ ( Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Coincidence Detection in Pitch Perception. S. Shamma, D. Klein and D. Depireux

Coincidence Detection in Pitch Perception. S. Shamma, D. Klein and D. Depireux Coincidence Detection in Pitch Perception S. Shamma, D. Klein and D. Depireux Electrical and Computer Engineering Department, Institute for Systems Research University of Maryland at College Park, College

More information

Speech perception in individuals with dementia of the Alzheimer s type (DAT) Mitchell S. Sommers Department of Psychology Washington University

Speech perception in individuals with dementia of the Alzheimer s type (DAT) Mitchell S. Sommers Department of Psychology Washington University Speech perception in individuals with dementia of the Alzheimer s type (DAT) Mitchell S. Sommers Department of Psychology Washington University Overview Goals of studying speech perception in individuals

More information

Voice Detection using Speech Energy Maximization and Silence Feature Normalization

Voice Detection using Speech Energy Maximization and Silence Feature Normalization , pp.25-29 http://dx.doi.org/10.14257/astl.2014.49.06 Voice Detection using Speech Energy Maximization and Silence Feature Normalization In-Sung Han 1 and Chan-Shik Ahn 2 1 Dept. of The 2nd R&D Institute,

More information

Implant Subjects. Jill M. Desmond. Department of Electrical and Computer Engineering Duke University. Approved: Leslie M. Collins, Supervisor

Implant Subjects. Jill M. Desmond. Department of Electrical and Computer Engineering Duke University. Approved: Leslie M. Collins, Supervisor Using Forward Masking Patterns to Predict Imperceptible Information in Speech for Cochlear Implant Subjects by Jill M. Desmond Department of Electrical and Computer Engineering Duke University Date: Approved:

More information

Combination of binaural and harmonic. masking release effects in the detection of a. single component in complex tones

Combination of binaural and harmonic. masking release effects in the detection of a. single component in complex tones Combination of binaural and harmonic masking release effects in the detection of a single component in complex tones Martin Klein-Hennig, Mathias Dietz, and Volker Hohmann a) Medizinische Physik and Cluster

More information

Perceptual pitch shift for sounds with similar waveform autocorrelation

Perceptual pitch shift for sounds with similar waveform autocorrelation Pressnitzer et al.: Acoustics Research Letters Online [DOI./.4667] Published Online 4 October Perceptual pitch shift for sounds with similar waveform autocorrelation Daniel Pressnitzer, Alain de Cheveigné

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 BACKGROUND Speech is the most natural form of human communication. Speech has also become an important means of human-machine interaction and the advancement in technology has

More information

Categorical Perception

Categorical Perception Categorical Perception Discrimination for some speech contrasts is poor within phonetic categories and good between categories. Unusual, not found for most perceptual contrasts. Influenced by task, expectations,

More information

Reply to Reconsidering evidence for the suppression model of the octave illusion, by C. D. Chambers, J. B. Mattingley, and S. A.

Reply to Reconsidering evidence for the suppression model of the octave illusion, by C. D. Chambers, J. B. Mattingley, and S. A. Psychonomic Bulletin & Review 2004, 11 (4), 667 676 Reply to Reconsidering evidence for the suppression model of the octave illusion, by C. D. Chambers, J. B. Mattingley, and S. A. Moss DIANA DEUTSCH University

More information

Competing Frameworks in Perception

Competing Frameworks in Perception Competing Frameworks in Perception Lesson II: Perception module 08 Perception.08. 1 Views on perception Perception as a cascade of information processing stages From sensation to percept Template vs. feature

More information

Competing Frameworks in Perception

Competing Frameworks in Perception Competing Frameworks in Perception Lesson II: Perception module 08 Perception.08. 1 Views on perception Perception as a cascade of information processing stages From sensation to percept Template vs. feature

More information