Acoustics Research Institute

Similar documents
Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

Signals, systems, acoustics and the ear. Week 5. The peripheral auditory system: The ear as a signal processor

Systems Neuroscience Oct. 16, Auditory system. http:

ENT 318 Artificial Organs Physiology of Ear

Lecture 3: Perception

Mechanical Properties of the Cochlea. Reading: Yost Ch. 7

Chapter 11: Sound, The Auditory System, and Pitch Perception

Auditory System. Barb Rohrer (SEI )

Representation of sound in the auditory nerve

Hearing Sound. The Human Auditory System. The Outer Ear. Music 170: The Ear

Music 170: The Ear. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) November 17, 2016

MECHANISM OF HEARING

Auditory System Feedback

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Structure, Energy Transmission and Function. Gross Anatomy. Structure, Function & Process. External Auditory Meatus or Canal (EAM, EAC) Outer Ear

Deafness and hearing impairment

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

COM3502/4502/6502 SPEECH PROCESSING

Auditory Physiology Richard M. Costanzo, Ph.D.

Required Slide. Session Objectives

The Structure and Function of the Auditory Nerve

Auditory Physiology PSY 310 Greg Francis. Lecture 30. Organ of Corti

Sound Waves. Sensation and Perception. Sound Waves. Sound Waves. Sound Waves

Intro to Audition & Hearing

Receptors / physiology

A computer model of medial efferent suppression in the mammalian auditory system

PSY 215 Lecture 10 Topic: Hearing Chapter 7, pages

HCS 7367 Speech Perception

BINAURAL PSYCHOACOUSTIC MODEL: IMPLEMENTATION AND EVALUATION

HEARING. Structure and Function

SOLUTIONS Homework #3. Introduction to Engineering in Medicine and Biology ECEN 1001 Due Tues. 9/30/03

Refining a model of hearing impairment using speech psychophysics

Issues faced by people with a Sensorineural Hearing Loss

Sound and its characteristics. The decibel scale. Structure and function of the ear. Békésy s theory. Molecular basis of hair cell function.

Digital Speech and Audio Processing Spring

Topic 4. Pitch & Frequency

Hearing. istockphoto/thinkstock

A truly remarkable aspect of human hearing is the vast

Before we talk about the auditory system we will talk about the sound and waves

Auditory Physiology PSY 310 Greg Francis. Lecture 29. Hearing

PSY 310: Sensory and Perceptual Processes 1

Cochlear anatomy, function and pathology II. Professor Dave Furness Keele University

L2: Speech production and perception Anatomy of the speech organs Models of speech production Anatomy of the ear Auditory psychophysics

Sound localization psychophysics

Sound and Hearing. Decibels. Frequency Coding & Localization 1. Everything is vibration. The universe is made of waves.

Hearing. Juan P Bello

HEARING AND PSYCHOACOUSTICS

Processing of sounds in the inner ear

Hearing: Physiology and Psychoacoustics

Auditory nerve. Amanda M. Lauer, Ph.D. Dept. of Otolaryngology-HNS

Chapter 3: Anatomy and physiology of the sensory auditory mechanism

Unit VIII Problem 9 Physiology: Hearing

Relation between speech intelligibility, temporal auditory perception, and auditory-evoked potentials

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

Chapter 1: Introduction to digital audio

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

Educational Module Tympanometry. Germany D Germering

ClaroTM Digital Perception ProcessingTM

BCS 221: Auditory Perception BCS 521 & PSY 221

SUBJECT: Physics TEACHER: Mr. S. Campbell DATE: 15/1/2017 GRADE: DURATION: 1 wk GENERAL TOPIC: The Physics Of Hearing

SPHSC 462 HEARING DEVELOPMENT. Overview Review of Hearing Science Introduction

Coding of Sounds in the Auditory System and Its Relevance to Signal Processing and Coding in Cochlear Implants

Lecture 6 Hearing 1. Raghav Rajan Bio 354 Neurobiology 2 January 28th All lecture material from the following links unless otherwise mentioned:

Auditory System & Hearing

J Jeffress model, 3, 66ff

Hearing. and other senses

Who are cochlear implants for?

Lecture 4: Auditory Perception. Why study perception?

PSY 214 Lecture 16 (11/09/2011) (Sound, auditory system & pitch perception) Dr. Achtman PSY 214

Auditory Periphery! external middle inner. stapes movement initiates a pressure wave in cochlear fluid

College of Medicine Dept. of Medical physics Physics of ear and hearing /CH

Psychoacoustical Models WS 2016/17

Frequency refers to how often something happens. Period refers to the time it takes something to happen.

9/29/14. Amanda M. Lauer, Dept. of Otolaryngology- HNS. From Signal Detection Theory and Psychophysics, Green & Swets (1966)

Human Acoustic Processing

Sound. Audition. Physics of Sound. Properties of sound. Perception of sound works the same way as light.

Audition. Sound. Physics of Sound. Perception of sound works the same way as light.

SLHS 1301 The Physics and Biology of Spoken Language. Practice Exam 2. b) 2 32

Organization. The physics of sound. Measuring sound intensity. Fourier analysis. (c) S-C. Liu, Inst of Neuroinformatics 1

Noise Induced Hearing Loss

AUDL GS08 and GAV1: 2013 Final exam page 1/13. You must complete all sections. Label all graphs. Show your work!

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

Hearing. By Jack & Tori

Comment by Delgutte and Anna. A. Dreyer (Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA)

Hearing and Balance 1

Anatomy and Physiology of Hearing

Hearing. PSYCHOLOGY (8th Edition, in Modules) David Myers. Module 14. Hearing. Hearing

Hearing. Figure 1. The human ear (from Kessel and Kardon, 1979)

A Brief History of Auditory Models

Linguistic Phonetics Fall 2005

Binaural Hearing. Steve Colburn Boston University

Hearing Lectures. Acoustics of Speech and Hearing. Subjective/Objective (recap) Loudness Overview. Sinusoids through ear. Facts about Loudness

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

Improving the diagnostic power of otoacoustic emissions. Arturo Moleti Physics Department University of Roma Tor Vergata

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant

Hearing I: Sound & The Ear

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

SPECIAL SENSES: THE AUDITORY SYSTEM

Stimulus Coding in the Auditory Nerve. Neural Coding and Perception of Sound 1

Transcription:

Austrian Academy of Sciences Acoustics Research Institute Modeling Modelingof ofauditory AuditoryPerception Perception Bernhard BernhardLaback Labackand andpiotr PiotrMajdak Majdak http://www.kfs.oeaw.ac.at http://www.kfs.oeaw.ac.at Bernhard.Laback@oeaw.ac.at Bernhard.Laback@oeaw.ac.at Institut für Schallforschung, Österreichische Akademie der Wissenschaften; A-1040 Wien, Wohllebengasse 12-14 Tel: +43 1/51581-2511 Fax: +43 1/51581-2530 email: piotr@majdak.com

Physiological Basics Four main stages of the auditory system: Outer ear Middle ear Inner ear (cochlea) and auditory nerve Central processing stages [Ellis et al., 1991] [Courtesy of the House Ear Institute]

Outer Ear Pinna: Sound amplification Direction-dependent filtering at high frequencies (> 4 khz) Important for sound localization in vertical planes (front/back, up/down dimensions) Requirement for externalization of sounds Auditory canal: Maximum in transfer function for pinna + auditory canal from 1.5 to 5 khz

Middle Ear In tympanic cavity (air-filled) Connects ear drum with oval window of cochlea via maleus, incus, and stapes Function: adjustment of impedance of waves in air versus liquid Best transmission from 0.5 4 khz [Gelfand, 1997]

Inner Ear (Cochlea) Tubelike, involuted structure, 35-mm length Three compartments: scala vestibuli (SV), tympani (ST), and media (SM) Basilar membrane (BM) with Cortical Organ: sensory structures Oval window connects stapes with SV at base ST and SV connected at apex Electric potential difference (-40 mv) between SV/ST and SM at BM [Kießling,et al. 1997]

Vibration of Basilar Membrane (BM) Periodic pressure wave from oval window causes pressure difference between SV and ST Traveling wave (van Békésy, Nobel Prize in 1961) For high frequencies maximum displacement towards base, for low frequencies towards apex Determines characteristic frequency (CF) for each place Involves frequency-dependent delay (cochlear delay) Vibration shape essential for spectral coding [Ruggero & Temchin, 2007] [WADA Lab., 2005]

Micromechanics of Cochlea (Cortical Organ) BM is covered with Tectorial memrane (TM) 1 row inner hair cells, IHC (total: 3500) 3-4 rows outer hair cells, OHC (total: 25000) Thin hairs on top of hair cells (sterocilia) connected to TM Shear forces cause deflection Depolarization of hair cells Neural action potential (spike) [Mammano 2004]

Activity of Outer Hair Cells [WADA Lab., 2005] Displacement of stereocilia causes contraction (length variation) of OHCs [from???] In phase with signal Active amplification of cochlear vibration Only at low levels (saturation at 50 db SPL) [Dallos et al, 1986]

Tuning Curve of Basilar Membrane (BM) Tuning Curve: Tone level required to cause constant BM displacement at fixed CF as a function of tone frequency Filled circles: living animal Empty circle: animal in bad constitution (elevated absolute threshold) -10 db Bandwidth CF: 18 khz [Sellick et al, 1982] OHC damage Filled squares: post mortem

Input-Output Function of Cochlea On-frequency (at CF) Compressive response at medium levels [Robles et al., 1986] Linear response at low and high levels Off-frequency (below and above CF) Linear response over the entire dynamic range BM response at fixed place (CF = 8kHz) for different tone frequencies

Neural Response Spontaneous firing: without stimulus Three classes of neurons High spont. rate (18-250 spikes/s) Medium spont. rate (0.5-18 spikes/s) Low spont. rate (< 0.5 spikes/s) Neurons with high spont. rates have low thresholds Thresholds differ by up to 80 db between classes

Dynamic Range and Timing Dynamic range of neurons: 20 60 db Phase Locking: [Kiang, 1968] Inner hair cell response Acoustic signal Synchronization of spike pattern with signal waveform Half-wave rectification For signals > 1 khz decreasing phase locking Due to neural refractory time (approx. 1 ms)

Neural Adaptation Typical neural response pattern: Onset: High spike rate (and strong phase locking) Ongoing portion: saturation at lower spike rate Offset: strong reduction and subsequent increase up to spontaneous spike rate Contrast enhancement Accentuation of dynamic changes of the input Stimulus-Amplitude Stimulus-Amplitude (BM-Deflection) (BM-Deflection) NeuronalSpike Response Neuronal Rate

Central Processing Stages Evaluate all incoming information: Neural spike patterns as a function of CF Integration of information across CFs (Integration of information across ears) Incorporate a priori information Involve learning and context effects Central detector is assumed to optimally extract the information relevant in a particular task/situation [Port, 2007]

Basic Auditory Functions Absolute threshold: Audibility of frequency components Frequency selectivity: Separation of frequency components Reflected in spectral (frequency) masking effects Temporal resolution: Reflected in temporal masking effects Intensity discrimination, loudness perception, and modulation perception Pitch perception

Modeling of Auditory Perception: Why? Predict the results from a variety of experiments within one framework Explain the functioning of the auditory system Help generating hypotheses that can be explicitly stated and quantitatively tested

Types of Auditory Models Biophysical: Detailed processes of components and structures Physiological: Functionality of components and structures Mathematical/Statistical: Abstract representation of auditory processing Perceptual/Effective: Effective signal processing of stages of auditory system based on perceptual experiments

The Computational Auditory SignalProcessing and Perception (CASP) Model Simulates the experimentally observed inputoutput behavior in humans No explicit modeling of precise biophysical mechanisms Currently focusing on modeling of: Masking phenomena (spectral and temporal) Intensity discrimination Modulation perception Based on Dau et al. (1996, 1997) and Jepsen et al. (2008)

Basic Experiment for CASP: Masking Task Detection of target signal (T) in presence of masker signal (M) Three-alternative force choice (3-AFC) Random position of T Listener task: Which of the three intervals contained T? Example trial: M M+T M

Model Structure [Jepsen et al., 2008]

Outer- and Middle-Ear Transfer Function Outer ear: Typical human headphone-to-eardrum sound pressure gain Middle ear: Stapes peak velocity (m/s) as a function of frequency for SPL = 0 db [Lopez-Poveda and Meddis, 2001]

Dual-Resonance Nonlinear (DRNL) Filterbank Mimics the complex nonlinear BM response Compressive for frequencies ±10% around CF [Lopez-Poveda and Meddis, 2001] Linear outside CF On frequency (f = CF) CF = 4 khz [Jepsen et al., 2008]

Hair Cell and Auditory Nerve Mechanical BM oscillation Receptor potential Half-wave rectification + 1st order lowpass filter (1000 Hz) Preserves fine structure at low CFs and envelope at high CFs Squaring Expansion: Mimics spike rate-vs.-level functions of auditory nerve fibres Adaptation Dynamic changes in gain in response to changes in input level Arises at level of auditory nerve Implemented as chain of five nonlinear feedback loops (Dau et al., 1996)

Modulation Processing and Filterbank Model of frequency-dependent modulation sensitivity 1st order low-pass filter Model of modulation-frequency selectivity Modulation filterbank [Dau et al., 1997]

Model of Limited Amplitude Resolution Gaussian-distributed internal noise Added to each channel at output of modulation filterbank Noise variance adjusted to predict human intensity discrimination for 1-kHz tone at SPL of 60 db

Decision Device: Optimal Detector Assumption: Listener creates template of target Template = Int(M + T) Int(M); T > masked threshold Int(x) = internal representation of x 3-dimensional pattern with axes time, frequency, and modulation frequency Simulation to obtain difference representation : Int(M) Int(each interval of AFC task) T+IN or IN? IN = internal noise Decision based on cross-correlation coefficient between template and difference representation Interval with largest value assumed to contain target Simulation of experimental procedure

Model Evaluation Simulation of psychoacoustic experiments: Using experimental stimuli Applying experimental paradigm Presentation of results: Dark filled circles: CASP model Gray filled circles: original model (Dau et al., 1997) Main difference: linear filterbank instead of DRNL Open symbols: experimental data

Intensity Discrimination Stimuli: 1-kHz sinusoid, broadband noise Several reference levels Poor prediction for pure-tones: Possibly due to lacking evaluation of nonlinear phase effect across CFs Data from [Houtsma et al., 1980] Figure from [Jepsen et al., 2008]

Tone-in-Noise Simultaneous Masking Target: 2-kHz sinusoid, variable duration Masker: band-limited Gaussian noise, 500 ms [Jepsen et al., 2008]

Spectral Masking Patterns Target: sinusoid or NB-noise, 200 ms Masker: sinusoid or NB-noise at 1 khz, 200 ms Two masker levels: 45 and 85 db SPL Data from [Moore et al., 1998] Figure from [Jepsen et al., 2008]

Forward Masking I Target: 4-kHz sinusoid, 10 ms Masker: broadband Gaussian noise (0.02-8 khz), 200 ms Variable masker-target intervals [Jepsen et al., 2008]

Forward Masking II Target: 4-kHz sinusoid, 10 ms Masker: variable levels, 0 or 30-ms separation On-frequency: 4-kHz sinusoid Off-frequency: 2.4-kHz sinusoid [Jepsen et al., 2008]

Modulation Detection Stimuli: Narrow-band carrier: band limited Gaussian noise, centered at 5 khz, bandwidth = 3, 31, 314 Hz Broadband carrier: Gaussian noise Modulator: pure tone, different frequencies Data from [Dau et al., 1997] and [Viemeister 1979] Figure from [Jepsen et al., 2008]

Perception of Sound Quality: Design Subjective data: Low-bit-rate speech Mean Opinion Score (MOS) Auditory processing model (based on Dau et al., 1996) Objective speech quality: Early version of CASP model Based on Dau et al. (1996) Compares internal representation of coded and reference signal: Band weighting characteristic Distance measure based on correlation coefficient: qc Employs band weighting function All data from [Hansen and Kollmeier, 2000]

Perception of Sound Quality: Results Good prediction of subjective data (r = 0.94) Best prediction with: Strong weighting of high-frequency bands CASP parameters optimized for psychoacoustic data Data from [Hansen and Kollmeier, 2000]

Power of the Model Combines information across: Frequency channels Modulation channels Time Allows to model complex spectro-temporal interactions in hearing like: Co-modulation masking release Two-tone suppression Temporal or spectral integration effects (including multiple-looks processing) Able to deals with arbitrarily complex stimuli

CASP: Applications and Possible Extensions Modeling of hearing loss: Allows studying consequences of dysfunction of specific mechanisms/processing stages like: Damage of outer hair cells Lack of phase locking Can be used as front-end for models of other auditory functions like: Pitch perception Sound localization and binaural signal detection Auditory scene analysis (ASA) Speech recognition Subjective evaluation (sound quality assessment)

Summary & Conclusions The processing of the auditory system is highly nonlinear and complex CASP: Perceptual model for the peripheral auditory system Modular and easily extendable Able to simulate many auditory phenomena CASP implementation: in MATLAB available upon request (Morten Jepsen) http://caspmodel.sourceforge.net/ An open-source MATLAB implementation of the model soon available: AMToolbox @ sourceforge.net