A Brief History of Auditory Models

Similar documents
ENT 318 Artificial Organs Physiology of Ear

COM3502/4502/6502 SPEECH PROCESSING

Receptors / physiology

Intro to Audition & Hearing

Auditory Physiology Richard M. Costanzo, Ph.D.

Sound. Audition. Physics of Sound. Properties of sound. Perception of sound works the same way as light.

Audition. Sound. Physics of Sound. Perception of sound works the same way as light.

Hearing Sound. The Human Auditory System. The Outer Ear. Music 170: The Ear

Music 170: The Ear. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) November 17, 2016

Hearing. istockphoto/thinkstock

Signals, systems, acoustics and the ear. Week 5. The peripheral auditory system: The ear as a signal processor

Sound and its characteristics. The decibel scale. Structure and function of the ear. Békésy s theory. Molecular basis of hair cell function.

Auditory System. Barb Rohrer (SEI )

Chapter 11: Sound, The Auditory System, and Pitch Perception

Hearing. By: Jimmy, Dana, and Karissa

Systems Neuroscience Oct. 16, Auditory system. http:

Deafness and hearing impairment

Acoustics Research Institute

Hearing. PSYCHOLOGY (8th Edition, in Modules) David Myers. Module 14. Hearing. Hearing

Mechanical Properties of the Cochlea. Reading: Yost Ch. 7

BCS 221: Auditory Perception BCS 521 & PSY 221

Required Slide. Session Objectives

The Structure and Function of the Auditory Nerve

MECHANISM OF HEARING

HEARING. Structure and Function

Chapter 13 Physics of the Ear and Hearing

Lecture 3: Perception

Sound Waves. Sensation and Perception. Sound Waves. Sound Waves. Sound Waves

Educational Module Tympanometry. Germany D Germering

Hearing. By Jack & Tori

ID# Exam 2 PS 325, Fall 2003

SPHSC 462 HEARING DEVELOPMENT. Overview Review of Hearing Science Introduction

Auditory System Feedback

The Ear. The ear can be divided into three major parts: the outer ear, the middle ear and the inner ear.

Structure, Energy Transmission and Function. Gross Anatomy. Structure, Function & Process. External Auditory Meatus or Canal (EAM, EAC) Outer Ear

Human Acoustic Processing

Anatomy and Physiology of Hearing

ID# Final Exam PS325, Fall 1997

SENSORY SYSTEM VII THE EAR PART 1

Learning Targets. Module 20. Hearing Explain how the ear transforms sound energy into neural messages.

PSY 215 Lecture 10 Topic: Hearing Chapter 7, pages

Auditory Physiology PSY 310 Greg Francis. Lecture 29. Hearing

PSY 310: Sensory and Perceptual Processes 1

Speech Generation and Perception

Unit VIII Problem 9 Physiology: Hearing

Printable version - Hearing - OpenLearn - The Open University

College of Medicine Dept. of Medical physics Physics of ear and hearing /CH

Ganglion Cells Blind Spot Cornea Pupil Visual Area of the Bipolar Cells Thalamus Rods and Cones Lens Visual cortex of the occipital lobe

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

Before we talk about the auditory system we will talk about the sound and waves

to vibrate the fluid. The ossicles amplify the pressure. The surface area of the oval window is

Discrete Signal Processing

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Hearing. Figure 1. The human ear (from Kessel and Kardon, 1979)

Presentation On SENSATION. Prof- Mrs.Kuldeep Kaur

Chapter 18 Senses SENSORY RECEPTION 10/21/2011. Sensory Receptors and Sensations. Sensory Receptors and Sensations. Sensory Receptors and Sensations

HEARING GUIDE PREPARED FOR CLINICAL PROFESSIONALS HEARING.HEALTH.MIL. HCE_ClinicalProvider-Flip_FINAL01.indb 1

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Assistive Technology Project. Presented By: Rose Aldan

Chapter 17, Part 2! The Special Senses! Hearing and Equilibrium!

Chapter 17, Part 2! Chapter 17 Part 2 Special Senses! The Special Senses! Hearing and Equilibrium!

Sound and Hearing. Decibels. Frequency Coding & Localization 1. Everything is vibration. The universe is made of waves.

Hearing. Juan P Bello

PSY 214 Lecture 16 (11/09/2011) (Sound, auditory system & pitch perception) Dr. Achtman PSY 214

Chapter Fourteen. The Hearing Mechanism. 1. Introduction.

Perception of Sound. To hear sound, your ear has to do three basic things:

HEARING AND PSYCHOACOUSTICS

Sound Waves. Sound and Sensa3on. Chapter 9. Sound waves are composed of compression and rarefac3on of air molecules. Domain

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant

Hearing: Physiology and Psychoacoustics

Hearing and Balance 1

Vision and Audition. This section concerns the anatomy of two important sensory systems, the visual and the auditory systems.

PSY 214 Lecture # (11/9/2011) (Sound, Auditory & Speech Perception) Dr. Achtman PSY 214

Otoconia: Calcium carbonate crystals Gelatinous mass. Cilia. Hair cells. Vestibular nerve. Vestibular ganglion

SUBJECT: Physics TEACHER: Mr. S. Campbell DATE: 15/1/2017 GRADE: DURATION: 1 wk GENERAL TOPIC: The Physics Of Hearing

The cochlea: auditory sense. The cochlea: auditory sense

Sound and the auditory system

How Do Our Ears Work? Quiz

Activity 1: Anatomy of the Eye and Ear Lab

Consciousness and Blindsight

AUDITORY APPARATUS. Mr. P Mazengenya. Tel 72204

SENSATION & PERCEPTION

Converting Sound Waves into Neural Signals, Part 1. What happens to initiate neural signals for sound?

Lecture 6 Hearing 1. Raghav Rajan Bio 354 Neurobiology 2 January 28th All lecture material from the following links unless otherwise mentioned:

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

An Auditory System Modeling in Sound Source Localization

EMANATIONS FROM RESIDUUM OSCILLATIONS IN HUMAN AUDITORY SYSTEM

SOLUTIONS Homework #3. Introduction to Engineering in Medicine and Biology ECEN 1001 Due Tues. 9/30/03

Hearing: the function of the outer, the middle and inner ear. Hearing tests. The auditory pathways

9.01 Introduction to Neuroscience Fall 2007

Definition Slides. Sensation. Perception. Bottom-up processing. Selective attention. Top-down processing 11/3/2013

= add definition here. Definition Slide

HEAR YE! HEAR YE! (1.5 Hours)

is the clear, transparent part at the front of the eye. It allows light to enter the eye and it also refracts (focuses) the light onto the retina.

SPECIAL SENSES: THE AUDITORY SYSTEM

Chapter 7. Audition, the Body Senses, and the Chemical Senses. Copyright Allyn & Bacon 2004

Representation of sound in the auditory nerve

Implementation of Spectral Maxima Sound processing for cochlear. implants by using Bark scale Frequency band partition

Auditory Physiology PSY 310 Greg Francis. Lecture 30. Organ of Corti

Copyright 2009 Pearson Education, Inc.

Transcription:

A Brief History of Auditory Models Leonardo C. Araújo 1, Tairone N. Magalhaes 1, Damares P. M. Souza 1, Hani C. Yehia 1, Maurício A. Loureiro 1 1 CEFALA - Center for Research on Speech, Acoustics, Language and Music Universidade Federal de Minas Gerais (UFMG) CPDEE-Centro de Pesquisa e Desenvolvimento em Engenharia Elétrica, room 214 Av. Antônio Carlos 6627 31270-010 Belo Horizonte, MG leoca, tairone, damares, hani, mauricio @cefala.org Abstract. This work presents a brief description of the human auditory system together with the history of human comprehension of the auditory function, its main features, and classic models used to represent it. First, a historical view of the hearing apparatus is presented. After that, the physiology of the peripheral auditory system is described. The process of acoustic propagation through the outer, middle and inner ear, as well as the mechanism of transformation of cochlea inner hair cell motion into neuron spikes are explained. Next, Flanagan s mathematical representation (based on physiological data acquired by von Békésy) of the passive relation between the sound that reaches the outer ear and the motion of the cochlea basilar membrane. Flanagan s model is followed by Lyon s model of the cochlea, Meddis model of the inner hair cell, and Patterson s Auditory Image Model. Finally, the IPEM Toolbox is introduced as an example of music analysis system that incorporates an auditory model to perform acoustic analysis of sound based on human perception. 1. Introduction Over the past half century, the auditory system has been the focus of intensive research. Knowledge on physiology, psychology and engineering has provided the possibility for creation of models which until a certain extension tries to describe and mimic the hearing mechanisms. Mathematical and computational models are used to build quantitative simulations that describes a system and bring insights about the system itself or its behavior under certain conditions. Auditory models might be used to several purposes, such as help on tracking problems on hearing or speech fields. They can be also very useful on music research since these systems can simulate the mechanisms of sound perception, so it is possible to use it in musical timbre research, as has been done by a few [Cosi et al., 1994], [Toiviainen et al., 1995]. 2. Hearing Physiology Basically, the peripheral auditory system has three main parts: the outer, middle and inner ear. The outer ear is the visible portion of the ear. It includes the pinna (also called auricle), the ear canal and the eardrum. The pinna collects the sounds and directs them down into the ear canal. Its shape also help us to extract directional information from sounds. The ear canal is a tube about 2.5 cm long, and it acts as a quarter-wavelength resonator. So it enhances frequencies around 3400 Hz, the maximum sensitivity regions of human hearing. This region can be easily observed at the famous equal loudness curves.

Acoustic sound waves impinging on the outer ear is propagated down the external meatus leading to the eardrum which is set into vibration. These vibrations are then transmitted to the tiny ossicles of the middle ear. The ossicle chain is compounded of three units: the incus, the malleus and the stapes. The malleus, or hammer, is fixed on the eardrum. It makes contact with the incus, or anvil, which in turn connects with stapes, or stirrup, through a small joint. The main function of the ossicles is to make an impedance transformation from the air medium of the outer ear to the liquid medium inside the cochlea (inner ear). As there is a great change in the propagation medium density, and so in its impedance, it is necessary an impedance transformation, trying to match those two far distant impedances, comprising their respective disparate realms. The lever action of the ossicles alone provide a force amplification of about 1.3 [von Békésy, 1960]. Another amplification arises from the change in effective area, the eardrum area is much greater than that of the stirrup on the order of 15:1, according to Békésy s measurements [von Békésy, 1960]. Figure 1: Outer, middle and inner ear The inner ear is the part where the cochlea is located, just behind the oval window, which is connected to the stapes footplate. The inner ear also comprises the vestibular apparatus and the auditory nerve terminations. The cochlea is a snail-shaped organ filled with almost incompressible fluids. It is divided by two membranes: Reissner s membrane and the basilar membrane. When the oval window is set into movement, a pressure difference on the fluid propagates down the cochlea. Thus inward motion of the ossicles results in an outward movement in the round window. The basilar membrane (Figure 2) is set in motion by this pressure difference, but the pattern of its movement varies, depending of its mechanical properties at a certain position. At its basal end, the basilar membrane is narrow and stiff, while at its apical end it is wider and more compliant. Because of these differences, the position of the peak in the pattern of vibration differs according to the frequency of stimulation. The basis responds best to high frequencies, while the apex responds best to low frequencies. Mechanical motion of the basilar membrane leads to displacements of the inner hair cells stereocilia, which are located between the basilar membrane and the tectorial membrane, in a structure called organ of Corti. 3. Auditory Models The crucial problem in modelling is deciding what is the optimal level of details and what are the possible simplifications. The amount of details and the computational (and why not, mathematical)

Figure 2: Section of the cochlea. (Adapted from Moore,1995). burden go against each other. Increasing the level of details, leads us to complex and computational expensive models. Furthermore, all the effort put on those details modelling might be wasted if the final user of the models does not care about them. This kind of representation subdue the dynamic characteristics and the nonlinearity of the signal. It also incorporates other phenomena from the perception mechanism such as critical bands, frequency masking and temporal masking. Auditory models are being continually upgraded, and there are many different models available on the internet. The principal models will be described here. 3.1. Flanagan s Model Flanagan, based on the physiological data measured by Békésy, has proposed a mathematical and also a computational model for the auditory mechanism [Flanagan, 1960] [Flanagan, 1962] [Flanagan, 1972]. His model is divided into two parts, one comprising the middle ear and the other the basilar membrane. The fist block implements the physiological function embraced by the middle ear and the second the basilar membrane. The input in the first block, p(t), is the pressure at the eardrum, its output, x(t), is stapes displacement which will be the input to the next stage. The final output, y l (t), is the basilar membrane displacement at a distance l from the stapes. The approximating functions are indicated in its frequency-domain (Laplace) representation by the functions G(s) and F l (s). Those functions must be fitted to available physiological data. Flanagan assumes the ear to be mechanically passive and linear over the frequency and amplitude ranges of interest, than, rational functions of frequency with left half-plane poles (stable) can be used to approximate the physiological data. The fact of being modeled by rational functions make it feasible of lumped-constant electrical circuit implementation. Flanagan uses as an approximation to G(s) a third degree function, composed of a real pole and a pair of complex conjugated poles. F l (s) is approximated by a second order pair of complex conjugated poles, a real pole, a real zero and a delay factor. This transfer-function is created this way to resemble the resonant properties of the membrane, approximating it as a constant-q (constant percentage bandwidth) filter bank in character. 3.2. Lyon s Model Richard F. Lyon [Lyon and Mead, 1988] developed a model of analog electronic cochlea based on the knowledge of how the cochlea works. His approach was to model the fluid-dynamic wave

Figure 3 Figure 4 medium of the cochlea by a cascade of filters based on the observed properties of the medium. The action of the active outer hair cells was modelled by a set of automatic gain controls (AGC) which simulated the dynamic compression of the intensity range on the basilar membrane. A neural spike at the base of auditory nerve is only generated when the stereocilia of the inner hair cell are bent one way. When the cilia are bent the other way, no spikes are generated. So the inner hair cells acts like half-wave-rectifiers, and this unit was used to model them in the Lyon s cochlear model. The output of the model is the probability of firing along time for the neurones of the auditory nerve. This data received the name cochleagram. Malcolm Slaney implemented an auditory model based on Lyon s model. He developed a toolbox for MatLab called auditory toolbox which includes several functions for auditory modelling including different models for cochlear processing, hair cell transduction and additional functions for spectral analysis and correlogram generation. The correlogram a is function that summarizes periodical information of the cochleagram to give us a better representation of an auditory image. It can be visualized as a movie, and there is also a function for their visualization in the auditory toolbox. 3.3. Meddis Inner Hair Cell Ray Meddis [Meddis, 1986] developed a model of inner hair cell based on its physiology. In this model, the permeability function controls the neurotransmitter release into the synaptic cleft. The spike probability on specific neurones of the auditory nerve is a function of the amount of neurotransmitter in the cleft. Interspike intervals are also taken into account. Spikes cannot occur in intervals less than 1 ms. His inner hair cell model became popular between the auditory researchers. The auditory image model (AIM) (see section 3.4), a famous model of auditory processing developed by Roy Patterson et al. [Patterson et al., 1995] includes a stage of neural transduction using the Meddis hair cell. This model was upgraded by Sumner et al [Sumner et al., 2002], creating a revised model of the inner hair cell and auditory-nerve complex. This model improves the previous in terms of the range of phenomenon simulated and is more consistent with recent developments in hair-cell physiology. It represents better the response of medium and low spontaneous rate fibers. The

Figure 5: Example of cochleagram using a recorded clarinet sound (C4) as input to the model. former only simulate the response of high spontaneous rate fibers at the fiber s best frequency. 3.4. Auditory Image Model (AIM) The auditory image model includes several alternative modules comprising the stages of processing by the peripheral auditory system. These stages are (1) middle ear filtering, (2) spectral analysis, (3) neural encoding and (4) time-interval stabilization. The middle ear filtering is a simple linear filter that enhances middle frequencies. The spectral analysis can be done with a functional (gammatone 1 ) or a physiological auditory (nonlinear transmission) filter. The output of this stage is the estimated basilar membrane motion (BMM) obtained in the presence of the input signal. The neural encoding stage converts the BMM into a neural activity pattern (NAP). Two modules are provided for generating the NAP: a bank of meddis inner hair cells and a bank of twodimensional adaptive threshold units, which rectify and compress the BMM, then apply adaptation in time and suppression across frequency. The last stage summarizes the temporal activity at the output of the NAP stage based on the idea that periodic sounds give rise to static perceptions by the human listener. There are two modules included: an strobed temporal integration (STI) and a correlogram. 3.5. IPEM Toolbox The IPEM Toolbox is an example of music analisys system which incorporate an auditory model to peform the perception-based acoustic analysis of the sound. The auditory model (Auditory Peripheral Module) was developed by Van Immerseel and Martens (1992) and implement some features that are important for music perception. There is an outer ear filtering, an array of band-pass filters for simulate the cochlea and a hair cell model (HCM) that incorporates half wave rectification and dynamic range compression. The HCM also introduces distortion products that corresponds to the beating frequencies. There is also a low pass filter that extracts the envelope of each channel. This filter agree with the loss of syncronization at the primary auditory nerve. 1 Gammatone filter bank is a constant bandwidth filter bank.the gammatone auditory filter can be described by its impulse response: y tone (t) = at n 1 e 2πbt cos(2πf c t + φ), for t > 0, where the parameters a and b determines the duration of the impulse response and n the filter s order.

This toolbox contains modules wich deal with different aspects of perception and can provide good estimations of sensory, perceptual and cognitive parameters of the sound. 4. Conclusions The research carried out up to now has shed light on the mechanisms that govern the human auditory sense. However many issues involving the function of the auditory periphery are still unresolved. New publications suggest that factors such as displacements of the stereocilia due to the influence of Brownian motion and stochastic resonance enhances the detection of weak signals in the middle frequency range. The ultimate mechanism of perception and the transmission of neural information to the brain is still a cloudy subject. These processes are dealt as black box systems (we are unaware of what is going inside) but, still, we might be able of observing and measuring subjects behavior, in response to prescribed auditory stimuli. This approach together with new measurement techniques might give us the information necessary to draw conclusions and understand to which extent auditory physiology and psychology behavior are brought into harmony. References Cosi, P., Poli, G. D., and Lauzzana, G. (1994). Auditory modelling and self-organizing neural networks for timbre classification. Journal of New Music, 23:71 98. Flanagan, J. L. (1960). Models for approximating basilar membrane displacement. Bell System Technology Journal, 39:1163 1191. Flanagan, J. L. (1962). Models for approximating basilar membrane displacement ii. Bell System Technology Journal, 41:959 1009. Flanagan, J. L. (1972). Speech Analysis, Synthesis and Perception. Springer-Verlang, end edition. Lyon, R. F. and Mead, C. (1988). An analog electronic cochlea. IEEE Transactions on Acoustics, Speech and Signal Processing, 36(7):1119 1134. Meddis, R. (1986). Simulation of mechanical to neural transduction in the auditory recepter. Journal of the Acoustical Society of America, 79(3):702 711. Patterson, R. D., Allerhand, M. H., and Giguere, C. (1995). Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform. Journal of the Acoustical Society of America, 98:1890 1894. Sumner, C. J., Lopes-Poveda, E. A., O Mard, L. P., and Meddis, R. (2002). A revised model of the inner-hair cell and auditory-nerve complex. Journal of the Acoustical Society of America, 111(5):2178 2188. Toiviainen, P., Kaipainen, M., and Louhivuori, J. (1995). Musical timbre: Similarity ratings correlate with computational feature space distances. Journal of New Music Research, 24:282 298. von Békésy, G. (1960). Experiments in Hearing. McGraw-Hill, 1st edition.