Lecture 3: Perception

Similar documents
Lecture 4: Auditory Perception. Why study perception?

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

PHYS 1240 Sound and Music Professor John Price. Cell Phones off Laptops closed Clickers on Transporter energized

Chapter 11: Sound, The Auditory System, and Pitch Perception

Acoustics Research Institute

Sound Waves. Sensation and Perception. Sound Waves. Sound Waves. Sound Waves

COM3502/4502/6502 SPEECH PROCESSING

Hearing Sound. The Human Auditory System. The Outer Ear. Music 170: The Ear

Music 170: The Ear. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) November 17, 2016

Hearing. Juan P Bello

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Signals, systems, acoustics and the ear. Week 5. The peripheral auditory system: The ear as a signal processor

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

Systems Neuroscience Oct. 16, Auditory system. http:

Required Slide. Session Objectives

Deafness and hearing impairment

Digital Speech and Audio Processing Spring

Topic 4. Pitch & Frequency

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

ID# Final Exam PS325, Fall 1997

Receptors / physiology

ENT 318 Artificial Organs Physiology of Ear

Spectrograms (revisited)

L2: Speech production and perception Anatomy of the speech organs Models of speech production Anatomy of the ear Auditory psychophysics

BCS 221: Auditory Perception BCS 521 & PSY 221

Hearing. istockphoto/thinkstock

Intro to Audition & Hearing

PSY 215 Lecture 10 Topic: Hearing Chapter 7, pages

Psychoacoustical Models WS 2016/17

HCS 7367 Speech Perception

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

16.400/453J Human Factors Engineering /453. Audition. Prof. D. C. Chandra Lecture 14

ID# Exam 2 PS 325, Fall 2003

HEARING. Structure and Function

Healthy Organ of Corti. Loss of OHCs. How to use and interpret the TEN(HL) test for diagnosis of Dead Regions in the cochlea

Representation of sound in the auditory nerve

Linguistic Phonetics Fall 2005

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

ID# Exam 2 PS 325, Fall 2009

The Structure and Function of the Auditory Nerve

Auditory System Feedback

Chapter 1: Introduction to digital audio

Outline. 4. The Ear and the Perception of Sound (Psychoacoustics) A.1 Outer Ear Amplifies Sound. Introduction

Auditory Physiology PSY 310 Greg Francis. Lecture 30. Organ of Corti

Hearing. Figure 1. The human ear (from Kessel and Kardon, 1979)

Thresholds for different mammals

Hearing: Physiology and Psychoacoustics

PSY 214 Lecture # (11/9/2011) (Sound, Auditory & Speech Perception) Dr. Achtman PSY 214

Auditory System. Barb Rohrer (SEI )

THE MECHANICS OF HEARING

The Production of Speech Speed of Sound Air (at sea level, 20 C) 343 m/s (1230 km/h) V=( T) m/s

SPHSC 462 HEARING DEVELOPMENT. Overview Review of Hearing Science Introduction

Robustness, Separation & Pitch

Loudness. Loudness is not simply sound intensity!

J Jeffress model, 3, 66ff

HEARING AND PSYCHOACOUSTICS

Sound and Hearing. Decibels. Frequency Coding & Localization 1. Everything is vibration. The universe is made of waves.

ClaroTM Digital Perception ProcessingTM

Binaural Hearing. Why two ears? Definitions

A computer model of medial efferent suppression in the mammalian auditory system

Hearing. and other senses

Auditory System & Hearing

Hearing. PSYCHOLOGY (8th Edition, in Modules) David Myers. Module 14. Hearing. Hearing

Hearing I: Sound & The Ear

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear

The Ear. The ear can be divided into three major parts: the outer ear, the middle ear and the inner ear.

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Hearing Lecture notes (1): Introductory Hearing

SUBJECT: Physics TEACHER: Mr. S. Campbell DATE: 15/1/2017 GRADE: DURATION: 1 wk GENERAL TOPIC: The Physics Of Hearing

PSY 214 Lecture 16 (11/09/2011) (Sound, auditory system & pitch perception) Dr. Achtman PSY 214

Sound Waves. Sound and Sensa3on. Chapter 9. Sound waves are composed of compression and rarefac3on of air molecules. Domain

Auditory Perception: Sense of Sound /785 Spring 2017

Computational Models of Mammalian Hearing:

Hearing I: Sound & The Ear

Nature of the Sound Stimulus. Sound is the rhythmic compression and decompression of the air around us caused by a vibrating object.

Outline. The ear and perception of sound (Psychoacoustics) A.1 Outer Ear Amplifies Sound. Introduction

Topics in Linguistic Theory: Laboratory Phonology Spring 2007

Auditory Physiology PSY 310 Greg Francis. Lecture 29. Hearing

PSY 310: Sensory and Perceptual Processes 1

Hearing II Perceptual Aspects

Lecture 6 Hearing 1. Raghav Rajan Bio 354 Neurobiology 2 January 28th All lecture material from the following links unless otherwise mentioned:

Mechanical Properties of the Cochlea. Reading: Yost Ch. 7

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

Auditory System & Hearing

Sensation and Perception

Computational Perception /785. Auditory Scene Analysis

Hearing Lectures. Acoustics of Speech and Hearing. Subjective/Objective (recap) Loudness Overview. Sinusoids through ear. Facts about Loudness

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

A truly remarkable aspect of human hearing is the vast

SOLUTIONS Homework #3. Introduction to Engineering in Medicine and Biology ECEN 1001 Due Tues. 9/30/03

MECHANISM OF HEARING

Binaural Hearing for Robots Introduction to Robot Hearing

Who are cochlear implants for?

Auditory Physiology Richard M. Costanzo, Ph.D.

Temporal Adaptation. In a Silicon Auditory Nerve. John Lazzaro. CS Division UC Berkeley 571 Evans Hall Berkeley, CA

Ganglion Cells Blind Spot Cornea Pupil Visual Area of the Bipolar Cells Thalamus Rods and Cones Lens Visual cortex of the occipital lobe

Learning Targets. Module 20. Hearing Explain how the ear transforms sound energy into neural messages.

Sound and its characteristics. The decibel scale. Structure and function of the ear. Békésy s theory. Molecular basis of hair cell function.

Sensation and Perception

Issues faced by people with a Sensorineural Hearing Loss

Transcription:

ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 3: Perception 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception Dan Ellis Dept. Electrical Engineering, Columbia University dpwe@ee.columbia.edu http://www.ee.columbia.edu/~dpwe/e4896/ E4896 Music Signal Processing (Dan Ellis) 213-2-4-1 /24

1. Ear Physiology The ear is a very sensitive transducer of air pressure variations into nerve firings just above Brownian motion!? Middle ear Auditory nerve Outer ear Inner ear (cochlea) Midbrain Cortex The cochlea is largely understood the brain is more difficult E4896 Music Signal Processing (Dan Ellis) 213-2-4-2 /24

The Ear Impedance matching & transduction Ear canal Pinna Middle ear bones Cochlea (inner ear) Eardrum (tympanum) pinna acts as horn eardrum + bones match impedance cochlea transduces to nerve firings E4896 Music Signal Processing (Dan Ellis) 213-2-4-3 /24

The Cochlea Complex resonant structure Oval window (from ME bones) Travelling wave Basilar Membrane (BM) Cochlea 16 khz Resonant frequency 5 Hz Position 35mm http://www.wadalab.mech.tohoku.ac.jp/fem_bm-e.html active feedback to maintain near-ringing state efferent fibers? E4896 Music Signal Processing (Dan Ellis) 213-2-4 - /24

Hair Cells Transduce mechanical motion to nerve firings Cochlea Tectorial membrane Inner Hair Cell (IHC) Basilar membrane Auditory nerve Outer Hair Cell (OHC) 3, IHCs driving 2, nerves easily damaged E4896 Music Signal Processing (Dan Ellis) 213-2-4-5 /24

Auditory Nerve IHC fires near maximum displacement Local BM displacement Typical nerve signal (mv) 5 time / ms cannot fire every cycle some noise Firing count Cycle angle E4896 Music Signal Processing (Dan Ellis) 213-2-4-6 /24

Nerve Responses Onset enhancement Frequency selectivity Spike count 1 Dynamic range db SPL Tone burst One fiber: ~ 25 db dynamic range Time 1 ms 8 6 4 Spikes/sec 3 2 1 2 1 Hz 1 khz 1 khz (log) frequency Intensity / db SPL E4896 Music Signal Processing (Dan Ellis) 213-2-4-7 /24 2 4 6 8 1 Hearing dynamic range > 1 db

Auditory Nerve Ensemble Ensemble of nerves provide full information Secker-Walker & Searle 9 similar to constant-q log-intensity spectrogram freq / 8ve re 1 Hz PatSla rectsmoo on bbctmp2 (21-2-18) 5 4 3 2 1 1 2 3 4 5 6 time / ms E4896 Music Signal Processing (Dan Ellis) 213-2-4-8 /24

Auditory Models Filterbank + nonlinearity varying (but broad) bandwidth IHC Sound Outer/middle ear filtering Cochlea filterbank IHC 6 SlaneyPatterson 12 chans/oct from 18 Hz, BBC1tmp (21218) 5 channel 4 3 2 1.1.2.3.4.5 time / s E4896 Music Signal Processing (Dan Ellis) 213-2-4-9 /24

2. Auditory Psychophysics Extensive study of relationship between physical (Φ) and psychological (Ψ) values perception is not direct! Common across all perceptual modalities proprioceptive force, body positioning vision hearing Φ - Ψ distinction is important! E4896 Music Signal Processing (Dan Ellis) 213-2-4-1/24

Loudness perception Perception of physical parameter just noticeable difference - jnd magnitude scaling Weber s law: Loudness log(l) = log 1 (L) = db(i) = I I log(l) log(i) L I.3.3 log(i).3 db(i) 33.3 log 1 (L) Log(loudness rating) Hartmann(1993) Classroom loudness scaling data 2.6 1.4-2 -1 Hartmann 1 Sound 9level tracks 19+2 E4896 Music Signal Processing (Dan Ellis) 213-2-4-11/24 2.4 2.2 2. 1.8 1.6 Textbook figure: L α I.3 Power law fit: L α I.22

Equal Loudness Fletcher-Munson curves (1937) 12 Intensity / db SPL 8 4 match intensity to specific 1 khz tone loudness growth 1 1 1, freq / Hz E4896 Music Signal Processing (Dan Ellis) 213-2-4-12/24

Masking Limited dynamic range in cochlea effect within frequency critical bands Intensity / db absolute threshold masking tone masked threshold basis of MPEG Audio log freq Masking tone Forward/backward temporal effects level / db 2 18 16 14 12 1 8 Elevated masking threshold skirt 2 6 15 4 2 5 1 15 5 2 25 E4896 Music Signal Processing (Dan Ellis) 213-2-4-13/24 time / ms 1 freq / Bark tracks 23-25

Limits of Hearing Test what listeners can discriminate A B X Roughly... timing: 2 ms difference, 2 ms ordering tuning: ~1% spectral profile: single components ~ 2 db phase? tones vs. noise... time two-interval forced-choice : X = A or B? E4896 Music Signal Processing (Dan Ellis) 213-2-4-14/24

3. Pitch Perception Complex (non-sinusoidal) tones give a single, fused percept despite harmonics resolved by cochlea 7 freq. chan. 6 5 4 3 2 1.5.1 time/s percept is of a single pitch.. but pitch does NOT rely on the fundamental track 37 E4896 Music Signal Processing (Dan Ellis) 213-2-4-15/24

Place models of pitch Hypothesis: Pitch results from activation pattern AN excitation broader HF channels cannot resolve harmonics resolved harmonics Correlate with harmonic sieve : Pitch strength frequency channel Duifhuis et al. 82 frequency channel support: low harmonics are important but: pitch of noisy signals E4896 Music Signal Processing (Dan Ellis) 213-2-4-16/24

Time models of pitch Autocorrelation neatly unifies pitch phenomena freq per-channel autocorrelation autocorrelation time Summary autocorrelation Meddis & Hewitt 91 1 2 3 common period (pitch) lag / ms but: high-frequency modulation evokes weak pitch E4896 Music Signal Processing (Dan Ellis) 213-2-4-17/24

Competing Cues Perhaps brains use both place & time cues common perceptual strategy: opportunistic combination of information e.g. Probabilistic combination arg max Pr( x) arg max Pr( x 1)Pr( x 2 ) Pr( ) if x1, x2 are independent given θ E4896 Music Signal Processing (Dan Ellis) 213-2-4-18/24

4. Music Perception Hearing music involves instruments notes rhythm Can study with subjective experiments Grey 75 E4896 Music Signal Processing (Dan Ellis) 213-2-4-19/24

Scene Analysis Detect separate events common onset common harmonicity freq / Hz 8 6 4 2 Pierce 83 1 2 3 4 5 6 7 8 9 time / s instruments & timbre E4896 Music Signal Processing (Dan Ellis) 213-2-4-2/24

Musical intervals relate to harmonic proximity Consonance Pitch Helix Warren et al. 23 E4896 Music Signal Processing (Dan Ellis) 213-2-4-21/24

Rhythm Sensitive to periodicity speech? breathing? brain? Onsets + autocorrelation? variations in tapping 4/4 vs 3/4 4 3 2 1 4 3 2 1 1 2 3 4 5 6 7 8 9 1 1 2 3 4 5 6 7 8 9 1 4 2-2 1 2 3 4 5 6 7 8 9 1 8 6 4 2 1 1 2 3 4 5 6 7 8 9 1 4 2-2 1 2 3 4 5 6 7 8 9 1 8 6 4 2 1 1 2 3 4 5 6 7 8 9 1 5 1 2 3 4 5 6 7 8 9 1-1 1 2 3 4 5 6 7 8 9 1 E4896 Music Signal Processing (Dan Ellis) 213-2-4-22/24

Sequences Perceptual effects of sequences e.g. streaming TRT: 6-15 ms frequency 1 khz Δf: 2 octaves time track 36 Music is built of sequences different sensitivities E4896 Music Signal Processing (Dan Ellis) 213-2-4-23/24

Summary Ear converts air pressure to nerve firings spectral analysis Brain does a lot with scarce information dealing with the real world Music is a complex signal multiple sources harmonic structures temporal patterns E4896 Music Signal Processing (Dan Ellis) 213-2-4-24/24

References H. Duifhuis, L. F. Willems, & R. J. Sluyter, Measurement of pitch in speech: An implementation of Goldstein s theory of pitch perception, J. Acoust. Soc. Am., 71(6):1568-158, 1982. H. Fletcher & W. A. Munson, Relation between loudness and masking, J. Acoust. Soc. Am., 9:1-1, 1937. B. Gold, N. Morgan, & D. Ellis, Speech and Audio Signal Processing (2nd ed.), Wiley, 211. J. M. Grey, An Exploration of Musical Timbre, Ph.D. Dissertation, Stanford CCRMA, No. STAN-M-2, 1975. W. M. Hartmann, Auditory demonstrations on compact disk for large N, J. Acoust. Soc. Am., 93(1):1-16, 1993. R. Meddis & M. J. Hewitt, Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification, J. Acoust. Soc. Am., 89(6):2866-2882, 1991. B. C. J. Moore, An Introduction to the Psychology of Hearing (4th ed.), Academic Press, 1997. J. R. Pierce, The Science of Musical Sound, Scientific American Press, 1983. H. E. Secker-Walker & C. L. Searle, Time-domain analysis of auditory-nervefiber firing rates, J. Acoust. Soc. Am., 88(3):1427-1436, 199. J. D. Warren, S. Uppenkamp, R. D. Patterson, & T. D. Griffiths, Separating pitch chroma and pitch height in the human brain, PNAS 1(17):138-42, 23. E4896 Music Signal Processing (Dan Ellis) 213-2-4-25/24