Robust Neural Encoding of Speech in Human Auditory Cortex

Similar documents
Cortical Encoding of Auditory Objects at the Cocktail Party. Jonathan Z. Simon University of Maryland

Neural Representations of the Cocktail Party in Human Auditory Cortex

Cortical Encoding of Auditory Objects in the Cocktail Party Problem. Jonathan Z. Simon University of Maryland

Neural Representations of the Cocktail Party in Human Auditory Cortex

Neural Representations of the Cocktail Party in Human Auditory Cortex

Neural Representations of Speech, and Speech in Noise, in Human Auditory Cortex

Neural Representations of Speech at the Cocktail Party in Human Auditory Cortex

Competing Streams at the Cocktail Party

Neural Representations of Speech in Human Auditory Cortex

Over-representation of speech in older adults originates from early response in higher order auditory cortex

Effects of aging on temporal synchronization of speech in noise investigated in the cortex by using MEG and in the midbrain by using EEG techniques

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Computational Perception /785. Auditory Scene Analysis

Chapter 40 Effects of Peripheral Tuning on the Auditory Nerve s Representation of Speech Envelope and Temporal Fine Structure Cues

The role of periodicity in the perception of masked speech with simulated and real cochlear implants

Modulation and Top-Down Processing in Audition

Neurobiology of Hearing (Salamanca, 2012) Auditory Cortex (2) Prof. Xiaoqin Wang

Spectro-temporal response fields in the inferior colliculus of awake monkey

HCS 7367 Speech Perception

HCS 7367 Speech Perception

ABSTRACT. in humans is not yet fully understood. In this dissertation, we develop a computational

Auditory scene analysis in humans: Implications for computational implementations.

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

FINE-TUNING THE AUDITORY SUBCORTEX Measuring processing dynamics along the auditory hierarchy. Christopher Slugocki (Widex ORCA) WAS 5.3.

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Chapter 5. Summary and Conclusions! 131

Precise Spike Timing and Reliability in Neural Encoding of Low-Level Sensory Stimuli and Sequences

ABSTRACT. Professor, Jonathan Z. Simon, Department of Electrical and Computer Engineering

Effect of informational content of noise on speech representation in the aging midbrain and cortex

Structure and Function of the Auditory and Vestibular Systems (Fall 2014) Auditory Cortex (3) Prof. Xiaoqin Wang

Interaction between Attention and Bottom-Up Saliency Mediates the Representation of Foreground and Background in an Auditory Scene

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency

Role of F0 differences in source segregation

Adaptation of Classification Model for Improving Speech Intelligibility in Noise

10/15/2016. Hearing loss. Aging. Cognition. Aging, Cognition, and Hearing Loss: Clinical Implications

Challenges in microphone array processing for hearing aids. Volkmar Hamacher Siemens Audiological Engineering Group Erlangen, Germany

Binaural Hearing. Why two ears? Definitions

Asynchronous glimpsing of speech: Spread of masking and task set-size

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Auditory principles in speech processing do computers need silicon ears?

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise

Hearing II Perceptual Aspects

Auditory Scene Analysis. Dr. Maria Chait, UCL Ear Institute

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION

21/01/2013. Binaural Phenomena. Aim. To understand binaural hearing Objectives. Understand the cues used to determine the location of a sound source

Sound Localization PSY 310 Greg Francis. Lecture 31. Audition

Power and phase properties of oscillatory neural responses in the presence of background activity

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant

Introduction to Computational Neuroscience

Integration of Visual Information in Auditory Cortex Promotes Auditory Scene Analysis through Multisensory Binding

Sound localization psychophysics

Chapter 59 Temporal Coherence and the Streaming of Complex Sounds

Auditory Physiology PSY 310 Greg Francis. Lecture 30. Organ of Corti

Task Difficulty and Performance Induce Diverse Adaptive Patterns in Gain and Shape of Primary Auditory Cortical Receptive Fields

Comment by Delgutte and Anna. A. Dreyer (Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA)

Competing Streams at the Cocktail Party: Exploring the Mechanisms of Attention and Temporal Integration

On the influence of interaural differences on onset detection in auditory object formation. 1 Introduction

Neural Correlates of Auditory Perceptual Awareness under Informational Masking

Sound, Mixtures, and Learning

Neural correlates of the perception of sound source separation

Acoustics, signals & systems for audiology. Psychoacoustics of hearing impairment

TOWARD A BRAIN INTERFACE FOR TRACKING ATTENDED AUDITORY SOURCES

Representation of sound in the auditory nerve

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa

Atypical processing of prosodic changes in natural speech stimuli in school-age children with Asperger syndrome

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

How is the stimulus represented in the nervous system?

Linguistic Phonetics Fall 2005

Encoding of natural sounds by variance of the cortical local field potential

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Influence of acoustic complexity on spatial release from masking and lateralization

Auditory Scene Analysis: phenomena, theories and computational models

INTRODUCTION J. Acoust. Soc. Am. 103 (2), February /98/103(2)/1080/5/$ Acoustical Society of America 1080

= + Auditory Scene Analysis. Week 9. The End. The auditory scene. The auditory scene. Otherwise known as

Report. Direct Recordings of Pitch Responses from Human Auditory Cortex

Auditory-Visual Speech Perception Laboratory

Systems Neuroscience Oct. 16, Auditory system. http:

Revisiting the right-ear advantage for speech: Implications for speech displays

Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation

Binaural processing of complex stimuli

ABSTRACT. Auditory streaming: behavior, physiology, and modeling

Lateralized speech perception in normal-hearing and hearing-impaired listeners and its relationship to temporal processing

Chapter 11: Sound, The Auditory System, and Pitch Perception

Spectral fingerprints of large-scale neuronal interactions

A. SEK, E. SKRODZKA, E. OZIMEK and A. WICHER

Speech recognition in noisy environments: A survey

Effects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag

Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone

Carl Wernicke s Contribution to Theories of Conceptual Representation in the Cerebral Cortex. Nicole Gage and Gregory Hickok Irvine, California

AUTOCORRELATION AND CROSS-CORRELARION ANALYSES OF ALPHA WAVES IN RELATION TO SUBJECTIVE PREFERENCE OF A FLICKERING LIGHT

EEG reveals divergent paths for speech envelopes during selective attention

Who are cochlear implants for?

Towards a biological theory of phone5c percep5on

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

Combating the Reverberation Problem

Auditory Scene Analysis

Telephone Based Automatic Voice Pathology Assessment.

Oscillations: From Neuron to MEG

Hearing. and other senses

Transcription:

Robust Neural Encoding of Speech in Human Auditory Cortex Nai Ding, Jonathan Z. Simon Electrical Engineering / Biology University of Maryland, College Park

Auditory Processing in Natural Scenes How is the stable perception of sound generated from degraded acoustics?

Auditory Processing in Natural Scenes How is the stable perception of sound generated from degraded acoustics? Magnetoencephalography (MEG) MEG measures spatially synchronized dendritic current.

Outline Cortical Encoding of Speech in MEG Representation of Spectro-temporal Features Cortical Code despite Energetic Masking Speech in Stationary Noise Cortical Code despite Informational Masking Segregation of Simultaneous Speakers

MEG Response to Speech Speech Stimulus MEG Response frequency time time

MEG Response to Speech Speech Stimulus STRF MEG Response frequency time time correlation 0.2 0.1 0 Predictive Power Large-scale synchronized cortical activity is phase locked to slow temporal modulations of speech. 1 4 10 25 50 frequency (Hz) Ding & Simon (in press) J. Neurophysiol.

Neural Reconstruction The temporal envelope of speech can be reconstructed from the MEG response. stimulus speech envelope speech envelope reconstructed from MEG response 2 seconds Subject: R1747

Outline Cortical Encoding of Speech in MEG Representation of Spectro-temporal Features Neural Coding under Energetic Masking Speech in Stationary Noise Neural Coding under Informational Masking Segregation of Simultaneous Speakers

Speech Embedded in Noise Clean Speech SNR: 6 db SNR: -2 db SNR: -9 db Spectrogram Intelligibility: 100 % 70 % 5 % Envelope 6 db 1 second 10 participants; 2 minutes of stimulus in each condition

Neural Reconstruction of Speech The temporal envelope of the underlying speech is reconstructed neurally from cortical response. +6 db Reconstruction Accuracy 1 s -6 db Neural Reconstruction Speech Envelope Correlation.2.1 0 C +6 +2 SNR

Contrast Gain Control Neural compensation for noise-induced loss of stimulus contrast Amplitude-Intensity Function Amplitude Growth Rate response 30 db stimulus 12 db C +6 +2 SNR

Adaptive Encoding of Modulations power Modulation Spectrum of Stimulus 0 18 db 5 10 15 0 5 10 15 frequency (Hz) Noise noisier Speech coherence Response Spectrum Noise contains more energy at higher.2 modulation rate, and therefore interfere with speech more at.1 high modulation rates. frequency (Hz)

Adaptive Encoding of Modulations power Neural sensitivity profile shifts away from the modulation rates heavily corrupted by noise. Modulation Spectrum of Stimulus 18 db noisier coherence.2.1 Response Spectrum Cutoff Frequency Hz 9 8 7 6 C +6 +2 SNR 0 5 10 15 0 5 10 15 frequency (Hz) frequency (Hz)

Outline Cortical Encoding of Speech in MEG Representation of Spectro-temporal Features Neural Coding under Energetic Masking Speech in Stationary Noise Neural Coding under Informational Masking Segregation of Simultaneous Speakers

Diotic Speech Segregation Two speakers, one male and one female, were mixed and presented diotically. The subjects were instructed to focus on one or the other speaker. The MEG response is modeled using two STRFs, one for each speaker. speech mixture Stream 1 Stream 2 MEG signal

Neural Unmixing of Concurrent Speakers frequency (khz) frequency (khz) 3 1.5.2 3 1.5.2 + Attended 0 100 200 Unattended 0 100 200 time (ms) Neurally decoded envelope is more correlated with the attended speaker in >90% of single trials. Correlation 0.2 0.1 0 Attended P << 0.001 Unattended

Summary 1. Neural processing adapts to noise. 2. Simultaneous speakers can be neurally segregated and processed differently. 3. Cortical encoding is precise yet dynamic: modulated by both stimulus acoustics (bottom-up) and attention (top-down), and leading to a robust encoding of speech in natural scenes.

Acknowledgement We thank Stephen David, David Poeppel, Mary Howard, Shihab Shamma, and Monita Chatterjee for discussions! SfN poster: 172.11/KK6 (Sunday, 10-11) Contact: gahding@umd.edu Nai Ding jzsimon@umd.edu Jonathan Z. Simon

Thank you!

Adaptive Encoding of Modulations Neural sensitivity profile shifts away from the modulation rates heavily corrupted by noise. Modulation Spectrum of Stimulus Response Spectrum 18 db noisier coherence.2.1 0 5 10 15 0 5 10 15 frequency (Hz) frequency (Hz)

STRF from MEG and LFP Frequency (khz) 3.3 1.3 0.5 0.2 MEG STRF 0 0.1 0.2 0.3 Time (s) f (khz) 5 1.1 LFP from ferret AI 0 0.1 0.2 0.3 LFP 0 0.1 0.2 0.3 time (s) (in collaboration with Stephen David and Shihab Shamma)