Neural Representations of the Cocktail Party in Human Auditory Cortex

Similar documents
Neural Representations of the Cocktail Party in Human Auditory Cortex

Neural Representations of the Cocktail Party in Human Auditory Cortex

Cortical Encoding of Auditory Objects at the Cocktail Party. Jonathan Z. Simon University of Maryland

Cortical Encoding of Auditory Objects in the Cocktail Party Problem. Jonathan Z. Simon University of Maryland

Neural Representations of Speech, and Speech in Noise, in Human Auditory Cortex

Neural Representations of Speech at the Cocktail Party in Human Auditory Cortex

Neural Representations of Speech in Human Auditory Cortex

Competing Streams at the Cocktail Party

Robust Neural Encoding of Speech in Human Auditory Cortex

Effects of aging on temporal synchronization of speech in noise investigated in the cortex by using MEG and in the midbrain by using EEG techniques

Over-representation of speech in older adults originates from early response in higher order auditory cortex

Chapter 5. Summary and Conclusions! 131

ABSTRACT. Professor, Jonathan Z. Simon, Department of Electrical and Computer Engineering

Spectro-temporal response fields in the inferior colliculus of awake monkey

The neurolinguistic toolbox Jonathan R. Brennan. Introduction to Neurolinguistics, LSA2017 1

Computational Perception /785. Auditory Scene Analysis

Neurobiology of Hearing (Salamanca, 2012) Auditory Cortex (2) Prof. Xiaoqin Wang

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Rhythm and Rate: Perception and Physiology HST November Jennifer Melcher

Structure and Function of the Auditory and Vestibular Systems (Fall 2014) Auditory Cortex (3) Prof. Xiaoqin Wang

Competing Streams at the Cocktail Party: Exploring the Mechanisms of Attention and Temporal Integration

CURRICULUM VITAE Jonathan Z. Simon

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

Role of F0 differences in source segregation

10/15/2016. Hearing loss. Aging. Cognition. Aging, Cognition, and Hearing Loss: Clinical Implications

Modulation and Top-Down Processing in Audition

Auditory fmri correlates of loudness perception for monaural and diotic stimulation

Beyond Blind Averaging: Analyzing Event-Related Brain Dynamics. Scott Makeig. sccn.ucsd.edu

Interaction between Attention and Bottom-Up Saliency Mediates the Representation of Foreground and Background in an Auditory Scene

FINE-TUNING THE AUDITORY SUBCORTEX Measuring processing dynamics along the auditory hierarchy. Christopher Slugocki (Widex ORCA) WAS 5.3.

Introduction to Computational Neuroscience

HCS 7367 Speech Perception

Representation of sound in the auditory nerve

Early Occurrence Of Auditory Change Detection In The Human Brain

Effect of informational content of noise on speech representation in the aging midbrain and cortex

Task Difficulty and Performance Induce Diverse Adaptive Patterns in Gain and Shape of Primary Auditory Cortical Receptive Fields

Toward Extending Auditory Steady-State Response (ASSR) Testing to Longer-Latency Equivalent Potentials

Sound localization psychophysics

The role of periodicity in the perception of masked speech with simulated and real cochlear implants

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear

Neuromagnetic recordings reveal the temporal dynamics of auditory spatial processing in the human cortex

Dynamics of Precise Spike Timing in Primary Auditory Cortex

Report. Direct Recordings of Pitch Responses from Human Auditory Cortex

How is the stimulus represented in the nervous system?

Decisions Have Consequences

Binaural Hearing. Why two ears? Definitions

Chapter 40 Effects of Peripheral Tuning on the Auditory Nerve s Representation of Speech Envelope and Temporal Fine Structure Cues

But, what about ASSR in AN?? Is it a reliable tool to estimate the auditory thresholds in those category of patients??

Entrainment of neuronal oscillations as a mechanism of attentional selection: intracranial human recordings

Brain Imaging of auditory function

Neural Correlates of Human Cognitive Function:

Introduction to Computational Neuroscience

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking

CS/NEUR125 Brains, Minds, and Machines. Due: Friday, April 14

Processing Interaural Cues in Sound Segregation by Young and Middle-Aged Brains DOI: /jaaa

Sum of Neurally Distinct Stimulus- and Task-Related Components.

Precise Spike Timing and Reliability in Neural Encoding of Low-Level Sensory Stimuli and Sequences

A Brain Computer Interface System For Auto Piloting Wheelchair

EEG reveals divergent paths for speech envelopes during selective attention

AccuScreen ABR Screener

TOWARD A BRAIN INTERFACE FOR TRACKING ATTENDED AUDITORY SOURCES

Systems Neuroscience Oct. 16, Auditory system. http:

Simulations of high-frequency vocoder on Mandarin speech recognition for acoustic hearing preserved cochlear implant

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

HCS 7367 Speech Perception

Processing and Plasticity in Rodent Somatosensory Cortex

The Central Nervous System

EE 4BD4 Lecture 11. The Brain and EEG

Acoustics Research Institute

Challenges in microphone array processing for hearing aids. Volkmar Hamacher Siemens Audiological Engineering Group Erlangen, Germany

Who are cochlear implants for?

The neural code for interaural time difference in human auditory cortex

Encoding of natural sounds by variance of the cortical local field potential

Dynamic, adaptive task-related processing of sound in the auditory and prefrontal cortex

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Studying the time course of sensory substitution mechanisms (CSAIL, 2014)

International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 10 (November 2014)

An Overview of BMIs. Luca Rossini. Workshop on Brain Machine Interfaces for Space Applications

MULTI-CHANNEL COMMUNICATION

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

The Evolution of Neuroprosthetics

Atypical processing of prosodic changes in natural speech stimuli in school-age children with Asperger syndrome

This is a repository copy of Involvement of right STS in audio-visual integration for affective speech demonstrated using MEG.

Neural Correlates of Auditory Perceptual Awareness under Informational Masking

Power and phase properties of oscillatory neural responses in the presence of background activity

Decoding the attended speech stream with multichannel EEG: implications for online, daily-life applications

EFFECTS OF TEMPORAL FINE STRUCTURE ON THE LOCALIZATION OF BROADBAND SOUNDS: POTENTIAL IMPLICATIONS FOR THE DESIGN OF SPATIAL AUDIO DISPLAYS

Auditory temporal edge detection in human auditory cortex

Cortical sound processing in children with autism disorder: an MEG investigation

Auditory principles in speech processing do computers need silicon ears?

Robustness, Separation & Pitch

Carnegie Mellon University Annual Progress Report: 2011 Formula Grant

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant

Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a Cocktail Party

Issues faced by people with a Sensorineural Hearing Loss

Essential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar

Nikos Laskaris ENTEP

Transcription:

Neural Representations of the Cocktail Party in Human Auditory Cortex Jonathan Z. Simon Department of Biology Department of Electrical & Computer Engineering Institute for Systems Research University of Maryland http://www.isr.umd.edu/labs/cssl/simonlab MPI Leipzig, 5 June 2014

Acknowledgements Grad Students Francisco Cervantes Alex Presacco Krishna Puvvada Past Grad Students Nayef Ahmar Claudia Bonin Maria Chait Marisel Villafane Delgado Kim Drnec Nai Ding Victor Grau-Serrat Ling Ma Raul Rodriguez Juanjuan Xiang Kai Sum Li Jiachen Zhuo Undergraduate Students Abdulaziz Al-Turki Nicholas Asendorf Sonja Bohr Elizabeth Camenga Corinne Cameron Julien Dagenais Katya Dombrowski Kevin Hogan Kevin Kahn Andrea Shome Madeleine Varmer Ben Walsh Collaborators Students Murat Aytekin Julian Jenkins David Klein Huan Luo Past Postdocs Dan Hertz Yadong Wang Collaborators Catherine Carr Monita Chatterjee Alain de Cheveigné Didier Depireux Mounya Elhilali Jonathan Fritz Cindy Moss David Poeppel Shihab Shamma Funding NIH R01 DC 008342 NIH R01 DC 007657 NIH R01 DC 005660 NIH R01 DC 000436 NIH R01 AG 036424 NIH R01 AG 027573 NIH R01 EB 004750 NIH R03 DC 004382 USDA 20096512005791

Introduction Magnetoencephalography (MEG) Auditory Objects Neural Representations of Auditory Objects in Cortex: Decoding Neural Representations of Auditory Objects in Cortex: Encoding Neural Representations of Speech in Noise

Magnetoencephalography Non-invasive, Passive, Silent Neural Recordings Simultaneous Whole-Head Recording (~200 sensors) Sensitivity high: ~100 ft (10 13 Tesla) low: ~10 4 ~10 6 neurons Temporal Resolution: ~1 ms Spatial Resolution coarse: ~1 cm ambiguous

Neural Signals & MEG scalp skull B MEG V EEG recording! surface orientation! of magnetic! field Magnetic! Dipolar! Field! Projection CSF tissue current! flow Photo by Fritz Goro Direct electrophysiological measurement! not hemodynamic! real-time! No unique solution for distributed source Measures spatially synchronized cortical activity! Fine temporal resolution (~ 1 ms)! Moderate spatial resolution (~ 1 cm)

Time Course of MEG Responses Auditory Evoked Responses MEG Response Patterns Time-Locked to Stimulus Events Robust Strongly Lateralized Pure Tone Broadband Noise

MEG Component Analysis Data driven spatial filtering: many available methods ICA, PCA, DSS Generate spatial filters & their outputs ( components ) DSS: Denoising Source Separation: Särelä & Valpola (2005) DSS components ordered by reproducibility 1st component maximally reproducible = most stimulus driven

DSS Example Most reproducible filter & component Optimally filters out trialto-trial-variable signal = neural noise Filter can be applied to other signals, e.g. single trials Särelä & Valpola (2005) de Cheveigné & Simon, J. Neurosci. Methods (2008)

DSS Example: Spectral Frequency Spectrum before DSS Frequency Spectrum after DSS!! Ding & Simon, J. Neurophysiol (2009)

Phase-Locking in MEG to Slow Acoustic Modulations AM at 3 Hz 3 Hz phase-locked response MEG activity is precisely phase-locked to temporal modulations of sound response spectrum (subject R0747) 3 Hz 6 Hz 0 10 Ding & Simon, J Neurophysiol (2009) Wang et al., J Neurophysiol (2012) Frequency (Hz)

MEG Responses to Speech Modulations Auditory Model

MEG Responses Predicted by STRF Model (up to ~10 Hz) Linear Kernel = STRF Ding & Simon, J Neurophysiol (2012) Spectro-Temporal Response Function

Neural Reconstruction of Speech Envelope Speech Envelope Decoder MEG Responses. (up to ~ 10 Hz) stimulus speech envelope reconstructed stimulus speech envelope 2 s Ding & Simon, J Neurophysiol (2012) Zion-Golumbic et al., Neuron (2013) Reconstruction accuracy comparable to single unit & ECoG recordings

Auditory Objects What is an auditory object? perceptual construct (not neural, not acoustic) commonalities with visual objects several potential formal definitions

Auditory Object Definition Griffiths & Warren definition: corresponds with something in the sensory world object information separate from information of rest of sensory world abstracted: object information generalized over particular sensory experiences

Auditory Objects at the Cocktail Party Alex Katz, The Cocktail Party

Auditory Objects at the Cocktail Party Alex Katz, The Cocktail Party

Experiments stationary noise speech noisevocoding competing speech

Speech Stream as an Auditory Object corresponds with something in the sensory world information separate from information of rest of sensory world e.g. other speech streams or noise abstracted: object information generalized over particular sensory experiences e.g. different sound mixtures

Neural Representation of an Auditory Object neural representation is of something in sensory world when other sounds mixed in, neural representation is of that auditory object, not entire acoustic scene neural representation invariant under broad changes in specific acoustics

Selective Neural Encoding

Unselective vs. Selective Neural Encoding

Selective Neural Encoding

Stream-Specific Representation representative subject attending to! speaker 1 attending to! speaker 2 grand average over subjects reconstructed from MEG attended speech envelopes reconstructed from MEG Ding & Simon, PNAS (2013) Identical Stimuli!

Single Trial Speech Reconstruction Ding & Simon, PNAS (2013)

Single Trial Speech Reconstruction Ding & Simon, PNAS (2013)

Overall Speech Reconstruction correlation 0.2 0.1 Distinct neural representations for different speech streams 0 attended speech reconstruction attended speech background reconstruction background

Invariance Under Acoustic Changes

Invariance Under Acoustic Changes

Gain-Control Models Stream-Based Gain Control? Object-Based correlation.2.1 attended background Neural Results -8-5 0 5 8 Speaker Relative Intensity (db) correlation.2.1 attended background Stimulus- Based correlation.2.1 attended background -8-5 0 5 8 Speaker Relative Intensity (db) -8-5 0 5 8 Speaker Relative Intensity (db) Stream-based not stimulus-based! Neural representation is invariant to acoustic changes.

Neural Representation of an Auditory Object neural representation is of something in sensory world when other sounds mixed in, neural representation is of auditory object, not entire acoustic scene neural representation invariant under broad changes in specific acoustics

Forward STRF Model Spectro-Temporal Response Function (STRF)

STRF Results frequency (khz) 3 1.5.2 Attended 0 100 200 time (ms) 3 1.5.2 Background 0 100 200 time (ms) TRF background STRF separable (time, frequency)! 300 Hz - 2 khz dominant carriers! M50STRF positive peak! M100STRF negative peak M100STRF strongly modulated by attention, but not M50STRF attended

Neural Sources M100STRF source near (same as?) M100 source: Planum Temporale!! M50STRF source is anterior and medial to M100 (same as M50?): Heschl s Gyrus anterior posterior Left M50STRF M100STRF M100 medial Right 5 mm

Cortical Object- Processing Hierarchy Attentional Modulation Influence of Relative Intensity 0 attended background 0 100 200 400 time (ms) 0 0 100 200 400 time (ms) 8 db 5 db 0 db -5 db -8 db clean M100STRF strongly modulated by attention, but not M50STRF.! M100STRF invariant against acoustic changes.! Objects well-neurally represented at 100 ms, but not 50 ms.

Ding & Simon, J Neuroscience (2013) Speech in Noise

Speech in Noise: Results Neural Reconstruction of Underlying Speech Envelope +6 db 1 s -6 db C Correlation with Intelligiblity B correlation Reconstruction Accuracy.2.1 0 Q +6 +2 3 6 9 SNR (db) reconstruction accuracy.3.2.1 0 0 25 50 75 100 intelligiblity (%)

Noise-Vocoded Speech in noise = +3 db SNR Ding, Chatterjee & Simon, NeuroImage (2014)

Noise-Vocoded Speech: Results Cortical entrainment to natural speech robust to noise! Cortical entrainment to vocoded speech is not! Not explainable by passive envelope tracking mechanisms! - noise vocoding does not directly affect the stimulus envelope

Noise-Vocoded Speech: Results

Not Just Speech Competing Tone Streams Tone Stream in Masker Cloud Frequency Frequency Time Time Xiang et al., J Neuroscience (2010) Elhilali et al., PLoS Biology (2009)

Summary Cortical representations of speech found here: consistent with being neural representations of auditory perceptual objects very robust to noise (~intelligibility) relies on spectro-temporal fine structure explicitly temporal representation Object representation at 100 ms latency (PT), but not by 50 ms (HG)

Thank You