Sound Analysis Research at LabROSA

Size: px
Start display at page:

Download "Sound Analysis Research at LabROSA"

Transcription

1 Sound Analysis Research at LabROSA Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA 1. Speech 2. Music 3. Environmental Sound LabROSA Overview - Dan Ellis /17

2 LabROSA Overview Information Extraction Music Eigenrhythms Environment Personal audio Machine Learning Meeting turns Speech FDLP Signal Processing LabROSA Overview - Dan Ellis /17

3 1. Speech Analysis / Recognition Speech recognizers work for read speech poorly for spontaneous e.g. % errors 30% Transform spontaneous speech to read? Read with Sambarta Bhattacharjee Spontaneous Spont speech pole freq slope < 1 reduction Read speech pole freq LabROSA Overview - Dan Ellis /17

4 Meeting Recordings with Jerry Liu and ICSI Multi-mic recordings for speaker turns every voice reaches every mic... (?)... but with differing coupling filters (delays, gains) Find turns with minimal assumptions e.g. ad-hoc sensor setups (multiple PDAs) differences to remove effect of source signal - no spectral models, < 1xRT LabROSA Overview - Dan Ellis /17

5 Speaker Turns from Timing Diffs Find best timing skew between mic pairs Find clusters in high-confidence points Fit Gaussians to each cluster, assign that class to all frames within radius 0 ICSI0: good points 0 All pts: nearest class 0 All pts: closest dimension LabROSA Overview - Dan Ellis /17

6 2. Music Signal Analysis A lot of music data available e.g. 60G of MP3 00 hr of audio/k tracks What can we do with it? implicit definition of music Quality vs. quantity Speech recognition lesson: x data, 1/th annotation, twice as useful Motivating Applications music similarity / classification computer (assisted) music generation insight into music LabROSA Overview - Dan Ellis /17

7 Transcription as Classification with Graham Poliner Signal models typically used for transcription harmonic spectrum, superposition But... trade domain knowledge for data transcription as pure classification problem: Audio Trained classifier p("c0" Audio) p("c#0" Audio) p("d0" Audio) p("d#0" Audio) p("e0" Audio) p("f0" Audio) single N-way discrimination for melody per-note classifiers for polyphonic transcription LabROSA Overview - Dan Ellis /17

8 Classifier Transcription Results Trained on MIDI syntheses (32 songs) SMO SVM (Weka) Tested on ISMIR MIREX 03 set foreground/background separation Frame-level pitch concordance system jazz3 overall fg+bg 71.% 44.3% just fg 6.1% 4.4% LabROSA Overview - Dan Ellis /17

9 Eigenrhythms: Drum Pattern Space Pop songs built on repeating drum loop bass drum, snare, hi-hat small variations on a few basic patterns with John Arroyo Eigen-analysis (PCA) to capture variations? by analyzing lots of (MIDI) data Applications music categorization beat box synthesis LabROSA Overview - Dan Ellis /17

10 Eigenrhythms Need + Eigenvectors for good coverage of 0 training patterns ( dims) Top patterns: LabROSA Overview - Dan Ellis /17

11 Eigenrhythms for Classification 0 - Projections in Eigenspace / LDA space PCA(1,2) projection (16% corr) 6 blues country 4 disco hiphop2 house newwave rock 0 pop punk -2 rnb LDA(1,2) projection (33% corr) way Genre classification (nearest nbr): PCA3: % correct LDA4: 36% correct LabROSA Overview - Dan Ellis /17

12 3. Other Sounds: Clap Detection Rhythmic clapping may help neural development sensori-motor planning focus and attention Interactive metronome devices give feedback on synchrony sensor-based Classroom deployment? acoustic-based? for multiple simultaneous users?? with Nathan Lesser from interactivemetronome.com LabROSA Overview - Dan Ellis /17

13 Clap Range Discrimination Absolute level varies Decay slopes ~ same reverberation (RT 60 ~ 900ms) Initial burst for near-field direct sound amplitude energy (4ms) / db freq / khx Near-field (327MUDD nf0:4) Far-field (327MUDD ff0:4) time / s time / s LabROSA Overview - Dan Ellis /17

14 Personal Audio with Keansub Lee Easy to record everything you hear ~0GB / 64 kbps Very hard to find anything how to scan? how to visualize? how to index? Starting point: Collect data ~ 60 hours (8 days, ~7. hr/day) hand-mark 139 segments (26 min/seg avg.) assign to 16 classes (8 have multiple instances) LabROSA Overview - Dan Ellis /17

15 Features for Long Recordings Feature frames = 1 min (not 2 ms!) Characterize variation within each frame... Average Linear Energy 1 Normalized Energy Deviation 60 freq / bark 0 80 freq / bark 40 Average Log Energy 60 db 1 Log Energy Deviation db freq / bark freq / bark Average Spectral Entropy 0. bits and structure within coarse auditory bands db freq / bark freq / bark Spectral Entropy Deviation time / min db bits LabROSA Overview - Dan Ellis /17

16 Personal Audio Applications Visualization / browsing / diary inference link in other information sources - diary - NoteTaker interface: what was I hearing? LabROSA Overview - Dan Ellis /17

17 LabROSA Summary LabROSA signal processing + machine learning + information extraction Applications Speech: Recognition, Organization Music: Transcription, Recommendation Environment: Detection, Description Also... signal separation, compression, dolphins... LabROSA Overview - Dan Ellis /17

LabROSA Research Overview

LabROSA Research Overview LabROSA Research Overview Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1.

More information

Using the Soundtrack to Classify Videos

Using the Soundtrack to Classify Videos Using the Soundtrack to Classify Videos Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

Using Source Models in Speech Separation

Using Source Models in Speech Separation Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

General Soundtrack Analysis

General Soundtrack Analysis General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

Lecture 3: Perception

Lecture 3: Perception ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 3: Perception 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception Dan Ellis Dept. Electrical Engineering, Columbia University

More information

Lecture 9: Speech Recognition: Front Ends

Lecture 9: Speech Recognition: Front Ends EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition: Front Ends 1 2 Recognizing Speech Feature Calculation Dan Ellis http://www.ee.columbia.edu/~dpwe/e682/

More information

Computational Perception /785. Auditory Scene Analysis

Computational Perception /785. Auditory Scene Analysis Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in

More information

Psychoacoustical Models WS 2016/17

Psychoacoustical Models WS 2016/17 Psychoacoustical Models WS 2016/17 related lectures: Applied and Virtual Acoustics (Winter Term) Advanced Psychoacoustics (Summer Term) Sound Perception 2 Frequency and Level Range of Human Hearing Source:

More information

Sound Texture Classification Using Statistics from an Auditory Model

Sound Texture Classification Using Statistics from an Auditory Model Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering

More information

Robustness, Separation & Pitch

Robustness, Separation & Pitch Robustness, Separation & Pitch or Morgan, Me & Pitch Dan Ellis Columbia / ICSI dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. Robustness and Separation 2. An Academic Journey 3. Future COLUMBIA

More information

SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones

SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones Hong Lu, Wei Pan, Nicholas D. Lane, Tanzeem Choudhury, Andrew T. Campbell Dept. of Computer Science, Dartmouth College

More information

Hearing the Universal Language: Music and Cochlear Implants

Hearing the Universal Language: Music and Cochlear Implants Hearing the Universal Language: Music and Cochlear Implants Professor Hugh McDermott Deputy Director (Research) The Bionics Institute of Australia, Professorial Fellow The University of Melbourne Overview?

More information

whether or not the fundamental is actually present.

whether or not the fundamental is actually present. 1) Which of the following uses a computer CPU to combine various pure tones to generate interesting sounds or music? 1) _ A) MIDI standard. B) colored-noise generator, C) white-noise generator, D) digital

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

Recognition & Organization of Speech & Audio

Recognition & Organization of Speech & Audio Recognition & Organization of Speech & Audio Dan Ellis http://labrosa.ee.columbia.edu/ Outline 1 2 3 Introducing Projects in speech, music & audio Summary overview - Dan Ellis 21-9-28-1 1 Sound organization

More information

Sound Interfaces Engineering Interaction Technologies. Prof. Stefanie Mueller HCI Engineering Group

Sound Interfaces Engineering Interaction Technologies. Prof. Stefanie Mueller HCI Engineering Group Sound Interfaces 6.810 Engineering Interaction Technologies Prof. Stefanie Mueller HCI Engineering Group what is sound? if a tree falls in the forest and nobody is there does it make sound?

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

Using Speech Models for Separation

Using Speech Models for Separation Using Speech Models for Separation Dan Ellis Comprising the work of Michael Mandel and Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY

More information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion

More information

The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet

The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet Ghazaleh Vaziri Christian Giguère Hilmi R. Dajani Nicolas Ellaham Annual National Hearing

More information

J Jeffress model, 3, 66ff

J Jeffress model, 3, 66ff Index A Absolute pitch, 102 Afferent projections, inferior colliculus, 131 132 Amplitude modulation, coincidence detector, 152ff inferior colliculus, 152ff inhibition models, 156ff models, 152ff Anatomy,

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening AUDL GS08/GAV1 Signals, systems, acoustics and the ear Pitch & Binaural listening Review 25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 Part I: Auditory frequency selectivity Tuning

More information

Sound, Mixtures, and Learning

Sound, Mixtures, and Learning Sound, Mixtures, and Learning Dan Ellis Laboratory for Recognition and Organization of Speech and Audio (LabROSA) Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant

An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant Annual Progress Report An Auditory-Model-Based Electrical Stimulation Strategy Incorporating Tonal Information for Cochlear Implant Joint Research Centre for Biomedical Engineering Mar.7, 26 Types of Hearing

More information

Auditory gist perception and attention

Auditory gist perception and attention Auditory gist perception and attention Sue Harding Speech and Hearing Research Group University of Sheffield POP Perception On Purpose Since the Sheffield POP meeting: Paper: Auditory gist perception:

More information

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine

More information

Source and Description Category of Practice Level of CI User How to Use Additional Information. Intermediate- Advanced. Beginner- Advanced

Source and Description Category of Practice Level of CI User How to Use Additional Information. Intermediate- Advanced. Beginner- Advanced Source and Description Category of Practice Level of CI User How to Use Additional Information Randall s ESL Lab: http://www.esllab.com/ Provide practice in listening and comprehending dialogue. Comprehension

More information

Topic 4. Pitch & Frequency

Topic 4. Pitch & Frequency Topic 4 Pitch & Frequency A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu (accompanying himself on doshpuluur) demonstrates perfectly the characteristic sound of the Xorekteer voice An

More information

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete Teager Energy and Modulation Features for Speech Applications Alexandros Summariza(on Potamianos and Emo(on Tracking in Movies Dept. of ECE Technical Univ. of Crete Alexandros Potamianos, NatIONAL Tech.

More information

Auditory scene analysis in humans: Implications for computational implementations.

Auditory scene analysis in humans: Implications for computational implementations. Auditory scene analysis in humans: Implications for computational implementations. Albert S. Bregman McGill University Introduction. The scene analysis problem. Two dimensions of grouping. Recognition

More information

Linguistic Phonetics Fall 2005

Linguistic Phonetics Fall 2005 MIT OpenCourseWare http://ocw.mit.edu 24.963 Linguistic Phonetics Fall 2005 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 24.963 Linguistic Phonetics

More information

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics

! Can hear whistle? ! Where are we on course map? ! What we did in lab last week. ! Psychoacoustics 2/14/18 Can hear whistle? Lecture 5 Psychoacoustics Based on slides 2009--2018 DeHon, Koditschek Additional Material 2014 Farmer 1 2 There are sounds we cannot hear Depends on frequency Where are we on

More information

Communication with low-cost hearing protectors: hear, see and believe

Communication with low-cost hearing protectors: hear, see and believe 12th ICBEN Congress on Noise as a Public Health Problem Communication with low-cost hearing protectors: hear, see and believe Annelies Bockstael 1,3, Lies De Clercq 2, Dick Botteldooren 3 1 Université

More information

Sound Production. Phonotaxis in crickets. What is sound? Recognition and Localization. What happens over time at a single point in space?

Sound Production. Phonotaxis in crickets. What is sound? Recognition and Localization. What happens over time at a single point in space? Behaviour Sound Production Phonotaxis in crickets Recognition and Localization scraper close scraper file open Males open and close wings rhythmically. On each closing stroke, scraper contacts file causing

More information

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH) Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH) Matt Huenerfauth Raja Kushalnagar Rochester Institute of Technology DHH Auditory Issues Links Accents/Intonation Listening

More information

Error Detection based on neural signals

Error Detection based on neural signals Error Detection based on neural signals Nir Even- Chen and Igor Berman, Electrical Engineering, Stanford Introduction Brain computer interface (BCI) is a direct communication pathway between the brain

More information

The Effect of Analysis Methods and Input Signal Characteristics on Hearing Aid Measurements

The Effect of Analysis Methods and Input Signal Characteristics on Hearing Aid Measurements The Effect of Analysis Methods and Input Signal Characteristics on Hearing Aid Measurements By: Kristina Frye Section 1: Common Source Types FONIX analyzers contain two main signal types: Puretone and

More information

EEG Signal Description with Spectral-Envelope- Based Speech Recognition Features for Detection of Neonatal Seizures

EEG Signal Description with Spectral-Envelope- Based Speech Recognition Features for Detection of Neonatal Seizures EEG Signal Description with Spectral-Envelope- Based Speech Recognition Features for Detection of Neonatal Seizures Temko A., Nadeu C., Marnane W., Boylan G., Lightbody G. presented by Ladislav Rampasek

More information

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds Hearing Lectures Acoustics of Speech and Hearing Week 2-10 Hearing 3: Auditory Filtering 1. Loudness of sinusoids mainly (see Web tutorial for more) 2. Pitch of sinusoids mainly (see Web tutorial for more)

More information

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview

Infant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview Infant Hearing Development: Translating Research Findings into Clinical Practice Lori J. Leibold Department of Allied Health Sciences The University of North Carolina at Chapel Hill Auditory Development

More information

Oral Presentation #6 Clinical Analysis of Speech Rhythms in Language Development using MATLAB

Oral Presentation #6 Clinical Analysis of Speech Rhythms in Language Development using MATLAB Oral Presentation #6 Clinical Analysis of Speech Rhythms in Language Development using MATLAB Ben Christ, Madeline Girard, Zeynep Sayar, Cathleen Trespasz Problem Statement Preliminary research has been

More information

GfK Verein. Detecting Emotions from Voice

GfK Verein. Detecting Emotions from Voice GfK Verein Detecting Emotions from Voice Respondents willingness to complete questionnaires declines But it doesn t necessarily mean that consumers have nothing to say about products or brands: GfK Verein

More information

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.

Linguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions. 24.963 Linguistic Phonetics Basic Audition Diagram of the inner ear removed due to copyright restrictions. 1 Reading: Keating 1985 24.963 also read Flemming 2001 Assignment 1 - basic acoustics. Due 9/22.

More information

BMMC (UG SDE) IV SEMESTER

BMMC (UG SDE) IV SEMESTER UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION BMMC (UG SDE) IV SEMESTER GENERAL COURSE IV COMMON FOR Bsc ELECTRONICS, COMPUTER SCIENCE, INSTRUMENTATION & MULTIMEDIA BASICS OF AUDIO & VIDEO MEDIA QUESTION

More information

Technical Discussion HUSHCORE Acoustical Products & Systems

Technical Discussion HUSHCORE Acoustical Products & Systems What Is Noise? Noise is unwanted sound which may be hazardous to health, interfere with speech and verbal communications or is otherwise disturbing, irritating or annoying. What Is Sound? Sound is defined

More information

Power Instruments, Power sources: Trends and Drivers. Steve Armstrong September 2015

Power Instruments, Power sources: Trends and Drivers. Steve Armstrong September 2015 Power Instruments, Power sources: Trends and Drivers Steve Armstrong September 2015 Focus of this talk more significant losses Severe Profound loss Challenges Speech in quiet Speech in noise Better Listening

More information

Two Modified IEC Ear Simulators for Extended Dynamic Range

Two Modified IEC Ear Simulators for Extended Dynamic Range Two Modified IEC 60318-4 Ear Simulators for Extended Dynamic Range Peter Wulf-Andersen & Morten Wille The international standard IEC 60318-4 specifies an occluded ear simulator, often referred to as a

More information

Biomedical Engineering laboratory - research topics

Biomedical Engineering laboratory - research topics Biomedical Engineering laboratory - research topics Maayan Ventures & BME Biomedical Engineering Knowledge Center 20. February 2008. Background Biomedical Engineering: started in 1995 Host institute: BME

More information

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY

More information

Visi-Pitch IV is the latest version of the most widely

Visi-Pitch IV is the latest version of the most widely APPLICATIONS Voice Disorders Motor Speech Disorders Voice Typing Fluency Selected Articulation Training Hearing-Impaired Speech Professional Voice Accent Reduction and Second Language Learning Importance

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

Classroom Acoustics Research

Classroom Acoustics Research Classroom Acoustics Research Lianna Curiale, Fatima Yashmin, Elaine Angelopulos, Stacy Amar and Anatoly Syutkin. Project supervisor: Gabriel Bulgarea Introduction: [American Speech-Language-Hearing Association;

More information

Computational Models of Mammalian Hearing:

Computational Models of Mammalian Hearing: Computational Models of Mammalian Hearing: Frank Netter and his Ciba paintings An Auditory Image Approach Dick Lyon For Tom Dean s Cortex Class Stanford, April 14, 2010 Breschet 1836, Testut 1897 167 years

More information

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music)

Topic 4. Pitch & Frequency. (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao Duan s course slides on Computer Audition and Its Applications in Music) A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu

More information

Providing Effective Communication Access

Providing Effective Communication Access Providing Effective Communication Access 2 nd International Hearing Loop Conference June 19 th, 2011 Matthew H. Bakke, Ph.D., CCC A Gallaudet University Outline of the Presentation Factors Affecting Communication

More information

HOW TO USE THE SHURE MXA910 CEILING ARRAY MICROPHONE FOR VOICE LIFT

HOW TO USE THE SHURE MXA910 CEILING ARRAY MICROPHONE FOR VOICE LIFT HOW TO USE THE SHURE MXA910 CEILING ARRAY MICROPHONE FOR VOICE LIFT Created: Sept 2016 Updated: June 2017 By: Luis Guerra Troy Jensen The Shure MXA910 Ceiling Array Microphone offers the unique advantage

More information

Chapter 1: Introduction to digital audio

Chapter 1: Introduction to digital audio Chapter 1: Introduction to digital audio Applications: audio players (e.g. MP3), DVD-audio, digital audio broadcast, music synthesizer, digital amplifier and equalizer, 3D sound synthesis 1 Properties

More information

A Sleeping Monitor for Snoring Detection

A Sleeping Monitor for Snoring Detection EECS 395/495 - mhealth McCormick School of Engineering A Sleeping Monitor for Snoring Detection By Hongwei Cheng, Qian Wang, Tae Hun Kim Abstract Several studies have shown that snoring is the first symptom

More information

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise 4 Special Issue Speech-Based Interfaces in Vehicles Research Report Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise Hiroyuki Hoshino Abstract This

More information

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for What you re in for Speech processing schemes for cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences

More information

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear

Prelude Envelope and temporal fine. What's all the fuss? Modulating a wave. Decomposing waveforms. The psychophysics of cochlear The psychophysics of cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences Prelude Envelope and temporal

More information

Carnegie Mellon University Annual Progress Report: 2011 Formula Grant

Carnegie Mellon University Annual Progress Report: 2011 Formula Grant Carnegie Mellon University Annual Progress Report: 2011 Formula Grant Reporting Period January 1, 2012 June 30, 2012 Formula Grant Overview The Carnegie Mellon University received $943,032 in formula funds

More information

Evaluating Auditory Contexts and Their Impacts on Hearing Aid Outcomes with Mobile Phones

Evaluating Auditory Contexts and Their Impacts on Hearing Aid Outcomes with Mobile Phones Evaluating Auditory Contexts and Their Impacts on Hearing Aid Outcomes with Mobile Phones Syed Shabih Hasan, Octav Chipara Department of Computer Science/Aging Mind and Brain Initiative (AMBI) Yu-Hsiang

More information

TOLERABLE DELAY FOR SPEECH PROCESSING: EFFECTS OF HEARING ABILITY AND ACCLIMATISATION

TOLERABLE DELAY FOR SPEECH PROCESSING: EFFECTS OF HEARING ABILITY AND ACCLIMATISATION TOLERABLE DELAY FOR SPEECH PROCESSING: EFFECTS OF HEARING ABILITY AND ACCLIMATISATION Tobias Goehring, PhD Previous affiliation (this project): Institute of Sound and Vibration Research University of Southampton

More information

Overview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text?

Overview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text? Access to Mainstream Classroom Instruction Through Real-Time Text Michael Stinson, Rochester Institute of Technology National Technical Institute for the Deaf Presentation at Best Practice in Mainstream

More information

ipod Noise Exposure Assessment in Simulated Environmental Conditions

ipod Noise Exposure Assessment in Simulated Environmental Conditions ipod Noise Exposure Assessment in Simulated Environmental Conditions Kyle N. Acker Advised by: Robert Novak Michael Heinz Background Since the 80s and the invention of the personal audio player, there

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 BACKGROUND Speech is the most natural form of human communication. Speech has also become an important means of human-machine interaction and the advancement in technology has

More information

HAT Process: Determining HAT for a Student

HAT Process: Determining HAT for a Student HAT Process: Determining HAT for a Student (Approved DSR or request Considerations for IEP Team) Audio & HI / TC determine who will be on team to determine HAT Team completes Needs Identification section

More information

Music and Hearing in the Older Population: an Audiologist's Perspective

Music and Hearing in the Older Population: an Audiologist's Perspective Music and Hearing in the Older Population: an Audiologist's Perspective Dwight Ough, M.A., CCC-A Audiologist Charlotte County Hearing Health Care Centre Inc. St. Stephen, New Brunswick Anatomy and Physiology

More information

Chapter 3. Sounds, Signals, and Studio Acoustics

Chapter 3. Sounds, Signals, and Studio Acoustics Chapter 3 Sounds, Signals, and Studio Acoustics Sound Waves Compression/Rarefaction: speaker cone Sound travels 1130 feet per second Sound waves hit receiver Sound waves tend to spread out as they travel

More information

Introduction to Audio Forensics 8 th MyCERT SIG 25/04/2006

Introduction to Audio Forensics 8 th MyCERT SIG 25/04/2006 Introduction to Audio Forensics 8 th MyCERT SIG 25/04/2006 By: Mohd Zabri Adil Talib zabriadil.talib@niser.org.my (C) 2006 - CFL NISER 1 Agenda Audio forensics definition Audio forensics origin Technologies

More information

Speech (Sound) Processing

Speech (Sound) Processing 7 Speech (Sound) Processing Acoustic Human communication is achieved when thought is transformed through language into speech. The sounds of speech are initiated by activity in the central nervous system,

More information

The following information relates to NEC products offered under our GSA Schedule GS-35F- 0245J and other Federal Contracts.

The following information relates to NEC products offered under our GSA Schedule GS-35F- 0245J and other Federal Contracts. The following information relates to NEC products offered under our GSA Schedule GS-35F- 0245J and other Federal Contracts. NEC Unified Solutions, Inc., based upon its interpretation of the Section 508

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 4aSCb: Voice and F0 Across Tasks (Poster

More information

Design Folio Stay Tuned! Comparing the effects of long and short-term auditory stimulation for increasing the sensitivity of a person s hearing.

Design Folio Stay Tuned! Comparing the effects of long and short-term auditory stimulation for increasing the sensitivity of a person s hearing. Design Folio Stay Tuned! Comparing the effects of long and short-term auditory stimulation for increasing the sensitivity of a person s hearing. Page 1 of 12 Need or problem For my scientific research

More information

Communication quality for students with a hearing impairment: An experiment evaluating speech intelligibility and annoyance

Communication quality for students with a hearing impairment: An experiment evaluating speech intelligibility and annoyance Communication quality for students with a hearing impairment: An experiment evaluating speech intelligibility and annoyance Johan Odelius, Örjan Johansson, Anders Ågren Division of Sound and Vibration,

More information

Hearing in the Environment

Hearing in the Environment 10 Hearing in the Environment Click Chapter to edit 10 Master Hearing title in the style Environment Sound Localization Complex Sounds Auditory Scene Analysis Continuity and Restoration Effects Auditory

More information

Acoustic Sensing With Artificial Intelligence

Acoustic Sensing With Artificial Intelligence Acoustic Sensing With Artificial Intelligence Bowon Lee Department of Electronic Engineering Inha University Incheon, South Korea bowon.lee@inha.ac.kr bowon.lee@ieee.org NVIDIA Deep Learning Day Seoul,

More information

Note: This document describes normal operational functionality. It does not include maintenance and troubleshooting procedures.

Note: This document describes normal operational functionality. It does not include maintenance and troubleshooting procedures. Date: 18 Nov 2013 Voluntary Accessibility Template (VPAT) This Voluntary Product Accessibility Template (VPAT) describes accessibility of Polycom s C100 and CX100 family against the criteria described

More information

Speech Processing / Speech Translation Case study: Transtac Details

Speech Processing / Speech Translation Case study: Transtac Details Speech Processing 11-492/18-492 Speech Translation Case study: Transtac Details Phraselator: One Way Translation Commercial System VoxTec Rapid deployment Modules of 500ish utts Transtac: Two S2S System

More information

Musical Instrument Classification through Model of Auditory Periphery and Neural Network

Musical Instrument Classification through Model of Auditory Periphery and Neural Network Musical Instrument Classification through Model of Auditory Periphery and Neural Network Ladislava Jankø, Lenka LhotskÆ Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University

More information

TEAK Bioengineering Artificial Hearing Lesson Plan Page 1 TEAK Traveling Engineering Activity Kits

TEAK Bioengineering Artificial Hearing Lesson Plan Page 1 TEAK Traveling Engineering Activity Kits TEAK Bioengineering Artificial Hearing Lesson Plan Page 1 TEAK Traveling Engineering Activity Kits Biomedical Engineering Kit: Artificial Sensory Artificial Hearing Activity TEAK Bioengineering Artificial

More information

3-D Sound and Spatial Audio. What do these terms mean?

3-D Sound and Spatial Audio. What do these terms mean? 3-D Sound and Spatial Audio What do these terms mean? Both terms are very general. 3-D sound usually implies the perception of point sources in 3-D space (could also be 2-D plane) whether the audio reproduction

More information

Hall of Fame or Shame? Human Abilities: Vision & Cognition. Hall of Shame! Human Abilities: Vision & Cognition. Outline. Video Prototype Review

Hall of Fame or Shame? Human Abilities: Vision & Cognition. Hall of Shame! Human Abilities: Vision & Cognition. Outline. Video Prototype Review Hall of Fame or Shame? Human Abilities: Vision & Cognition Prof. James A. Landay University of Washington Autumn 2008 October 21, 2008 2 Hall of Shame! Design based on a top retailer s site In study, user

More information

PHYS 1240 Sound and Music Professor John Price. Cell Phones off Laptops closed Clickers on Transporter energized

PHYS 1240 Sound and Music Professor John Price. Cell Phones off Laptops closed Clickers on Transporter energized PHYS 1240 Sound and Music Professor John Price Cell Phones off Laptops closed Clickers on Transporter energized The Ear and Hearing Thanks to Jed Whittaker for many of these slides Ear anatomy substructures

More information

11 Music and Speech Perception

11 Music and Speech Perception 11 Music and Speech Perception Properties of sound Sound has three basic dimensions: Frequency (pitch) Intensity (loudness) Time (length) Properties of sound The frequency of a sound wave, measured in

More information

Fujitsu LifeBook T Series TabletPC Voluntary Product Accessibility Template

Fujitsu LifeBook T Series TabletPC Voluntary Product Accessibility Template Fujitsu LifeBook T Series TabletPC Voluntary Product Accessibility Template 1194.21 Software Applications and Operating Systems* (a) When software is designed to run on a system that This product family

More information

Lecture 8: Spatial sound

Lecture 8: Spatial sound EE E6820: Speech & Audio Processing & Recognition Lecture 8: Spatial sound 1 2 3 4 Spatial acoustics Binaural perception Synthesizing spatial audio Extracting spatial sounds Dan Ellis

More information

Auditory Scene Analysis: phenomena, theories and computational models

Auditory Scene Analysis: phenomena, theories and computational models Auditory Scene Analysis: phenomena, theories and computational models July 1998 Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 The computational

More information

Music. listening with hearing aids

Music. listening with hearing aids Music listening with hearing aids T F A R D Music listening with hearing aids Hearing loss can range from mild to profound and can affect one or both ears. Understanding what you can hear with and without

More information

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES ISCA Archive ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES Allard Jongman 1, Yue Wang 2, and Joan Sereno 1 1 Linguistics Department, University of Kansas, Lawrence, KS 66045 U.S.A. 2 Department

More information

Lecture 4: Auditory Perception. Why study perception?

Lecture 4: Auditory Perception. Why study perception? EE E682: Speech & Audio Processing & Recognition Lecture 4: Auditory Perception 1 2 3 4 5 6 Motivation: Why & how Auditory physiology Psychophysics: Detection & discrimination Pitch perception Speech perception

More information

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar Perry C. Hanavan, AuD Augustana University Sioux Falls, SD August 15, 2018 Listening in Noise Cocktail Party Problem

More information

Voice Detection using Speech Energy Maximization and Silence Feature Normalization

Voice Detection using Speech Energy Maximization and Silence Feature Normalization , pp.25-29 http://dx.doi.org/10.14257/astl.2014.49.06 Voice Detection using Speech Energy Maximization and Silence Feature Normalization In-Sung Han 1 and Chan-Shik Ahn 2 1 Dept. of The 2nd R&D Institute,

More information

Avaya IP Office R9.1 Avaya one-x Portal Call Assistant Voluntary Product Accessibility Template (VPAT)

Avaya IP Office R9.1 Avaya one-x Portal Call Assistant Voluntary Product Accessibility Template (VPAT) Avaya IP Office R9.1 Avaya one-x Portal Call Assistant Voluntary Product Accessibility Template (VPAT) Avaya IP Office Avaya one-x Portal Call Assistant is an application residing on the user s PC that

More information

www.gnresound.com consumerhelp@gnresound.com ReSound North America 8001 Bloomington Freeway Bloomington, MN 55420 1.800.248.4327 Fax: 1.952.769.8001 ReSound Canada 303 Supertest Road Toronto, Ontario,

More information

Chapter 17 Sound Sound and Hearing. Properties of Sound Waves 1/20/2017. Pearson Prentice Hall Physical Science: Concepts in Action

Chapter 17 Sound Sound and Hearing. Properties of Sound Waves 1/20/2017. Pearson Prentice Hall Physical Science: Concepts in Action Pearson Prentice Hall Physical Science: Concepts in Action Chapter 17 Sound Standing Waves in Music When the string of a violin is played with a bow, it vibrates and creates standing waves. Some instruments,

More information

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION Lu Xugang Li Gang Wang Lip0 Nanyang Technological University, School of EEE, Workstation Resource

More information