General Soundtrack Analysis

Size: px
Start display at page:

Download "General Soundtrack Analysis"

Transcription

1 General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University Outline introduction Broadcast soundtrack monitoring Example technologies Summary FBIS - Dan Ellis

2 1 : Sound Organization freq / Hz alj time / sec Analysis Voice1 Music1 Stab Music2 Music2 (quiet) Voice2 Analyzing and describing complex sounds: - continuous sound mixture distinct objects & events Human listeners as the prototype - strong subjective impression when listening -..but hard to see in signal FBIS - Dan Ellis

3 The information in sound freq / Hz Steps 1 Steps time / s Hearing confers evolutionary advantage - optimized to get useful information from sound Enormous detail is available in familiar sounds - ecological influence on our sense of hearing FBIS - Dan Ellis

4 Audio Information Extraction Text Information Retrieval IR AutomaticSpeech Recognition ASR Text Information Extraction IE Blind Source Separation BSS Spoken Document Retrieval SDR Independent Component Analysis ICA Audio Information Extraction AIE Music Information Retrieval Music IR Computational Auditory Scene Analysis CASA Audio Fingerprinting Audio Content-based Retrieval Audio CBR Unsupervised Clustering Multimedia Information Retrieval MMIR Domain - text... speech... music... general audio Operation - recognize... index/retrieve... organize FBIS - Dan Ellis

5 Summary DOMAINS Broadcast Movies Lectures Music Meetings Personal recordings Location monitoring HCI Object-based structure discovery & learning Speech recognition Speaker description Nonspeech recognition Scene Analysis Audio-visual integration Music analysis APPLICATIONS Multimedia access & search Personal media management Machine perception & awareness Prostheses / human augmentation Automatic judgments/recommendation FBIS - Dan Ellis

6 Outline introduction Broadcast soundtrack monitoring - Available information - Information filtering - Intelligent analysis Example technologies Summary FBIS - Dan Ellis

7 2 Broadcast soundtrack monitoring: What information is available? Video and soundtrack reinforce, complement - hard things in one domain can be easy in other Information at different scales: generic sound events dialog style words program genre speaker emotion story coverage activity level schedule changes specific speaker ID seconds days timescale FBIS - Dan Ellis

8 Information filtering Maximizing analyst utility: information selection Automatic support for triage : - segmentation - inferring schedules - identify repeated segments - generic categorization/labeling - task-specific flagging FBIS - Dan Ellis

9 Automatic labeling/flagging Computer as vigilant monitor Models Monitoring computer Signal Flagged events Range of tasks - generic unusual patterns - specific trigger events Issues - false alarms vs. misses - combinations of simple detectors for higher-level classes e.g. program type FBIS - Dan Ellis

10 Schedule summarization Learn the patterns of a particular broadcaster: - 24/7 monitoring - automatic segmentation into programs - automatic classification of genres 6 12 M T W Th F S Su News Talk Sports Music Other Support information extraction - program types, repetitions, summarization - schedule changes activity? FBIS - Dan Ellis

11 Outline introduction Broadcast soundtrack monitoring Example technologies - Sound enhancement - Locating repetitions - eling soundtracks Summary FBIS - Dan Ellis

12 3 Example technologies: Sound enhancement Time scale modification - speed up for rapid skimming - slow down for close listening Noise reduction freq / Hz Original speech Sped up by 5% Slowed down to 5% Noise reduced at pk - 2dB time / sec FBIS - Dan Ellis

13 Repetition detection Matching signals is expensive - but matching strings is fast Represent audio as simple string - recognition into arbitrary classes - match detection becomes string matching spee cell Models perc viol anim mach soundtrack Recognizer v p a v c a String match detection target Recognizer p a c p v a s Monitor for many signatures FBIS - Dan Ellis

14 Pr(mus) freq / Hz Soundtrack labeling Broad class: speech-music alj time / sec Specific events: alarms freq / Hz 4 2 Speech + alarm 1 Pr(alarm) time / sec FBIS - Dan Ellis

15 Speaking style Recognizing information other than words Meeting Recorder project: Locate overlaps Spkr A backchannel (signals desire to regain floor?) floor seizure mr speaker active Spkr B speake cedes f Spkr C interruptions Spkr D breath noise Spkr E crosstalk Table top time / secs level/db 4 2 C A O M JL mr Speaker emotion - depends on good baseline FBIS - Dan Ellis

16 Outline introduction Broadcast soundtrack monitoring Example technologies Summary FBIS - Dan Ellis

17 4 Summary: Soundtrack analysis Soundtrack carries information - useful and detailed - complementary to image - rapid processing Open source monitoring - Need to find the interesting bits - Short-time: specific detectors - Long-time: schedule, genre classification Current techniques - Signal enhancement - Detectors for speech, music, alarms etc. - Classification of interaction style, emotion FBIS - Dan Ellis

18 Extra slides FBIS - Dan Ellis

19 Automatic Speech Recognition (ASR) Standard speech recognition structure: Feature calculation sound D A T A Acoustic model parameters Word models s ah t Language model p("sat" "the","cat") p("saw" "the","cat") Acoustic classifier HMM decoder feature vectors phone probabilities phone / word sequence Understanding/ application... State of the art word-error rates (WERs): - 2% (dictation) - 3% (telephone conversations) Can use multiple streams... FBIS - Dan Ellis

20 The Meeting Recorder project (with ICSI, UW, SRI, IBM) Microphones in conventional meetings - for summarization/retrieval/behavior analysis - informal, overlapped speech Data collection (ICSI, UW,...): - 1 hours collected, ongoing transcription - headsets + tabletop + PDA FBIS - Dan Ellis

21 Crosstalk cancellation Baseline speaker activity detection is hard: Spkr A backchannel (signals desire to regain floor?) floor seizure mr speaker active Spkr B speaker B cedes floor Spkr C interruptions Spkr D breath noise Spkr E crosstalk Table top time / secs level/db 4 2 Noisy crosstalk model: m = C s + n Estimate subband C Aa from A s peak energy -... including pure delay (1 ms frames) -... then linear inversion FBIS - Dan Ellis

22 Alarm sound detection Alarm sounds have particular structure - people know them when they hear them Isolate alarms in sound mixtures freq / Hz time / sec representation of energy in time-frequency - formation of atomic elements - grouping by common properties (onset &c.) - classify by attributes... Key: recognize despite background FBIS - Dan Ellis

23 Audio Information Retrieval (with Manuel Reyes) Searching in a database of audio - speech.. use ASR - text annotations.. search them - sound effects library? e.g. Muscle Fish SoundFisher browser - define multiple perceptual feature dimensions - search by proximity in (weighted) feature space Segment feature analysis Sound segment database Feature vectors Seach/ comparison Results Segment feature analysis Query example - features are global for each soundfile, no attempt to separate mixtures FBIS - Dan Ellis

24 Audio Retrieval: Results Musclefish corpus - most commonly reported set Features - mfcc, brightness, bandwidth, pitch... - no temporal sequence structure Results: - 28 examples, 16 classes, 84% correct - confusions: Musical instrs. 136 (14) Instr Spch Env Anim Mech Speech 17 (7) 2 Eviron. 2 6 (1) Animals () Mechanical 1 15 (2) FBIS - Dan Ellis

25 CASA for audio retrieval When audio material contains mixtures, global features are insufficient Retrieval based on element/object analysis: Generic element analysis Object formation Continuous audio archive Element representations Objects + properties Seach/ comparison Results Generic element analysis (Object formation) Query example rushing water Word-to-class mapping Symbolic query Properties alone - features are calculated over grouped subsets FBIS - Dan Ellis

Recognition & Organization of Speech & Audio

Recognition & Organization of Speech & Audio Recognition & Organization of Speech & Audio Dan Ellis http://labrosa.ee.columbia.edu/ Outline 1 2 3 Introducing Projects in speech, music & audio Summary overview - Dan Ellis 21-9-28-1 1 Sound organization

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 5 Introducing Tandem modeling

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University Outline 1 2 3 4 5 Sound organization Background & related work Existing projects

More information

Recognition & Organization of Speech and Audio

Recognition & Organization of Speech and Audio Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 Introducing Robust speech recognition

More information

Robustness, Separation & Pitch

Robustness, Separation & Pitch Robustness, Separation & Pitch or Morgan, Me & Pitch Dan Ellis Columbia / ICSI dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. Robustness and Separation 2. An Academic Journey 3. Future COLUMBIA

More information

Using the Soundtrack to Classify Videos

Using the Soundtrack to Classify Videos Using the Soundtrack to Classify Videos Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

Using Source Models in Speech Separation

Using Source Models in Speech Separation Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

Sound, Mixtures, and Learning

Sound, Mixtures, and Learning Sound, Mixtures, and Learning Dan Ellis Laboratory for Recognition and Organization of Speech and Audio (LabROSA) Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

Lecture 9: Speech Recognition: Front Ends

Lecture 9: Speech Recognition: Front Ends EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition: Front Ends 1 2 Recognizing Speech Feature Calculation Dan Ellis http://www.ee.columbia.edu/~dpwe/e682/

More information

Computational Auditory Scene Analysis: An overview and some observations. CASA survey. Other approaches

Computational Auditory Scene Analysis: An overview and some observations. CASA survey. Other approaches CASA talk - Haskins/NUWC - Dan Ellis 1997oct24/5-1 The importance of auditory illusions for artificial listeners 1 Dan Ellis International Computer Science Institute, Berkeley CA

More information

Sound Analysis Research at LabROSA

Sound Analysis Research at LabROSA Sound Analysis Research at LabROSA Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

LabROSA Research Overview

LabROSA Research Overview LabROSA Research Overview Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1.

More information

Speech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces

Speech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces HCI Lecture 11 Guest lecture: Speech Interfaces Hiroshi Shimodaira Institute for Communicating and Collaborative Systems (ICCS) Centre for Speech Technology Research (CSTR) http://www.cstr.ed.ac.uk Thanks

More information

Sound, Mixtures, and Learning

Sound, Mixtures, and Learning Sound, Mixtures, and Learning Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/

More information

Automatic audio analysis for content description & indexing

Automatic audio analysis for content description & indexing Automatic audio analysis for content description & indexing Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 Auditory Scene Analysis (ASA) Computational

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception Babies 'cry in mother's tongue' HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Babies' cries imitate their mother tongue as early as three days old German researchers say babies begin to pick up

More information

Review of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04

Review of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04 Review of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04 Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 SPRACH overview The SPRACH Broadcast

More information

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY

More information

Modulation and Top-Down Processing in Audition

Modulation and Top-Down Processing in Audition Modulation and Top-Down Processing in Audition Malcolm Slaney 1,2 and Greg Sell 2 1 Yahoo! Research 2 Stanford CCRMA Outline The Non-Linear Cochlea Correlogram Pitch Modulation and Demodulation Information

More information

Consonant Perception test

Consonant Perception test Consonant Perception test Introduction The Vowel-Consonant-Vowel (VCV) test is used in clinics to evaluate how well a listener can recognize consonants under different conditions (e.g. with and without

More information

Lecture 3: Perception

Lecture 3: Perception ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 3: Perception 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception Dan Ellis Dept. Electrical Engineering, Columbia University

More information

Noise-Robust Speech Recognition Technologies in Mobile Environments

Noise-Robust Speech Recognition Technologies in Mobile Environments Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.

More information

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete

Outline. Teager Energy and Modulation Features for Speech Applications. Dept. of ECE Technical Univ. of Crete Teager Energy and Modulation Features for Speech Applications Alexandros Summariza(on Potamianos and Emo(on Tracking in Movies Dept. of ECE Technical Univ. of Crete Alexandros Potamianos, NatIONAL Tech.

More information

Computational Perception /785. Auditory Scene Analysis

Computational Perception /785. Auditory Scene Analysis Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in

More information

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH) Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH) Matt Huenerfauth Raja Kushalnagar Rochester Institute of Technology DHH Auditory Issues Links Accents/Intonation Listening

More information

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar Perry C. Hanavan, AuD Augustana University Sioux Falls, SD August 15, 2018 Listening in Noise Cocktail Party Problem

More information

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds

Hearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds Hearing Lectures Acoustics of Speech and Hearing Week 2-10 Hearing 3: Auditory Filtering 1. Loudness of sinusoids mainly (see Web tutorial for more) 2. Pitch of sinusoids mainly (see Web tutorial for more)

More information

Computational Auditory Scene Analysis: Principles, Practice and Applications

Computational Auditory Scene Analysis: Principles, Practice and Applications Computational Auditory Scene Analysis: Principles, Practice and Applications Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 Auditory Scene Analysis

More information

Hearing in the Environment

Hearing in the Environment 10 Hearing in the Environment Click Chapter to edit 10 Master Hearing title in the style Environment Sound Localization Complex Sounds Auditory Scene Analysis Continuity and Restoration Effects Auditory

More information

Assistive Technology for Regular Curriculum for Hearing Impaired

Assistive Technology for Regular Curriculum for Hearing Impaired Assistive Technology for Regular Curriculum for Hearing Impaired Assistive Listening Devices Assistive listening devices can be utilized by individuals or large groups of people and can typically be accessed

More information

Your hearing. Your solution.

Your hearing. Your solution. Your hearing. Your solution. Name Date Your audiogram Loud Soft db 10 0 10 20 30 40 50 60 70 80 90 100 110 120 Low pitched High pitched 125 250 500 1000 2000 4000 8000 Hz Normal hearing Mild hearing loss

More information

Online Speaker Adaptation of an Acoustic Model using Face Recognition

Online Speaker Adaptation of an Acoustic Model using Face Recognition Online Speaker Adaptation of an Acoustic Model using Face Recognition Pavel Campr 1, Aleš Pražák 2, Josef V. Psutka 2, and Josef Psutka 2 1 Center for Machine Perception, Department of Cybernetics, Faculty

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

Auditory Scene Analysis: phenomena, theories and computational models

Auditory Scene Analysis: phenomena, theories and computational models Auditory Scene Analysis: phenomena, theories and computational models July 1998 Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 The computational

More information

Sound Interfaces Engineering Interaction Technologies. Prof. Stefanie Mueller HCI Engineering Group

Sound Interfaces Engineering Interaction Technologies. Prof. Stefanie Mueller HCI Engineering Group Sound Interfaces 6.810 Engineering Interaction Technologies Prof. Stefanie Mueller HCI Engineering Group what is sound? if a tree falls in the forest and nobody is there does it make sound?

More information

On-The-Fly Student Notes from Video Lecture. Using ASR

On-The-Fly Student Notes from Video Lecture. Using ASR On-The-Fly Student es from Video Lecture Using ASR Dipali Ramesh Peddawad 1 1computer Science and Engineering, Vishwakarma Institute Technology Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Acoustic Signal Processing Based on Deep Neural Networks

Acoustic Signal Processing Based on Deep Neural Networks Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline

More information

GfK Verein. Detecting Emotions from Voice

GfK Verein. Detecting Emotions from Voice GfK Verein Detecting Emotions from Voice Respondents willingness to complete questionnaires declines But it doesn t necessarily mean that consumers have nothing to say about products or brands: GfK Verein

More information

Providing Effective Communication Access

Providing Effective Communication Access Providing Effective Communication Access 2 nd International Hearing Loop Conference June 19 th, 2011 Matthew H. Bakke, Ph.D., CCC A Gallaudet University Outline of the Presentation Factors Affecting Communication

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Perceptual Quality. Measurement Method: Physiological

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Perceptual Quality. Measurement Method: Physiological The Good, the Bad and the Muffled: the Impact of Different Degradations on Internet Speech Anna Watson and M. Angela Sasse Department of CS University College London, London, UK Proceedings of ACM Multimedia

More information

Source and Description Category of Practice Level of CI User How to Use Additional Information. Intermediate- Advanced. Beginner- Advanced

Source and Description Category of Practice Level of CI User How to Use Additional Information. Intermediate- Advanced. Beginner- Advanced Source and Description Category of Practice Level of CI User How to Use Additional Information Randall s ESL Lab: http://www.esllab.com/ Provide practice in listening and comprehending dialogue. Comprehension

More information

Broadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop

Broadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop Broadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop Lin-Ching Chang Department of Electrical Engineering and Computer Science School of Engineering Work Experience 09/12-present,

More information

Overview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text?

Overview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text? Access to Mainstream Classroom Instruction Through Real-Time Text Michael Stinson, Rochester Institute of Technology National Technical Institute for the Deaf Presentation at Best Practice in Mainstream

More information

Tandem modeling investigations

Tandem modeling investigations Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 What makes Tandem successful? Can we make Tandem better? Does Tandem

More information

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain F. Archetti 1,2, G. Arosio 1, E. Fersini 1, E. Messina 1 1 DISCO, Università degli Studi di Milano-Bicocca, Viale Sarca,

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 BACKGROUND Speech is the most natural form of human communication. Speech has also become an important means of human-machine interaction and the advancement in technology has

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet

The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet Ghazaleh Vaziri Christian Giguère Hilmi R. Dajani Nicolas Ellaham Annual National Hearing

More information

An active unpleasantness control system for indoor noise based on auditory masking

An active unpleasantness control system for indoor noise based on auditory masking An active unpleasantness control system for indoor noise based on auditory masking Daisuke Ikefuji, Masato Nakayama, Takanabu Nishiura and Yoich Yamashita Graduate School of Information Science and Engineering,

More information

Assistive Listening Technology: in the workplace and on campus

Assistive Listening Technology: in the workplace and on campus Assistive Listening Technology: in the workplace and on campus Jeremy Brassington Tuesday, 11 July 2017 Why is it hard to hear in noisy rooms? Distance from sound source Background noise continuous and

More information

Effects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag

Effects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag JAIST Reposi https://dspace.j Title Effects of speaker's and listener's environments on speech intelligibili annoyance Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag Citation Inter-noise 2016: 171-176 Issue

More information

Sound Texture Classification Using Statistics from an Auditory Model

Sound Texture Classification Using Statistics from an Auditory Model Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering

More information

Binaural Hearing for Robots Introduction to Robot Hearing

Binaural Hearing for Robots Introduction to Robot Hearing Binaural Hearing for Robots Introduction to Robot Hearing 1Radu Horaud Binaural Hearing for Robots 1. Introduction to Robot Hearing 2. Methodological Foundations 3. Sound-Source Localization 4. Machine

More information

An Overview of Simultaneous Remote Interpretation By Telephone. 8/22/2017 Copyright 2017 ZipDX LLC

An Overview of Simultaneous Remote Interpretation By Telephone. 8/22/2017 Copyright 2017 ZipDX LLC An Overview of Simultaneous Remote Interpretation By Telephone 1 It s A Multilingual World Professional interpreters make it possible for people to meet in a multilingual setting In-person gatherings of

More information

C H A N N E L S A N D B A N D S A C T I V E N O I S E C O N T R O L 2

C H A N N E L S A N D B A N D S A C T I V E N O I S E C O N T R O L 2 C H A N N E L S A N D B A N D S Audibel hearing aids offer between 4 and 16 truly independent channels and bands. Channels are sections of the frequency spectrum that are processed independently by the

More information

Gender Based Emotion Recognition using Speech Signals: A Review

Gender Based Emotion Recognition using Speech Signals: A Review 50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department

More information

SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING

SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING CPC - G10L - 2017.08 G10L SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING processing of speech or voice signals in general (G10L 25/00);

More information

Using Speech Models for Separation

Using Speech Models for Separation Using Speech Models for Separation Dan Ellis Comprising the work of Michael Mandel and Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY

More information

CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS

CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS Mitchel Weintraub and Leonardo Neumeyer SRI International Speech Research and Technology Program Menlo Park, CA, 94025 USA ABSTRACT

More information

Dutch Multimodal Corpus for Speech Recognition

Dutch Multimodal Corpus for Speech Recognition Dutch Multimodal Corpus for Speech Recognition A.G. ChiŃu and L.J.M. Rothkrantz E-mails: {A.G.Chitu,L.J.M.Rothkrantz}@tudelft.nl Website: http://mmi.tudelft.nl Outline: Background and goal of the paper.

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception Long-term spectrum of speech HCS 7367 Speech Perception Connected speech Absolute threshold Males Dr. Peter Assmann Fall 212 Females Long-term spectrum of speech Vowels Males Females 2) Absolute threshold

More information

Robust Neural Encoding of Speech in Human Auditory Cortex

Robust Neural Encoding of Speech in Human Auditory Cortex Robust Neural Encoding of Speech in Human Auditory Cortex Nai Ding, Jonathan Z. Simon Electrical Engineering / Biology University of Maryland, College Park Auditory Processing in Natural Scenes How is

More information

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Physiological. Measurement Methods: PQ. Experimental Conditions

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Physiological. Measurement Methods: PQ. Experimental Conditions The Good, the Bad and the Muffled: the Impact of Different Degradations on Internet Speech Anna Watson and M. Angela Sasse Dept. of CS University College London, London, UK Proceedings of ACM Multimedia

More information

INTRODUCTION TO AUDIOLOGY Hearing Balance Tinnitus - Treatment

INTRODUCTION TO AUDIOLOGY Hearing Balance Tinnitus - Treatment INTRODUCTION TO AUDIOLOGY Hearing Balance Tinnitus - Treatment What is Audiology? Audiology refers to the SCIENCE OF HEARING AND THE STUDY OF THE AUDITORY PROCESS (Katz, 1986) Audiology is a health-care

More information

The power to connect us ALL.

The power to connect us ALL. Provided by Hamilton Relay www.ca-relay.com The power to connect us ALL. www.ddtp.org 17E Table of Contents What Is California Relay Service?...1 How Does a Relay Call Work?.... 2 Making the Most of Your

More information

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking

Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking Vahid Montazeri, Shaikat Hossain, Peter F. Assmann University of Texas

More information

The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments

The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments Ram H. Woo, Alex Park, and Timothy J. Hazen MIT Computer Science and Artificial Intelligence Laboratory 32

More information

Audiovisual to Sign Language Translator

Audiovisual to Sign Language Translator Technical Disclosure Commons Defensive Publications Series July 17, 2018 Audiovisual to Sign Language Translator Manikandan Gopalakrishnan Follow this and additional works at: https://www.tdcommons.org/dpubs_series

More information

ITU-T. FG AVA TR Version 1.0 (10/2013) Part 3: Using audiovisual media A taxonomy of participation

ITU-T. FG AVA TR Version 1.0 (10/2013) Part 3: Using audiovisual media A taxonomy of participation International Telecommunication Union ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU FG AVA TR Version 1.0 (10/2013) Focus Group on Audiovisual Media Accessibility Technical Report Part 3: Using

More information

Auditory Scene Analysis. Dr. Maria Chait, UCL Ear Institute

Auditory Scene Analysis. Dr. Maria Chait, UCL Ear Institute Auditory Scene Analysis Dr. Maria Chait, UCL Ear Institute Expected learning outcomes: Understand the tasks faced by the auditory system during everyday listening. Know the major Gestalt principles. Understand

More information

Presenter. Community Outreach Specialist Center for Sight & Hearing

Presenter. Community Outreach Specialist Center for Sight & Hearing Presenter Heidi Adams Heidi Adams Community Outreach Specialist Center for Sight & Hearing Demystifying Assistive Listening Devices Based on work by Cheryl D. Davis, Ph.D. WROCC Outreach Site at Western

More information

Voluntary Product Accessibility Template (VPAT)

Voluntary Product Accessibility Template (VPAT) (VPAT) Date: Product Name: Product Version Number: Organization Name: Submitter Name: Submitter Telephone: APPENDIX A: Suggested Language Guide Summary Table Section 1194.21 Software Applications and Operating

More information

CSE 118/218 Final Presentation. Team 2 Dreams and Aspirations

CSE 118/218 Final Presentation. Team 2 Dreams and Aspirations CSE 118/218 Final Presentation Team 2 Dreams and Aspirations Smart Hearing Hearing Impairment A major public health issue that is the third most common physical condition after arthritis and heart disease

More information

Hearing the Universal Language: Music and Cochlear Implants

Hearing the Universal Language: Music and Cochlear Implants Hearing the Universal Language: Music and Cochlear Implants Professor Hugh McDermott Deputy Director (Research) The Bionics Institute of Australia, Professorial Fellow The University of Melbourne Overview?

More information

Power Instruments, Power sources: Trends and Drivers. Steve Armstrong September 2015

Power Instruments, Power sources: Trends and Drivers. Steve Armstrong September 2015 Power Instruments, Power sources: Trends and Drivers Steve Armstrong September 2015 Focus of this talk more significant losses Severe Profound loss Challenges Speech in quiet Speech in noise Better Listening

More information

Director of Testing and Disability Services Phone: (706) Fax: (706) E Mail:

Director of Testing and Disability Services Phone: (706) Fax: (706) E Mail: Angie S. Baker Testing and Disability Services Director of Testing and Disability Services Phone: (706)737 1469 Fax: (706)729 2298 E Mail: tds@gru.edu Deafness is an invisible disability. It is easy for

More information

Summary Table Voluntary Product Accessibility Template. Supports. Please refer to. Supports. Please refer to

Summary Table Voluntary Product Accessibility Template. Supports. Please refer to. Supports. Please refer to Date Aug-07 Name of product SMART Board 600 series interactive whiteboard SMART Board 640, 660 and 680 interactive whiteboards address Section 508 standards as set forth below Contact for more information

More information

SPEECH PERCEPTION IN A 3-D WORLD

SPEECH PERCEPTION IN A 3-D WORLD SPEECH PERCEPTION IN A 3-D WORLD A line on an audiogram is far from answering the question How well can this child hear speech? In this section a variety of ways will be presented to further the teacher/therapist

More information

Auditory gist perception and attention

Auditory gist perception and attention Auditory gist perception and attention Sue Harding Speech and Hearing Research Group University of Sheffield POP Perception On Purpose Since the Sheffield POP meeting: Paper: Auditory gist perception:

More information

Hear Better With FM. Get more from everyday situations. Life is on

Hear Better With FM. Get more from everyday situations. Life is on Hear Better With FM Get more from everyday situations Life is on We are sensitive to the needs of everyone who depends on our knowledge, ideas and care. And by creatively challenging the limits of technology,

More information

EXPECT More from Oticon AGIL. a new survey proves it!

EXPECT More from Oticon AGIL. a new survey proves it! EXPECT More from Oticon AGIL a new survey proves it! Wondering how good Oticon Agil is? * The survey was performed in Canada, Germany, the United Kingdom and the United States. The survey collected data

More information

Summary Table Voluntary Product Accessibility Template. Supporting Features. Supports. Supports. Supports. Supports

Summary Table Voluntary Product Accessibility Template. Supporting Features. Supports. Supports. Supports. Supports Date: March 31, 2016 Name of Product: ThinkServer TS450, TS550 Summary Table Voluntary Product Accessibility Template Section 1194.21 Software Applications and Operating Systems Section 1194.22 Web-based

More information

Adaptation of Classification Model for Improving Speech Intelligibility in Noise

Adaptation of Classification Model for Improving Speech Intelligibility in Noise 1: (Junyoung Jung et al.: Adaptation of Classification Model for Improving Speech Intelligibility in Noise) (Regular Paper) 23 4, 2018 7 (JBE Vol. 23, No. 4, July 2018) https://doi.org/10.5909/jbe.2018.23.4.511

More information

Enhanced Feature Extraction for Speech Detection in Media Audio

Enhanced Feature Extraction for Speech Detection in Media Audio INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Enhanced Feature Extraction for Speech Detection in Media Audio Inseon Jang 1, ChungHyun Ahn 1, Jeongil Seo 1, Younseon Jang 2 1 Media Research Division,

More information

Wireless Technology - Improving Signal to Noise Ratio for Children in Challenging Situations

Wireless Technology - Improving Signal to Noise Ratio for Children in Challenging Situations Wireless Technology - Improving Signal to Noise Ratio for Children in Challenging Situations Astrid Haastrup MA, Senior Audiologist, GN ReSound ASHS 14-16 th November 2013 Disclosure Statement Employee

More information

SpeechZone 2. Author Tina Howard, Au.D., CCC-A, FAAA Senior Validation Specialist Unitron Favorite sound: wind chimes

SpeechZone 2. Author Tina Howard, Au.D., CCC-A, FAAA Senior Validation Specialist Unitron Favorite sound: wind chimes SpeechZone 2 Difficulty understanding speech in noisy environments is the biggest complaint for those with hearing loss. 1 In the real world, important speech doesn t always come from in front of the listener.

More information

Speech to Text Wireless Converter

Speech to Text Wireless Converter Speech to Text Wireless Converter Kailas Puri 1, Vivek Ajage 2, Satyam Mali 3, Akhil Wasnik 4, Amey Naik 5 And Guided by Dr. Prof. M. S. Panse 6 1,2,3,4,5,6 Department of Electrical Engineering, Veermata

More information

16.400/453J Human Factors Engineering /453. Audition. Prof. D. C. Chandra Lecture 14

16.400/453J Human Factors Engineering /453. Audition. Prof. D. C. Chandra Lecture 14 J Human Factors Engineering Audition Prof. D. C. Chandra Lecture 14 1 Overview Human ear anatomy and hearing Auditory perception Brainstorming about sounds Auditory vs. visual displays Considerations for

More information

Errol Davis Director of Research and Development Sound Linked Data Inc. Erik Arisholm Lead Engineer Sound Linked Data Inc.

Errol Davis Director of Research and Development Sound Linked Data Inc. Erik Arisholm Lead Engineer Sound Linked Data Inc. An Advanced Pseudo-Random Data Generator that improves data representations and reduces errors in pattern recognition in a Numeric Knowledge Modeling System Errol Davis Director of Research and Development

More information

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise 4 Special Issue Speech-Based Interfaces in Vehicles Research Report Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise Hiroyuki Hoshino Abstract This

More information

solutions to keep kids and teens connected Guide for Roger TM Life is on Bridging the understanding gap

solutions to keep kids and teens connected Guide for Roger TM Life is on Bridging the understanding gap Life is on We are sensitive to the needs of everyone who depends on our knowledge, ideas and care. And by creatively challenging the limits of technology, we develop innovations that help people hear,

More information

As of: 01/10/2006 the HP Designjet 4500 Stacker addresses the Section 508 standards as described in the chart below.

As of: 01/10/2006 the HP Designjet 4500 Stacker addresses the Section 508 standards as described in the chart below. Accessibility Information Detail Report As of: 01/10/2006 the HP Designjet 4500 Stacker addresses the Section 508 standards as described in the chart below. Family HP DesignJet 4500 stacker series Detail

More information

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine

More information

Roger TM in challenging listening situations. Hear better in noise and over distance

Roger TM in challenging listening situations. Hear better in noise and over distance Roger TM in challenging listening situations Hear better in noise and over distance Bridging the understanding gap Today s hearing aid technology does an excellent job of improving speech understanding.

More information

Sonic Spotlight. SmartCompress. Advancing compression technology into the future

Sonic Spotlight. SmartCompress. Advancing compression technology into the future Sonic Spotlight SmartCompress Advancing compression technology into the future Speech Variable Processing (SVP) is the unique digital signal processing strategy that gives Sonic hearing aids their signature

More information

For hearing instruments. 25/04/2015 Assistive Listening Devices - Leuven

For hearing instruments. 25/04/2015 Assistive Listening Devices - Leuven For hearing instruments 1 Why wireless accessories? Significant benefits in challenging listening situations 40% more speech understanding on the phone* Hear TV directly in full stereo quality Facilitate

More information

Speech Spatial Qualities

Speech Spatial Qualities Speech Spatial Qualities Advice about answering the questions The following questions inquire about aspects of your ability and experience hearing and listening in different situations. For each question,

More information

Everything you need to stay connected

Everything you need to stay connected Everything you need to stay connected GO WIRELESS Make everyday tasks easier Oticon Opn wireless accessories are a comprehensive and easy-to-use range of devices developed to improve your listening and

More information

COM3502/4502/6502 SPEECH PROCESSING

COM3502/4502/6502 SPEECH PROCESSING COM3502/4502/6502 SPEECH PROCESSING Lecture 4 Hearing COM3502/4502/6502 Speech Processing: Lecture 4, slide 1 The Speech Chain SPEAKER Ear LISTENER Feedback Link Vocal Muscles Ear Sound Waves Taken from:

More information