Oregon Graduate Institute of Science and Technology,
|
|
- Ronald McBride
- 6 years ago
- Views:
Transcription
1 SPEAKER RECOGNITION AT OREGON GRADUATE INSTITUTE June & 6, 997 Sarel van Vuuren and Narendranath Malayath Hynek Hermansky and Pieter Vermeulen, Oregon Graduate Institute, Portland, Oregon
2 Oregon Graduate Institute. Speaker Recognition at OGI Research Group Goals. Competitive System Architecture Results 3. Initial Robust System Architecture Preliminary Results and Conclusions Planned Extensions
3 People { Faculty: Hynek Hermansky, Pieter Vermeulen { Post Doc: Nobu Kanedera, Carlos Avendano { PhD Students: Sarel van Vuuren, Sangita Tibrewala, Narendranath Malayath Speech processing by emulating relevant properties of speech perception Collaboration with { CSLU at OGI { ICSI Berkeley { IIT Madras { IDIAP Martigny { KTH Stockholm
4 Activities { Speaker identication { Acoustic modeling for ASR { Enhancement of degraded speech and speech processing for handicapped { Human speech perception
5 Speaker Recognition at OGI Speech Signal { linguistic message { speaker characteristics { environment Task { nd out how these information sources are coded into the signal Applications { speaker ID { speaker independent ASR { voice mimic
6 Requirements of a Speaker Verication System Invariant to channel Invariant to session Invariant to noise Minimal training data Minimal verication data Adapt to speaker styles
7 Goals Be familiar with state of the art { Build an up to date competitive system following the state of the art { Analyze and understand abilities and limitations { Contribute to research system { Incorporate ideas from research system
8 Goals Research novel ideas { Knowledge driven { Analyze and understand { Report results { Contribute to state of the art system { Incorporate knowledge from state of the art system Address robustness { Invariance vs modeling { Channels and noise Address data requirements { Training { Verication
9 Initial Robust System Preprocessing Similar Representation Rep. Rep. L+E Speaker Specific Mapping L+S+E + L+S +E - Distance Information Sources L:Linguistic S:Speaker E:Environment Frame Integration Features Likelihood Estimator Residue invariant to extraneous information and noise Preprocessing: segmentation - such as silence removal, voiced segments Representation: diering speaker information - such as low order PLP vs high order PLP Speaker Specic Mapping: - such as Neural Net or Pseudo Inverse
10 Initial Robust System Preprocessing Similar Representation Rep. Rep. L+E Speaker Specific Mapping L+S+E + L+S +E - Distance Information Sources L:Linguistic S:Speaker E:Environment Frame Integration Features Likelihood Estimator Speaker Specic Distance Measure: - Euclidean, likelihood estimator, Bhattacharyya Frame Integrator: - average, voting Likelihood Estimator Adding other information (pitch, formants)
11 Initial Robust Implementation PLP Representation Remove Silence PLP-7 PLP-4 Speaker Specific NN + - Euclidean Frame Average Preprocessing: Silence deletion Representation: PLP-7 vs PLP-4 Speaker Specic Mapping: Neural Net Distance Measure: Euclidean Frame Integrator: Average Likelihood Estimator: None
12 Preliminary Studies Map from speaker independent to speaker rich representation Evidence of discrimination Evidence of low data requirements for verication No handset robustness - mapping not invariant due to training methodology
13 Results { GMM baseline DET curve: handset training; 3 sec test ; female; training handset 0 mdcf hdcf 0.00 eer 9.80 % mdcf (.8,9.4) hdcf (.7,4.) mdcf 0.08 eer 73 0 DET curve: handset training; 3 sec test ; female; non training handset 0 mdcf hdcf eer 4.86 % mdcf (.9,4.) hdcf (.7,4.) mdcf eer
14 Results { GMM baseline DET curve: handset training; 0 sec test; female; training handset 0 mdcf 0.09 hdcf eer.04 % mdcf (.3,.7) hdcf (.,7.4) mdcf eer 07 0 DET curve: handset training; 0 sec test; female; non training handset 0 mdcf hdcf 0.00 eer 9.60 % mdcf (.,3.8) hdcf (.,37.8) mdcf eer 0.3 0
15 Results { GMM baseline DET curve: handset training; 30 sec test; female; training handset 0 mdcf 0.0 hdcf 0.06 eer.80 % mdcf (0.6,8.7) hdcf (0.7,9.0) mdcf 0.0 eer 7 0 DET curve: handset training; 30 sec test; female; non training handset 0 mdcf hdcf eer 6.9 % mdcf (.,9.) hdcf (0.7,30.0) mdcf 0.09 eer 94 0
16 Results { PLP system DET curve: handset training; 3 sec test ; female; training handset 0 mdcf eer 9.0 % mdcf (.9,0.0) mdcf 0.84 eer DET curve: handset training; 3 sec test ; female; non training handset 0 mdcf eer 33.3 % mdcf (0.8,0.0) mdcf eer
17 Results { PLP system DET curve: handset training; 0 sec test; female; training handset 0 mdcf 0.06 eer.69 % mdcf (.4,.9) mdcf 0.88 eer DET curve: handset training; 0 sec test; female; non training handset 0 mdcf 0.09 eer % mdcf (0.9,0.0) mdcf 0.87 eer
18 Results { PLP system DET curve: handset training; 30 sec test; female; training handset 0 mdcf eer 4.8 % mdcf (.6,4.6) mdcf 0.88 eer DET curve: handset training; 30 sec test; female; non training handset 0 mdcf eer 9.4 % mdcf (0.8,0.0) mdcf 0.83 eer
19 Results { Subspace system DET curve: handset training; 3 sec test ; female; training handset 0 mdcf 0.07 eer.4 % mdcf (.,0.0) mdcf eer DET curve: handset training; 3 sec test ; female; non training handset eer 3.8 % 0 0
20 Results { Subspace system DET curve: handset training; 0 sec test; female; training handset 0 mdcf eer.76 % mdcf (.,39.) mdcf eer DET curve: handset training; 0 sec test; female; non training handset eer % 0 0
21 Results { Subspace system DET curve: handset training; 30 sec test; female; training handset 0 mdcf 0.0 eer 9.4 % mdcf (.6,38.7) mdcf eer DET curve: handset training; 30 sec test; female; non training handset eer 9.4 % 0 0
22 Future Work: Speaker Verication Understand each component Preprocessing Representation Environment Invariant Mapping Distance Measure
Research Article Automatic Speaker Recognition for Mobile Forensic Applications
Hindawi Mobile Information Systems Volume 07, Article ID 698639, 6 pages https://doi.org//07/698639 Research Article Automatic Speaker Recognition for Mobile Forensic Applications Mohammed Algabri, Hassan
More informationRobustness, Separation & Pitch
Robustness, Separation & Pitch or Morgan, Me & Pitch Dan Ellis Columbia / ICSI dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. Robustness and Separation 2. An Academic Journey 3. Future COLUMBIA
More informationRecognition & Organization of Speech & Audio
Recognition & Organization of Speech & Audio Dan Ellis http://labrosa.ee.columbia.edu/ Outline 1 2 3 Introducing Projects in speech, music & audio Summary overview - Dan Ellis 21-9-28-1 1 Sound organization
More informationRobust Speech Detection for Noisy Environments
Robust Speech Detection for Noisy Environments Óscar Varela, Rubén San-Segundo and Luis A. Hernández ABSTRACT This paper presents a robust voice activity detector (VAD) based on hidden Markov models (HMM)
More informationGeneral Soundtrack Analysis
General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/
More informationSpeech recognition in noisy environments: A survey
T-61.182 Robustness in Language and Speech Processing Speech recognition in noisy environments: A survey Yifan Gong presented by Tapani Raiko Feb 20, 2003 About the Paper Article published in Speech Communication
More informationRecognition & Organization of Speech and Audio
Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 5 Introducing Tandem modeling
More informationThe Unifi-EV Protocol for Evalita 2009
AI*IA 2009: International Conference on Artificial Intelligence EVALITA Workshop Reggio Emilia, 12 Dicembre 2009 The Unifi-EV2009-1 Protocol for Evalita 2009 Prof. Ing. Monica Carfagni, Ing. Matteo Nunziati
More informationAnalysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information
Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion
More informationAcoustic Signal Processing Based on Deep Neural Networks
Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline
More informationSpeech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces
HCI Lecture 11 Guest lecture: Speech Interfaces Hiroshi Shimodaira Institute for Communicating and Collaborative Systems (ICCS) Centre for Speech Technology Research (CSTR) http://www.cstr.ed.ac.uk Thanks
More informationReview of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04
Review of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04 Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 SPRACH overview The SPRACH Broadcast
More informationHCS 7367 Speech Perception
Babies 'cry in mother's tongue' HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Babies' cries imitate their mother tongue as early as three days old German researchers say babies begin to pick up
More informationPMR5406 Redes Neurais e Lógica Fuzzy. Aula 5 Alguns Exemplos
PMR5406 Redes Neurais e Lógica Fuzzy Aula 5 Alguns Exemplos APPLICATIONS Two examples of real life applications of neural networks for pattern classification: RBF networks for face recognition FF networks
More informationCOMBINING CATEGORICAL AND PRIMITIVES-BASED EMOTION RECOGNITION. University of Southern California (USC), Los Angeles, CA, USA
COMBINING CATEGORICAL AND PRIMITIVES-BASED EMOTION RECOGNITION M. Grimm 1, E. Mower 2, K. Kroschel 1, and S. Narayanan 2 1 Institut für Nachrichtentechnik (INT), Universität Karlsruhe (TH), Karlsruhe,
More informationAuditory gist perception and attention
Auditory gist perception and attention Sue Harding Speech and Hearing Research Group University of Sheffield POP Perception On Purpose Since the Sheffield POP meeting: Paper: Auditory gist perception:
More informationEnhancement of Reverberant Speech Using LP Residual Signal
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 8, NO. 3, MAY 2000 267 Enhancement of Reverberant Speech Using LP Residual Signal B. Yegnanarayana, Senior Member, IEEE, and P. Satyanarayana Murthy
More information2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants
Context Effect on Segmental and Supresegmental Cues Preceding context has been found to affect phoneme recognition Stop consonant recognition (Mann, 1980) A continuum from /da/ to /ga/ was preceded by
More informationA New Paradigm for the Evaluation of Forensic Evidence
A New Paradigm for the Evaluation of Forensic Evidence and its implementation in forensic voice comparison Geoffrey Stewart Morrison Ewald Enzinger p(e H ) p p(e H ) d Abstract & Biographies In Europe
More informationRecognition & Organization of Speech and Audio
Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University Outline 1 2 3 4 5 Sound organization Background & related work Existing projects
More information1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1
1. INTRODUCTION Sign language interpretation is one of the HCI applications where hand gesture plays important role for communication. This chapter discusses sign language interpretation system with present
More informationGender Based Emotion Recognition using Speech Signals: A Review
50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 BACKGROUND Speech is the most natural form of human communication. Speech has also become an important means of human-machine interaction and the advancement in technology has
More informationThe MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments
The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments Ram H. Woo, Alex Park, and Timothy J. Hazen MIT Computer Science and Artificial Intelligence Laboratory 32
More informationSound, Mixtures, and Learning
Sound, Mixtures, and Learning Dan Ellis Laboratory for Recognition and Organization of Speech and Audio (LabROSA) Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/
More informationSingle-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions
3 Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki Graduate School of System Informatics, Kobe University,
More informationUsing Speech Models for Separation
Using Speech Models for Separation Dan Ellis Comprising the work of Michael Mandel and Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY
More informationLecture 9: Speech Recognition: Front Ends
EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition: Front Ends 1 2 Recognizing Speech Feature Calculation Dan Ellis http://www.ee.columbia.edu/~dpwe/e682/
More informationCodebook driven short-term predictor parameter estimation for speech enhancement
Codebook driven short-term predictor parameter estimation for speech enhancement Citation for published version (APA): Srinivasan, S., Samuelsson, J., & Kleijn, W. B. (2006). Codebook driven short-term
More informationTESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY
ARCHIVES OF ACOUSTICS 32, 4 (Supplement), 187 192 (2007) TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY Piotr STARONIEWICZ Wrocław University of Technology Institute of Telecommunications,
More informationModulation and Top-Down Processing in Audition
Modulation and Top-Down Processing in Audition Malcolm Slaney 1,2 and Greg Sell 2 1 Yahoo! Research 2 Stanford CCRMA Outline The Non-Linear Cochlea Correlogram Pitch Modulation and Demodulation Information
More informationTandem modeling investigations
Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 What makes Tandem successful? Can we make Tandem better? Does Tandem
More informationThe SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation Martin Graciarena 1, Luciana Ferrer 2, Vikramjit Mitra 1 1 SRI International,
More informationInternational Forensic Science & Forensic Medicine Conference Naif Arab University for Security Sciences Riyadh Saudi Arabia
SPECTRAL EDITING IN SPEECH RECORDINGS: CHALLENGES IN AUTHENTICITY CHECKING Antonio César Morant Braid Electronic Engineer, Specialist Official Forensic Expert, Public Security Secretariat - Technical Police
More informationLecturer: T. J. Hazen. Handling variability in acoustic conditions. Computing and applying confidence scores
Lecture # 20 Session 2003 Noise Robustness and Confidence Scoring Lecturer: T. J. Hazen Handling variability in acoustic conditions Channel compensation Background noise compensation Foreground noises
More informationAudiovisual to Sign Language Translator
Technical Disclosure Commons Defensive Publications Series July 17, 2018 Audiovisual to Sign Language Translator Manikandan Gopalakrishnan Follow this and additional works at: https://www.tdcommons.org/dpubs_series
More informationErrol Davis Director of Research and Development Sound Linked Data Inc. Erik Arisholm Lead Engineer Sound Linked Data Inc.
An Advanced Pseudo-Random Data Generator that improves data representations and reduces errors in pattern recognition in a Numeric Knowledge Modeling System Errol Davis Director of Research and Development
More informationAcoustic-Labial Speaker Verication. (luettin, genoud,
Acoustic-Labial Speaker Verication Pierre Jourlin 1;2, Juergen Luettin 1, Dominique Genoud 1, Hubert Wassner 1 1 IDIAP, rue du Simplon 4, CP 592, CH-192 Martigny, Switzerland (luettin, genoud, wassner)@idiap.ch
More informationSmart Multifunctional Digital Content Ecosystem Using Emotion Analysis of Voice
International Conference on Computer Systems and Technologies - CompSysTech 17 Smart Multifunctional Digital Content Ecosystem Using Emotion Analysis of Voice Alexander I. Iliev, Peter Stanchev Abstract:
More informationSpringer. Springer Handbook of Auditory Research. Series Editors: Richard R. Fay and Arthur N. Popper
Springer Handbook of Auditory Research Series Editors: Richard R. Fay and Arthur N. Popper Springer New York Berlin Heidelberg Hong Kong London Milan Paris Tokyo Steven Greenberg Arthur N. Popper Editors
More informationAcoustics, signals & systems for audiology. Psychoacoustics of hearing impairment
Acoustics, signals & systems for audiology Psychoacoustics of hearing impairment Three main types of hearing impairment Conductive Sound is not properly transmitted from the outer to the inner ear Sensorineural
More informationPATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON
PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON Summary This paper has been prepared for a meeting (in Beijing 9-10 IX 1996) organised jointly by the
More informationKathy Nico Carbonell Speech, Language & Hearing Sciences, University of Florida P.O. Box University of Florida Gainesville, FL 32610
Kathy Nico Carbonell P.O. Box 100174 University of Florida Gainesville, FL 32610 Phone: (352)294-8253 email: kathycarbonell@phhp.ufl.edu RESEARCH INTERESTS Speech intelligibility, perceptual flexibility,
More informationComputational Auditory Scene Analysis: An overview and some observations. CASA survey. Other approaches
CASA talk - Haskins/NUWC - Dan Ellis 1997oct24/5-1 The importance of auditory illusions for artificial listeners 1 Dan Ellis International Computer Science Institute, Berkeley CA
More informationHearing Impaired K 12
Hearing Impaired K 12 Section 20 1 Knowledge of philosophical, historical, and legal foundations and their impact on the education of students who are deaf or hard of hearing 1. Identify federal and Florida
More informationGfK Verein. Detecting Emotions from Voice
GfK Verein Detecting Emotions from Voice Respondents willingness to complete questionnaires declines But it doesn t necessarily mean that consumers have nothing to say about products or brands: GfK Verein
More informationAUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening
AUDL GS08/GAV1 Signals, systems, acoustics and the ear Pitch & Binaural listening Review 25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 Part I: Auditory frequency selectivity Tuning
More informationJing Shen CURRICULUM VITAE Contact Information Education Academic and Research Experience
Jing Shen CURRICULUM VITAE Contact Information Department of Communication Sciences and Disorders Northwestern University 2240 Campus Drive, Evanston, Il 60208 Email: jing.shen@northwestern.edu Phone:
More informationRobust Neural Encoding of Speech in Human Auditory Cortex
Robust Neural Encoding of Speech in Human Auditory Cortex Nai Ding, Jonathan Z. Simon Electrical Engineering / Biology University of Maryland, College Park Auditory Processing in Natural Scenes How is
More informationAuditory scene analysis in humans: Implications for computational implementations.
Auditory scene analysis in humans: Implications for computational implementations. Albert S. Bregman McGill University Introduction. The scene analysis problem. Two dimensions of grouping. Recognition
More informationSVM-based Discriminative Accumulation Scheme for Place Recognition
SVM-based Discriminative Accumulation Scheme for Place Recognition Andrzej Pronobis CAS/CVAP, KTH Stockholm, Sweden pronobis@csc.kth.se Óscar Martínez Mozos AIS, University Of Freiburg Freiburg, Germany
More informationA Lip Reading Application on MS Kinect Camera
A Lip Reading Application on MS Kinect Camera Alper Yargıç, Muzaffer Doğan Computer Engineering Department Anadolu University Eskisehir, Turkey {ayargic,muzafferd}@anadolu.edu.tr Abstract Hearing-impaired
More informationI. INTRODUCTION. OMBARD EFFECT (LE), named after the French otorhino-laryngologist
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 6, AUGUST 2010 1379 Unsupervised Equalization of Lombard Effect for Speech Recognition in Noisy Adverse Environments Hynek Bořil,
More informationMovement and Memory. Undergraduate degree. Technology & Movement: New Approaches to Understanding Change. Graduate degree
Tracy Zitzelberger, MPH Technology & Movement: New Approaches to Understanding Change Tracy Zitzelberger, MPH Administra9ve Director Layton Aging & Alzheimer's Disease Center Oregon Center for Aging &
More informationSPEECH PERCEPTION IN A 3-D WORLD
SPEECH PERCEPTION IN A 3-D WORLD A line on an audiogram is far from answering the question How well can this child hear speech? In this section a variety of ways will be presented to further the teacher/therapist
More informationHCS 7367 Speech Perception
Long-term spectrum of speech HCS 7367 Speech Perception Connected speech Absolute threshold Males Dr. Peter Assmann Fall 212 Females Long-term spectrum of speech Vowels Males Females 2) Absolute threshold
More informationComputational Perception /785. Auditory Scene Analysis
Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in
More informationEvaluation of the neurological state of people with Parkinson s disease using i-vectors
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Evaluation of the neurological state of people with Parkinson s disease using s N. Garcia 1, J. R. Orozco-Arroyave 1,2, L. F. D Haro 3, Najim Dehak
More informationSound Localization PSY 310 Greg Francis. Lecture 31. Audition
Sound Localization PSY 310 Greg Francis Lecture 31 Physics and psychology. Audition We now have some idea of how sound properties are recorded by the auditory system So, we know what kind of information
More informationSRIRAM GANAPATHY. Indian Institute of Science, Phone: +91-(80) Bangalore, India, Fax: +91-(80)
SRIRAM GANAPATHY Assistant Professor, Email: sriram@ee.iisc.ernet.in Electrical Engineering, Web: http://www.leap.ee.iisc.ac.in/sriram Indian Institute of Science, Phone: +91-(80)-2293-2433 Bangalore,
More informationLATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION
LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION Lu Xugang Li Gang Wang Lip0 Nanyang Technological University, School of EEE, Workstation Resource
More informationLabROSA Research Overview
LabROSA Research Overview Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1.
More informationMotion Control for Social Behaviours
Motion Control for Social Behaviours Aryel Beck a.beck@ntu.edu.sg Supervisor: Nadia Magnenat-Thalmann Collaborators: Zhang Zhijun, Rubha Shri Narayanan, Neetha Das 10-03-2015 INTRODUCTION In order for
More informationSpeech, Language, and Hearing Sciences. Discovery with delivery as WE BUILD OUR FUTURE
Speech, Language, and Hearing Sciences Discovery with delivery as WE BUILD OUR FUTURE It began with Dr. Mack Steer.. SLHS celebrates 75 years at Purdue since its beginning in the basement of University
More informationAmbiguity in the recognition of phonetic vowels when using a bone conduction microphone
Acoustics 8 Paris Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone V. Zimpfer a and K. Buck b a ISL, 5 rue du Général Cassagnou BP 734, 6831 Saint Louis, France b
More informationAuditory Scene Analysis
1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer
More informationHuman-Robotic Agent Speech Interaction
Human-Robotic Agent Speech Interaction INSIDE Technical Report 002-16 Rubén Solera-Ureña and Helena Moniz February 23, 2016 1. Introduction Project INSIDE investigates how robust symbiotic interactions
More informationJitter, Shimmer, and Noise in Pathological Voice Quality Perception
ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine
More informationI>. U8.!E+S (contextual tuning theory) '/ # +8IL
'IH3 $ +K. t.i. (..L { @ >*S 1 K 1 /O)*0@ 2 1 619-02, 3 i S ef 2-2, ATR s (magnuson@hip.atr.co.jp, yamada@hip.atr.co.jp) 2 Q Q (h-nusbaum@uchicago.edu) I>. U8.!E+S (contextual tuning theory) '/ # +8IL
More informationSensory Cue Integration
Sensory Cue Integration Summary by Byoung-Hee Kim Computer Science and Engineering (CSE) http://bi.snu.ac.kr/ Presentation Guideline Quiz on the gist of the chapter (5 min) Presenters: prepare one main
More informationSPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM)
SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) Virendra Chauhan 1, Shobhana Dwivedi 2, Pooja Karale 3, Prof. S.M. Potdar 4 1,2,3B.E. Student 4 Assitant Professor 1,2,3,4Department of Electronics
More informationCategorical Perception
Categorical Perception Discrimination for some speech contrasts is poor within phonetic categories and good between categories. Unusual, not found for most perceptual contrasts. Influenced by task, expectations,
More informationRecognition & Organization of Speech and Audio
Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 Introducing Robust speech recognition
More informationUse of Auditory Techniques Checklists As Formative Tools: from Practicum to Student Teaching
Use of Auditory Techniques Checklists As Formative Tools: from Practicum to Student Teaching Marietta M. Paterson, Ed. D. Program Coordinator & Associate Professor University of Hartford ACE-DHH 2011 Preparation
More informationVisual IVR. for the. Hearing Impaired
Visual IVR for the Hearing Impaired Presenters Dr. Rhoda Agin Principal and Owner at Rhoda L. Agin, Ph.D. Communication Associates Gali Kovacs Marketing Director, Jacada Intro: The Common IVR Frustrations
More informationRole of F0 differences in source segregation
Role of F0 differences in source segregation Andrew J. Oxenham Research Laboratory of Electronics, MIT and Harvard-MIT Speech and Hearing Bioscience and Technology Program Rationale Many aspects of segregation
More informationLIE DETECTION SYSTEM USING INPUT VOICE SIGNAL K.Meena 1, K.Veena 2 (Corresponding Author: K.Veena) 1 Associate Professor, 2 Research Scholar,
International Journal of Pure and Applied Mathematics Volume 117 No. 8 2017, 121-125 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: 10.12732/ijpam.v117i8.25
More informationEnhanced Feature Extraction for Speech Detection in Media Audio
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Enhanced Feature Extraction for Speech Detection in Media Audio Inseon Jang 1, ChungHyun Ahn 1, Jeongil Seo 1, Younseon Jang 2 1 Media Research Division,
More informationEffect of spectral normalization on different talker speech recognition by cochlear implant users
Effect of spectral normalization on different talker speech recognition by cochlear implant users Chuping Liu a Department of Electrical Engineering, University of Southern California, Los Angeles, California
More informationEEL 6586, Project - Hearing Aids algorithms
EEL 6586, Project - Hearing Aids algorithms 1 Yan Yang, Jiang Lu, and Ming Xue I. PROBLEM STATEMENT We studied hearing loss algorithms in this project. As the conductive hearing loss is due to sound conducting
More informationPerformance of Gaussian Mixture Models as a Classifier for Pathological Voice
PAGE 65 Performance of Gaussian Mixture Models as a Classifier for Pathological Voice Jianglin Wang, Cheolwoo Jo SASPL, School of Mechatronics Changwon ational University Changwon, Gyeongnam 64-773, Republic
More informationFAST AMPLITUDE COMPRESSION IN HEARING AIDS IMPROVES AUDIBILITY BUT DEGRADES SPEECH INFORMATION TRANSMISSION
FAST AMPLITUDE COMPRESSION IN HEARING AIDS IMPROVES AUDIBILITY BUT DEGRADES SPEECH INFORMATION TRANSMISSION Arne Leijon and Svante Stadler Sound and Image Processing Lab., School of Electrical Engineering,
More informationA New Paradigm for the Evaluation of Forensic Evidence. Geoffrey Stewart Morrison. p p(e H )
A New Paradigm for the Evaluation of Forensic Evidence Geoffrey Stewart Morrison p(e H ) p p(e H ) d Abstract In Europe there has been a great deal of concern about the logically correct way to evaluate
More informationA Neural Network Architecture for.
A Neural Network Architecture for Self-Organization of Object Understanding D. Heinke, H.-M. Gross Technical University of Ilmenau, Division of Neuroinformatics 98684 Ilmenau, Germany e-mail: dietmar@informatik.tu-ilmenau.de
More informationBiologically-Inspired Human Motion Detection
Biologically-Inspired Human Motion Detection Vijay Laxmi, J. N. Carter and R. I. Damper Image, Speech and Intelligent Systems (ISIS) Research Group Department of Electronics and Computer Science University
More informationSPEECH EMOTION RECOGNITION: ARE WE THERE YET?
SPEECH EMOTION RECOGNITION: ARE WE THERE YET? CARLOS BUSSO Multimodal Signal Processing (MSP) lab The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Why study emotion
More informationOscillatory Neural Network for Image Segmentation with Biased Competition for Attention
Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of
More informationNoise-Robust Speech Recognition Technologies in Mobile Environments
Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.
More informationComparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide
International Journal of Computer Theory and Engineering, Vol. 7, No. 6, December 205 Comparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide Thaweewong Akkaralaertsest
More informationCONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS
CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS Mitchel Weintraub and Leonardo Neumeyer SRI International Speech Research and Technology Program Menlo Park, CA, 94025 USA ABSTRACT
More informationResonating memory traces account for the perceptual magnet effect
Resonating memory traces account for the perceptual magnet effect Gerhard Jäger Dept. of Linguistics, University of Tübingen, Germany Introduction In a series of experiments, atricia Kuhl and co-workers
More informationJuan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3
PRESERVING SPECTRAL CONTRAST IN AMPLITUDE COMPRESSION FOR HEARING AIDS Juan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3 1 University of Malaga, Campus de Teatinos-Complejo Tecnol
More informationHearing in the Environment
10 Hearing in the Environment Click Chapter to edit 10 Master Hearing title in the style Environment Sound Localization Complex Sounds Auditory Scene Analysis Continuity and Restoration Effects Auditory
More informationMODULE 6 Communication
MODULE 6 Communication Communication: The process by which information is transmitted and understood between two or more people. Communication competence: A person s ability to identify appropriate communication
More informationLearning Process. Auditory Training for Speech and Language Development. Auditory Training. Auditory Perceptual Abilities.
Learning Process Auditory Training for Speech and Language Development Introduction Demonstration Perception Imitation 1 2 Auditory Training Methods designed for improving auditory speech-perception Perception
More informationLecture 6. Human Factors in Engineering Design
GE105 Introduction to Engineering Design College of Engineering King Saud University Lecture 6. Human Factors in Engineering Design SPRING 2016 What is Human Factors in Design? Considering information
More informationOverview of the visual cortex. Ventral pathway. Overview of the visual cortex
Overview of the visual cortex Two streams: Ventral What : V1,V2, V4, IT, form recognition and object representation Dorsal Where : V1,V2, MT, MST, LIP, VIP, 7a: motion, location, control of eyes and arms
More informationInformal Functional Hearing Evaluation for Students with DeafBlindness
Informal Functional Hearing Evaluation for Students with DeafBlindness Presented by Chris Montgomery, M. Ed., TVI Deafblind Education Consultant TSBVI Outreach Programs Alexia Papanicolas, Au.D., CCC-A
More informationSpeech and Sound Use in a Remote Monitoring System for Health Care
Speech and Sound Use in a Remote System for Health Care M. Vacher J.-F. Serignat S. Chaillol D. Istrate V. Popescu CLIPS-IMAG, Team GEOD Joseph Fourier University of Grenoble - CNRS (France) Text, Speech
More information