CHAPTER 1 INTRODUCTION
|
|
- Sheryl Johns
- 5 years ago
- Views:
Transcription
1 CHAPTER 1 INTRODUCTION 1.1 BACKGROUND Speech is the most natural form of human communication. Speech has also become an important means of human-machine interaction and the advancement in technology has modernized the ways of communication. Speech processing is the study of speech signals and the processing methods of these signals. The signals are usually processed in digital representation, so speech processing is regarded as a special case of digital signal processing applied to speech signal. Speech processing relates to the enhancement, compression, synthesis or recognition of speech signals. In the recent times, digital signal processing has found importance and wide applications since the techniques are more sophisticated and advanced as compared to their analog equivalents. Ease and speed of representing, storing, retrieving and processing speech data has contributed to the development of efficient and effective speech processing techniques to address the issues related to speech. 1.2 OBJECTIVE OF THE PRESENT WORK The speech processing systems used to communicate or store speech signal are usually designed for noise free environment, but in real world the presence of background interference in the form of additive background noise and channel noise drastically degrades the performance of these systems, it causes inaccurate information exchange and listener
2 2 fatigue. Speech enhancement is a field of digital speech processing technique which aims to enhance the intelligibility and/or perceptual quality of the speech signal, like audio noise reduction for audio signals. Restoring the desired speech signal from the mixture of speech and background noise is amongst the oldest, still indefinable goals in speech processing and communication system research. As the presence of noise degrades the performance of various speech processing systems, it is therefore common to incorporate speech enhancement as a preprocessing step in these systems operating in noisy environments. Removing various types of noise is difficult due to its random nature and the inherent complexities of speech. Noise reduction techniques usually have a trade-off between the amount of noise removal and speech distortions introduced due to the processing of the speech signal. Complication and ease of implementation of the noise reduction algorithms is also of concern in applications especially those related to portable devices such as mobile communications, digital hearing aid, etc. The main objective of speech enhancement or noise reduction technique is to improve one or more perceptual aspects of speech, such as the speech quality or intelligibility. This is important in a variety of contexts, like environments with interfering background noise (e.g. offices, streets, automobiles, factory environment etc.), in speech recognition systems, hands free environment for cars, hearing aid devices etc. The ultimate goal of this work is to eliminate the additive noise present in the speech signal and restore the speech signal to its original form.
3 3 1.3 SPEECH ENHANCEMENT TECHNIQUES Most of the methods have been developed with some or the other auditory, perceptual or statistical constraints placed on the speech and noise signals. However, in real world situations, it is very difficult to reliably predict the characteristics of the interfering noise signal or the exact characteristics of the speech waveform. Hence, the speech enhancement methods are suboptimal and only reduce the amount of noise in the signal to some extent. Due to this nature, some of the speech signals are distorted during the noise reduction process. Hence, there is a trade-off between distortions in the processed speech and the amount of noise suppressed. The effectiveness of the speech enhancement system is therefore being measured based on how well it performs in light of this trade-off. Speech enhancement systems are classified in a number of ways based on the criteria used or application of the enhancement system. In general it is classified based on i) number of input channels like one, two or multiple, ii) domain of processing like time or frequency, iii) type of algorithm like adaptive or non-adaptive and iv) additional constraints like speech production or perception. The speech signal is acquired from single or multiple channel sensors. Non-stationarity of the noise process is further complicating the enhancement effort. The approach commonly referred to as multichannel system that consists of more than one channel, where one channel has the noisy signal that is to be processed and the other channel consists of the reference signal. Adaptive noise cancellation is one such powerful speech enhancement technique based on the availability of an auxiliary channel, known as reference path, where a correlated sample or reference of the contaminating noise is present. The system is relatively easier compared to a single channel system.
4 4 In the most common real time scenario, second channel is not available. One microphone input (single channel) makes speech enhancement difficult, as speech and noise are present in the same channel. Separation of the two signals requires relatively good knowledge of the speech and noise models. These systems are easy to build and comparatively less expensive than the multiple input systems. They constitute one of the most difficult situations of speech enhancement, since no reference signal to the noise is available and the clean speech cannot be preprocessed prior to being affected by the noise. Single channel systems make use of different statistics of speech and unwanted noise. The performance of these methods are limited in the presence of nonstationary noise as most of the methods make an assumption that noise is stationary during speech intervals and also the performance drastically degrades at lower signal to noise ratios. The best trade-off between speech distortion and noise reduction in a perceptual sense are based on the properties closely related to human perception and masking effects. Masking is a fundamental aspect of the human auditory system and is a basic element of perceptual coding systems which is utilized in enhancement systems. Masking is defined as either the process by which the threshold of audibility of one sound is raised by presence of another sound or the amount by which that threshold is increased. The auditory threshold is the minimum Sound Pressure Level (SPL) necessary to detect a pure tone in quiet environment as a function of the frequency of the tone. The hearing threshold level in such condition is a representative of average among the values obtained from different people. Below this level, the human ear cannot perceive sound. Masking is typically expressed in decibels (db), and its effect is known as the masked or masking threshold. Masking comes in play when
5 5 the activity caused by one signal is not detected due to the activity caused by presence of another neighboring signal. Masking occurs in frequency and time domain. The masking effect of a signal is the greatest within a critical bandwidth of the signal, although there are some effects in other critical bands too. Masking in frequency domain is typically determined for a noise masking a tone or a tone masking a noise. Temporal masking includes simultaneous masking, premasking (also called backward masking) and postmasking (also called forward masking). Simultaneous masking occurs when a masker is present throughout the duration of a relatively long signal. Postmasking occurs when a masking signal masks a signal occurring after the masker and is caused by reduction in sensitivity of recently simulated cells. A simplified explanation of the mechanism underlying the simultaneous masking phenomena is as follows. The presence of a strong noise or tone masker creates an excitation of sufficient strength on the basilar membrane at the critical band location to effectively block the transmission of a weaker signal. Hence the speech enhancement algorithm needs to be developed, considering the non-stationarity of the noise and the perceptual quality of the human ear. 1.4 SUBJECTIVE AND OBJECTIVE QUALITY MEASURES FOR PERFORMANCE ANALYSIS In this research, dual channel and single channel speech enhancement algorithms have been proposed to enhance the speech degraded by the additive background noise. In general the environment is categorized as stationary and non-stationary noisy environment. Most of the real time noises are non-stationary in nature. Speech degradation by such noise often occurs due to sources like air conditioning units, fans,
6 6 cars, city streets, factory environments, helicopters, computer systems, restaurants, etc. The core objective of this research is to develop a speech enhancement algorithm which works well in any of the real world environment. Performance analyses of speech enhancement techniques are based on their subjective and objective quality measures that are explained in this section. The two main aspects of speech enhancement algorithms are speech intelligibility and quality (Yi Hu and Philipos C. Loizou 2006a, 2006b, 2007, 2008); it is quantified using objective and subjective measures. Objective quality measures predict the perceived speech quality based on a computation of the numerical distance or distortion between the original and the processed speech. Speech intelligibility is the measure of the effectiveness of speech. Objective quality measures are evaluated automatically from the speech signal, its spectrum, or some parameters obtained thereof. Since they do not require listening tests, these measures give an immediate estimate of the perceptual quality of a speech enhancement algorithm. Subjective quality measures are obtained using listening tests in which human participants rate quality of the speech in accordance with a predetermined opinion scale. It is expressed in terms of how pleasant the signal sounds or how much effort is required for the listeners to understand the message. Subjective distortion measures provide the most accurate assessment of the performance since the degree of perceptual quality and intelligibility are determined by the human auditory system. In general, the performance of the speech enhancement algorithm with reference to speech
7 7 intelligibility and quality is evaluated using these subjective and objective distortion measures Objective Quality Measures The objective quality measures (Bayya and Vis 1996), (Philipos C. Loizou 2007) are primarily based on the idea that speech quality is modeled in terms of differences in loudness between the original and processed signals or simply in terms of differences in the spectral envelopes between the original and processed signals. Objective speech quality measures are categorized into: A. Signal to Noise Ratio (SNR) B. Itakuro - Saito (IS) distance measure C. Perceptual Evaluation of Speech Quality (PESQ) D. Mean Square Error (MSE) Signal to Noise Ratio (SNR) The SNR is the ratio of signal power to noise power expressed in decibels (db) and is given in Equation (1.1) as n SNR db 10 log10 2 (1.1) n s [s(n) 2 (n) ŝ(n)] signal. where s(n) is the clean speech and (n) is the enhanced speech
8 8 Itakuro-Saito (IS) Distance Measure The Itakuro-Saito distance is a measure of the perceptual difference between an original spectrum P ( ) and an approximation Pˆ( ) of that spectrum. The distance is defined as: 1 P( ) P( ) D IS (P( ),Pˆ ( )) log 1 d (1.2) 2 Pˆ( ) Pˆ ( ) This IS distance measure shows the phase difference information between enhanced and clean signal. Normally it should be of low value. Perceptual Evaluation of Speech Quality (PESQ) PESQ measure (Rix et al 2001) is one of the most commonly used measures to predict the subjective opinion score of a degraded or enhanced speech. It is recommended by International Telecommunications Union (ITU) for speech quality assessment. In PESQ measure, a reference signal and the enhanced signal are first aligned in both time and level. For normal subjective test material the PESQ score ranges from 1 to 5, with higher score indicating better quality. Mean Square Error The Mean Square Error (MSE) metric is frequently used in signal processing and is defined in Equation (1.3) as: 1 L 2 MSE S(i) Ŝ(i) (1.3) L i 1
9 9 signal and Ŝ (i ) where S(i) denotes the power spectrum of the clean speech denotes the power spectrum of the enhanced speech signal Subjective Quality Measures ITU-T-P.835 Standard ITU-T standard (P.835) was designed for evaluating the subjective quality of speech in noise and is particularly appropriate for the evaluation of noise suppression algorithms. The parameters used to evaluate the subjective quality measures of a speech signal are CSIG which specifies the scale of signal distortion, CBAK gives the scale of background intrusiveness and COVRL specifies the scale of mean opinion score. The details of these parameters are given in Tables 1.1, 1.2 and 1.3. a. Speech signal alone using a five-point scale of signal distortion (CSIG) shown in Table 1.1. Table 1.1 Scale of signal distortion Rating Speech quality 5 Very natural, no degradation 4 Fairly natural, little degradation 3 Somewhat natural, somewhat degraded 2 Fairly unnatural, fairly degraded 1 Very unnatural, very degraded b. Background noise alone using a five-point scale of background intrusiveness (CBAK) given in Table 1.2.
10 10 Table 1.2 Scale of background intrusiveness Rating Speech quality 5 Not noticeable 4 Somewhat noticeable 3 Noticeable, somewhat intrusive 2 Fairly conspicuous, somewhat intrusive 1 Very conspicuous, very intrusive c. Overall effect using the scale of the mean opinion score (COVRL) is given in Table 1.3. Table 1.3 Scale of mean opinion score Rating Speech quality 5 Excellent 4 Good 3 Fair 2 Poor 1 Unsatisfactory These values are obtained by linearly combining the existing objective measures by the following relations given in Equation (1.4) CSIG = LLR+0.603PESQ-0.009WSS CBAK = PESQ-0.007WSS+0.063segSNR COVL = PESQ-0.512LLR-0.007WSS (1.4) where LLR specifies the Log-Likelihood Ratio and WSS is the Weighted Spectral Slope distance.
11 11 Log-Likelihood Ratio The Log-Likelihood Ratio (LLR) measure is also referred to as the Itakuro distance measure. It is based on the dissimilarity between the all pole models of the original (clean) and enhanced speech. This distance measure is computed between sets of linear prediction coefficients over synchronous frames in the original and enhanced speech. The LLR measure is found as T a dr sa d LLR log (1.5) 10 T a sr sa s In Equation (1.5), a s and a d are the linear prediction coefficient vectors for the clean and degraded or enhanced speech segments respectively. R s denotes the autocorrelation matrices of the clean speech segment for which the optimal predictor coefficients a s have been computed. Weighted Spectral Slope distance (WSS) The WSS distance measure is based on critical filter band analyses (auditory model) in which overlapping filters of progressively larger bandwidth are used to estimate the smoothed short time speech spectrum (Yi Hu and Philipos C. Loizou 2006a, 2006b, 2008). This measure finds a weighted difference between the spectral slopes in each band. The magnitude of each weight reflects whether the band is near a spectral peak or valley and whether the peak is the largest in the spectrum.
12 12 Test Samples The test samples to examine the performance of the proposed speech enhancement algorithms are taken from SpEAR database of CSLU (Center for Spoken Language Understanding), the NOIZEUS database and recorded samples in room environment. SpEAR: Speech Enhancement and Assessment Resource. In this database speech corrupted by different classes of noise like white stationary, pink stationary, car, cellular etc., are available and the speech is sampled at 8 khz (16 bps). NOIZEUS: A noisy speech corpus for evaluation of speech enhancement algorithms (Varga and Steeneken 1993). The noisy database contains 30 IEEE sentences (produced by three male and three female speakers) corrupted by eight different real world noises at different SNRs. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. The sentences are originally sampled at 25 khz and downsampled to 8 khz. The noise signals are added to the speech signals at SNRs of 0 db, 5 db and 10 db. In the subsequent chapters the dual channel and single channel speech enhancement algorithms are proposed and using the test samples the proposed methods are analyzed based on the performance analysis measures discussed above. 1.5 APPLICATIONS OF SPEECH ENHANCEMENT The following are some of the applications where speech enhancement plays a vital role to improve the performance of the speech processing systems.
13 13 Speech enhancement has found many applications particularly with the increase in Automatic Speech Recognition (ASR) and mobile communications. In ASR systems, the performance degrades badly in the case of adverse environments with very low SNR. It is found that the recognition rate is improved by applying a speech enhancement algorithm to the degraded speech. Also in the case of mobile communication, the speech signal is degraded by different types of noise in the communication channel; hence there is a need for speech enhancement in the receiver. The common features of most of the speech enhancement methods are to estimate the power spectrum of clean speech from the power spectrum of noisy speech and the spectrum of noise. Hearing instruments have come a long way from the first analog electro acoustic devices to state-of-the-art full fledged digital hearing systems. Besides providing the amplification necessary for the successful rehabilitation of the hearing impaired person, modern hearing instruments encompass a variety of functions, which improves the user experience. This includes functions for noise suppression, feedback control and wireless communications. Under noisy conditions, hearing impaired persons typically have greater difficulty in understanding speech than those with normal hearing. This disadvantage translates to the requirement of an additional 2.5 to 12 db SNR improvement for speech discrimination scores similar to those of normal hearing. As the capabilities of digital signal processing migrate to smaller devices, it is natural to consider its use as a front-end speech enhancement technique for future generation hearing aids. Undisputedly, the most important function is the restoration of the speech intelligibility in acoustically adverse conditions. This requires sophisticated acoustic signal (speech) processing algorithms.
14 14 The rapid evolution and combination of wireless communication and consumer electronics technologies have advanced the development of information agent industry. However, the use of handset equipment for cellular phones in cars is a restriction and a potential risk for the driver. Hands free devices are thus developed to overcome the problem. However, the driver speech will be corrupted severely by the ambient noise, which affects successive operations such as speech coding and ASR for voice dialing. To solve the problem, hands free car kits must provide some means of noise reduction in the front end of speech coder and recognizer. Regarding the case of speech recognizers, the objective of ASR systems is to recognize the human speech, such as words and sentences, using algorithms evaluated by a computer without the interference of human. However, under noisy environment, the recognition performance degrades significantly due to the statistical mismatch between the noisy speech feature and the clean-trained acoustic model of the recognition system. The mismatch occurs when the testing condition is different from the training condition, as the acoustic interferences such as additive background noise change the statistics of the speech. It is necessary to address this problem so that the recognition accuracy can be improved to a level which is applicable to real world problems. The speech enhancement techniques attempt to reduce the effect of the mismatched acoustic environment by estimating the clean speech features. These are some of the real time speech processing methods where speech enhancement is necessary for better performance and proposed speech enhancement techniques consider the problems of the existing methods.
15 ORGANIZATION OF THE THESIS The thesis is organized as follows: Chapter 2 gives a review of the existing methods for dual channel and single channel speech enhancement algorithms and about noise estimation related to the present work. In this chapter, Section 2.2 deals with the review of the existing speech enhancement methods and various noise estimation algorithms. In Section 2.3, drawbacks of the existing speech enhancement algorithms are discussed. Section 2.4 gives the need for the present work. Chapter 3 presents the dual channel algorithms. Hadamard Least Mean Square algorithm with DCT preprocessing technique and Hadamard Recursive Least Square algorithm with DCT preprocessing are proposed in this chapter. Further time domain and frequency domain plots are obtained and results are analyzed. In Chapter 4, single channel algorithms are proposed. In Section 4.2, enhancement technique using Partial Differential Equation (PDE) is proposed for stationary noisy speech signal condition and is operated on time domain speech samples. Section 4.3 deals with the speech enhancement using variance and modified gain function; here the modified gain function is calculated both in time index and in frequency bin to compensate the speech distortion. Performance analysis is made by comparing with the existing method. Chapter 5 provides the speech enhancement algorithm using subband approach. Section 5.2 consists of subband spectral subtraction method using adaptive noise estimation algorithm, which is used for
16 16 non-stationary noisy environment to improve the quality of the enhanced speech signal along with the reduced musical noise compared to the conventional spectral subtraction method. In Section 5.3, a Subband Two Step Decision Directed approach (SBTSDD) with adaptive weighting factor and perceptual gain factor is proposed. This section considers the masking property and the perceptual quality of the human ear. The masking thresholds are used to obtain the enhanced signal. Results for the proposed methods are analyzed and compared with the results of Chapter 4 and the plots for time domain and frequency domain outputs are given. Further in Chapter 6, speech enhancement for digital hearing aid is discussed based on the results obtained by the proposed methods. Chapter 7. The conclusion and suggestions for future work are presented in 1.7 SUMMARY In this chapter the introduction of speech processing systems and the speech enhancement are discussed. The overview of the speech enhancement techniques and its applications are discussed briefly. The next chapter deals with the literature review and the need for the present work.
HCS 7367 Speech Perception
Long-term spectrum of speech HCS 7367 Speech Perception Connected speech Absolute threshold Males Dr. Peter Assmann Fall 212 Females Long-term spectrum of speech Vowels Males Females 2) Absolute threshold
More informationSpeech Enhancement, Human Auditory system, Digital hearing aid, Noise maskers, Tone maskers, Signal to Noise ratio, Mean Square Error
Volume 119 No. 18 2018, 2259-2272 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ SUBBAND APPROACH OF SPEECH ENHANCEMENT USING THE PROPERTIES OF HUMAN AUDITORY
More informationHCS 7367 Speech Perception
Babies 'cry in mother's tongue' HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Babies' cries imitate their mother tongue as early as three days old German researchers say babies begin to pick up
More informationPerformance Comparison of Speech Enhancement Algorithms Using Different Parameters
Performance Comparison of Speech Enhancement Algorithms Using Different Parameters Ambalika, Er. Sonia Saini Abstract In speech communication system, background noise degrades the information or speech
More informationNovel Speech Signal Enhancement Techniques for Tamil Speech Recognition using RLS Adaptive Filtering and Dual Tree Complex Wavelet Transform
Web Site: wwwijettcsorg Email: editor@ijettcsorg Novel Speech Signal Enhancement Techniques for Tamil Speech Recognition using RLS Adaptive Filtering and Dual Tree Complex Wavelet Transform VimalaC 1,
More informationReSound NoiseTracker II
Abstract The main complaint of hearing instrument wearers continues to be hearing in noise. While directional microphone technology exploits spatial separation of signal from noise to improve listening
More informationAcoustic Signal Processing Based on Deep Neural Networks
Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline
More informationAdvanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment
DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY
More informationAn active unpleasantness control system for indoor noise based on auditory masking
An active unpleasantness control system for indoor noise based on auditory masking Daisuke Ikefuji, Masato Nakayama, Takanabu Nishiura and Yoich Yamashita Graduate School of Information Science and Engineering,
More informationNoise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise
4 Special Issue Speech-Based Interfaces in Vehicles Research Report Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise Hiroyuki Hoshino Abstract This
More informationVoice Detection using Speech Energy Maximization and Silence Feature Normalization
, pp.25-29 http://dx.doi.org/10.14257/astl.2014.49.06 Voice Detection using Speech Energy Maximization and Silence Feature Normalization In-Sung Han 1 and Chan-Shik Ahn 2 1 Dept. of The 2nd R&D Institute,
More informationUvA-DARE (Digital Academic Repository) Perceptual evaluation of noise reduction in hearing aids Brons, I. Link to publication
UvA-DARE (Digital Academic Repository) Perceptual evaluation of noise reduction in hearing aids Brons, I. Link to publication Citation for published version (APA): Brons, I. (2013). Perceptual evaluation
More informationCodebook driven short-term predictor parameter estimation for speech enhancement
Codebook driven short-term predictor parameter estimation for speech enhancement Citation for published version (APA): Srinivasan, S., Samuelsson, J., & Kleijn, W. B. (2006). Codebook driven short-term
More informationAND9020/D. Adaptive Feedback Cancellation 3 from ON Semiconductor APPLICATION NOTE INTRODUCTION
Adaptive Feedback Cancellation 3 from ON Semiconductor APPLICATION NOTE INTRODUCTION This information note describes the feedback cancellation feature provided in ON Semiconductor s latest digital hearing
More informationEEL 6586, Project - Hearing Aids algorithms
EEL 6586, Project - Hearing Aids algorithms 1 Yan Yang, Jiang Lu, and Ming Xue I. PROBLEM STATEMENT We studied hearing loss algorithms in this project. As the conductive hearing loss is due to sound conducting
More informationAcoustics, signals & systems for audiology. Psychoacoustics of hearing impairment
Acoustics, signals & systems for audiology Psychoacoustics of hearing impairment Three main types of hearing impairment Conductive Sound is not properly transmitted from the outer to the inner ear Sensorineural
More informationFREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED
FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED Francisco J. Fraga, Alan M. Marotta National Institute of Telecommunications, Santa Rita do Sapucaí - MG, Brazil Abstract A considerable
More informationCombination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency
Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Tetsuya Shimamura and Fumiya Kato Graduate School of Science and Engineering Saitama University 255 Shimo-Okubo,
More informationSonic Spotlight. SmartCompress. Advancing compression technology into the future
Sonic Spotlight SmartCompress Advancing compression technology into the future Speech Variable Processing (SVP) is the unique digital signal processing strategy that gives Sonic hearing aids their signature
More informationEcho Canceller with Noise Reduction Provides Comfortable Hands-free Telecommunication in Noisy Environments
Canceller with Reduction Provides Comfortable Hands-free Telecommunication in Noisy Environments Sumitaka Sakauchi, Yoichi Haneda, Manabu Okamoto, Junko Sasaki, and Akitoshi Kataoka Abstract Audio-teleconferencing,
More informationPCA Enhanced Kalman Filter for ECG Denoising
IOSR Journal of Electronics & Communication Engineering (IOSR-JECE) ISSN(e) : 2278-1684 ISSN(p) : 2320-334X, PP 06-13 www.iosrjournals.org PCA Enhanced Kalman Filter for ECG Denoising Febina Ikbal 1, Prof.M.Mathurakani
More informationCURRENTLY, the most accurate method for evaluating
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2008 229 Evaluation of Objective Quality Measures for Speech Enhancement Yi Hu and Philipos C. Loizou, Senior Member,
More informationAdaptive Feedback Cancellation for the RHYTHM R3920 from ON Semiconductor
Adaptive Feedback Cancellation for the RHYTHM R3920 from ON Semiconductor This information note describes the feedback cancellation feature provided in in the latest digital hearing aid amplifiers for
More informationIssues faced by people with a Sensorineural Hearing Loss
Issues faced by people with a Sensorineural Hearing Loss Issues faced by people with a Sensorineural Hearing Loss 1. Decreased Audibility 2. Decreased Dynamic Range 3. Decreased Frequency Resolution 4.
More informationDiscrete Signal Processing
1 Discrete Signal Processing C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.cs.nctu.edu.tw/~cmliu/courses/dsp/ ( Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw
More informationComputational Perception /785. Auditory Scene Analysis
Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in
More informationFrequency refers to how often something happens. Period refers to the time it takes something to happen.
Lecture 2 Properties of Waves Frequency and period are distinctly different, yet related, quantities. Frequency refers to how often something happens. Period refers to the time it takes something to happen.
More informationAutomatic Live Monitoring of Communication Quality for Normal-Hearing and Hearing-Impaired Listeners
Automatic Live Monitoring of Communication Quality for Normal-Hearing and Hearing-Impaired Listeners Jan Rennies, Eugen Albertin, Stefan Goetze, and Jens-E. Appell Fraunhofer IDMT, Hearing, Speech and
More informationSpeech Enhancement Based on Deep Neural Networks
Speech Enhancement Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu and Jun Du at USTC 1 Outline and Talk Agenda In Signal Processing Letter,
More informationSonic Spotlight. Binaural Coordination: Making the Connection
Binaural Coordination: Making the Connection 1 Sonic Spotlight Binaural Coordination: Making the Connection Binaural Coordination is the global term that refers to the management of wireless technology
More informationHearing Lectures. Acoustics of Speech and Hearing. Auditory Lighthouse. Facts about Timbre. Analysis of Complex Sounds
Hearing Lectures Acoustics of Speech and Hearing Week 2-10 Hearing 3: Auditory Filtering 1. Loudness of sinusoids mainly (see Web tutorial for more) 2. Pitch of sinusoids mainly (see Web tutorial for more)
More informationSound Texture Classification Using Statistics from an Auditory Model
Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering
More informationSpeech Enhancement Using Deep Neural Network
Speech Enhancement Using Deep Neural Network Pallavi D. Bhamre 1, Hemangi H. Kulkarni 2 1 Post-graduate Student, Department of Electronics and Telecommunication, R. H. Sapat College of Engineering, Management
More informationHearIntelligence by HANSATON. Intelligent hearing means natural hearing.
HearIntelligence by HANSATON. HearIntelligence by HANSATON. Intelligent hearing means natural hearing. Acoustic environments are complex. We are surrounded by a variety of different acoustic signals, speech
More informationPsychoacoustical Models WS 2016/17
Psychoacoustical Models WS 2016/17 related lectures: Applied and Virtual Acoustics (Winter Term) Advanced Psychoacoustics (Summer Term) Sound Perception 2 Frequency and Level Range of Human Hearing Source:
More informationThe effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet
The effect of wearing conventional and level-dependent hearing protectors on speech production in noise and quiet Ghazaleh Vaziri Christian Giguère Hilmi R. Dajani Nicolas Ellaham Annual National Hearing
More informationWhite Paper: micon Directivity and Directional Speech Enhancement
White Paper: micon Directivity and Directional Speech Enhancement www.siemens.com Eghart Fischer, Henning Puder, Ph. D., Jens Hain Abstract: This paper describes a new, optimized directional processing
More informationTechnical Discussion HUSHCORE Acoustical Products & Systems
What Is Noise? Noise is unwanted sound which may be hazardous to health, interfere with speech and verbal communications or is otherwise disturbing, irritating or annoying. What Is Sound? Sound is defined
More informationFIR filter bank design for Audiogram Matching
FIR filter bank design for Audiogram Matching Shobhit Kumar Nema, Mr. Amit Pathak,Professor M.Tech, Digital communication,srist,jabalpur,india, shobhit.nema@gmail.com Dept.of Electronics & communication,srist,jabalpur,india,
More informationFrequency Tracking: LMS and RLS Applied to Speech Formant Estimation
Aldebaro Klautau - http://speech.ucsd.edu/aldebaro - 2/3/. Page. Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation ) Introduction Several speech processing algorithms assume the signal
More informationSpeech Enhancement Based on Spectral Subtraction Involving Magnitude and Phase Components
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phase Components Miss Bhagat Nikita 1, Miss Chavan Prajakta 2, Miss Dhaigude Priyanka 3, Miss Ingole Nisha 4, Mr Ranaware Amarsinh
More informationResearch Article The Acoustic and Peceptual Effects of Series and Parallel Processing
Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 9, Article ID 6195, pages doi:1.1155/9/6195 Research Article The Acoustic and Peceptual Effects of Series and Parallel
More informationAmbiguity in the recognition of phonetic vowels when using a bone conduction microphone
Acoustics 8 Paris Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone V. Zimpfer a and K. Buck b a ISL, 5 rue du Général Cassagnou BP 734, 6831 Saint Louis, France b
More informationCONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS
CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS Mitchel Weintraub and Leonardo Neumeyer SRI International Speech Research and Technology Program Menlo Park, CA, 94025 USA ABSTRACT
More informationThe Effect of Analysis Methods and Input Signal Characteristics on Hearing Aid Measurements
The Effect of Analysis Methods and Input Signal Characteristics on Hearing Aid Measurements By: Kristina Frye Section 1: Common Source Types FONIX analyzers contain two main signal types: Puretone and
More informationEvidence base for hearing aid features:
Evidence base for hearing aid features: { the ʹwhat, how and whyʹ of technology selection, fitting and assessment. Drew Dundas, PhD Director of Audiology, Clinical Assistant Professor of Otolaryngology
More informationAccessibility Standards Mitel MiVoice 8528 and 8568 Digital Business Telephones
Accessibility Standards Mitel products are designed with the highest standards of accessibility. Below is a table that outlines how Mitel MiVoice 8528 and 8568 digital business telephones conform to section
More informationNear-End Perception Enhancement using Dominant Frequency Extraction
Near-End Perception Enhancement using Dominant Frequency Extraction Premananda B.S. 1,Manoj 2, Uma B.V. 3 1 Department of Telecommunication, R. V. College of Engineering, premanandabs@gmail.com 2 Department
More informationJuan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3
PRESERVING SPECTRAL CONTRAST IN AMPLITUDE COMPRESSION FOR HEARING AIDS Juan Carlos Tejero-Calado 1, Janet C. Rutledge 2, and Peggy B. Nelson 3 1 University of Malaga, Campus de Teatinos-Complejo Tecnol
More informationNoise-Robust Speech Recognition Technologies in Mobile Environments
Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.
More informationSUPPRESSION OF MUSICAL NOISE IN ENHANCED SPEECH USING PRE-IMAGE ITERATIONS. Christina Leitner and Franz Pernkopf
2th European Signal Processing Conference (EUSIPCO 212) Bucharest, Romania, August 27-31, 212 SUPPRESSION OF MUSICAL NOISE IN ENHANCED SPEECH USING PRE-IMAGE ITERATIONS Christina Leitner and Franz Pernkopf
More informationSpeech recognition in noisy environments: A survey
T-61.182 Robustness in Language and Speech Processing Speech recognition in noisy environments: A survey Yifan Gong presented by Tapani Raiko Feb 20, 2003 About the Paper Article published in Speech Communication
More informationImplementation of Spectral Maxima Sound processing for cochlear. implants by using Bark scale Frequency band partition
Implementation of Spectral Maxima Sound processing for cochlear implants by using Bark scale Frequency band partition Han xianhua 1 Nie Kaibao 1 1 Department of Information Science and Engineering, Shandong
More informationMasker-signal relationships and sound level
Chapter 6: Masking Masking Masking: a process in which the threshold of one sound (signal) is raised by the presentation of another sound (masker). Masking represents the difference in decibels (db) between
More informationLATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION
LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION Lu Xugang Li Gang Wang Lip0 Nanyang Technological University, School of EEE, Workstation Resource
More informationADVANCES in NATURAL and APPLIED SCIENCES
ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2016 December10(17):pages 275-280 Open Access Journal Improvements in
More informationA. SEK, E. SKRODZKA, E. OZIMEK and A. WICHER
ARCHIVES OF ACOUSTICS 29, 1, 25 34 (2004) INTELLIGIBILITY OF SPEECH PROCESSED BY A SPECTRAL CONTRAST ENHANCEMENT PROCEDURE AND A BINAURAL PROCEDURE A. SEK, E. SKRODZKA, E. OZIMEK and A. WICHER Institute
More informationAssistive Listening Technology: in the workplace and on campus
Assistive Listening Technology: in the workplace and on campus Jeremy Brassington Tuesday, 11 July 2017 Why is it hard to hear in noisy rooms? Distance from sound source Background noise continuous and
More informationWhat you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for
What you re in for Speech processing schemes for cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences
More informationThe Use of a High Frequency Emphasis Microphone for Musicians Published on Monday, 09 February :50
The Use of a High Frequency Emphasis Microphone for Musicians Published on Monday, 09 February 2009 09:50 The HF microphone as a low-tech solution for performing musicians and "ultra-audiophiles" Of the
More informationBest Practice Protocols
Best Practice Protocols SoundRecover for children What is SoundRecover? SoundRecover (non-linear frequency compression) seeks to give greater audibility of high-frequency everyday sounds by compressing
More informationC H A N N E L S A N D B A N D S A C T I V E N O I S E C O N T R O L 2
C H A N N E L S A N D B A N D S Audibel hearing aids offer between 4 and 16 truly independent channels and bands. Channels are sections of the frequency spectrum that are processed independently by the
More information2/16/2012. Fitting Current Amplification Technology on Infants and Children. Preselection Issues & Procedures
Fitting Current Amplification Technology on Infants and Children Cindy Hogan, Ph.D./Doug Sladen, Ph.D. Mayo Clinic Rochester, Minnesota hogan.cynthia@mayo.edu sladen.douglas@mayo.edu AAA Pediatric Amplification
More informationAdaptation of Classification Model for Improving Speech Intelligibility in Noise
1: (Junyoung Jung et al.: Adaptation of Classification Model for Improving Speech Intelligibility in Noise) (Regular Paper) 23 4, 2018 7 (JBE Vol. 23, No. 4, July 2018) https://doi.org/10.5909/jbe.2018.23.4.511
More informationEffects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking Vahid Montazeri, Shaikat Hossain, Peter F. Assmann University of Texas
More informationSound localization psychophysics
Sound localization psychophysics Eric Young A good reference: B.C.J. Moore An Introduction to the Psychology of Hearing Chapter 7, Space Perception. Elsevier, Amsterdam, pp. 233-267 (2004). Sound localization:
More informationAUDL GS08/GAV1 Signals, systems, acoustics and the ear. Pitch & Binaural listening
AUDL GS08/GAV1 Signals, systems, acoustics and the ear Pitch & Binaural listening Review 25 20 15 10 5 0-5 100 1000 10000 25 20 15 10 5 0-5 100 1000 10000 Part I: Auditory frequency selectivity Tuning
More informationSpeech perception in individuals with dementia of the Alzheimer s type (DAT) Mitchell S. Sommers Department of Psychology Washington University
Speech perception in individuals with dementia of the Alzheimer s type (DAT) Mitchell S. Sommers Department of Psychology Washington University Overview Goals of studying speech perception in individuals
More informationLinguistic Phonetics Fall 2005
MIT OpenCourseWare http://ocw.mit.edu 24.963 Linguistic Phonetics Fall 2005 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 24.963 Linguistic Phonetics
More informationEffects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag
JAIST Reposi https://dspace.j Title Effects of speaker's and listener's environments on speech intelligibili annoyance Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag Citation Inter-noise 2016: 171-176 Issue
More informationSystems for Improvement of the Communication in Passenger Compartments
Systems for Improvement of the Communication in Passenger Compartments Tim Haulick/ Gerhard Schmidt thaulick@harmanbecker.com ETSI Workshop on Speech and Noise in Wideband Communication 22nd and 23rd May
More informationPower Instruments, Power sources: Trends and Drivers. Steve Armstrong September 2015
Power Instruments, Power sources: Trends and Drivers Steve Armstrong September 2015 Focus of this talk more significant losses Severe Profound loss Challenges Speech in quiet Speech in noise Better Listening
More informationEssential feature. Who are cochlear implants for? People with little or no hearing. substitute for faulty or missing inner hair
Who are cochlear implants for? Essential feature People with little or no hearing and little conductive component to the loss who receive little or no benefit from a hearing aid. Implants seem to work
More informationModulation and Top-Down Processing in Audition
Modulation and Top-Down Processing in Audition Malcolm Slaney 1,2 and Greg Sell 2 1 Yahoo! Research 2 Stanford CCRMA Outline The Non-Linear Cochlea Correlogram Pitch Modulation and Demodulation Information
More informationINTRODUCTION TO PURE (AUDIOMETER & TESTING ENVIRONMENT) TONE AUDIOMETERY. By Mrs. Wedad Alhudaib with many thanks to Mrs.
INTRODUCTION TO PURE TONE AUDIOMETERY (AUDIOMETER & TESTING ENVIRONMENT) By Mrs. Wedad Alhudaib with many thanks to Mrs. Tahani Alothman Topics : This lecture will incorporate both theoretical information
More informationSingle channel noise reduction in hearing aids
Single channel noise reduction in hearing aids Recordings for perceptual evaluation Inge Brons Rolph Houben Wouter Dreschler Introduction Hearing impaired have difficulty understanding speech in noise
More informationLIST OF FIGURES. Figure No. Title Page No. Fig. l. l Fig. l.2
LIST OF FIGURES Figure No. Title Page No. Fig. l. l Fig. l.2 Fig. l.3 A generic perceptual audio coder. 3 Schematic diagram of the programme of the present work on perceptual audio coding. 18 Schematic
More informationThe role of periodicity in the perception of masked speech with simulated and real cochlear implants
The role of periodicity in the perception of masked speech with simulated and real cochlear implants Kurt Steinmetzger and Stuart Rosen UCL Speech, Hearing and Phonetic Sciences Heidelberg, 09. November
More informationLinguistic Phonetics. Basic Audition. Diagram of the inner ear removed due to copyright restrictions.
24.963 Linguistic Phonetics Basic Audition Diagram of the inner ear removed due to copyright restrictions. 1 Reading: Keating 1985 24.963 also read Flemming 2001 Assignment 1 - basic acoustics. Due 9/22.
More informationIMPROVING CHANNEL SELECTION OF SOUND CODING ALGORITHMS IN COCHLEAR IMPLANTS. Hussnain Ali, Feng Hong, John H. L. Hansen, and Emily Tobey
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) IMPROVING CHANNEL SELECTION OF SOUND CODING ALGORITHMS IN COCHLEAR IMPLANTS Hussnain Ali, Feng Hong, John H. L. Hansen,
More informationBINAURAL DICHOTIC PRESENTATION FOR MODERATE BILATERAL SENSORINEURAL HEARING-IMPAIRED
International Conference on Systemics, Cybernetics and Informatics, February 12 15, 2004 BINAURAL DICHOTIC PRESENTATION FOR MODERATE BILATERAL SENSORINEURAL HEARING-IMPAIRED Alice N. Cheeran Biomedical
More informationCONTACTLESS HEARING AID DESIGNED FOR INFANTS
CONTACTLESS HEARING AID DESIGNED FOR INFANTS M. KULESZA 1, B. KOSTEK 1,2, P. DALKA 1, A. CZYŻEWSKI 1 1 Gdansk University of Technology, Multimedia Systems Department, Narutowicza 11/12, 80-952 Gdansk,
More informationWIDEXPRESS. no.30. Background
WIDEXPRESS no. january 12 By Marie Sonne Kristensen Petri Korhonen Using the WidexLink technology to improve speech perception Background For most hearing aid users, the primary motivation for using hearing
More informationUsing Source Models in Speech Separation
Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/
More informationGENERALIZATION OF SUPERVISED LEARNING FOR BINARY MASK ESTIMATION
GENERALIZATION OF SUPERVISED LEARNING FOR BINARY MASK ESTIMATION Tobias May Technical University of Denmark Centre for Applied Hearing Research DK - 2800 Kgs. Lyngby, Denmark tobmay@elektro.dtu.dk Timo
More informationProviding Effective Communication Access
Providing Effective Communication Access 2 nd International Hearing Loop Conference June 19 th, 2011 Matthew H. Bakke, Ph.D., CCC A Gallaudet University Outline of the Presentation Factors Affecting Communication
More informationSupporting Features. Criteria. Remarks and Explanations
Date: vember 23, 2016 Name of Product: Contact for More Information: section508@necam.com Summary Table Voluntary Product Accessibility 1194.23(a) Telecommunications products or systems which provide a
More informationSummary Table Voluntary Product Accessibility Template. Supporting Features Not Applicable Not Applicable. Supports with Exceptions.
Plantronics/ Clarity Summary Table Voluntary Product Accessibility Template Criteria Section 1194.21 Software Applications and Operating Systems Section 1194.22 Web-based intranet and Internet Information
More informationCriteria Supporting Features Remarks and Explanations
Date: August 31, 2009 Name of Product: (Models DTL-2E, DTL-6DE, DTL- 12D, DTL-24D, DTL32D, DTL-8LD) Contact for More Information: section508@necam.com or 214-262-7095 Summary Table Voluntary Product Accessibility
More informationProceedings of Meetings on Acoustics
Proceedings of Meetings on Acoustics Volume 19, 13 http://acousticalsociety.org/ ICA 13 Montreal Montreal, Canada - 7 June 13 Engineering Acoustics Session 4pEAa: Sound Field Control in the Ear Canal 4pEAa13.
More informationInfant Hearing Development: Translating Research Findings into Clinical Practice. Auditory Development. Overview
Infant Hearing Development: Translating Research Findings into Clinical Practice Lori J. Leibold Department of Allied Health Sciences The University of North Carolina at Chapel Hill Auditory Development
More informationWho are cochlear implants for?
Who are cochlear implants for? People with little or no hearing and little conductive component to the loss who receive little or no benefit from a hearing aid. Implants seem to work best in adults who
More information3M Center for Hearing Conservation
3M Center for Hearing Conservation Key Terms in Occupational Hearing Conservation Absorption A noise control method featuring sound-absorbing materials that are placed in an area to reduce the reflection
More informationTopic 4. Pitch & Frequency
Topic 4 Pitch & Frequency A musical interlude KOMBU This solo by Kaigal-ool of Huun-Huur-Tu (accompanying himself on doshpuluur) demonstrates perfectly the characteristic sound of the Xorekteer voice An
More informationSUMMARY TABLE VOLUNTARY PRODUCT ACCESSIBILITY TEMPLATE
Date: 1 August 2009 Voluntary Accessibility Template (VPAT) This Voluntary Product Accessibility Template (VPAT) describes accessibility of Polycom s Polycom CX200, CX700 Desktop IP Telephones against
More informationSUMMARY TABLE VOLUNTARY PRODUCT ACCESSIBILITY TEMPLATE
Date: 2 November 2010 Updated by Alan Batt Name of Product: Polycom CX600 IP Phone for Microsoft Lync Company contact for more Information: Ian Jennings, ian.jennings@polycom.com Note: This document describes
More informationSupporting Features Remarks and Explanations
Date: August 27, 2009 Name of Product: (Models ITL- 2E, ITL-6DE, ITL- 12D, ITL-24D, ITL-32D, ITL-8LD, ITL-320C) Contact for More Information: section508@necam.com or 214-262-7095 Summary Table Voluntary
More informationIndividualizing Noise Reduction in Hearing Aids
Individualizing Noise Reduction in Hearing Aids Tjeerd Dijkstra, Adriana Birlutiu, Perry Groot and Tom Heskes Radboud University Nijmegen Rolph Houben Academic Medical Center Amsterdam Why optimize noise
More informationIMPROVING THE PATIENT EXPERIENCE IN NOISE: FAST-ACTING SINGLE-MICROPHONE NOISE REDUCTION
IMPROVING THE PATIENT EXPERIENCE IN NOISE: FAST-ACTING SINGLE-MICROPHONE NOISE REDUCTION Jason A. Galster, Ph.D. & Justyn Pisa, Au.D. Background Hearing impaired listeners experience their greatest auditory
More informationSlow compression for people with severe to profound hearing loss
Phonak Insight February 2018 Slow compression for people with severe to profound hearing loss For people with severe to profound hearing loss, poor auditory resolution abilities can make the spectral and
More information