An Investigation of MDVP Parameters for Voice Pathology Detection on Three Different Databases

Size: px
Start display at page:

Download "An Investigation of MDVP Parameters for Voice Pathology Detection on Three Different Databases"

Transcription

1 INTERSPEECH 2015 An Investigation of MDVP Parameters for Voice Pathology Detection on Three Different Databases Ahmed Al-nasheri 1, Zulfiqar Ali 1,2, Ghulam Muhammad 1, and Mansour Alsulaiman 1 1 Digital Speech Processing Group, Department of Computer Engineering, College of Computer and Information Sciences King Saud University, Riyadh 11543, Saudi Arabia 2 Centre for Intelligent Signal and Imaging Research (CISIR), Department of Electrical and Electronic Engineering Universiti Tekhnologi PETRONAS, Tronoh 31750, Perak, Malaysia a.alnasheri@yahoo.com, {zuali, ghulam, msuliman}@ksu.edu.sa Abstract In this paper, an investigation of Multi-Dimensional Voice Program (MDVP) parameters to automatically detect voice pathology in three different databases was conducted. MDVP parameters are very popular acoustic analysis among physician / clinician to detect voice pathology. The main objective of the paper is to find out the most prominent MDVP parameters irrespective to the databases used. In this study, three different databases from three distinct languages were used. The databases are Arabic voice pathology database (AVPD), Massachusetts Eye and Ear Infirmary (MEEI) (English database), and Saarbruecken Voice Database (SVD) (German database). Only the sustained vowel /a/ was used in the study. Fisher discrimination ratio (FDR) was applied to rank the parameters. Support vector machine (SVM) was used to perform the detection process. The experimental results demonstrated that there was clear difference of the performance of the MDVP parameters using these databases. The highly ranked parameters were also different from one database to another. The accuracies that achieved are varied from one database to another with the same number of MVPD parameters. The best accuracies obtained by using the three highest MDVP parameters arranged according to FDR were 99.68%, 88.21% and for SVD, MEEI and AVPD, respectively. Index Terms: MDVP parameters, AVPD, SVD, MEEI, SVM, Fisher discrimination ratio, voice pathology detection. 1. Introduction Voice pathologies affect the vocal folds, and these pathologies (disorder) produce irregular vibrations in the vocal folds due to the malfunctioning of the voice box. Vocal fold pathologies exhibit variations in a vibratory cycle of the vocal folds due to their incomplete closure. The voice disorder also changes the shape of the vocal tract and produces irregularities in spectral properties [1]. In addition, voice disorders affect the vocal fold vibration depending on the type of disorder and the location of the disease on the vocal folds that make them produce different basic tones. The number of dysphonic patients having different types of voice disorders has been increased significantly. In the United States, approximately 7.5 million people have vocal difficulty [2]. It has been found that 15% of the total visitors to the King Abdul Aziz University Hospital, Riyadh complain from a voice disorder [3]. The complications caused by a voice problem in a teaching professional are significantly greater than in a nonteaching professional. Studies revealed that, in the U.S., the prevalence of voice disorders during a lifetime is 57.7% for teachers and 28.8% for non-teachers [4]. Approximately, 33% of male and female teachers in the Riyadh area suffer from voice disorders [5]. At Communication and Swallowing Disorders Unit, King Abdul Aziz University Hospital, a high volume of voice disorder cases is examined (almost 760 cases per annum) in individuals with various professional and etiological backgrounds. The use of computers to detect or identify pathological problems in speech is a non-invasive method, which is advancing with time. In the last decade, much research has been done on the automatic detection of vocal fold disorders, and these tasks continue to require further investigation due to the lack of standard automatic diagnosing approaches/equipment for voice disorders. Pathology detection is the first crucial step to correctly diagnose and manage voice disorders. The use of the objective assessment that includes acoustical analysis is independent of human bias and can assess the voice quality reliably by relating certain parameters to vocal fold behavior. On the other hand, subjective measurement of voice quality is based on individual experience, which may vary. Automatic voice pathology detection can be accomplished by various types of features, which can be obtained by the longterm and short-term signal analysis. The long-term parameters can be derived by acoustic analysis [6], [7] of speech, and the short term parameters can be calculated by linear predictive coefficients (LPC) [8], [9], linear predictive cepstral coefficients (LPCC) [10], Mel-frequency cepstral coefficients (MFCC) [11], [12] etc. Different pattern matching techniques such as Gaussian mixture model (GMM) [13], [14], hidden Markov model (HMM) [15], support vector machine (SVM) [16], artificial neural networks (ANN) [17] etc. have been used to differentiate between disorder and normal samples. Multiple long-term acoustic features, namely, pitch, shimmer, jitter, APQ (amplitude perturbation quotient), PPQ (pitch perturbation quotient), HNR (harmonic to noise ratio), NNE (normalized noise energy), VTI (voice turbulence index), SPI (soft phonation index), FATR (frequency amplitude tremor), and the glottal to noise excitation ratio (GNE) are frequently used to diagnose voice pathology (referred in [14] as [2]-[12]). Furthermore, jitter and shimmer capture the vocal fold vibration characteristics for both pathological and normal people, and both parameters are widely used for clinical and scientific diagnosis [18]. Seven acoustic parameters, including shimmer Copyright 2015 ISCA 2952 September 6-10, 2015, Dresden, Germany

2 and jitter, are extracted by means of an iterative residual signal estimator in Rosa et al. [19], and jitter provided 54.8% pathology detection accuracy between 21 pathologies. Thirtythree different long-term acoustic parameters with their definitions, derived from Multi-Dimensional Voice Program (MDVP), are listed in Arjmandi et al. [20]. Twenty-two acoustic parameters are selected from the list, and they were extracted from the voice samples of the Massachusetts Eye and Ear Infirmary (MEEI) database. In this study, 50 dysphonic patients and 50 normal persons were used for the detection. The 22 parameters are calculated for each sample and fed to six different classifiers to compare their accuracies. Two feature reduction techniques are also used before applying classification methods. Binary classifier SVM has shown the best results compared to other classifiers, and its recognition rate is 94.26%. In [21], MFCC and six acoustic parameters jitter, shimmer, NHR, SPI, APQ and RAP (Relative Average Perturbation) are extracted, and the results are compared with the NN-based voice pathology detection system [22]. Sáenz- Lechón et al compared between their proposed parameters based on wavelet transform and some of the MDVP parameters to discriminate between pathological and normal voices [23]. To ensure the reliability of the acoustic MDVP parameters, some of these parameters were compared to the same parameters which are extracted using Praat and the result showed that there was no significant different between the two computer software [24]. Recently, MPEG-7 audio descriptors and multi-directional regression based features are used in voice pathology detection and found to have good accuracy [26, 27]. Another recent study investigates the most discriminative frequency region for voice pathology detection [28]. In general, MDVP parameters have a good ability to discriminate between normal and pathological voices like other tools that used to extract acoustic parameters such as WPCVox [25]. In this paper, the well-known MDVP parameters are used in three different databases, which are AVPD, MEEI, and SVD, to detect voice pathology. MDVP parameters are commonly used by the physician / clinician to assess the voice pathology; however, MDVP is commercial software, and it includes parameters of voice quality measurement. The rest of the paper is organized as follows: section 2 provides overview of the speech databases. Section 3 presents the methodology and tools using in this study; Section 4 gives experimental setup; Section 5 presents results and discussion, and finally, Section 6 draws some conclusions. 2. Databases 2.1. Arabic voice pathology database (AVPD) The speech samples collected in different sessions at Communication and Swallowing Disorders Unit [4], King Abdul Aziz University Hospital, Riyadh, Saudi Arabia, by experienced phoneticians in a soundproof room using a standardized recording protocol. The database collection is one of the major tasks of the undergoing project funded by National Plans for Science and Technology (NPST), Saudi Arabia, for the duration of two years. The protocol of the database designed such that it should avoid different shortcomings of MEEI database [24]. The project records the sustained vowels as well as the speech of patients having vocal fold disorders and the same of normal persons after clinically assessing the persons. The recording has different types of texts; three vowels with onset and offset information, isolated words containing Arabic digits and some common words, and continuous speech. The selected text covers all Arabic phonemes. All speakers record three utterances of each vowel /a/, /u/ and /i/, while, isolated words and continuous speech are recorded once to avoid making a burden on patients. The sampling frequency of the database is 50 khz, and the speech is recorded by using Kay Pentax computerized speech lab (CSL Model 4300) Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database This database developed by MEEI Voice and Speech Lab. It includes more than 1,400 voiced samples of sustained vowel /a/ and first 12 seconds of Rainbow Passage. It is openly commercialized by Kay Elemetrics [29]. It was recorded in two different environments. The sampling frequency for normal samples is 50 khz while that of the pathological samples is 25 khz or 50 khz. It is used in most of the studies of voice pathology detection and classification Saarbruecken voice database (SVD) The SVD is a freely downloadable database [30]. It has been recorded by the Institute of Phonetics of Saarland University. This database contains sustained vowels /a/, /i/ and /u/ with different intonations: normal, low, high and low-high-low and a spoken sentence in German Guten Morgen, wie geht es Ihnen? that mean in English Good morning, how are you? is included. Very few studies of voice pathology detection have been done on this database. 3. Method Various samples of normal and three distinct voice disorders were taken from three different databases as shown in Table 1. We chose these disorders because only these disorders are common in the three databases. The numbers of male and female speakers are shown, respectively, inside the parenthesis in Table 1. Table 1. Normal and pathological samples from three different databases. Data base AVPD MEEI SVD Normal 118 (93-25) 53 (21, 32) 262 ( ) Pathological Cysts Paraly. Polyp (7 6) (16-16) (14-16) (6-4) (38-32) (8-7) (5-1) (121-73) (19-25) 244 In the experiments, we only used samples for sustained vowel /a/. Each voice sample has a total of 33 MDVP acoustic parameters. The MDVP parameters of these samples were extracted using MDVP [31] kaypentax. This program extracts 33 parameters, but we only selected 22 parameters out of 33 parameters because the rest of the parameters do not reflect voice quality, or they are not reported for some voices. In this study, fisher discriminative ratio (FDR) was used between two classes, one from normal and the other from pathological, for the whole parameters in each database individually. The purpose of using FDR is to find which features contribute higher to detection of pathologies, and it can be calculated as shown in (1). 2 ( N P ) FDR where, i 1,2,3,...22 (1) i 2 2 N P 2953

3 where µn and µp represent the mean for classes of normal and pathological samples, respectively, while σn and σp represent the same but as variances, and where i represents the feature number. In the experiments, the features were fed to support vector machine for the decision making whether the subject is normal or pathological. To ensure from the accuracy we got, every experiment was repeated ten times and then we took the average to report the results. 4. Experiment For each database, four different experiments were performed with various numbers of parameters. First, we selected the 22 parameters that were used and defined in [31]. We performed the experiments with the 22 parameters from these databases individually. After that, we used FDR and sorted these parameters in descending order according to the fisher ratio score. We chose the first top 10 parameters, and performed the experiments with these parameters. Next, we chose the three top parameters from each database and performed the experiments with them. To develop a general system independent of the databases, we chose the four common parameters from the first 10 top parameters and performed the experiment with them from each database individually. 5. Results and Discussion The results of the performed experiments for pathology detection are expressed in terms of accuracy (ACC), sensitivity (SN: the proportion of pathological samples that positively identified), specificity (SP: the proportion of normal samples that negatively identified) and area under the Receiver Operating Characteristic (ROC) curve called AUC, which are shown in Table 2. As we can see from this table, the accuracies are varied from one database to another with the same number of MDVP parameters. The variation in the accuracies from one database to another may refer to different reasons such as (i) the severity levels of voice disorders, which are not the same in the three databases as we can notice from Table 2 how much sensitivity (that belongs to pathological samples) is varied from one database to another, (ii) the recording environment and the regulation of the recording are not the same in the three databases, (iii) in the same database the recoding environments for pathological and normal samples are not the same as in MEEI database, and (iv) the number of the taken samples from these databases in this study are not the same. In case of 22-parameters, the highest accuracy is 76.36% with MEEI database, which is comparable to the result in [32] where Arjmandi uses the same database and the same classifier, but different pathological samples. The accuracies with the two other databases, SVD and AVPD, are comparable (little less) to the accuracy obtained by MEEI. To the best of our knowledge, this is the first instance of evaluating the MDVP parameters in these two databases, and it needs more investigation. In case of 10 top parameters, the accuracy is increased by 11% with the MEEI samples, and the accuracy remained almost constant in case of SVD and AVPD. The reason behind that may be due to the top 10 parameters are not the same in all databases, and their FDR values are different too. In case of 3 top MDVP parameters, the accuracies are dramatically increased especially with the samples that were taken from SVD database. The best accuracies are 99.68%, 88.21%, and 72.53% for SVD, MEEI, and AVPD, respectively. The top three features that gave high accuracy with SVD are vam (peak amplitude variation in period-to-period), APQ (amplitude perturbation quotient), and PFR (phonatory fundamental frequency). vam has FDR value of which is far greater than FDR values of and that belong to PFR and APQ, respectively. Table 2. Result of different MDVP Parameters from three databases Parameter s 22- Parameters 10- Parameters 3-Parameters 4 - Common Databas e ACC % SN% SP% AUC AVPD MEEI SVD AVPD MEEI SVD AVPD MEEI SVD AVPD MEEI SVD Figure 1: Scatter plot for the three to MDVP Parameters from SVD database. From the scatter plot in Figure 1, we can see vam parameter has the best ability to discriminate the pathological samples more than the others two parameters, and it can be illustrated in Figure 2 that the probability density functions (pdf) of normal and pathological samples using the vam parameter have almost no overlapping. Figure 3 illustrates the ROC curve of the 3 top parameters from the three databases. It shows that the best performance is obtained with the features extracted from SVD database. One of the reasons behind that is in SVD database, the pathological samples have high severity, so it has a clear difference between normal and pathological samples. The 95% confidence interval (C.I.) is [ ], and 1-tail p-value is zero (<0.05) describing the significance of the data in the two classes. 2954

4 Figure 2: PDF for vam Parameter. from these databases produce the same ability to detect pathological voices or not and to avoid unfairly over-fitting that results from error estimation. By performing cross database experiment, it will be easy to see how well the trained system on one database could classify samples of another database. As shown in Table 3 the cross database experiments are performed between two databases: one database for training and the other one for testing and the accuracy is depicted in that table with increased accuracy in the presence of SVD database in the training. In addition, three more experiments are performed between three databases, where two are used for training and the other one for testing. The accuracies are comparable in the three experiments. Table 3: Cross database experiments accuracies (%) with 4-common parameters. In general, the result we got is comparable to previous study in case of using MEEI database and it represents a good result if we compare our initial work on AVPD database with the initial use of SVD in the field of voice pathology detection as in [33]. Training Testing MEEI SVD AVPD MEEI SVD AVPD MEEI + SVD SVD + AVPD MEEI AVPD Figure 3: ROC Curve for 3-Top Features for the three Databases. In case of the using the four common MDVP parameters between the three databases, the accuracy is decreased by 7% in case of MEEI and the other two accuracies for AVD and AVPD almost remained unchanged (with respect to the accuracies obtained by 10 parameters). The reason behind that is that every database is independent of one to another, and the 4 common parameters have different values according to the FDR score. The four common parameters between these databases are Jitta, Jitt, Rap, and PPQ. Jitta: is an absolute jitter that gives an evaluation in microseconds of the period-to-period variability of the pitch within the analyzed voice sample, Jitt: it gives an evaluation of the variability of the pitch period within the analyzed voice sample in percent, Rap: is a relative average perturbation that gives an evaluation of the variability of the pitch period within the analyzed voice sample at smoothing factor 3 periods, and PPQ: is a pitch perturbation quotient that gives an evaluation in percent of the long-term variability of the pitch period within the analyzed voice sample at smoothing factor of three periods. In each type of experiment mentioned before, we made a comparison between the performance of MDVP parameters to detect voice pathology for these databases and how much their samples contribute to discriminate between pathological and normal voices. In this study, some additional experiments were performed using cross database between the three databases to make sure if the extracted MDVP parameters From the results of all the experiments, we find the followings. Training with MEEI does not make the system robust, because the system is confused whether it is classifying the environments or normal / pathology. Training with SVD makes the system more robust than the systems trained with other database, because SVD samples are clearly distinguishable between normal and pathology. Training with two databases offsets some shortcomings of training with one database. There is a need to develop more robust features that can successfully differentiate between normal and pathological samples irrespective of the databases. Moreover, the features should be able to classify low severe pathological samples as in the case of AVPD. 6. Conclusion In this work, we evaluate MDVP parameters on three different databases AVPD, MEEI, and SVD with four different experiments individually. The accuracies of the detection are varied from one database to another with the same number of MDVP parameters. The best accuracies we got are %, 88.21%, and 72.53% for the samples that were taken from SVD, MEEI, and AVPD respectively. Some of the MDVP parameters show high ability of discrimination between normal and pathological subjects such as vam, APQ, and PFR. In a future work, we will try to find out some features that can correctly detect voice pathologies independent of the databases. 7. Acknowledgment This project was funded by the National Plan for Science, Technology and Innovation (MAARIFAH), King Abdulaziz City for Science and Technology, Kingdom of Saudi Arabia, Award Number (12-MED ). 2955

5 8. References [1] G. Muhammad, Z. Ali, M. Alsulaiman, and K. Almutib. "Vocal Fold Disorder Detection by applying LBP Operator on Dysphonic Speech Signal, RAICMS, , [2] National Institute on Deafness and Other Communication Disorders: Voice, Speech, and Language: Quick Statistics, 2014.Availableat Accessed on March, [3] Research Chair of Voicing and Swallowing Disorders. Available at Accessed on March, [4] N. Roy, R.M. Merrill, S. Thibeault, R.A. Parsa, S.D. Gray, and E.M. Smith, Prevalence of voice disorders in teachers and the general population, J Speech Lang Hear Res., vol.47, no. 2, pp , Apr [5] K.H. Almalki, Voice Disorders Among Saudi Teachers in Riyadh City, Saudi Journal of Oto-Rhinolaryngology Head and Neck Surgery, [6] B. Boyanov, and S. Hadjitodorov, Acoustic analysis of pathological voices. a voice analysis system for the screening of laryngeal diseases, Proceedings of IEEE International Conference on Engineering in Medicine and Biology Society, vol.16, pp.74-82, [7] C. E. Martinez, and L H. Rufiner, Acoustic analysis of speech for detection of laryngeal pathologies, Proceedings of 22nd Annual IEEE International Conference on Engineering in Medicine and Biology Society, vol. 3, pp , [8] B. S. Atal, "Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and recognition" J. Acoustic. Soc. Amer., vol. 54, no. 6, pp , [9] L. Xugang, and D. Jianwu, An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification, Speech Communication 07, vol. 50, no. 4, pp , Oct [10] M. A. Anusuya, S. K. Katti, Front end analysis of speech recognition: a review, International Journal of Speech Technology, vol. 14, pp , Dec [11] L. Rabiner and B.H. Juang, Fundamentals of speech recognition. Englewood Cliffs, NJ: Prentice-Hall, [12] Z. Ali, M. Aslam., and M.E. Ana María, A speaker identification system using MFCC features with VQ technique, Proceedings of 3rd IEEE International Symposium on Intelligent Information Technology Application, pp , [13] W.J.J. Roberts, and J.P. Willmore, "Automatic speaker recognition using Gaussian mixture models", Proceedings of Information, Decision and Control, IDC 99, pp , [14] J.I. Godino-Llorente, P. Gomes-Vilda and M. Blanco-Velasco, "Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and shortterm cepstral parameters", IEEE Transactions on Biomedical Engineering, vol. 53, no. 10, pp Oct [15] L.E. Baum and T. Petrie, Statistical inference for probabilistic functions of finite state Markov Chains, Ann. Math. Stat., vol. 37, pp , [16] S. Abe, Support Vector Machines for Pattern Classification. Springer-Verlag, Berlin Heidelberg New York, 2005 [17] T. Ritchings, M. McGillion, and C. Moore, Pathological voice quality assessment using artificial neural networks, Med. Eng. Phys., vol. 24, no. 8, pp , Sept [18] M. Brockmann, M.J. Drinnan, C. Storck, and P.N. Carding, "Reliable jitter and shimmer measurements in voice clinics: The relevance of vowel, gender, vocal intensity, and fundamental frequency effects in a typical clinical task," Journal of Voice, vol. 25, no. 1, pp , [19] M. Rosa, J.C. Pereira, and M. Grellet, Adaptive estimation of residue signal for voice pathology diagnosis, IEEE Trans. Biomed. Eng., vol. 47, no. 1, pp , Jan [20] M.K. Arjmandi, M. Pooyan, M. Mikaili, M. Vali, and A. Moqarehzadeh, Identification of voice disorders using long-time features and support vector machine with different feature reduction methods, Journal of Voice, vol. 25, no. 6, pp , Nov [21] J. Wang and C. Jo, "Vocal folds disorder detection using pattern recognition method", Proceedings of 29th Annual International Conference of the IEEE EMBS, pp , Lyon, France, [22] T. Li, C. Jo, and S. Wang, Classification of pathological voice including severely noisy cases, Proceedings of 8th International Conference on Spoken Language Processing, I, Jeju, Korea, pp , [23] N. Sáenz-Lechón, J.I. Godino-Llorente, V.. Osma-Ruiz, and P. Gómez-Vilda, Methodological issues in the development of automatic systems for voice pathology detection, Biomedical Signal Processing and Control, vol. 1, no. 2, pp , April [24] H. Oğuz, M. A. Kiliç, and M. A. Şafak, "Comparison of results in two acoustic analysis programs: PRAAT and MDVP," Turkish Journal of Medical Sciences 41.5, pp , [25] J.I. Godino-Llorente, V. Osma-Ruiz, N. Sáenz-Lechón, I. Cobeta- Marco, R. González-Herranz, C. Ramírez-Calvo "Acoustic analysis of voice using WPCVox: a comparative study with Multi- Dimensional Voice Program," European Archives of Oto-Rhino- Laryngology, 265.4, , [26] G. Muhammad and M. Melhem, Pathological Voice Detection and Binary Classification Using MPEG-7 Audio Features," Biomedical Signal Processing and Controls, 11, pp. 1 9, [27] G. Muhammad, T. Mesallam, K. Almalki, M. Farahat, A. Mahmood, and M. Alsulaiman, Multi Directional Regression (MDR) Based Features for Automatic Voice Disorder Detection, Journal of Voice, Vol. 26, No. 6, pp. 817.e e27, [28] A. A-Nasheri, Z. Ali, G. Muhammad, and M. Alsulaiman, Voice Pathology Detection Using Auto-Correlation of Different Filters Bank, 11th ACS/IEEE International Conference on Computer Systems and Applications, Doha, Qatar, November 10-13, [29] Kay Elemetrics Corp., Disordered Voice Database, Version 1.03 (CD-ROM), MEEI, Voice and Speech Lab, Boston, MA (October 1994). [30] W.J. Barry, P utzer, M., Saarbrucken Voice Database, Institute of Phonetics, Univ. of Saarland, [31] Kay Elemetrics, Multi-Dimensional Voice Program (MDVP) [Computer Program], [32] M. K. Arjmandi, M. Pooyan, M. Mikaili, M. Vali, A. Moqarehzadeh, "Identification of voice disorders using long-time features and support vector machine with different feature reduction methods," Journal of Voice, 25 (6), pp. e275-e289, [33] D. Martínez, E. Lleida, A. Ortega, A. Miguel, and J. Villalba "Voice Pathology Detection on the Saarbruecken Voice Database with Calibration and Fusion of Scores Using MultiFocal Toolkit," Advances in Speech and Language Technologies for Iberian Languages. Springer Berlin Heidelberg, ,

Performance of Gaussian Mixture Models as a Classifier for Pathological Voice

Performance of Gaussian Mixture Models as a Classifier for Pathological Voice PAGE 65 Performance of Gaussian Mixture Models as a Classifier for Pathological Voice Jianglin Wang, Cheolwoo Jo SASPL, School of Mechatronics Changwon ational University Changwon, Gyeongnam 64-773, Republic

More information

Voice Pathology Recognition and Classification using Noise Related Features

Voice Pathology Recognition and Classification using Noise Related Features Voice Pathology Recognition and Classification using Noise Related Features HAMDI Rabeh 1, HAJJI Salah 2, CHERIF Adnane 3 Faculty of Mathematical, Physical and Natural Sciences of Tunis, University of

More information

Telephone Based Automatic Voice Pathology Assessment.

Telephone Based Automatic Voice Pathology Assessment. Telephone Based Automatic Voice Pathology Assessment. Rosalyn Moran 1, R. B. Reilly 1, P.D. Lacy 2 1 Department of Electronic and Electrical Engineering, University College Dublin, Ireland 2 Royal Victoria

More information

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception

Jitter, Shimmer, and Noise in Pathological Voice Quality Perception ISCA Archive VOQUAL'03, Geneva, August 27-29, 2003 Jitter, Shimmer, and Noise in Pathological Voice Quality Perception Jody Kreiman and Bruce R. Gerratt Division of Head and Neck Surgery, School of Medicine

More information

SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM)

SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) Virendra Chauhan 1, Shobhana Dwivedi 2, Pooja Karale 3, Prof. S.M. Potdar 4 1,2,3B.E. Student 4 Assitant Professor 1,2,3,4Department of Electronics

More information

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes J. Sujatha Research Scholar, Vels University, Assistant Professor, Post Graduate

More information

METHOD. The current study was aimed at investigating the prevalence and nature of voice

METHOD. The current study was aimed at investigating the prevalence and nature of voice CHAPTER - 3 METHOD The current study was aimed at investigating the prevalence and nature of voice problems in call center operators with the following objectives: To determine the prevalence of voice

More information

Nonlinear signal processing and statistical machine learning techniques to remotely quantify Parkinson's disease symptom severity

Nonlinear signal processing and statistical machine learning techniques to remotely quantify Parkinson's disease symptom severity Nonlinear signal processing and statistical machine learning techniques to remotely quantify Parkinson's disease symptom severity A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig 9 March 2011 Project

More information

Gender Based Emotion Recognition using Speech Signals: A Review

Gender Based Emotion Recognition using Speech Signals: A Review 50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department

More information

Research Article Automatic Speaker Recognition for Mobile Forensic Applications

Research Article Automatic Speaker Recognition for Mobile Forensic Applications Hindawi Mobile Information Systems Volume 07, Article ID 698639, 6 pages https://doi.org//07/698639 Research Article Automatic Speaker Recognition for Mobile Forensic Applications Mohammed Algabri, Hassan

More information

Analysis of Speech Recognition Techniques for use in a Non-Speech Sound Recognition System

Analysis of Speech Recognition Techniques for use in a Non-Speech Sound Recognition System Analysis of Recognition Techniques for use in a Sound Recognition System Michael Cowling, Member, IEEE and Renate Sitte, Member, IEEE Griffith University Faculty of Engineering & Information Technology

More information

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY

More information

EMOTION DETECTION FROM VOICE BASED CLASSIFIED FRAME-ENERGY SIGNAL USING K-MEANS CLUSTERING

EMOTION DETECTION FROM VOICE BASED CLASSIFIED FRAME-ENERGY SIGNAL USING K-MEANS CLUSTERING EMOTION DETECTION FROM VOICE BASED CLASSIFIED FRAME-ENERGY SIGNAL USING K-MEANS CLUSTERING Nazia Hossain 1, Rifat Jahan 2, and Tanjila Tabasum Tunka 3 1 Senior Lecturer, Department of Computer Science

More information

Comparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide

Comparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide International Journal of Computer Theory and Engineering, Vol. 7, No. 6, December 205 Comparative Analysis of Vocal Characteristics in Speakers with Depression and High-Risk Suicide Thaweewong Akkaralaertsest

More information

EMOTIONS are one of the most essential components of

EMOTIONS are one of the most essential components of 1 Hidden Markov Model for Emotion Detection in Speech Cyprien de Lichy, Pratyush Havelia, Raunaq Rewari Abstract This paper seeks to classify speech inputs into emotion labels. Emotions are key to effective

More information

UNOBTRUSIVE MONITORING OF SPEECH IMPAIRMENTS OF PARKINSON S DISEASE PATIENTS THROUGH MOBILE DEVICES

UNOBTRUSIVE MONITORING OF SPEECH IMPAIRMENTS OF PARKINSON S DISEASE PATIENTS THROUGH MOBILE DEVICES UNOBTRUSIVE MONITORING OF SPEECH IMPAIRMENTS OF PARKINSON S DISEASE PATIENTS THROUGH MOBILE DEVICES T. Arias-Vergara 1, J.C. Vásquez-Correa 1,2, J.R. Orozco-Arroyave 1,2, P. Klumpp 2, and E. Nöth 2 1 Faculty

More information

A Review on Sleep Apnea Detection from ECG Signal

A Review on Sleep Apnea Detection from ECG Signal A Review on Sleep Apnea Detection from ECG Signal Soumya Gopal 1, Aswathy Devi T. 2 1 M.Tech Signal Processing Student, Department of ECE, LBSITW, Kerala, India 2 Assistant Professor, Department of ECE,

More information

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation Aldebaro Klautau - http://speech.ucsd.edu/aldebaro - 2/3/. Page. Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation ) Introduction Several speech processing algorithms assume the signal

More information

A Review on Dysarthria speech disorder

A Review on Dysarthria speech disorder A Review on Dysarthria speech disorder Miss. Yogita S. Mahadik, Prof. S. U. Deoghare, 1 Student, Dept. of ENTC, Pimpri Chinchwad College of Engg Pune, Maharashtra, India 2 Professor, Dept. of ENTC, Pimpri

More information

Speech (Sound) Processing

Speech (Sound) Processing 7 Speech (Sound) Processing Acoustic Human communication is achieved when thought is transformed through language into speech. The sounds of speech are initiated by activity in the central nervous system,

More information

Visi-Pitch IV is the latest version of the most widely

Visi-Pitch IV is the latest version of the most widely APPLICATIONS Voice Disorders Motor Speech Disorders Voice Typing Fluency Selected Articulation Training Hearing-Impaired Speech Professional Voice Accent Reduction and Second Language Learning Importance

More information

CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS

CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS Mitchel Weintraub and Leonardo Neumeyer SRI International Speech Research and Technology Program Menlo Park, CA, 94025 USA ABSTRACT

More information

COMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION RECOGNITION

COMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION RECOGNITION Journal of Engineering Science and Technology Vol. 11, No. 9 (2016) 1221-1233 School of Engineering, Taylor s University COMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION

More information

CLOSED PHASE ESTIMATION FOR INVERSE FILTERING THE ORAL AIRFLOW WAVEFORM*

CLOSED PHASE ESTIMATION FOR INVERSE FILTERING THE ORAL AIRFLOW WAVEFORM* CLOSED PHASE ESTIMATION FOR INVERSE FILTERING THE ORAL AIRFLOW WAVEFORM* Jón Guðnason 1, Daryush D. Mehta 2,3, Thomas F. Quatieri 3 1 Center for Analysis and Design of Intelligent Agents, Reykyavik University,

More information

TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY

TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY ARCHIVES OF ACOUSTICS 32, 4 (Supplement), 187 192 (2007) TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY Piotr STARONIEWICZ Wrocław University of Technology Institute of Telecommunications,

More information

Heart Murmur Recognition Based on Hidden Markov Model

Heart Murmur Recognition Based on Hidden Markov Model Journal of Signal and Information Processing, 2013, 4, 140-144 http://dx.doi.org/10.4236/jsip.2013.42020 Published Online May 2013 (http://www.scirp.org/journal/jsip) Heart Murmur Recognition Based on

More information

Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2

Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2 Shaheen N. Awan 1, Nancy Pearl Solomon 2, Leah B. Helou 3, & Alexander Stojadinovic 2 1 Bloomsburg University of Pennsylvania; 2 Walter Reed National Military Medical Center; 3 University of Pittsburgh

More information

Using Source Models in Speech Separation

Using Source Models in Speech Separation Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation

The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation Martin Graciarena 1, Luciana Ferrer 2, Vikramjit Mitra 1 1 SRI International,

More information

Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions

Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions 3 Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki Graduate School of System Informatics, Kobe University,

More information

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain F. Archetti 1,2, G. Arosio 1, E. Fersini 1, E. Messina 1 1 DISCO, Università degli Studi di Milano-Bicocca, Viale Sarca,

More information

Adaptation of Classification Model for Improving Speech Intelligibility in Noise

Adaptation of Classification Model for Improving Speech Intelligibility in Noise 1: (Junyoung Jung et al.: Adaptation of Classification Model for Improving Speech Intelligibility in Noise) (Regular Paper) 23 4, 2018 7 (JBE Vol. 23, No. 4, July 2018) https://doi.org/10.5909/jbe.2018.23.4.511

More information

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES Varinthira Duangudom and David V Anderson School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332

More information

Noise-Robust Speech Recognition Technologies in Mobile Environments

Noise-Robust Speech Recognition Technologies in Mobile Environments Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.

More information

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES

ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES ISCA Archive ACOUSTIC AND PERCEPTUAL PROPERTIES OF ENGLISH FRICATIVES Allard Jongman 1, Yue Wang 2, and Joan Sereno 1 1 Linguistics Department, University of Kansas, Lawrence, KS 66045 U.S.A. 2 Department

More information

International Journal of Pure and Applied Mathematics

International Journal of Pure and Applied Mathematics International Journal of Pure and Applied Mathematics Volume 118 No. 9 2018, 253-257 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu CLASSIFICATION

More information

Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale

Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale Simin Xie 1, 2, Nan Yan 2, Ping Yu 3, Manwa L. Ng 4, Lan Wang 2,Zhuanzhuan

More information

Evaluation of the neurological state of people with Parkinson s disease using i-vectors

Evaluation of the neurological state of people with Parkinson s disease using i-vectors INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Evaluation of the neurological state of people with Parkinson s disease using s N. Garcia 1, J. R. Orozco-Arroyave 1,2, L. F. D Haro 3, Najim Dehak

More information

MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE

MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE Abstract MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE Arvind Kumar Tiwari GGS College of Modern Technology, SAS Nagar, Punjab, India The prediction of Parkinson s disease is

More information

Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease

Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease ORIGINAL ARTICLE Voice Analysis in Individuals with Chronic 10.5005/jp-journals-10023-1081 Obstructive Pulmonary Disease Voice Analysis in Individuals with Chronic Obstructive Pulmonary Disease 1 Anuradha

More information

Voice analysis for detecting patients with Parkinson s disease using the hybridization of the best acoustic features

Voice analysis for detecting patients with Parkinson s disease using the hybridization of the best acoustic features International Journal on Electrical Engineering and Informatics - Volume 8, umber, March 206 Voice analysis for detecting patients with Parkinson s disease using the hybridization of the best acoustic

More information

EXPLORATORY ANALYSIS OF SPEECH FEATURES RELATED TO DEPRESSION IN ADULTS WITH APHASIA

EXPLORATORY ANALYSIS OF SPEECH FEATURES RELATED TO DEPRESSION IN ADULTS WITH APHASIA EXPLORATORY ANALYSIS OF SPEECH FEATURES RELATED TO DEPRESSION IN ADULTS WITH APHASIA Stephanie Gillespie 1, Elliot Moore 1, Jacqueline Laures-Gore 2, Matthew Farina 2 1 Georgia Institute of Technology,

More information

Interjudge Reliability in the Measurement of Pitch Matching. A Senior Honors Thesis

Interjudge Reliability in the Measurement of Pitch Matching. A Senior Honors Thesis Interjudge Reliability in the Measurement of Pitch Matching A Senior Honors Thesis Presented in partial fulfillment of the requirements for graduation with distinction in Speech and Hearing Science in

More information

Some Studies on of Raaga Emotions of Singers Using Gaussian Mixture Model

Some Studies on of Raaga Emotions of Singers Using Gaussian Mixture Model International Journal for Modern Trends in Science and Technology Volume: 03, Special Issue No: 01, February 2017 ISSN: 2455-3778 http://www.ijmtst.com Some Studies on of Raaga Emotions of Singers Using

More information

468 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 3, MARCH 2006

468 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 3, MARCH 2006 468 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 3, MARCH 2006 Telephony-Based Voice Pathology Assessment Using Automated Speech Analysis Rosalyn J. Moran*, Richard B. Reilly, Senior Member,

More information

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise

Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise 4 Special Issue Speech-Based Interfaces in Vehicles Research Report Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise Hiroyuki Hoshino Abstract This

More information

Identification and Validation of Repetitions/Prolongations in Stuttering Speech using Epoch Features

Identification and Validation of Repetitions/Prolongations in Stuttering Speech using Epoch Features Identification and Validation of Repetitions/Prolongations in Stuttering Speech using Epoch Features G. Manjula Research Scholar, Department of Electronics and Communication Engineering, Associate Professor,

More information

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency

Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Tetsuya Shimamura and Fumiya Kato Graduate School of Science and Engineering Saitama University 255 Shimo-Okubo,

More information

Computational Perception /785. Auditory Scene Analysis

Computational Perception /785. Auditory Scene Analysis Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 4aSCb: Voice and F0 Across Tasks (Poster

More information

Critical Review: Are laryngeal manual therapies effective in improving voice outcomes of patients with muscle tension dysphonia?

Critical Review: Are laryngeal manual therapies effective in improving voice outcomes of patients with muscle tension dysphonia? Critical Review: Are laryngeal manual therapies effective in improving voice outcomes of patients with muscle tension dysphonia? María López M.Cl.Sc (SLP) Candidate University of Western Ontario: School

More information

Correlating Speech and Voice Features of Transgender Women with Ratings of Femininity and Gender

Correlating Speech and Voice Features of Transgender Women with Ratings of Femininity and Gender University of Rhode Island DigitalCommons@URI Open Access Master's Theses 2018 Correlating Speech and Voice Features of Transgender Women with Ratings of Femininity and Gender Kimberly Dahl University

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is

More information

2013), URL

2013), URL Title Acoustical analysis of voices produced by Cantonese patients of unilateral vocal fold paralysis: acoustical analysis of voices by Cantonese UVFP Author(s) Yan, N; Wang, L; Ng, ML Citation The 2013

More information

Sound Texture Classification Using Statistics from an Auditory Model

Sound Texture Classification Using Statistics from an Auditory Model Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering

More information

EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS?

EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS? EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS? Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge- Apple, Zhiyao Duan, Wendi Heinzelman University of Rochester,

More information

2012, Greenwood, L.

2012, Greenwood, L. Critical Review: How Accurate are Voice Accumulators for Measuring Vocal Behaviour? Lauren Greenwood M.Cl.Sc. (SLP) Candidate University of Western Ontario: School of Communication Sciences and Disorders

More information

NMF-Density: NMF-Based Breast Density Classifier

NMF-Density: NMF-Based Breast Density Classifier NMF-Density: NMF-Based Breast Density Classifier Lahouari Ghouti and Abdullah H. Owaidh King Fahd University of Petroleum and Minerals - Department of Information and Computer Science. KFUPM Box 1128.

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Improved Accuracy of Breast Cancer Detection in Digital Mammograms using Wavelet Analysis and Artificial

More information

Acoustic Analysis Before and After Voice Therapy for Laryngeal Pathology.

Acoustic Analysis Before and After Voice Therapy for Laryngeal Pathology. Acoustic Analysis Before and After Voice Therapy for Laryngeal Pathology. Chhetri SS, Gautam R ABSTRACT Background Department of ENT-HNS Kathmandu Medical College and Teaching Hospital Sinamangal, Kathmandu,

More information

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar

A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar Perry C. Hanavan, AuD Augustana University Sioux Falls, SD August 15, 2018 Listening in Noise Cocktail Party Problem

More information

Exploiting visual information for NAM recognition

Exploiting visual information for NAM recognition Exploiting visual information for NAM recognition Panikos Heracleous, Denis Beautemps, Viet-Anh Tran, Hélène Loevenbruck, Gérard Bailly To cite this version: Panikos Heracleous, Denis Beautemps, Viet-Anh

More information

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India.

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India. Volume 116 No. 21 2017, 203-208 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF

More information

Discrete Signal Processing

Discrete Signal Processing 1 Discrete Signal Processing C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.cs.nctu.edu.tw/~cmliu/courses/dsp/ ( Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Speech and Sound Use in a Remote Monitoring System for Health Care

Speech and Sound Use in a Remote Monitoring System for Health Care Speech and Sound Use in a Remote System for Health Care M. Vacher J.-F. Serignat S. Chaillol D. Istrate V. Popescu CLIPS-IMAG, Team GEOD Joseph Fourier University of Grenoble - CNRS (France) Text, Speech

More information

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants

2/25/2013. Context Effect on Suprasegmental Cues. Supresegmental Cues. Pitch Contour Identification (PCI) Context Effect with Cochlear Implants Context Effect on Segmental and Supresegmental Cues Preceding context has been found to affect phoneme recognition Stop consonant recognition (Mann, 1980) A continuum from /da/ to /ga/ was preceded by

More information

be investigated online at the following web site: Synthesis of Pathological Voices Using the Klatt Synthesiser. Speech

be investigated online at the following web site: Synthesis of Pathological Voices Using the Klatt Synthesiser. Speech References [1] Antonanzas, N. The inverse filter program developed by Norma Antonanzas can be investigated online at the following web site: www.surgery.medsch.ucla.edu/glottalaffairs/software_of_the_boga.htm

More information

Vital Responder: Real-time Health Monitoring of First- Responders

Vital Responder: Real-time Health Monitoring of First- Responders Vital Responder: Real-time Health Monitoring of First- Responders Ye Can 1,2 Advisors: Miguel Tavares Coimbra 2, Vijayakumar Bhagavatula 1 1 Department of Electrical & Computer Engineering, Carnegie Mellon

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the

More information

Enhanced Feature Extraction for Speech Detection in Media Audio

Enhanced Feature Extraction for Speech Detection in Media Audio INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Enhanced Feature Extraction for Speech Detection in Media Audio Inseon Jang 1, ChungHyun Ahn 1, Jeongil Seo 1, Younseon Jang 2 1 Media Research Division,

More information

Improving the Speech Intelligibility By Cochlear Implant Users

Improving the Speech Intelligibility By Cochlear Implant Users University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2016 Improving the Speech Intelligibility By Cochlear Implant Users Behnam Azimi University of Wisconsin-Milwaukee

More information

Timing Patterns of Speech as Potential Indicators of Near-Term Suicidal Risk

Timing Patterns of Speech as Potential Indicators of Near-Term Suicidal Risk International Journal of Multidisciplinary and Current Research Research Article ISSN: 2321-3124 Available at: http://ijmcr.com Timing Patterns of Speech as Potential Indicators of Near-Term Suicidal Risk

More information

FIR filter bank design for Audiogram Matching

FIR filter bank design for Audiogram Matching FIR filter bank design for Audiogram Matching Shobhit Kumar Nema, Mr. Amit Pathak,Professor M.Tech, Digital communication,srist,jabalpur,india, shobhit.nema@gmail.com Dept.of Electronics & communication,srist,jabalpur,india,

More information

Burst-based Features for the Classification of Pathological Voices

Burst-based Features for the Classification of Pathological Voices INTERSPEECH 2013 Burst-based Features for the Classification of Pathological Voices Julie Mauclair Lionel Koenig Marina Robert, Peggy Gatignol Université Paris Descartes, 45 rue des Saints Pères 75006

More information

INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT

INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT R.Nishitha 1, Dr K.Srinivasan 2, Dr V.Rukkumani 3 1 Student, 2 Professor and Head, 3 Associate Professor, Electronics and Instrumentation

More information

Overview. Acoustics of Speech and Hearing. Source-Filter Model. Source-Filter Model. Turbulence Take 2. Turbulence

Overview. Acoustics of Speech and Hearing. Source-Filter Model. Source-Filter Model. Turbulence Take 2. Turbulence Overview Acoustics of Speech and Hearing Lecture 2-4 Fricatives Source-filter model reminder Sources of turbulence Shaping of source spectrum by vocal tract Acoustic-phonetic characteristics of English

More information

Reviewing the connection between speech and obstructive sleep apnea

Reviewing the connection between speech and obstructive sleep apnea DOI 10.1186/s12938-016-0138-5 BioMedical Engineering OnLine RESEARCH Open Access Reviewing the connection between speech and obstructive sleep apnea Fernando Espinoza Cuadros 1*, Rubén Fernández Pozo 1,

More information

Research Proposal on Emotion Recognition

Research Proposal on Emotion Recognition Research Proposal on Emotion Recognition Colin Grubb June 3, 2012 Abstract In this paper I will introduce my thesis question: To what extent can emotion recognition be improved by combining audio and visual

More information

Effects of a New Voicing Parameter on Pathological Voice Discrimination by SVM

Effects of a New Voicing Parameter on Pathological Voice Discrimination by SVM Effects of a New Voicing Parameter on Pathological Voice Discrimination by SVM Asma Belhaj *, Aicha Bouzid, Noureddine Ellouze Laboratory of Signal, Images and Information Technology, Tunis El Manar University,

More information

Parkinson s Disease Diagnosis by k-nearest Neighbor Soft Computing Model using Voice Features

Parkinson s Disease Diagnosis by k-nearest Neighbor Soft Computing Model using Voice Features Parkinson s Disease Diagnosis by k-nearest Neighbor Soft Computing Model using Voice Features Chandra Prakash Rathore 1, Rekh Ram Janghel 2 1 Department of Information Technology, Dr. C. V. Raman University,

More information

Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate

Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate , pp.57-61 http://dx.doi.org/10.14257/astl.2015.107.14 Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate Yong-Ki Kim 1, Jong Gwan Lim 2, Mi-Hye Kim *3

More information

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION

LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION LATERAL INHIBITION MECHANISM IN COMPUTATIONAL AUDITORY MODEL AND IT'S APPLICATION IN ROBUST SPEECH RECOGNITION Lu Xugang Li Gang Wang Lip0 Nanyang Technological University, School of EEE, Workstation Resource

More information

Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone

Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone Acoustics 8 Paris Ambiguity in the recognition of phonetic vowels when using a bone conduction microphone V. Zimpfer a and K. Buck b a ISL, 5 rue du Général Cassagnou BP 734, 6831 Saint Louis, France b

More information

BFI-Based Speaker Personality Perception Using Acoustic-Prosodic Features

BFI-Based Speaker Personality Perception Using Acoustic-Prosodic Features BFI-Based Speaker Personality Perception Using Acoustic-Prosodic Features Chia-Jui Liu, Chung-Hsien Wu, Yu-Hsien Chiu* Department of Computer Science and Information Engineering, National Cheng Kung University,

More information

VIRTUAL ASSISTANT FOR DEAF AND DUMB

VIRTUAL ASSISTANT FOR DEAF AND DUMB VIRTUAL ASSISTANT FOR DEAF AND DUMB Sumanti Saini, Shivani Chaudhry, Sapna, 1,2,3 Final year student, CSE, IMSEC Ghaziabad, U.P., India ABSTRACT For the Deaf and Dumb community, the use of Information

More information

Assessment of Reliability of Hamilton-Tompkins Algorithm to ECG Parameter Detection

Assessment of Reliability of Hamilton-Tompkins Algorithm to ECG Parameter Detection Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 Assessment of Reliability of Hamilton-Tompkins Algorithm to ECG Parameter

More information

Parkinson s Disease : A Case Study

Parkinson s Disease : A Case Study Parkinson s Disease : A Case Study Ms. Ahilya V.Salunkhe 1, Prof. Ms. Pallavi S.Deshpande 2 1,2 Department of Electronics & Telecommunication Engineering,Smt. Kashibai Navale College of Engineering, Vadgaon

More information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion

More information

Speaker Independent Isolated Word Speech to Text Conversion Using Auto Spectral Subtraction for Punjabi Language

Speaker Independent Isolated Word Speech to Text Conversion Using Auto Spectral Subtraction for Punjabi Language International Journal of Scientific and Research Publications, Volume 7, Issue 7, July 2017 469 Speaker Independent Isolated Word Speech to Text Conversion Using Auto Spectral Subtraction for Punjabi Language

More information

PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON

PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON PATTERN ELEMENT HEARING AIDS AND SPEECH ASSESSMENT AND TRAINING Adrian FOURCIN and Evelyn ABBERTON Summary This paper has been prepared for a meeting (in Beijing 9-10 IX 1996) organised jointly by the

More information

Divide-and-Conquer based Ensemble to Spot Emotions in Speech using MFCC and Random Forest

Divide-and-Conquer based Ensemble to Spot Emotions in Speech using MFCC and Random Forest Published as conference paper in The 2nd International Integrated Conference & Concert on Convergence (2016) Divide-and-Conquer based Ensemble to Spot Emotions in Speech using MFCC and Random Forest Abdul

More information

Polly Lau Voice Research Lab, Department of Speech and Hearing Sciences, The University of Hong Kong

Polly Lau Voice Research Lab, Department of Speech and Hearing Sciences, The University of Hong Kong Perception of synthesized voice quality in connected speech by Cantonese speakers Edwin M-L. Yiu a) Voice Research Laboratory, Department of Speech and Hearing Sciences, The University of Hong Kong, 5/F

More information

A Judgment of Intoxication using Hybrid Analysis with Pitch Contour Compare (HAPCC) in Speech Signal Processing

A Judgment of Intoxication using Hybrid Analysis with Pitch Contour Compare (HAPCC) in Speech Signal Processing A Judgment of Intoxication using Hybrid Analysis with Pitch Contour Compare (HAPCC) in Speech Signal Processing Seong-Geon Bae #1 1 Professor, School of Software Application, Kangnam University, Gyunggido,

More information

Closing and Opening Phase Variability in Dysphonia Adrian Fourcin(1) and Martin Ptok(2)

Closing and Opening Phase Variability in Dysphonia Adrian Fourcin(1) and Martin Ptok(2) Closing and Opening Phase Variability in Dysphonia Adrian Fourcin() and Martin Ptok() () University College London 4 Stephenson Way, LONDON NW PE, UK () Medizinische Hochschule Hannover C.-Neuberg-Str.

More information

CONTENT BASED CLINICAL DEPRESSION DETECTION IN ADOLESCENTS

CONTENT BASED CLINICAL DEPRESSION DETECTION IN ADOLESCENTS 7th European Signal Processing Conference (EUSIPCO 9) Glasgow, Scotland, August 4-8, 9 CONTENT BASED CLINICAL DEPRESSION DETECTION IN ADOLESCENTS Lu-Shih Alex Low, Namunu C. Maddage, Margaret Lech, Lisa

More information

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for

What you re in for. Who are cochlear implants for? The bottom line. Speech processing schemes for What you re in for Speech processing schemes for cochlear implants Stuart Rosen Professor of Speech and Hearing Science Speech, Hearing and Phonetic Sciences Division of Psychology & Language Sciences

More information

VERY EARLY DETECTION OF AUTISM SPECTRUM DISORDERS BASED ON ACOUSTIC ANALYSIS OF PRE-VERBAL VOCALIZATIONS OF 18-MONTH OLD TODDLERS

VERY EARLY DETECTION OF AUTISM SPECTRUM DISORDERS BASED ON ACOUSTIC ANALYSIS OF PRE-VERBAL VOCALIZATIONS OF 18-MONTH OLD TODDLERS VERY EARLY DETECTION OF AUTISM SPECTRUM DISORDERS BASED ON ACOUSTIC ANALYSIS OF PRE-VERBAL VOCALIZATIONS OF 18-MONTH OLD TODDLERS João F. Santos 1, Nirit Brosh 2, Tiago H. Falk 1, Lonnie Zwaigenbaum 3,

More information

Differences of the Voice Parameters Between the Population of Different Hearing Tresholds: Findings by Using the Multi-Dimensional Voice Program

Differences of the Voice Parameters Between the Population of Different Hearing Tresholds: Findings by Using the Multi-Dimensional Voice Program Original Article Clinical and Experimental Otorhinolaryngology Vol. 10, No. 3: 278-282, September 2017 http://dx.doi.org/10.21053/ceo.2015.01900 pissn 1976-8710 eissn 2005-0720 Differences of the Voice

More information

Acoustic Signal Processing Based on Deep Neural Networks

Acoustic Signal Processing Based on Deep Neural Networks Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline

More information

11 Music and Speech Perception

11 Music and Speech Perception 11 Music and Speech Perception Properties of sound Sound has three basic dimensions: Frequency (pitch) Intensity (loudness) Time (length) Properties of sound The frequency of a sound wave, measured in

More information