Mark John Francis Gales
|
|
- Rosa Hawkins
- 5 years ago
- Views:
Transcription
1 Mark John Francis Gales Thesis, Book Chapters and Magazines 1. S. Furui, L. Deng, M. Gales, H. Ney, and K. Tokuda, Fundamental technologies in modern speech recognition [from the guest editors], IEEE Signal Processing Magazine, vol. 29, pp , November M. J. F. Gales, S. Watanabe, and E. Fosler-Lussier, Structured discriminative models for speech recognition: An overview, IEEE Signal Processing Magazine, vol. 29, pp , November S. Matsoukas, T. Ng, L. Nguyen, F. Diehl, M. J. F. Gales, X. Liu, P. C. Woodland, J.-L. Gauvain, L. Lamel, A. Messaoudi, and etal, Handbook of natural language processing and machine translation. DARPA Global Autonomous Language Exploitation, ch. Optimizing Speech-To-Text System Combination for Machine Translation. Springer, M. Tomalin, W. Byrne, A. de Gispert, M. Gales, X. Liu, P. Woodland, and etal, Handbook of natural language processing and machine translation. DARPA Global Autonomous Language Exploitation, ch. Improving Speech Transcription for Machine Translation. Springer, M. J. F. Gales, Robust Speech Recognition of Uncertain or Missing Data, ch. Model-Based Approaches to Handling Uncertainty. Springer, M. J. F. Gales, Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods, ch. Augmented Statistical Models: Using Dynamic Kernels for Acoustic Models M. J. F. Gales and S. J. Young, The application of hidden Markov models in speech recognition, Foundations and Trends in Signal Processing, vol. 1, no. 3, pp , S. J. Young, G. Evermann, M. J. F. Gales, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, December M. J. F. Gales, Model-Based Techniques for Noise Robust Speech Recognition. PhD thesis, Cambridge University Engineering Department, September
2 Journal Papers and Letters 1. Y. Wang, M. J. F. Gales, K. M. Knill, K. Kyriakopoulos, A. Malinin, R. C. van Dalen, and M. Rashid, Towards automatic assessment of spontaneous spoken English, Speech Communication, vol. 104, pp , G. Degottex, P. Lanchantin, and M. J. F. Gales, A log domain pulse model for parametric speech synthesis, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 26, no. 1, pp , C. Wu, M. J. F. Gales, A. Ragni, P. Karanasou, and K. C. Sim, Improving interpretability and regularization in deep learning, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 26, no. 2, pp , P. Karanasou, C. Wu, M. J. F. Gales, and P. C. Woodland, I-vectors and structured neural networks for rapid adaptation of acoustic models, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 25, no. 4, pp , X. Liu, X. Chen, Y. Wang, M. J. F. Gales, and P. C. Woodland, Two efficient lattice rescoring methods using recurrent neural network language models, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 24, no. 8, pp , X. Chen, X. Liu, Y. Wang, M. J. F. Gales, and P. C. Woodland, Efficient training and evaluation of recurrent neural network language models for automatic speech recognition, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 24, no. 11, pp , T. Yoshioka and M. J. F. Gales, Environmentally robust ASR front-end for deep neural network acoustic models, Computer Speech & Language, vol. 31, no. 1, pp , L. Chen, N. Braunschweiler, and M. J. F. Gales, Speaker and expression factorization for audiobook data: Expressiveness and transplantation, IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 23, no. 4, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Paraphrastic language models, Computer Speech & Language, vol. 28, no. 6, pp , V. Wan, J. Latorre, K. Yanagisawa, N. Braunschweiler, L. Chen, M. J. F. Gales, and M. Akamine, Building HMM-TTS voices on diverse data, J. Sel. Topics Signal Processing, vol. 8, no. 2, pp , L. Chen, M. J. F. Gales, N. Braunschweiler, M. Akamine, and K. Knill, Integrated expression prediction and speech synthesis from text, J. Sel. Topics Signal Processing, vol. 8, no. 2, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Use of contexts in language model interpolation and adaptation, Computer Speech & Language, vol. 27, no. 1, pp , R. C. van Dalen and M. J. F. Gales, Importance sampling to compute likelihoods of noise-corrupted speech, Computer Speech & Language, vol. 27, no. 1, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Language model cross adaptation for LVCSR system combination, Computer Speech & Language, vol. 27, no. 4, pp , R. Maia, M. Akamine, and M. J. F. Gales, Complex cepstrum for statistical parametric speech synthesis, Speech Communication, vol. 55, no. 5, pp , S. Zhang and M. J. F. Gales, Structured SVMs for automatic speech recognition, IEEE Trans. Audio, Speech & Language Processing, vol. 21, no. 3, pp , F. Diehl, M. J. F. Gales, M. Tomalin, and P. C. Woodland, Morphological decomposition in Arabic ASR systems, Computer Speech & Language, vol. 26, no. 4, pp , H. Zen, M. J. F. Gales, Y. Nankaku, and K. Tokuda, Product of experts for statistical parametric speech synthesis, IEEE Trans. Audio, Speech & Language Processing, vol. 20, no. 3, pp , H. Zen, N. Braunschweiler, S. Buchholz, M. J. F. Gales, K. Knill, S. Krstulovic, and J. Latorre, Statistical parametric speech synthesis based on speaker and language factorization, IEEE Trans. Audio, Speech & Language Processing, vol. 20, no. 6, pp , Y. Wang and M. J. F. Gales, Speaker and noise factorization for robust speech recognition, IEEE Trans. Audio, Speech & Language Processing, vol. 20, no. 7, pp ,
3 21. J. Park, F. Diehl, M. J. F. Gales, M. Tomalin, and P. C. Woodland, The efficient incorporation of MLP features into automatic speech recognition systems, Computer Speech & Language, vol. 25, no. 3, pp , Z. Roupakia and M. J. F. Gales, Kernel eigenvoices (revisited) for large-vocabulary speech recognition, IEEE Signal Process. Lett., vol. 18, no. 12, pp , D. K. Kim and M. J. F. Gales, Noisy constrained maximum-likelihood linear regression for noiserobust speech recognition, IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 2, pp , R. C. van Dalen and M. J. F. Gales, Extended VTS for noise-robust speech recognition, IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 4, pp , H. Xu, M. J. F. Gales, and K. K. Chin, Joint uncertainty decoding with predictive methods for noise robust speech recognition, IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 6, pp , M. J. F. Gales and F. Flego, Discriminative classifiers with adaptive kernels for noise robust speech recognition, Computer Speech & Language, vol. 24, no. 4, pp , K. Yu, M. J. F. Gales, L. Wang, and P. C. Woodland, Unsupervised training and directed manual transcription for LVCSR, Speech Communication, vol. 52, no. 7-8, pp , S. Zhang, A. Ragni, and M. J. F. Gales, Structured log linear models for noise robust speech recognition, IEEE Signal Process. Lett., vol. 17, no. 11, pp , C. Breslin and M. J. F. Gales, Directed decision trees for generating complementary systems, Speech Communication, vol. 51, no. 3, pp , K. Yu, M. J. F. Gales, and P. C. Woodland, Unsupervised adaptation with discriminative mapping transforms, IEEE Trans. Audio, Speech & Language Processing, vol. 17, no. 4, pp , C. Longworth and M. J. F. Gales, Combining derivative and parametric kernels for speaker verification, IEEE Trans. Audio, Speech & Language Processing, vol. 17, no. 4, pp , H. Liao and M. J. F. Gales, Issues with uncertainty decoding for noise robust automatic speech recognition, Speech Communication, vol. 50, no. 4, pp , K. C. Sim and M. J. F. Gales, Discriminative semi-parametric trajectory model for speech recognition, Computer Speech & Language, vol. 21, no. 4, pp , X. Liu and M. J. F. Gales, Automatic model complexity control using marginalized discriminative growth functions, IEEE Trans. Audio, Speech & Language Processing, vol. 15, no. 4, pp , K. Yu and M. J. F. Gales, Bayesian adaptive inference and adaptive training, IEEE Trans. Audio, Speech & Language Processing, vol. 15, no. 6, pp , M. I. Layton and M. J. F. Gales, Acoustic modelling using continuous rational kernels, VLSI Signal Processing, vol. 48, no. 1-2, pp , M. J. F. Gales and S. S. Airey, Product of Gaussians for speech recognition, Computer Speech & Language, vol. 20, no. 1, pp , M. J. F. Gales and M. I. Layton, Training augmented models using SVMs, IEICE Transactions, vol. 89-D, no. 3, pp , K. C. Sim and M. J. F. Gales, Minimum phone error training of precision matrix models, IEEE Trans. Audio, Speech & Language Processing, vol. 14, no. 3, pp , M. J. F. Gales, D. Y. Kim, P. C. Woodland, H. Y. Chan, D. Mrva, R. Sinha, and S. E. Tranter, Progress in the CU-HTK broadcast news transcription system, IEEE Trans. Audio, Speech & Language Processing, vol. 14, no. 5, pp , K. Yu and M. J. F. Gales, Discriminative cluster adaptive training, IEEE Trans. Audio, Speech & Language Processing, vol. 14, no. 5, pp , T. Hain, P. C. Woodland, G. Evermann, M. J. F. Gales, X. Liu, G. L. Moore, D. Povey, and L. Wang, Automatic transcription of conversational telephone speech, IEEE Trans. Speech and Audio Processing, vol. 13, no. 6, pp , A. I. Rosti and M. J. F. Gales, Factor analysed hidden Markov models for speech recognition, Computer Speech & Language, vol. 18, no. 2, pp ,
4 44. M. J. F. Gales, Transformation streams and the HMM error model, Computer Speech & Language, vol. 16, no. 2, pp , S. S. Chen, E. Eide, M. J. F. Gales, R. A. Gopinath, D. Kanvesky, and P. A. Olsen, Automatic transcription of broadcast news, Speech Communication, vol. 37, no. 1-2, pp , M. J. F. Gales, Maximum likelihood multiple subspace projections for hidden Markov models, IEEE Trans. Speech and Audio Processing, vol. 10, no. 2, pp , M. J. F. Gales, Cluster adaptive training of hidden Markov models, IEEE Trans. Speech and Audio Processing, vol. 8, no. 4, pp , M. J. F. Gales, K. Knill, and S. J. Young, State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs, IEEE Trans. Speech and Audio Processing, vol. 7, no. 2, pp , M. J. F. Gales, Semi-tied covariance matrices for hidden Markov models, IEEE Trans. Speech and Audio Processing, vol. 7, no. 3, pp , M. J. F. Gales, Maximum likelihood linear transformations for HMM-based speech recognition, Computer Speech & Language, vol. 12, no. 2, pp , M. J. F. Gales, Predictive model-based compensation schemes for robust speech recognition, Speech Communication, vol. 25, no. 1-3, pp , M. J. F. Gales and P. C. Woodland, Mean and variance adaptation within the MLLR framework, Computer Speech & Language, vol. 10, no. 4, pp , M. J. F. Gales and S. J. Young, Robust continuous speech recognition using parallel model combination, IEEE Trans. Speech and Audio Processing, vol. 4, no. 5, pp , M. J. F. Gales and S. J. Young, Robust speech recognition in additive and convolutional noise using parallel model combination, Computer Speech & Language, vol. 9, no. 4, pp , M. J. F. Gales and S. J. Young, Cepstral parameter compensation for HMM recognition in noise, Speech Communication, vol. 12, no. 3, pp ,
5 Conference and Workshop Papers 1. Y. Wang, X. Chen, M. J. F. Gales, A. Ragni, and J. H. M. Wong, Phonetic and graphemic systems for multi-genre broadcast transcription, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018, pp , Y. Wang, C. Zhang, M. J. F. Gales, and P. C. Woodland, Speaker adaptation and adaptive training for jointly optimised tandem systems, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , M. Wan, G. Degottex, and M. J. F. Gales, Waveform-based speaker representations for speech synthesis, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , K. Kyriakopoulos, K. Knill, and M. J. F. Gales, A deep learning approach to assessing non-native pronunciation of English using phone distances, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , K. Knill, M. J. F. Gales, K. Kyriakopoulos, A. Malinin, A. Ragni, Y. Wang, and A. Caines, Impact of ASR performance on free speaking language assessment, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , A. Ragni and M. J. F. Gales, Automatic speech recognition system development in the wild, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , O. Chen, A. Ragni, M. J. F. Gales, and X. Chen, Active memory networks for language modeling, in Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018., pp , A. Malinin and M. J. F. Gales, Predictive uncertainty estimation via prior networks, in Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada., pp , A. Malinin, A. Ragni, K. Knill, and M. J. F. Gales, Incorporating uncertainty into deep learning for spoken language assessment, in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 2: Short Papers, pp , J. H. M. Wong and M. J. F. Gales, Multi-task ensembles with teacher-student training, in 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp , X. Chen, X. Liu, A. Ragni, Y. Wang, and M. J. F. Gales, Future word contexts in neural network language models, in 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp , A. Malinin, K. Knill, and M. J. F. Gales, A hierarchical attention based model for off-topic spontaneous spoken response detection, in 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp , M. Wan, G. Degottex, and M. J. F. Gales, Integrated speaker-adaptive speech synthesis, in 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp , A. Ragni, C. Wu, M. J. F. Gales, J. Vasilakes, and K. M. Knill, Stimulated training for automatic speech recognition and keyword search in limited resource conditions, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp , A. Ragni, D. Saunders, P. Zahemszky, J. Vasilakes, M. J. F. Gales, and K. M. Knill, Morph-toword transduction for accurate and efficient automatic speech recognition and keyword search, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp ,
6 16. X. Chen, A. Ragni, J. Vasilakes, X. Liu, K. Knill, and M. J. F. Gales, Recurrent neural network language models for keyword search, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp , J. H. M. Wong and M. J. F. Gales, Student-teacher training with diverse decision tree ensembles, in Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp , X. Chen, A. Ragni, X. Liu, and M. J. F. Gales, Investigating bidirectional recurrent neural network language models for speech recognition, in Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp , C. Wu and M. J. F. Gales, Deep activation mixture model for speech recognition, in Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp , K. M. Knill, M. J. F. Gales, K. Kyriakopoulos, A. Ragni, and Y. Wang, Use of graphemic lexicons for spoken language assessment, in Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp , K. Kyriakopoulos, M. J. F. Gales, and K. Knill, Automatic characterisation of the pronunciation of non-native English speakers using phone distance features, in 7th ISCA International Workshop on Speech and Language Technology in Education, SLaTE 2017, Stockholm, Sweden, August 2017, pp , A. Malinin, K. Knill, A. Ragni, Y. Wang, and M. J. F. Gales, An attention based model for offtopic spontaneous spoken response detection: An initial study, in 7th ISCA International Workshop on Speech and Language Technology in Education, SLaTE 2017, Stockholm, Sweden, August 2017, pp , M. J. F. Gales, K. M. Knill, and A. Ragni, Low-resource speech recognition and keyword-spotting, in Speech and Computer - 19th International Conference, SPECOM 2017, Hatfield, UK, September 12-16, 2017, Proceedings, pp. 3 19, A. Malinin, R. C. van Dalen, K. Knill, Y. Wang, and M. J. F. Gales, Off-topic response detection for spontaneous spoken English assessment, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers, C. Wu, P. Karanasou, and M. J. F. Gales, Combining i-vector representation and structured neural networks for rapid adaptation, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp , J. Yang, C. Zhang, A. Ragni, M. J. F. Gales, and P. C. Woodland, System combination with loglinear models, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp , L. Wang, C. Zhang, P. C. Woodland, M. J. F. Gales, P. Karanasou, P. Lanchantin, X. Liu, and Y. Qian, Improved DNN-based segmentation for multi-genre broadcast audio, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp , X. Chen, X. Liu, Y. Qian, M. J. F. Gales, and P. C. Woodland, CUED-RNNLM - an opensource toolkit for efficient training and evaluation of recurrent neural network language models, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp , C. Wu, P. Karanasou, M. J. F. Gales, and K. C. Sim, Stimulated deep neural network for speech recognition, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp , J. Yang, A. Ragni, M. J. F. Gales, and K. M. Knill, Log-linear system combination using structured support vector machines, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp ,
7 31. S. Kundu, K. C. Sim, and M. J. F. Gales, Incorporating a generative front-end layer to deep neural network for noise robust automatic speech recognition, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp , J. H. M. Wong and M. J. F. Gales, Sequence student-teacher training of deep neural networks, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp , A. Ragni, E. Dakin, X. Chen, M. J. F. Gales, and K. M. Knill, Multi-language neural network language models, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp , P. Lanchantin, M. J. F. Gales, P. Karanasou, X. Liu, Y. Qian, L. Wang, P. C. Woodland, and C. Zhang, Selection of multi-genre broadcast data for the training of automatic speech recognition systems, in Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp , D. J. Litman, S. J. Young, M. J. F. Gales, K. Knill, K. Ottewell, R. C. van Dalen, and D. Vandyke, Towards using conversations with spoken dialogue systems in the automated assessment of nonnative speakers of English, in Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, September 2016, Los Angeles, CA, USA, pp , G. Degottex, P. Lanchantin, and M. J. F. Gales, A pulse model in log-domain for a uniform synthesizer, in The 9th ISCA Speech Synthesis Workshop, Sunnyvale, CA, USA, September 2016, pp , R. C. van Dalen, J. Yang, H. Wang, A. Ragni, C. Zhang, and M. J. F. Gales, Structured discriminative models using deep neural-network features, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , X. Chen, X. Liu, M. J. F. Gales, and P. C. Woodland, Investigation of back-off based interpolation between recurrent neural network and n-gram language models, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , J. Cui, B. Kingsbury, B. Ramabhadran, A. Sethy, K. Audhkhasi, X. Cui, E. Kislal, L. Mangu, M. Nußbaum-Thom, M. Picheny, Z. Tüske, P. Golik, R. Schlüter, H. Ney, M. J. F. Gales, K. M. Knill, A. Ragni, H. Wang, and P. C. Woodland, Multilingual representations for low resource speech recognition and keyword search, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , S. Tan, K. C. Sim, and M. J. F. Gales, Improving the interpretability of deep neural networks with stimulated learning, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , P. C. Woodland, X. Liu, Y. Qian, C. Zhang, M. J. F. Gales, P. Karanasou, P. Lanchantin, and L. Wang, Cambridge University transcription systems for the multi-genre broadcast challenge, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , P. Lanchantin, M. J. F. Gales, P. Karanasou, X. Liu, Y. Qian, L. Wang, P. C. Woodland, and C. Zhang, The development of the Cambridge University alignment systems for the multi-genre broadcast challenge, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , P. Karanasou, M. J. F. Gales, P. Lanchantin, X. Liu, Y. Qian, L. Wang, P. C. Woodland, and C. Zhang, Speaker diarisation and longitudinal linking in multi-genre broadcast data, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp , P. Bell, M. J. F. Gales, T. Hain, J. Kilgour, P. Lanchantin, X. Liu, A. McParland, S. Renals, O. Saz, M. Wester, and P. C. Woodland, The MGB challenge: Evaluating multi-genre broadcast media recognition, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp ,
8 45. C. Wu and M. J. F. Gales, Multi-basis adaptive neural network for rapid adaptation in speech recognition, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , A. Ragni, M. J. F. Gales, and K. M. Knill, A language space representation for speech recognition, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , T. Drugman, Y. Stylianou, L. Chen, X. Chen, and M. J. F. Gales, Robust excitation-based features for automatic speech recognition, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , R. C. van Dalen, K. M. Knill, P. Tsiakoulis, and M. J. F. Gales, Improving multiple-crowd-sourced transcriptions using a speech recogniser, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , M. J. F. Gales, K. M. Knill, and A. Ragni, Unicode-based graphemic systems for limited resource languages, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , X. Chen, X. Liu, M. J. F. Gales, and P. C. Woodland, Improving the training and evaluation efficiency of recurrent neural network language models, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , X. Liu, X. Chen, M. J. F. Gales, and P. C. Woodland, Paraphrastic recurrent neural network language models, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , X. Chen, X. Liu, M. J. F. Gales, and P. C. Woodland, Recurrent neural network language model training with noise contrastive estimation for speech recognition, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp , G. Mendels, E. Cooper, V. Soto, J. Hirschberg, M. J. F. Gales, K. M. Knill, A. Ragni, and H. Wang, Improving speech recognition and keyword search for low resource languages using web data, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , P. Lanchantin, C. Veaux, M. J. F. Gales, S. King, and J. Yamagishi, Reconstructing voices within the multiple-average-voice-model framework, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , R. C. van Dalen and M. J. F. Gales, Annotating large lattices with the exact word error, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , P. Karanasou, M. J. F. Gales, and P. C. Woodland, I-vector estimation using informative priors for adaptation of deep neural networks, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , X. Liu, F. Flego, L. Wang, C. Zhang, M. J. F. Gales, and P. C. Woodland, The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , X. Chen, T. Tan, X. Liu, P. Lanchantin, M. Wan, M. J. F. Gales, and P. C. Woodland, Recurrent neural network language model adaptation for multi-genre broadcast speech recognition, in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp , H. Wang, A. Ragni, M. J. F. Gales, K. M. Knill, P. C. Woodland, and C. Zhang, Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages, in INTER- SPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp ,
9 60. R. C. van Dalen, K. M. Knill, and M. J. F. Gales, Automatically grading learners English using a Gaussian process, in ISCA International Workshop on Speech and Language Technology in Education, SLaTE 2015, Leipzig, Germany, September 4-5, 2015, pp. 7 12, V. Wan, J. Latorre, K. Yanagisawa, M. J. F. Gales, and Y. Stylianou, Cluster adaptive training of average voice models, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , P. Lanchantin, M. J. F. Gales, S. King, and J. Yamagishi, Multiple-average-voice-based speech synthesis, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , L. Chen, N. Braunschweiler, and M. J. F. Gales, Speaker dependent expression predictor from text: Expressiveness and transplantation, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , J. Yang, R. C. van Dalen, S. Zhang, and M. J. F. Gales, Infinite structured support vector machines for speech recognition, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Paraphrastic neural network language models, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , X. Liu, Y. Wang, X. Chen, M. J. F. Gales, and P. C. Woodland, Efficient lattice rescoring using recurrent neural network language models, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , T. Yoshioka, X. Chen, and M. J. F. Gales, Impact of single-microphone dereverberation on DNNbased meeting transcription systems, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , T. Yoshioka, A. Ragni, and M. J. F. Gales, Investigation of unsupervised adaptation of DNN acoustic models with filter bank input, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp , K. Knill, M. J. F. Gales, A. Ragni, and S. P. Rath, Language independent and unsupervised acoustic models for speech recognition and keyword spotting, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , X. Chen, Y. Wang, X. Liu, M. J. F. Gales, and P. C. Woodland, Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , A. Ragni, K. M. Knill, S. P. Rath, and M. J. F. Gales, Data augmentation for low resource languages, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , S. P. Rath, K. M. Knill, A. Ragni, and M. J. F. Gales, Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , X. Chen, M. J. F. Gales, K. M. Knill, C. Breslin, L. Chen, K. K. Chin, and V. Wan, An initial investigation of long-term adaptation for meeting transcription, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , B. Kolluru, V. Wan, J. Latorre, K. Yanagisawa, and M. J. F. Gales, Generating multiple-accent pronunciations for TTS using joint sequence model interpolation, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , K. Yanagisawa, L. Chen, and M. J. F. Gales, Noise-robust TTS speaker adaptation with statistics smoothing, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp ,
10 76. P. Karanasou, Y. Wang, M. J. F. Gales, and P. C. Woodland, Adaptation of deep neural network acoustic models using factorised i-vectors, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , J. Latorre, K. Yanagisawa, V. Wan, B. Kolluru, and M. J. F. Gales, Speech intonation for TTS: study on evaluation methodology, in INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, pp , M. J. F. Gales, K. M. Knill, A. Ragni, and S. P. Rath, Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED, in 4th Workshop on Spoken Language Technologies for Under-resourced Languages, SLTU 2014, St. Petersburg, Russia, May 14-16, 2014, pp , K. Knill, M. J. F. Gales, S. P. Rath, P. C. Woodland, C. Zhang, and S. Zhang, Investigation of multilingual deep neural networks for spoken term detection, in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, December 8-12, 2013, pp , J. Latorre, M. J. F. Gales, K. Knill, and M. Akamine, Training a supra-segmental parametric F0 model without interpolating F0, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , S. Zhang and M. J. F. Gales, Kernelized log linear models for continuous speech recognition, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , R. C. van Dalen, A. Ragni, and M. J. F. Gales, Efficient decoding with generative score-spaces using the expectation semiring, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , Y. Wang and M. J. F. Gales, Tandem system adaptation using multiple linear feature transforms, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , R. Maia, M. Akamine, and M. J. F. Gales, Complex cepstrum analysis based on the minimum mean squared error, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , L. Chen, M. J. F. Gales, N. Braunschweiler, M. Akamine, and K. Knill, Integrated automatic expression prediction and speech synthesis from text, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , J. Mamou, J. Cui, X. Cui, M. J. F. Gales, B. Kingsbury, K. Knill, L. Mangu, D. Nolden, M. Picheny, B. Ramabhadran, R. Schlüter, A. Sethy, and P. C. Woodland, System combination and score normalization for spoken term detection, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , B. Kingsbury, J. Cui, X. Cui, M. J. F. Gales, K. Knill, J. Mamou, L. Mangu, D. Nolden, M. Picheny, B. Ramabhadran, R. Schlüter, A. Sethy, and P. C. Woodland, A high-performance Cantonese keyword search system, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Paraphrastic language models and combination with neural network language models, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , M. S. Seigel, P. C. Woodland, and M. J. F. Gales, A confidence-based approach for improving keyword hypothesis scores, in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp , P. Lanchantin, P. Bell, M. J. F. Gales, T. Hain, X. Liu, Y. Long, J. Quinnell, S. Renals, O. Saz, M. S. Seigel, P. Swietojanski, and P. C. Woodland, Automatic transcription of multi-genre media archives, in Proceedings of the First Workshop on Speech, Language and Audio in Multimedia, Marseille, France, August 22-23, 2013, pp ,
11 91. Y. Wang and M. J. F. Gales, An explicit independence constraint for factorised adaptation in speech recognition, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , Y. Long, M. J. F. Gales, P. Lanchantin, X. Liu, M. S. Seigel, and P. C. Woodland, Improving lightly supervised training for broadcast transcription, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , R. Maia, M. J. F. Gales, Y. Stylianou, and M. Akamine, Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , V. Wan, R. Anderson, A. Blokland, N. Braunschweiler, L. Chen, B. Kolluru, J. Latorre, R. Maia, B. Stenger, K. Yanagisawa, Y. Stylianou, M. Akamine, M. J. F. Gales, and R. Cipolla, Photorealistic expressive text to talking head synthesis, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , J. Yang, R. C. van Dalen, and M. J. F. Gales, Infinite support vector machines in speech recognition, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Cross-domain paraphrasing for improving language modelling using out-of-domain data, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp , K. Yanagisawa, J. Latorre, V. Wan, M. J. F. Gales, and S. King, Noise robustness in HMM-TTS speaker adaptation, in The Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, Barcelona, Spain, August 31-September 2, 2013, pp , F. Eyben, S. Buchholz, N. Braunschweiler, J. Latorre, V. Wan, M. J. F. Gales, and K. Knill, Unsupervised clustering of emotion and voice styles for expressive TTS, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012, pp , A. Ragni and M. J. F. Gales, Inference algorithms for generative score-spaces, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012, pp , R. Maia, M. Akamine, and M. J. F. Gales, Complex cepstrum as phase information in statistical parametric speech synthesis, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012, pp , F. Flego and M. J. F. Gales, Factor analysis based VTS discriminative adaptive training, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012, pp , L. Chen, M. J. F. Gales, V. Wan, J. Latorre, and M. Akamine, Exploring rich expressive information from audiobook data using cluster adaptive training, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , J. Latorre, V. Wan, M. J. F. Gales, L. Chen, K. K. Chin, K. Knill, and M. Akamine, Speech factorization for HMM-TTS based on cluster adaptive training, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , V. Wan, J. Latorre, K. K. Chin, L. Chen, M. J. F. Gales, H. Zen, K. Knill, and M. Akamine, Combining multiple high quality corpora for improving HMM-TTS, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , Y. Wang and M. J. F. Gales, Model-based approaches to adaptive training in reverberant environments, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp ,
12 106. M. J. F. Gales and F. Flego, Model-based approaches for degraded channel modelling in robust ASR, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Paraphrastic language models, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , Z. Roupakia, A. Ragni, and M. J. F. Gales, Rapid nonlinear speaker adaptation for large-vocabulary continuous speech recognition, in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pp , M. J. F. Gales, A. Ragni, A. Zhang, and R. C. van Dalen, Structured discriminative models for speech recognition, in 2012 Symposium on Machine Learning in Speech and Language Processing, MLSLP 2012, Portland, Oregon, USA, September 14, 2012, P. Bell, M. J. F. Gales, P. Lanchantin, X. Liu, Y. Long, S. Renals, P. Swietojanski, and P. C. Woodland, Transcription of multi-genre media archives using out-of-domain data, in 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, December 2-5, 2012, pp , S. Zhang and M. J. F. Gales, Extending noise robust structured support vector machines to larger vocabulary tasks, in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2011, Waikoloa, HI, USA, December 11-15, 2011, pp , Y. Wang and M. J. F. Gales, Improving reverberant VTS for hands-free robust speech recognition, in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2011, Waikoloa, HI, USA, December 11-15, 2011, pp , A. Ragni and M. J. F. Gales, Derivative kernels for noise robust ASR, in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2011, Waikoloa, HI, USA, December 11-15, 2011, pp , R. C. van Dalen and M. J. F. Gales, A variational perspective on noise-robust speech recognition, in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2011, Waikoloa, HI, USA, December 11-15, 2011, pp , H. Zen and M. J. F. Gales, Decision tree-based context clustering based on cross validation and hierarchical priors, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , Y. Wang and M. J. F. Gales, Speaker and noise factorisation on the AURORA4 task, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , J. Latorre, M. J. F. Gales, S. Buchholz, K. Knill, M. Tamura, Y. Ohtani, and M. Akamine, Continuous F0 in the source-excitation generation for HMM-based TTS: do we need voiced/unvoiced classification?, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , A. Ragni and M. J. F. Gales, Structured discriminative models for noise robust continuous speech recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , F. Flego and M. J. F. Gales, Factor analysis based VTS and JUD noise estimation and compensation, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , X. Liu, M. J. F. Gales, J. L. Hieronymus, and P. C. Woodland, Investigation of acoustic units for LVCSR systems, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp ,
13 121. L. Chen, M. J. F. Gales, and K. K. Chin, Constrained discriminative mapping transforms for unsupervised speaker adaptation, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , K. K. Chin, H. Xu, M. J. F. Gales, C. Breslin, and K. Knill, Rapid joint speaker and noise compensation for robust speech recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pp , F. Diehl, M. J. F. Gales, X. Liu, M. Tomalin, and P. C. Woodland, Word boundary modelling and full covariance Gaussians for Arabic speech-to-text systems, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , S. Zhang and M. J. F. Gales, Structured support vector machines for noise robust continuous speech recognition, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , C. Breslin, K. K. Chin, M. J. F. Gales, and K. Knill, Integrated online speaker clustering and adaptation, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , R. Maia, H. Zen, K. Knill, M. J. F. Gales, and S. Buchholz, Multipulse sequences for residual signal modeling, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , T. Li, P. C. Woodland, F. Diehl, and M. J. F. Gales, Graphone model interpolation and Arabic pronunciation generation, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , N. Pilkington, H. Zen, and M. J. F. Gales, Gaussian process experts for voice conversion, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Improving LVCSR system combination using neural network language model cross adaptation, in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, pp , H. Zen, M. J. F. Gales, Y. Nankaku, and K. Tokuda, Statistical parametric speech synthesis based on product of experts, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010, March 2010, Sheraton Dallas Hotel, Dallas, Texas, USA, pp , M. Tomalin, F. Diehl, M. J. F. Gales, J. Park, and P. C. Woodland, Recent improvements to the cambridge Arabic speech-to-text systems, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010, March 2010, Sheraton Dallas Hotel, Dallas, Texas, USA, pp , X. Liu, M. J. F. Gales, J. L. Hieronymus, and P. C. Woodland, Language model combination and adaptation usingweighted finite state transducers, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010, March 2010, Sheraton Dallas Hotel, Dallas, Texas, USA, pp , M. J. F. Gales and K. Yu, Canonical state models for automatic speech recognition, in INTER- SPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010, pp , X. Liu, M. J. F. Gales, and P. C. Woodland, Language model cross adaptation for LVCSR system combination, in INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010, pp , R. C. van Dalen and M. J. F. Gales, Asymptotically exact noise-corrupted speech likelihoods, in INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010, pp ,
SRIRAM GANAPATHY. Indian Institute of Science, Phone: +91-(80) Bangalore, India, Fax: +91-(80)
SRIRAM GANAPATHY Assistant Professor, Email: sriram@ee.iisc.ernet.in Electrical Engineering, Web: http://www.leap.ee.iisc.ac.in/sriram Indian Institute of Science, Phone: +91-(80)-2293-2433 Bangalore,
More informationNoise-Robust Speech Recognition Technologies in Mobile Environments
Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.
More informationCONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS
CONSTRUCTING TELEPHONE ACOUSTIC MODELS FROM A HIGH-QUALITY SPEECH CORPUS Mitchel Weintraub and Leonardo Neumeyer SRI International Speech Research and Technology Program Menlo Park, CA, 94025 USA ABSTRACT
More informationEnhanced Feature Extraction for Speech Detection in Media Audio
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Enhanced Feature Extraction for Speech Detection in Media Audio Inseon Jang 1, ChungHyun Ahn 1, Jeongil Seo 1, Younseon Jang 2 1 Media Research Division,
More informationTHE MGB-2 CHALLENGE: ARABIC MULTI-DIALECT BROADCAST MEDIA RECOGNITION
THE MGB-2 CHALLENGE: ARABIC MULTI-DIALECT BROADCAST MEDIA RECOGNITION Ahmed Ali 1,2, Peter Bell 2, James Glass 3, Yacine Messaoui 4, Hamdy Mubarak 1, Steve Renals 2, Yifan Zhang 1 1 Qatar Computing Research
More informationOnline Speaker Adaptation of an Acoustic Model using Face Recognition
Online Speaker Adaptation of an Acoustic Model using Face Recognition Pavel Campr 1, Aleš Pražák 2, Josef V. Psutka 2, and Josef Psutka 2 1 Center for Machine Perception, Department of Cybernetics, Faculty
More informationVIRTUAL ASSISTANT FOR DEAF AND DUMB
VIRTUAL ASSISTANT FOR DEAF AND DUMB Sumanti Saini, Shivani Chaudhry, Sapna, 1,2,3 Final year student, CSE, IMSEC Ghaziabad, U.P., India ABSTRACT For the Deaf and Dumb community, the use of Information
More informationSingle-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions
3 Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki Graduate School of System Informatics, Kobe University,
More informationHISTOGRAM EQUALIZATION BASED FRONT-END PROCESSING FOR NOISY SPEECH RECOGNITION
2 th May 216. Vol.87. No.2 25-216 JATIT & LLS. All rights reserved. HISTOGRAM EQUALIZATION BASED FRONT-END PROCESSING FOR NOISY SPEECH RECOGNITION 1 IBRAHIM MISSAOUI, 2 ZIED LACHIRI National Engineering
More informationGender Based Emotion Recognition using Speech Signals: A Review
50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department
More informationSRIRAM GANAPATHY. Indian Institute of Science, Phone: +91-(80) Bangalore, India, Fax: +91-(80)
SRIRAM GANAPATHY Assistant Professor, Email: sriramg@iisc.ac.in Electrical Engineering, Web: http://www.leap.ee.iisc.ac.in/sriram Indian Institute of Science, Phone: +91-(80)-2293-2433 Bangalore, India,
More informationThe SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation Martin Graciarena 1, Luciana Ferrer 2, Vikramjit Mitra 1 1 SRI International,
More informationAcoustic Signal Processing Based on Deep Neural Networks
Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline
More informationA PUBLIC DOMAIN SPEECH-TO-TEXT SYSTEM
A PUBLIC DOMAIN SPEECH-TO-TEXT SYSTEM M. Ordowski+, N. Deshmukh*, A. Ganapathiraju*, J. Hamaker*, and J. Picone* *Institute for Signal and Information Processing Department for Electrical and Computer
More informationFull Utilization of Closed-captions in Broadcast News Recognition
Full Utilization of Closed-captions in Broadcast News Recognition Meng Meng, Wang Shijin, Liang Jiaen, Ding Peng, Xu Bo. Hi-Tech Innovation Center, Institute of Automation Chinese Academy of Sciences,
More informationLecturer: T. J. Hazen. Handling variability in acoustic conditions. Computing and applying confidence scores
Lecture # 20 Session 2003 Noise Robustness and Confidence Scoring Lecturer: T. J. Hazen Handling variability in acoustic conditions Channel compensation Background noise compensation Foreground noises
More informationAdvanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment
DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY
More informationAdaptation of Classification Model for Improving Speech Intelligibility in Noise
1: (Junyoung Jung et al.: Adaptation of Classification Model for Improving Speech Intelligibility in Noise) (Regular Paper) 23 4, 2018 7 (JBE Vol. 23, No. 4, July 2018) https://doi.org/10.5909/jbe.2018.23.4.511
More informationBroadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop
Broadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop Lin-Ching Chang Department of Electrical Engineering and Computer Science School of Engineering Work Experience 09/12-present,
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationSpeech Enhancement Using Deep Neural Network
Speech Enhancement Using Deep Neural Network Pallavi D. Bhamre 1, Hemangi H. Kulkarni 2 1 Post-graduate Student, Department of Electronics and Telecommunication, R. H. Sapat College of Engineering, Management
More informationSpeech recognition in noisy environments: A survey
T-61.182 Robustness in Language and Speech Processing Speech recognition in noisy environments: A survey Yifan Gong presented by Tapani Raiko Feb 20, 2003 About the Paper Article published in Speech Communication
More informationSpeech Enhancement Based on Deep Neural Networks
Speech Enhancement Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu and Jun Du at USTC 1 Outline and Talk Agenda In Signal Processing Letter,
More informationPerformance of Gaussian Mixture Models as a Classifier for Pathological Voice
PAGE 65 Performance of Gaussian Mixture Models as a Classifier for Pathological Voice Jianglin Wang, Cheolwoo Jo SASPL, School of Mechatronics Changwon ational University Changwon, Gyeongnam 64-773, Republic
More informationFull text available at: The Application of Hidden Markov Models in Speech Recognition
The Application of Hidden Markov Models in Speech Recognition The Application of Hidden Markov Models in Speech Recognition Mark Gales Cambridge University Engineering Department Cambridge CB2 1PZ UK mjfg@eng.cam.ac.uk
More informationUsing Source Models in Speech Separation
Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/
More informationCOMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION RECOGNITION
Journal of Engineering Science and Technology Vol. 11, No. 9 (2016) 1221-1233 School of Engineering, Taylor s University COMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION
More informationHybrid HMM and HCRF model for sequence classification
Hybrid HMM and HCRF model for sequence classification Y. Soullard and T. Artières University Pierre and Marie Curie - LIP6 4 place Jussieu 75005 Paris - France Abstract. We propose a hybrid model combining
More informationThe MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments
The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments Ram H. Woo, Alex Park, and Timothy J. Hazen MIT Computer Science and Artificial Intelligence Laboratory 32
More informationRecognition & Organization of Speech & Audio
Recognition & Organization of Speech & Audio Dan Ellis http://labrosa.ee.columbia.edu/ Outline 1 2 3 Introducing Projects in speech, music & audio Summary overview - Dan Ellis 21-9-28-1 1 Sound organization
More informationAnalysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information
Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion
More informationA Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition
A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition LászlóTóth and Tamás Grósz MTA-SZTE Research Group on Artificial Intelligence Hungarian Academy of Sciences
More informationGeneral Soundtrack Analysis
General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/
More informationSpeech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces
HCI Lecture 11 Guest lecture: Speech Interfaces Hiroshi Shimodaira Institute for Communicating and Collaborative Systems (ICCS) Centre for Speech Technology Research (CSTR) http://www.cstr.ed.ac.uk Thanks
More informationEXEMPLAR SELECTION TECHNIQUES FOR SPARSE REPRESENTATIONS OF SPEECH USING MULTIPLE DICTIONARIES. Emre Yılmaz, Jort F. Gemmeke, and Hugo Van hamme
EXEMPLAR SELECTION TECHNIQUES FOR SPARSE REPRESENTATIONS OF SPEECH USING MULTIPLE DICTIONARIES Emre Yılmaz, Jort F. Gemmeke, and Hugo Van hamme Dept. ESAT, KU Leuven, Leuven, Belgium ABSTRACT This paper
More informationSPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM)
SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) Virendra Chauhan 1, Shobhana Dwivedi 2, Pooja Karale 3, Prof. S.M. Potdar 4 1,2,3B.E. Student 4 Assitant Professor 1,2,3,4Department of Electronics
More informationSpeech Compression for Noise-Corrupted Thai Dialects
American Journal of Applied Sciences 9 (3): 278-282, 2012 ISSN 1546-9239 2012 Science Publications Speech Compression for Noise-Corrupted Thai Dialects Suphattharachai Chomphan Department of Electrical
More informationUsing the Soundtrack to Classify Videos
Using the Soundtrack to Classify Videos Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/
More informationNoise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise
4 Special Issue Speech-Based Interfaces in Vehicles Research Report Noise-Robust Speech Recognition in a Car Environment Based on the Acoustic Features of Car Interior Noise Hiroyuki Hoshino Abstract This
More informationComparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate
, pp.57-61 http://dx.doi.org/10.14257/astl.2015.107.14 Comparison of Lip Image Feature Extraction Methods for Improvement of Isolated Word Recognition Rate Yong-Ki Kim 1, Jong Gwan Lim 2, Mi-Hye Kim *3
More informationAudio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Audio-Visual Speech Recognition Using Bimodal-Trained Features for a Person with Severe Hearing Loss Yuki Takashima 1, Ryo Aihara 1, Tetsuya Takiguchi
More informationReview of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04
Review of SPRACH/Thisl meetings Cambridge UK, 1998sep03/04 Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 4 5 SPRACH overview The SPRACH Broadcast
More informationRecognition & Organization of Speech and Audio
Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University http://www.ee.columbia.edu/~dpwe/ Outline 1 2 3 4 5 Introducing Tandem modeling
More informationIMPROVING INTELLIGIBILITY IN NOISE OF HMM-GENERATED SPEECH VIA NOISE-DEPENDENT AND -INDEPENDENT METHODS
IMPROVING INTELLIGIBILITY IN NOISE OF HMM-GENERATED SPEECH VIA NOISE-DEPENDENT AND -INDEPENDENT METHODS Cassia Valentini-Botinhao 1, Elizabeth Godoy 2, Yannis Stylianou 2, Bastian Sauert 3 Simon King 1
More informationEffects of Noises on the Analysis of Fundamental Frequency Contours for Thai Speech
Journal of Computer Science 7 (4): 568-572, 2011 ISSN 1549-3636 2011 Science Publications Effects of Noises on the Analysis of Fundamental Frequency Contours for Thai Speech Suphattharachai Chomphan Department
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the
More informationRobust Speech Detection for Noisy Environments
Robust Speech Detection for Noisy Environments Óscar Varela, Rubén San-Segundo and Luis A. Hernández ABSTRACT This paper presents a robust voice activity detector (VAD) based on hidden Markov models (HMM)
More informationA bio-inspired feature extraction for robust speech recognition
Zouhir and Ouni SpringerPlus 24, 3:65 a SpringerOpen Journal RESEARCH A bio-inspired feature extraction for robust speech recognition Youssef Zouhir * and Kaïs Ouni Open Access Abstract In this paper,
More informationResearch Article Automatic Speaker Recognition for Mobile Forensic Applications
Hindawi Mobile Information Systems Volume 07, Article ID 698639, 6 pages https://doi.org//07/698639 Research Article Automatic Speaker Recognition for Mobile Forensic Applications Mohammed Algabri, Hassan
More informationDeep Neural Networks for Voice Quality Assessment based on the GRBAS Scale
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale Simin Xie 1, 2, Nan Yan 2, Ping Yu 3, Manwa L. Ng 4, Lan Wang 2,Zhuanzhuan
More information10CS664: PATTERN RECOGNITION QUESTION BANK
10CS664: PATTERN RECOGNITION QUESTION BANK Assignments would be handed out in class as well as posted on the class blog for the course. Please solve the problems in the exercises of the prescribed text
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationCodebook driven short-term predictor parameter estimation for speech enhancement
Codebook driven short-term predictor parameter estimation for speech enhancement Citation for published version (APA): Srinivasan, S., Samuelsson, J., & Kleijn, W. B. (2006). Codebook driven short-term
More informationMitigating the Effects of Non-Stationary Unseen Noises on Language Recognition Performance
INTERSPEECH 2015 Mitigating the Effects of Non-Stationary Unseen Noises on Language Recognition Performance Luciana Ferrer 1, Mitchell McLaren 2, Aaron Lawson 2, Martin Graciarena 2 1 Departamento de Computación,
More informationI. INTRODUCTION. OMBARD EFFECT (LE), named after the French otorhino-laryngologist
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 6, AUGUST 2010 1379 Unsupervised Equalization of Lombard Effect for Speech Recognition in Noisy Adverse Environments Hynek Bořil,
More informationNOISE REDUCTION ALGORITHMS FOR BETTER SPEECH UNDERSTANDING IN COCHLEAR IMPLANTATION- A SURVEY
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 NOISE REDUCTION ALGORITHMS FOR BETTER SPEECH UNDERSTANDING IN COCHLEAR IMPLANTATION- A SURVEY Preethi N.M 1, P.F Khaleelur
More informationNoisy training for deep neural networks in speech recognition
Yin et al. EURASIP Journal on Audio, Speech, and Music Processing (15) 15:2 DOI.1186/s13636-14-47- RESEARCH Open Access Noisy training for deep neural networks in speech recognition Shi Yin 1,4,ChaoLiu
More informationModeling the Use of Space for Pointing in American Sign Language Animation
Modeling the Use of Space for Pointing in American Sign Language Animation Jigar Gohel, Sedeeq Al-khazraji, Matt Huenerfauth Rochester Institute of Technology, Golisano College of Computing and Information
More informationTESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY
ARCHIVES OF ACOUSTICS 32, 4 (Supplement), 187 192 (2007) TESTS OF ROBUSTNESS OF GMM SPEAKER VERIFICATION IN VoIP TELEPHONY Piotr STARONIEWICZ Wrocław University of Technology Institute of Telecommunications,
More informationSpeech, Hearing and Language: work in progress. Volume 11
Speech, Hearing and Language: work in progress Volume 11 Other Publications in Speech and Hearing Science: 1998 to 1999. Department of Phonetics and Linguistics UNIVERSITY COLLEGE LONDON 201 Other Publications
More informationSound Texture Classification Using Statistics from an Auditory Model
Sound Texture Classification Using Statistics from an Auditory Model Gabriele Carotti-Sha Evan Penn Daniel Villamizar Electrical Engineering Email: gcarotti@stanford.edu Mangement Science & Engineering
More informationSUPPRESSION OF MUSICAL NOISE IN ENHANCED SPEECH USING PRE-IMAGE ITERATIONS. Christina Leitner and Franz Pernkopf
2th European Signal Processing Conference (EUSIPCO 212) Bucharest, Romania, August 27-31, 212 SUPPRESSION OF MUSICAL NOISE IN ENHANCED SPEECH USING PRE-IMAGE ITERATIONS Christina Leitner and Franz Pernkopf
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is
More informationExploiting visual information for NAM recognition
Exploiting visual information for NAM recognition Panikos Heracleous, Denis Beautemps, Viet-Anh Tran, Hélène Loevenbruck, Gérard Bailly To cite this version: Panikos Heracleous, Denis Beautemps, Viet-Anh
More informationdoi: / _59(
doi: 10.1007/978-3-642-39188-0_59(http://dx.doi.org/10.1007/978-3-642-39188-0_59) Subunit modeling for Japanese sign language recognition based on phonetically depend multi-stream hidden Markov models
More informationSpeaker Independent Isolated Word Speech to Text Conversion Using Auto Spectral Subtraction for Punjabi Language
International Journal of Scientific and Research Publications, Volume 7, Issue 7, July 2017 469 Speaker Independent Isolated Word Speech to Text Conversion Using Auto Spectral Subtraction for Punjabi Language
More informationSpeech to Text Wireless Converter
Speech to Text Wireless Converter Kailas Puri 1, Vivek Ajage 2, Satyam Mali 3, Akhil Wasnik 4, Amey Naik 5 And Guided by Dr. Prof. M. S. Panse 6 1,2,3,4,5,6 Department of Electrical Engineering, Veermata
More informationLecture 9: Speech Recognition: Front Ends
EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition: Front Ends 1 2 Recognizing Speech Feature Calculation Dan Ellis http://www.ee.columbia.edu/~dpwe/e682/
More informationMULTI-MODAL FETAL ECG EXTRACTION USING MULTI-KERNEL GAUSSIAN PROCESSES. Bharathi Surisetti and Richard M. Dansereau
MULTI-MODAL FETAL ECG EXTRACTION USING MULTI-KERNEL GAUSSIAN PROCESSES Bharathi Surisetti and Richard M. Dansereau Carleton University, Department of Systems and Computer Engineering 25 Colonel By Dr.,
More informationResearch Proposal on Emotion Recognition
Research Proposal on Emotion Recognition Colin Grubb June 3, 2012 Abstract In this paper I will introduce my thesis question: To what extent can emotion recognition be improved by combining audio and visual
More informationAdvanced Methods and Tools for ECG Data Analysis
Advanced Methods and Tools for ECG Data Analysis Gari D. Clifford Francisco Azuaje Patrick E. McSharry Editors ARTECH HOUSE BOSTON LONDON artechhouse.com Preface XI The Physiological Basis of the Electrocardiogram
More informationFormulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification
Formulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification Reza Lotfian and Carlos Busso Multimodal Signal Processing (MSP) lab The University of Texas
More informationVITHEA: On-line word naming therapy in Portuguese for aphasic patients exploiting automatic speech recognition
VITHEA: On-line word naming therapy in Portuguese for aphasic patients exploiting automatic speech recognition Anna Pompili, Pedro Fialho and Alberto Abad L 2 F - Spoken Language Systems Lab, INESC-ID
More informationA New Paradigm for the Evaluation of Forensic Evidence. Geoffrey Stewart Morrison. p p(e H )
A New Paradigm for the Evaluation of Forensic Evidence Geoffrey Stewart Morrison p(e H ) p p(e H ) d Abstract In Europe there has been a great deal of concern about the logically correct way to evaluate
More informationHeart Murmur Recognition Based on Hidden Markov Model
Journal of Signal and Information Processing, 2013, 4, 140-144 http://dx.doi.org/10.4236/jsip.2013.42020 Published Online May 2013 (http://www.scirp.org/journal/jsip) Heart Murmur Recognition Based on
More informationMULTIMODAL FUSION FOR CUED SPEECH LANGUAGE RECOGNITION
MULTIMODAL FUSION FOR CUED SPEECH LANGUAGE RECOGNITION Savvas Argyropoulos 1,2, Dimitrios Tzovaras 2, and Michael G. Strintzis 1,2 1 Electrical and Computer Engineering Dept. Aristotle University of Thessaloniki
More informationOutline. Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes. Motivation. Why Hidden Markov Model? Why Hidden Markov Model?
Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes Li-Jung Liang Department of Medicine Statistics Core Email: liangl@ucla.edu Joint work with Rob Weiss & Scott Comulada Outline Motivation
More informationQuick detection of QRS complexes and R-waves using a wavelet transform and K-means clustering
Bio-Medical Materials and Engineering 26 (2015) S1059 S1065 DOI 10.3233/BME-151402 IOS Press S1059 Quick detection of QRS complexes and R-waves using a wavelet transform and K-means clustering Yong Xia
More informationAudio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain
Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain F. Archetti 1,2, G. Arosio 1, E. Fersini 1, E. Messina 1 1 DISCO, Università degli Studi di Milano-Bicocca, Viale Sarca,
More informationRobust system for patient specific classification of ECG signal using PCA and Neural Network
International Research Journal of Engineering and Technology (IRJET) e-issn: 395-56 Volume: 4 Issue: 9 Sep -7 www.irjet.net p-issn: 395-7 Robust system for patient specific classification of using PCA
More informationReach and grasp by people with tetraplegia using a neurally controlled robotic arm
Leigh R. Hochberg et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm Nature, 17 May 2012 Paper overview Ilya Kuzovkin 11 April 2014, Tartu etc How it works?
More informationAnalysis of Speech Recognition Techniques for use in a Non-Speech Sound Recognition System
Analysis of Recognition Techniques for use in a Sound Recognition System Michael Cowling, Member, IEEE and Renate Sitte, Member, IEEE Griffith University Faculty of Engineering & Information Technology
More informationarxiv: v1 [cs.cv] 13 Mar 2018
RESOURCE AWARE DESIGN OF A DEEP CONVOLUTIONAL-RECURRENT NEURAL NETWORK FOR SPEECH RECOGNITION THROUGH AUDIO-VISUAL SENSOR FUSION Matthijs Van keirsbilck Bert Moons Marian Verhelst MICAS, Department of
More informationComputational Perception /785. Auditory Scene Analysis
Computational Perception 15-485/785 Auditory Scene Analysis A framework for auditory scene analysis Auditory scene analysis involves low and high level cues Low level acoustic cues are often result in
More informationRecognition & Organization of Speech and Audio
Recognition & Organization of Speech and Audio Dan Ellis Electrical Engineering, Columbia University Outline 1 2 3 4 5 Sound organization Background & related work Existing projects
More informationA New Paradigm for the Evaluation of Forensic Evidence
A New Paradigm for the Evaluation of Forensic Evidence and its implementation in forensic voice comparison Geoffrey Stewart Morrison Ewald Enzinger p(e H ) p p(e H ) d Abstract & Biographies In Europe
More informationSome Studies on of Raaga Emotions of Singers Using Gaussian Mixture Model
International Journal for Modern Trends in Science and Technology Volume: 03, Special Issue No: 01, February 2017 ISSN: 2455-3778 http://www.ijmtst.com Some Studies on of Raaga Emotions of Singers Using
More informationSpeech Processing / Speech Translation Case study: Transtac Details
Speech Processing 11-492/18-492 Speech Translation Case study: Transtac Details Phraselator: One Way Translation Commercial System VoxTec Rapid deployment Modules of 500ish utts Transtac: Two S2S System
More informationA Review on Dysarthria speech disorder
A Review on Dysarthria speech disorder Miss. Yogita S. Mahadik, Prof. S. U. Deoghare, 1 Student, Dept. of ENTC, Pimpri Chinchwad College of Engg Pune, Maharashtra, India 2 Professor, Dept. of ENTC, Pimpri
More informationSPEECH EMOTION RECOGNITION: ARE WE THERE YET?
SPEECH EMOTION RECOGNITION: ARE WE THERE YET? CARLOS BUSSO Multimodal Signal Processing (MSP) lab The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Why study emotion
More informationEMOTION DETECTION FROM VOICE BASED CLASSIFIED FRAME-ENERGY SIGNAL USING K-MEANS CLUSTERING
EMOTION DETECTION FROM VOICE BASED CLASSIFIED FRAME-ENERGY SIGNAL USING K-MEANS CLUSTERING Nazia Hossain 1, Rifat Jahan 2, and Tanjila Tabasum Tunka 3 1 Senior Lecturer, Department of Computer Science
More informationDevelopment of novel algorithm by combining Wavelet based Enhanced Canny edge Detection and Adaptive Filtering Method for Human Emotion Recognition
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 12, Issue 9 (September 2016), PP.67-72 Development of novel algorithm by combining
More informationLIE DETECTION SYSTEM USING INPUT VOICE SIGNAL K.Meena 1, K.Veena 2 (Corresponding Author: K.Veena) 1 Associate Professor, 2 Research Scholar,
International Journal of Pure and Applied Mathematics Volume 117 No. 8 2017, 121-125 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: 10.12732/ijpam.v117i8.25
More informationCOMBINING CATEGORICAL AND PRIMITIVES-BASED EMOTION RECOGNITION. University of Southern California (USC), Los Angeles, CA, USA
COMBINING CATEGORICAL AND PRIMITIVES-BASED EMOTION RECOGNITION M. Grimm 1, E. Mower 2, K. Kroschel 1, and S. Narayanan 2 1 Institut für Nachrichtentechnik (INT), Universität Karlsruhe (TH), Karlsruhe,
More informationSign Language MT. Sara Morrissey
Sign Language MT Sara Morrissey Introduction Overview Irish Sign Language Problems for SLMT SLMT Data MaTrEx for SLs Future work Introduction (1) Motivation SLs are poorly resourced and lack political,
More informationREZUMAT TEZA DE DOCTORAT
Investeşte în oameni! FONDUL SOCIAL EUROPEAN Programul Operaţional Sectorial Dezvoltarea Resurselor Umane 2007 2013 Axa prioritară: 1 Educaţia şi formarea profesională în sprijinul creşterii economice
More informationEDGE DETECTION OF THE SCOLIOTIC VERTEBRAE USING X-RAY IMAGES
Journal of Engineering Science and Technology 4 th EURECA 2015 Special Issue February (2016) 166-175 School of Engineering, Taylor s University EDGE DETECTION OF THE SCOLIOTIC VERTEBRAE USING X-RAY IMAGES
More informationType II Fuzzy Possibilistic C-Mean Clustering
IFSA-EUSFLAT Type II Fuzzy Possibilistic C-Mean Clustering M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen, Department of Industrial Engineering, Amirkabir University of Technology, P.O. Box -, Tehran, Iran
More informationThe 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian
The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University
More informationMedical Knowledge Attention Enhanced Neural Model. for Named Entity Recognition in Chinese EMR
Medical Knowledge Attention Enhanced Neural Model for Named Entity Recognition in Chinese EMR Zhichang Zhang, Yu Zhang, Tong Zhou College of Computer Science and Engineering, Northwest Normal University,
More information