A METHOD TO ENHANCE INFORMATIZED CAPTION FROM IBM WATSON API USING SPEAKER PRONUNCIATION TIME-DB
|
|
- Victor Wilcox
- 5 years ago
- Views:
Transcription
1 A METHOD TO ENHANCE INFORMATIZED CAPTION FROM IBM WATSON API USING SPEAKER PRONUNCIATION TIME-DB Yong-Sik Choi, YunSik Son and Jin-Woo Jung Department of Computer Science and Engineering, Dongguk Universit, Seoul, Korea ABSTRACT IBM Watson API is a kind of speech recognition API sstem which can automaticall generate not onl recognized ws from voice signal but also generate speaker ID and timing information of each ws including the starting time and the ending time. The performance of IBM Watson API is ver good at the well-reced voice signal b the clearl speaking trained speakers but the performance is not enough good when there are some noises in the reced voice signal. This situation is easil found with movie sounds that include not onl speaking voice signal but also background music or special sound effects. This paper deals with a novel method to enhance this informatized caption from IBM Watson API to resolve this nois signal problem based on speaker pronunciation time-db. To do this, the proposed method uses the original caption information as an additional input. B comparing the original caption with the output of IBM Watson API, the error ws could be automaticall detected and correctl modified. And using the speaker pronunciation time-db containing the average pronunciation time of each w for each speaker, the timing information of each error w could be estimated. In this wa, more precisel enhanced informatized captions could be generated based on the IBM Watson API. The usefulness of the proposed method is verified with two case studies with nois voice signals. K EYWORDS Informatized caption, Speaker Pronunciation Time, IBM Watson API, Speech to Text Translation. INTRODUCTION B the waves of 4 th industrial revolution, artificial intelligence becomes one of the most promising technologies nowadas. There are so man research areas and research results from artificial intelligence. One of them is natural language processing b speech recognition. Tpical speech recognition technologies include speech to text conversion.among captions in which speech is converted into characters, captions including timing information and speaker ID information are referred to as informatizedcaptions [, 2]. Such an informatized caption could be generated b using IBM Watson API [3]. However, the IBM Watson API is more susceptible to clipping errors due to poor recognition results when there aresome noises in the voice signal. And this situation is easil found with movie sounds that include not onl speaking voice signal but also include background music or special sound effects.in er to solve this nois voice problem, there has been a method of predicting the timing information of informatized caption based on a linear estimation formula proportional to the number of alphabets used in each w [2]. But, this linear estimation method based on the number of alphabets is not good enough when there are some silent sllables. Therefore, a novel method to enhance the informatized caption from IBM Watson API is proposed in this paperbased on the speaker pronunciation time-db. DOI: 0.52/ijnlc
2 2. SPEAKER PRONUNCIATION TIME-DB (SPT-DB) 2.. STRUCTURE Figure. Structure of SPT-DB SPT-DB consists of each node for each speaker(s p ) as shown in Fig..The nodes consist of the average pronunciation times(d p ) of each w(w pk ).The nodes of the speaker are arranged in ascending er based on the average pronunciation time, and are connected to each other, and a null value is present at the end. When SPT-DB searches for a w spoken b the speaker, it searches based on the pronunciation time ASSUMPTION Before proceeding with the stud, the following assumptions are based on SPT-DB. [Assumption]SPT-DB is alread configured for each speaker. 3. PROPOSED ALGORITHM 3.. ALGORITHM MODIFYING INCORRECTLY RECOGNIZED WORD BASED ON N SPT-DB Figure 2. Original caption T(X) and informatized caption T + s (X) Basicall, original caption,t(x),, and informatized captionfrom speech recognition result,t + s (X), are input together. Here, S x and E x mean the start time and end time of pronunciation for the w X, respectivel. 2
3 [Step ] Judge whether there is an incorrectl recognized w b comparing T (X) with T s + (X).If there is no incorrectl recognized w, it terminates. If there is an incorrectl recognized w, go to the next step. [Step2] Judge whether there are several consecutive ws in the sequence and pass the parameter to the case. [Step3] Modif the ws in the SPT-DB based on the start and end points of the cases. [Step4] If there is an incorrectl recognized w in the following w, repeat steps to 3 and terminate if there is no incorrectl recognized w CASE : THERE IS ONLY ONE INCORRECTLY RECOGNIZED WORD. Figure3. There is one incorrectl recognized w [Step] Find the point at which the signal of a specific volume(db) T or more starts for E a to S c and determine S b. [Step2] If there is a minimum time t'in S b tos c at which the signal intensit falls below a certain volume T and then remains below T until S c, E b = t' is determined. If there is no t'satisfing the above condition, E b = S c. [Step3]Returns the start time and end time CASE 2: THERE ARE TWO INCORRECTLY RECOGNIZED WORD. Figure 4. Two incorrectl recognized w [Step] Find the point at which the signal of a specific volume(db) T or more starts for E a to S c and determine S b. [Step2] If there is a minimum time t'in S b tos c at which the signal intensit falls below a certain volume T and then remains below T until S c, E b = t' is determined. If there is no t'satisfing the above condition,e b = S c. [Step3] The ending point of the current w is obtained b multipling the start time of the current w b the ratio of the pronunciation time of the two ws to the average pronunciation time of the current w.the following are summarized as follows. 3
4 (, ) = +( ) (,)+(,) [Step4] Returns the start time and end time CASE 3: THERE ARE MORE THAN THREE INCORRECTLY RECOGNIZED WORD. Figure 5. More than three incorrectl recognized w [Step] Find the point at which the signal of a specific volume(db) T or more starts for E a to S w2 and determine S w. [Step2] If there is a minimum time t'in S w to S w2 at which the signal intensit falls below a certain volume T and then remains below T until S w2, E w = t' is determined. If there is no t'satisfing the above condition,e w = S w2. [Step3]The ending point of the current w is obtained b multipling the start time of the current w b the ratio of the pronunciation time of the incorrectl recognized ws to the average pronunciation time of the current w.the following are summarized as follows. = ( ) ( ) [Step4] Returns the start time and end time. 4. CASE STUDYⅠ The case was tested based on English listening assessment data. Fig. 6 shows a problem of the English listening evaluation for universit entrance examination. Fig. 7 and Fig. 8 show the result of speech recognition using the IBM Watson API.Table and Table2 list the time information of the caption at that time, it is expressed as [start time end time].using the IBM Watson API, speech recognition in an environment with no constraints results in high accurac as shown in Fig. 7. Table shows that incorrectl recognition w is (B, 0) and the accurac of speech recognition is 97.56%.However, in a nois environment like Fig.8, the accurac dropped significantl. Table 2shows that incorrectl recognition ws are(a, ), (A, 7), (A, 7), (A, 3), (B, 2), (B, 3), (B, 4), (B, 4), (B, 9), (B, 0) and (C, 3), and the accurac of speech recognition is 73.7%.For reference, the original voice source was snthesized with raining sound using Adobe Audition CC 207 to create a nois environment. If we improve the proposed algorithm with noise, we can obtain the same result as Fig.9 and Table3. The accurac of speech recognition is 00% b the help of original caption and each w includes its own start time and end time. 4
5 Figure 6. Original caption of Case StudⅠ Figure 7. Recognition of original voice without noise b IBM Watson sstem Table. Informatized caption from original voice without noise b IBM Watson sstem Sentence Dad I want to send this book to grandm do ou have a box A Speake a r Yeah I ve got this one to put phot albums and but it s a bit smal o l B Speake r The box looks big enoug for the book Can I use it h C Speake r
6 Sentence A B C D Figure 8. Recognition of mixed voice with rain noise b IBM Watson sstem Table 2. Informatized caption from mixed voice with rain noise b IBM Watson sstem W Speaker Speaker 0 Speaker Speaker Yeah I want to send this perfecgrandm do ou havea plot t a Yeah I foun someon to put photo albums and boughit s a bit sma d e t ll The box lookebig enoug for the book d h Can I use it Figure 9. Speech recognition result modified b proposed algorithm 6
7 Table3.Informatized caption modified b the proposed algorithm Sentence Dad I want to send this book to Grandm Do ou have a box A Speake a r Yeah I ve got this one to put phot albums in but it s a bit smal o l B Speake r The box looks big enoug for the book Can I use it h C Speake r CASE STUDYⅡ The case was tested based on English listening assessment data. Fig.0 shows a problem of the English listening evaluation for universit entrance examination. Fig. and Fig. 2 show the result of speech recognition using the IBM Watson API.Table 4 and Table 5 list the time information of the caption at that time, it is expressed as [start time end time].using the IBM Watson API, speech recognition in an environment with no constraints results in high accurac as shown in Fig.. Table 4shows that incorrectl recognized w is (D, ) and the accurac of speech recognition is 97.29%.However, in a nois environment like Fig.2, the accurac dropped significantl. Table 5shows that incorrectl recognized ws are(a, ), (A, )and (A, ), and the accurac of speech recognition is9.89%.for reference, the original voice source was snthesized with raining sound using Adobe Audition CC 207 to create a nois environment.if we improve the proposed algorithm with noise, we can obtain the same result as Fig.3 and Table 6. The accurac of speech recognition is 00% b the help of original caption and each w includes its own start time and end time. Figure 0. Original caption of Case StudⅡ 7
8 Figure. Recognition of original voice without noise b IBM Watson sstem Table 4. Informatized caption from original voice without noise b IBM Watson sstem Sentenc e Hone I heard the Smit famil move out to the countrsi I reall env the A Speak er 0 B Speak er C Speak er 0 D Speak er h d de m Reall wh is that I think we can sta health if we live in the countr Who Can ou be more specifi m c
9 Sentence A Speaker 0 B Speaker C Speaker 0 D Speaker Figure 2. Recognition of mixed voice with rain noise b IBM Watson sstem Table 5. Informatized caption from mixed voice with rain noise b IBM Watson sstem W I heard the Smith famil moved out to the countrsideenv them Reall wh is that I think we can sta healthif we live in the countr Can ou be more specific Figure 3. Speech recognition result modified b proposed algorithm 9
10 Table 6. Informatized caption modified b the proposed algorithm Sentenc e Hone I heard the Smit famil move out to the countrsi I reall env the A Speak er 0 B Speak er C Speak er 0 D Speak er h d de Reall wh is that I think we can sta health if we live in the countr Hmm Can ou be more specifi c CONCLUSIONS m In this paper, a novel method to enhance the informatized caption from IBM Watson API based on speaker pronunciation time-db is addressed to find and modifincorrectl recognized ws from IBM Watson API. SPT-DBcontains the average pronunciation time information of each w foreach speaker and is used to correct the errors in the informatized caption obtained through the IBM Watson API. The usefulness of the proposed method is verified with two case studies with nois voice signals.however, the proposed algorithm also has some limitations such as that SPT-DB should be created in advance because it is assumed that the information of the corresponding ws alread exists in SPT-DB. Furtherstud will be conducted to modif incorrectl recognized ws while performing speech recognition and simultaneousl to update the SPT-DB in real time. ACKNOWLEDGEMENTS Following are results of a stud on the "Leaders in INdustr-universit Cooperation+" Project, supported b the Ministr of Education and National Research Foundation of Korea,and partiall supported b Basic Science Research Program through the National Research Foundation of Korea (NRF) funded b the Ministr of Education, Science and Technolog (205RDAA ),and also partiall supported bthis work was supported b the Korea Institute for Advencement of Technolog(KIAT) grant funded b the Korean government(motie:ministr of Trade, Industr&Energ, HRD Program for Embedded Software R&D) (No. N00884). 0
11 REFERENCES [] CheonSun Kim, "Introduction to IBM Watson with case studies." Broadcasting and Media Magazine, Vol.22, No.,pp [2] Yong-Sik Choi, Hun-Min Park, Yun-Sik Son and Jin-Woo Jung, Informatized Caption Enhancement based on IBM Watson API, Proceedings of KIIS Autumn Conference 207, Vol. 27, No. 2, pp [3] IBM Watson Developer s Page, AUTHORS Yong-Sik Choi has been under M.S. candidate course at Dongguk universit, Korea, since 207. His current research interests include machine learning and intelligent human-robot interaction. YunSik Son received the B.S. degree from the Dept. of Computer Science and Engineering, Dongguk Universit, Seoul, Korea, in 2004, and M.S. and Ph.D. degrees from the Dept. of Computer Science and Engineering, Dongguk Universit, Seoul, Korea in 2006 and 2009, respectivel. He was a research professor of Dept. of Brain and Cognitive Engineering, Korea Universit, Seoul, Korea from Currentl, he is an assistant professor of the Dept. of Computer Science and Engineering, Dongguk Universit, Seoul, Korea. Also, His research areas include secure software, programming languages, compiler construction, and mobile/embedded sstems. Jin-Woo Jung received the B.S. and M.S. degrees in electrical engineering from Korea Advanced Institute of Science and Technolog (KAIST), Korea, in 997 and 999, respectivel and received the Ph.D.. degree in electrical engineering and computer science from KAIST, Korea in Since 2006, he has been with the Department of Computer Science and Engineering at Dongguk Universit, Korea, where he is currentl a Professor. During 200~2002, he worked as visiting researcher at the Department of Mechano-Informatics, Universit of Toko, Japan. During 2004~2006, he worked as researcher in Human-friendl Welfare Robot Sstem Research Center at KAIST, Korea. During 204, he worked as visiting scholar at the Department of Computer and Information Technolog, Purdue Universit, USA. His current research interests include human behaviour recognition, multiple robot cooperation and intelligent human-robot interaction.
Adaptation of Classification Model for Improving Speech Intelligibility in Noise
1: (Junyoung Jung et al.: Adaptation of Classification Model for Improving Speech Intelligibility in Noise) (Regular Paper) 23 4, 2018 7 (JBE Vol. 23, No. 4, July 2018) https://doi.org/10.5909/jbe.2018.23.4.511
More informationSign Language Interpretation Using Pseudo Glove
Sign Language Interpretation Using Pseudo Glove Mukul Singh Kushwah, Manish Sharma, Kunal Jain and Anish Chopra Abstract The research work presented in this paper explores the ways in which, people who
More informationGlove for Gesture Recognition using Flex Sensor
Glove for Gesture Recognition using Flex Sensor Mandar Tawde 1, Hariom Singh 2, Shoeb Shaikh 3 1,2,3 Computer Engineering, Universal College of Engineering, Kaman Survey Number 146, Chinchoti Anjur Phata
More informationA Judgment of Intoxication using Hybrid Analysis with Pitch Contour Compare (HAPCC) in Speech Signal Processing
A Judgment of Intoxication using Hybrid Analysis with Pitch Contour Compare (HAPCC) in Speech Signal Processing Seong-Geon Bae #1 1 Professor, School of Software Application, Kangnam University, Gyunggido,
More informationEffects of speaker's and listener's environments on speech intelligibili annoyance. Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag
JAIST Reposi https://dspace.j Title Effects of speaker's and listener's environments on speech intelligibili annoyance Author(s)Kubo, Rieko; Morikawa, Daisuke; Akag Citation Inter-noise 2016: 171-176 Issue
More informationGeneral Soundtrack Analysis
General Soundtrack Analysis Dan Ellis oratory for Recognition and Organization of Speech and Audio () Electrical Engineering, Columbia University http://labrosa.ee.columbia.edu/
More informationBroadcasting live and on demand relayed in Japanese Local Parliaments. Takeshi Usuba (Kaigirokukenkyusho Co.,Ltd)
Provision of recording services to Local parliaments Parliament and in publishing. proper equipment and so on, much like any other major stenography company in Japan. that occurred in stenography since
More informationSpeech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces
HCI Lecture 11 Guest lecture: Speech Interfaces Hiroshi Shimodaira Institute for Communicating and Collaborative Systems (ICCS) Centre for Speech Technology Research (CSTR) http://www.cstr.ed.ac.uk Thanks
More informationAn Ingenious accelerator card System for the Visually Lessen
An Ingenious accelerator card System for the Visually Lessen T. Naveena 1, S. Revathi 2, Assistant Professor 1,2, Department of Computer Science and Engineering 1,2, M. Arun Karthick 3, N. Mohammed Yaseen
More informationAccessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)
Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH) Matt Huenerfauth Raja Kushalnagar Rochester Institute of Technology DHH Auditory Issues Links Accents/Intonation Listening
More informationFlorida Standards Assessments
Florida Standards Assessments Assessment Viewing Application User Guide 2017 2018 Updated February 9, 2018 Prepared by the American Institutes for Research Florida Department of Education, 2018 Descriptions
More informationInteract-AS. Use handwriting, typing and/or speech input. The most recently spoken phrase is shown in the top box
Interact-AS One of the Many Communications Products from Auditory Sciences Use handwriting, typing and/or speech input The most recently spoken phrase is shown in the top box Use the Control Box to Turn
More informationRequirements for Maintaining Web Access for Hearing-Impaired Individuals
Requirements for Maintaining Web Access for Hearing-Impaired Individuals Daniel M. Berry 2003 Daniel M. Berry WSE 2001 Access for HI Requirements for Maintaining Web Access for Hearing-Impaired Individuals
More informationRIGHTS OF DEAF AND HARD OF HEARING PEOPLE IN ENTERTAINMENT
(800) 692-7443 (Voice) (877) 375-7139 (TDD) www.disabilityrightspa.org RIGHTS OF DEAF AND HARD OF HEARING PEOPLE IN ENTERTAINMENT If you are DEAF or HARD OF HEARING, you have a right to clear, effective
More informationNoise-Robust Speech Recognition Technologies in Mobile Environments
Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.
More informationBroadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop
Broadband Wireless Access and Applications Center (BWAC) CUA Site Planning Workshop Lin-Ching Chang Department of Electrical Engineering and Computer Science School of Engineering Work Experience 09/12-present,
More informationASR. ISSN / Audiol Speech Res 2017;13(3): / RESEARCH PAPER
ASR ISSN 1738-9399 / Audiol Speech Res 217;13(3):216-221 / https://doi.org/.21848/asr.217.13.3.216 RESEARCH PAPER Comparison of a Hearing Aid Fitting Formula Based on Korean Acoustic Characteristics and
More informationA Determination of Drinking According to Types of Sentence using Speech Signals
A Determination of Drinking According to Types of Sentence using Speech Signals Seong-Geon Bae 1, Won-Hee Lee 2 and Myung-Jin Bae *3 1 School of Software Application, Kangnam University, Gyunggido, Korea.
More informationDevelopment of Neural Network Analysis Program to Predict Shelf-life of Soymilk by Using Electronic Nose
Food Engineering Progress Vol. 4, No. 3. pp. 193~198 (2000) Development of Neural Network Analysis Program to Predict Shelf-life of Soymilk by Using Electronic Nose Sang-Hoon Ko, Eun-Young Park*, Kee-Young
More informationVoice Detection using Speech Energy Maximization and Silence Feature Normalization
, pp.25-29 http://dx.doi.org/10.14257/astl.2014.49.06 Voice Detection using Speech Energy Maximization and Silence Feature Normalization In-Sung Han 1 and Chan-Shik Ahn 2 1 Dept. of The 2nd R&D Institute,
More informationInternational Journal of Scientific & Engineering Research Volume 4, Issue 2, February ISSN THINKING CIRCUIT
International Journal of Scientific & Engineering Research Volume 4, Issue 2, February-2013 1 THINKING CIRCUIT Mr.Mukesh Raju Bangar Intern at Govt. Dental College and hospital, Nagpur Email: Mukeshbangar008@gmail.com
More informationUse of Auditory Techniques Checklists As Formative Tools: from Practicum to Student Teaching
Use of Auditory Techniques Checklists As Formative Tools: from Practicum to Student Teaching Marietta M. Paterson, Ed. D. Program Coordinator & Associate Professor University of Hartford ACE-DHH 2011 Preparation
More informationGender Based Emotion Recognition using Speech Signals: A Review
50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department
More informationAssistive Technology for Regular Curriculum for Hearing Impaired
Assistive Technology for Regular Curriculum for Hearing Impaired Assistive Listening Devices Assistive listening devices can be utilized by individuals or large groups of people and can typically be accessed
More informationAppendix I Teaching outcomes of the degree programme (art. 1.3)
Appendix I Teaching outcomes of the degree programme (art. 1.3) The Master graduate in Computing Science is fully acquainted with the basic terms and techniques used in Computing Science, and is familiar
More informationCombination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency
Combination of Bone-Conducted Speech with Air-Conducted Speech Changing Cut-Off Frequency Tetsuya Shimamura and Fumiya Kato Graduate School of Science and Engineering Saitama University 255 Shimo-Okubo,
More informationTWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING
134 TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING H.F.S.M.Fonseka 1, J.T.Jonathan 2, P.Sabeshan 3 and M.B.Dissanayaka 4 1 Department of Electrical And Electronic Engineering, Faculty
More informationby Gary Boyle Read the following story by Suwitcha Chaiyong from the Bangkok Post. Then, answer the questions that follow.
Bangkok Post Learning: Test Yourself Test Yourself is where you can improve your reading skills. Whether it s for tests like University Entrance Exams or IELTS and TOEFL, or even just for fun, these pages
More informationLanguage Volunteer Guide
Language Volunteer Guide Table of Contents Introduction How You Can Make an Impact Getting Started 3 4 4 Style Guidelines Captioning Translation Review 5 7 9 10 Getting Started with Dotsub Captioning Translation
More informationSmart Speaking Gloves for Speechless
Smart Speaking Gloves for Speechless Bachkar Y. R. 1, Gupta A.R. 2 & Pathan W.A. 3 1,2,3 ( E&TC Dept., SIER Nasik, SPP Univ. Pune, India) Abstract : In our day to day life, we observe that the communication
More informationDEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS
SEOUL Oct.7, 2016 DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS Gunhee Kim Computer Science and Engineering Seoul National University October
More informationSCRIPT FOR PODCAST ON DIGITAL HEARING AIDS STEPHANIE COLANGELO AND SARA RUSSO
SCRIPT FOR PODCAST ON DIGITAL HEARING AIDS STEPHANIE COLANGELO AND SARA RUSSO Hello and welcome to our first pod cast for MATH 2590. My name is Stephanie Colangelo and my name is Sara Russo and we are
More informationGesture Recognition using Marathi/Hindi Alphabet
Gesture Recognition using Marathi/Hindi Alphabet Rahul Dobale ¹, Rakshit Fulzele², Shruti Girolla 3, Seoutaj Singh 4 Student, Computer Engineering, D.Y. Patil School of Engineering, Pune, India 1 Student,
More informationEstimation of Systolic and Diastolic Pressure using the Pulse Transit Time
Estimation of Systolic and Diastolic Pressure using the Pulse Transit Time Soo-young Ye, Gi-Ryon Kim, Dong-Keun Jung, Seong-wan Baik, and Gye-rok Jeon Abstract In this paper, algorithm estimating the blood
More informationOverview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text?
Access to Mainstream Classroom Instruction Through Real-Time Text Michael Stinson, Rochester Institute of Technology National Technical Institute for the Deaf Presentation at Best Practice in Mainstream
More informationJung Wan Kim* Department of Speech Pathology of Daegu University, Gyeongsan, South Korea
J. Ecophysiol. Occup. Hlth. 16(1&2), 2016, 32 36 2016 The Academy of Environmental Biology, India DOI : 10.15512/joeoh/2016/v16i1&2/15795 The Efficiency and Efficacy of On-Line Vs. Off-Line Cognitive Rehabilitation
More informationAI Support for Communication Disabilities. Shaun Kane University of Colorado Boulder
AI Support for Communication Disabilities Shaun Kane University of Colorado Boulder Augmentative and Alternative Communication (AAC) AAC Any tool to replace or supplement spoken and written communication
More informationLearning Objectives. AT Goals. Assistive Technology for Sensory Impairments. Review Course for Assistive Technology Practitioners & Suppliers
Assistive Technology for Sensory Impairments Review Course for Assistive Technology Practitioners & Suppliers Learning Objectives Define the purpose of AT for persons who have sensory impairment Identify
More informationHOW AI WILL IMPACT SUBTITLE PRODUCTION
HOW AI WILL IMPACT SUBTITLE PRODUCTION 1 WHAT IS AI IN A BROADCAST CONTEXT? 2 What is AI in a Broadcast Context? I m sorry, Dave. I m afraid I can t do that. Image By Cryteria [CC BY 3.0 (https://creativecommons.org/licenses/by/3.0)],
More informationA Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar
A Consumer-friendly Recap of the HLAA 2018 Research Symposium: Listening in Noise Webinar Perry C. Hanavan, AuD Augustana University Sioux Falls, SD August 15, 2018 Listening in Noise Cocktail Party Problem
More informationUser Guide V: 3.0, August 2017
User Guide V: 3.0, August 2017 a product of FAQ 3 General Information 1.1 System Overview 5 1.2 User Permissions 6 1.3 Points of Contact 7 1.4 Acronyms and Definitions 8 System Summary 2.1 System Configuration
More informationSpeech to Text Wireless Converter
Speech to Text Wireless Converter Kailas Puri 1, Vivek Ajage 2, Satyam Mali 3, Akhil Wasnik 4, Amey Naik 5 And Guided by Dr. Prof. M. S. Panse 6 1,2,3,4,5,6 Department of Electrical Engineering, Veermata
More informationEMOTIONAL LIGHTING SYSTEM ABLE TO EMOTION REASONING USING FUZZY INFERENCE
EMOTIONAL LIGHTING SYSTEM ABLE TO EMOTION REASONING USING FUZZY INFERENCE 1 BO-RAM KIM, 2 DONG KEUN KIM 1 Department of Mobile Software, Graduate School, Sangmyung University, Seoul 03016, Korea 2 Department
More informationComparison of speech intelligibility between normal headsets and bone conduction hearing devices at call center
Comparison of speech intelligibility between normal headsets and bone conduction hearing devices at call center Setsuo MAEDA 1, Koji KOBAYASHI 2, Hidenori NAKATANI 3, Akiko NAKATANI 4 1 Kinki University
More informationAdvanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment
DISTRIBUTION STATEMENT A Approved for Public Release Distribution Unlimited Advanced Audio Interface for Phonetic Speech Recognition in a High Noise Environment SBIR 99.1 TOPIC AF99-1Q3 PHASE I SUMMARY
More informationAn Approach to Applying. Goal Model and Fault Tree for Autonomic Control
Contemporary Engineering Sciences, Vol. 9, 2016, no. 18, 853-862 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2016.6697 An Approach to Applying Goal Model and Fault Tree for Autonomic Control
More informationCommunication. Jess Walsh
Communication Jess Walsh Introduction. Douglas Bank is a home for young adults with severe learning disabilities. Good communication is important for the service users because it s easy to understand the
More informationAVR Based Gesture Vocalizer Using Speech Synthesizer IC
AVR Based Gesture Vocalizer Using Speech Synthesizer IC Mr.M.V.N.R.P.kumar 1, Mr.Ashutosh Kumar 2, Ms. S.B.Arawandekar 3, Mr.A. A. Bhosale 4, Mr. R. L. Bhosale 5 Dept. Of E&TC, L.N.B.C.I.E.T. Raigaon,
More informationEmbedded Based Hand Talk Assisting System for Dumb Peoples on Android Platform
Embedded Based Hand Talk Assisting System for Dumb Peoples on Android Platform R. Balakrishnan 1, Santosh BK 2, Rahul H 2, Shivkumar 2, Sunil Anthony 2 Assistant Professor, Department of Electronics and
More informationA Learning Method of Directly Optimizing Classifier Performance at Local Operating Range
A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range Lae-Jeong Park and Jung-Ho Moon Department of Electrical Engineering, Kangnung National University Kangnung, Gangwon-Do,
More informationPORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND-HELD OBJECTS FOR BLINDPERSONS
PORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND-HELD OBJECTS FOR BLINDPERSONS D. Srilatha 1, Mrs. K. Pavani 2 1 MTech, Vidya Jyothi Institute of Technology (Autonomous), Aziznagar,
More informationeasy read Your rights under THE accessible InformatioN STandard
easy read Your rights under THE accessible InformatioN STandard Your Rights Under The Accessible Information Standard 2 Introduction In June 2015 NHS introduced the Accessible Information Standard (AIS)
More informationMicrophone Input LED Display T-shirt
Microphone Input LED Display T-shirt Team 50 John Ryan Hamilton and Anthony Dust ECE 445 Project Proposal Spring 2017 TA: Yuchen He 1 Introduction 1.2 Objective According to the World Health Organization,
More informationeasy read Your rights under THE accessible InformatioN STandard
easy read Your rights under THE accessible InformatioN STandard Your Rights Under The Accessible Information Standard 2 1 Introduction In July 2015, NHS England published the Accessible Information Standard
More informationProviding Effective Communication Access
Providing Effective Communication Access 2 nd International Hearing Loop Conference June 19 th, 2011 Matthew H. Bakke, Ph.D., CCC A Gallaudet University Outline of the Presentation Factors Affecting Communication
More informationLindsay De Souza M.Cl.Sc AUD Candidate University of Western Ontario: School of Communication Sciences and Disorders
Critical Review: Do Personal FM Systems Improve Speech Perception Ability for Aided and/or Unaided Pediatric Listeners with Minimal to Mild, and/or Unilateral Hearing Loss? Lindsay De Souza M.Cl.Sc AUD
More informationHow can the Church accommodate its deaf or hearing impaired members?
Is YOUR church doing enough to accommodate persons who are deaf or hearing impaired? Did you know that according to the World Health Organization approximately 15% of the world s adult population is experiencing
More informationCSE 118/218 Final Presentation. Team 2 Dreams and Aspirations
CSE 118/218 Final Presentation Team 2 Dreams and Aspirations Smart Hearing Hearing Impairment A major public health issue that is the third most common physical condition after arthritis and heart disease
More informationA Guide to Theatre Access: Marketing for captioning
Guide A Guide to Theatre Access: Marketing for captioning Image courtesy of Stagetext. Heather Judge. CaptionCue test event at the National Theatre, 2015. Adapted from www.accessibletheatre.org.uk with
More informationEvaluation of Perceived Spatial Audio Quality
Evaluation of Perceived Spatial Audio Quality Jan BERG 1 School of Music, Luleå University of Technology Piteå, Sweden ABSTRACT 1 The increased use of audio applications capable of conveying enhanced spatial
More informationRESOURCE CATALOGUE. I Can Sign A Rainbow
RESOURCE CATALOGUE I Can Sign A Rainbow Catalogue # 100 I Can Sign a Rainbow is a 2 disc set (DVD and CD-ROM) designed for pre-schoolers and their families who use New Zealand Sign Language or spoken English.
More informationMemories with Grandma Elf. Animation activities for 7 11 year olds
Memories with Grandma Elf Animation activities for 7 11 year olds Introduction Why is dementia relevant to young people? Nearly a third of young people know someone with dementia. As the population ages
More informationBefore the Department of Transportation, Office of the Secretary Washington, D.C
Before the Department of Transportation, Office of the Secretary Washington, D.C. 20554 ) In the Matter of ) Accommodations for Individuals Who Are ) OST Docket No. 2006-23999 Deaf, Hard of Hearing, or
More informationKALAKA-3: a database for the recognition of spoken European languages on YouTube audios
KALAKA3: a database for the recognition of spoken European languages on YouTube audios Luis Javier RodríguezFuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez, Germán Bordel Grupo de Trabajo en Tecnologías
More informationResearch Proposal on Emotion Recognition
Research Proposal on Emotion Recognition Colin Grubb June 3, 2012 Abstract In this paper I will introduce my thesis question: To what extent can emotion recognition be improved by combining audio and visual
More informationHow the LIVELab helped us evaluate the benefits of SpeechPro with Spatial Awareness
How the LIVELab helped us evaluate the benefits of SpeechPro with Author Donald Hayes, Ph.D. Director, Clinical Research Favorite sound: Blues guitar A Sonova brand Figure The LIVELab at McMaster University
More informationMicrocontroller and Sensors Based Gesture Vocalizer
Microcontroller and Sensors Based Gesture Vocalizer ATA-UR-REHMAN SALMAN AFGHANI MUHAMMAD AKMAL RAHEEL YOUSAF Electrical Engineering Lecturer Rachna College of Engg. and tech. Gujranwala Computer Engineering
More informationSource and Description Category of Practice Level of CI User How to Use Additional Information. Intermediate- Advanced. Beginner- Advanced
Source and Description Category of Practice Level of CI User How to Use Additional Information Randall s ESL Lab: http://www.esllab.com/ Provide practice in listening and comprehending dialogue. Comprehension
More informationHoughton Mifflin Harcourt Avancemos Spanish correlated to the. NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid, and Novice High
Houghton Mifflin Harcourt Avancemos Spanish 1 2018 correlated to the NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid, and Novice High Novice Low Interpersonal Communication I can communicate
More informationINTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT
INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT R.Nishitha 1, Dr K.Srinivasan 2, Dr V.Rukkumani 3 1 Student, 2 Professor and Head, 3 Associate Professor, Electronics and Instrumentation
More informationITU-T. FG AVA TR Version 1.0 (10/2013) Part 3: Using audiovisual media A taxonomy of participation
International Telecommunication Union ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU FG AVA TR Version 1.0 (10/2013) Focus Group on Audiovisual Media Accessibility Technical Report Part 3: Using
More informationHoughton Mifflin Harcourt Avancemos Spanish 1b correlated to the. NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid and Novice High
Houghton Mifflin Harcourt Avancemos Spanish 1b 2018 correlated to the NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid and Novice High Novice Low Interpersonal Communication I can communicate
More informationVolume 2. Lecture Notes in Electrical Engineering 215. Kim Chung Eds. IT Convergence and Security IT Convergence and Security 2012.
Lecture Notes in Electrical Engineering 215 Kuinam J. Kim Kyung-Yong Chung Editors IT Convergence and Security 2012 Volume 2 The proceedings approaches the subject matter with problems in technical convergence
More informationHEARING SCREENING Your baby passed the hearing screening. Universal Newborn
Parents are important partners. They have the greatest impact upon their young child and their active participation is crucial. Mark Ross (1975) Universal Newborn HEARING SCREENING Your baby passed the
More informationCaptioning Your Video Using YouTube Online Accessibility Series
Captioning Your Video Using YouTube This document will show you how to use YouTube to add captions to a video, making it accessible to individuals who are deaf or hard of hearing. In order to post videos
More informationTASC CONFERENCES & TRAINING EVENTS
TASC is sponsored by the Administration on Developmental Disabilities (ADD), the Center for Mental Health Services (CMHS), the Rehabilitation Services Administration (RSA), the Social Security Administration
More informationHoughton Mifflin Harcourt Avancemos Spanish 1a correlated to the. NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid and Novice High
Houghton Mifflin Harcourt Avancemos Spanish 1a 2018 correlated to the NCSSFL ACTFL Can-Do Statements (2015), Novice Low, Novice Mid and Novice High Novice Low Interpersonal Communication I can communicate
More informationA Communication tool, Mobile Application Arabic & American Sign Languages (ARSL) Sign Language (ASL) as part of Teaching and Learning
A Communication tool, Mobile Application Arabic & American Sign Languages (ARSL) Sign Language (ASL) as part of Teaching and Learning Fatima Al Dhaen Ahlia University Information Technology Dep. P.O. Box
More information한국어 Hearing in Noise Test(HINT) 문장의개발
KISEP Otology Korean J Otolaryngol 2005;48:724-8 한국어 Hearing in Noise Test(HINT) 문장의개발 문성균 1 문형아 1 정현경 1 Sigfrid D. Soli 2 이준호 1 박기현 1 Development of Sentences for Korean Hearing in Noise Test(KHINT) Sung-Kyun
More informationDiagnosis via Model Based Reasoning
Diagnosis via Model Based Reasoning 1 Introduction: Artificial Intelligence and Model Based Diagnosis 1.1 Diagnosis via model based reasoning 1.2 The diagnosis Task: different approaches Davies example
More informationA Smart Texting System For Android Mobile Users
A Smart Texting System For Android Mobile Users Pawan D. Mishra Harshwardhan N. Deshpande Navneet A. Agrawal Final year I.T Final year I.T J.D.I.E.T Yavatmal. J.D.I.E.T Yavatmal. Final year I.T J.D.I.E.T
More informationChapter 17. Speech Recognition and Understanding Systems
Chapter 17. Speech Recognition and Understanding Systems The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2016 Summarized by Jang, Ha-Young
More informationSummary. About Pilot Waverly Labs is... Core Team Pilot Translation Kit FAQ Pilot Speech Translation App FAQ Testimonials Contact Useful Links. p.
Press kit Summary About Pilot Waverly Labs is... Core Team Pilot Translation Kit FAQ Pilot Speech Translation App FAQ Testimonials Contact Useful Links p. 3 p.4 p. 5 p. 6 p. 8 p. 10 p. 12 p. 14 p. 16 p.
More informationThe Effect of Sensor Errors in Situated Human-Computer Dialogue
The Effect of Sensor Errors in Situated Human-Computer Dialogue Niels Schuette Dublin Institute of Technology niels.schutte @student.dit.ie John Kelleher Dublin Institute of Technology john.d.kelleher
More informationHand Gestures Recognition System for Deaf, Dumb and Blind People
Hand Gestures Recognition System for Deaf, Dumb and Blind People Channaiah Chandana K 1, Nikhita K 2, Nikitha P 3, Bhavani N K 4, Sudeep J 5 B.E. Student, Dept. of Information Science & Engineering, NIE-IT,
More informationUsing Source Models in Speech Separation
Using Source Models in Speech Separation Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/
More informationSpeech Technology at Work IFS
Speech Technology at Work IFS SPEECH TECHNOLOGY AT WORK Jack Hollingum Graham Cassford Springer-Verlag Berlin Heidelberg GmbH Jack Hollingum IFS (Publications) Ud 35--39 High Street Kempston Bedford MK42
More informationThe power to connect us ALL.
Provided by Hamilton Relay www.ca-relay.com The power to connect us ALL. www.ddtp.org 17E Table of Contents What Is California Relay Service?...1 How Does a Relay Call Work?.... 2 Making the Most of Your
More informationA Study on a Automatic Audiometry Aid by PSM
International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 11, Number 8 (2018), pp. 1263-1272 International Research Publication House http://www.irphouse.com A Study on a Automatic
More informationThis American Life Transcript. Prologue. Broadcast June 25, Episode #411: First Contact. So, Scott, you were born without hearing, right?
Scott Krepel Interview from TAL #411 1 This American Life Transcript Prologue Broadcast June 25, 2010 Episode #411: First Contact Is that Marc? Yes, that s Marc speaking for Scott. So, Scott, you were
More informationProsody Rule for Time Structure of Finger Braille
Prosody Rule for Time Structure of Finger Braille Manabi Miyagi 1-33 Yayoi-cho, Inage-ku, +81-43-251-1111 (ext. 3307) miyagi@graduate.chiba-u.jp Yasuo Horiuchi 1-33 Yayoi-cho, Inage-ku +81-43-290-3300
More informationHuman cogition. Human Cognition. Optical Illusions. Human cognition. Optical Illusions. Optical Illusions
Human Cognition Fang Chen Chalmers University of Technology Human cogition Perception and recognition Attention, emotion Learning Reading, speaking, and listening Problem solving, planning, reasoning,
More informationComparing Two ROC Curves Independent Groups Design
Chapter 548 Comparing Two ROC Curves Independent Groups Design Introduction This procedure is used to compare two ROC curves generated from data from two independent groups. In addition to producing a
More informationA Wearable Hand Gloves Gesture Detection based on Flex Sensors for disabled People
A Wearable Hand Gloves Gesture Detection based on Flex Sensors for disabled People Kunal Purohit 1, Prof. Kailash Patidar 2, Mr. Rishi Singh Kushwah 3 1 M.Tech Scholar, 2 Head, Computer Science & Engineering,
More information1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1
1. INTRODUCTION Sign language interpretation is one of the HCI applications where hand gesture plays important role for communication. This chapter discusses sign language interpretation system with present
More informationCOMPUTER PLAY IN EDUCATIONAL THERAPY FOR CHILDREN WITH STUTTERING PROBLEM: HARDWARE SETUP AND INTERVENTION
034 - Proceeding of the Global Summit on Education (GSE2013) COMPUTER PLAY IN EDUCATIONAL THERAPY FOR CHILDREN WITH STUTTERING PROBLEM: HARDWARE SETUP AND INTERVENTION ABSTRACT Nur Azah Hamzaid, Ammar
More informationA Study on the Degree of Pronunciation Improvement by a Denture Attachment Using an Weighted-α Formant
A Study on the Degree of Pronunciation Improvement by a Denture Attachment Using an Weighted-α Formant Seong-Geon Bae 1 1 School of Software Application, Kangnam University, Gyunggido, Korea. 1 Orcid Id:
More informationThe Research of Early Child Scientific Activity according to the Strength Intelligence
, pp.162-166 http://dx.doi.org/10.14257/astl.2016. The Research of Early Child Scientific Activity according to the Strength Intelligence Eun-gyung Yun 1, Sang-hee Park 2 1 Dept. of Early Child Education,
More informationA Study on Edge Detection Techniques in Retinex Based Adaptive Filter
A Stud on Edge Detection Techniques in Retine Based Adaptive Filter P. Swarnalatha and Dr. B. K. Tripath Abstract Processing the images to obtain the resultant images with challenging clarit and appealing
More information