Assistive Technologies

Similar documents
Speech Recognition. Setup Guide for Win 7. Debbie Hebert, PT, ATP Central AT Services

Glossary of Disability-Related Terms

Sound Interfaces Engineering Interaction Technologies. Prof. Stefanie Mueller HCI Engineering Group

Learning Objectives. AT Goals. Assistive Technology for Sensory Impairments. Review Course for Assistive Technology Practitioners & Suppliers

Advanced Audio Interface for Phonetic Speech. Recognition in a High Noise Environment

Glossary of Inclusion Terminology

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)

ALL IN. ELEARNING DESIGN THAT MEETS ACCESSIBILITY STANDARD

An Ingenious accelerator card System for the Visually Lessen

A Smart Texting System For Android Mobile Users

Assistive Technology Considerations for Students: Writing

Interact-AS. Use handwriting, typing and/or speech input. The most recently spoken phrase is shown in the top box

Glossary of Disability-Related Terms

Universal Usability. Ethical, good business, the law SWEN-444

Planning and Hosting Accessible Webinars

Assistant Professor, PG and Research Department of Computer Applications, Sacred Heart College (Autonomous), Tirupattur, Vellore, Tamil Nadu, India

Elluminate and Accessibility: Receive, Respond, and Contribute

ACCESSIBILITY FOR THE DISABLED

User Guide V: 3.0, August 2017

Universal Usability. Ethical, good business, the law SE 444 R.I.T. S. Ludi/R. Kuehl p. 1 R I T. Software Engineering

Communication. Jess Walsh

36. Advanced Accessibility Features for Inclusive Distance Education

Speech to Text Wireless Converter

Research Proposal on Emotion Recognition

easy read Your rights under THE accessible InformatioN STandard

Speech as HCI. HCI Lecture 11. Human Communication by Speech. Speech as HCI(cont. 2) Guest lecture: Speech Interfaces

Making Sure People with Communication Disabilities Get the Message

Senior Design Project

easy read Your rights under THE accessible InformatioN STandard

AI Support for Communication Disabilities. Shaun Kane University of Colorado Boulder

Equinox 2.0 PC-based audiometer. Not just an audiometer - A full counseling and reporting solution

ACCESSIBILITY FOR THE DISABLED

Max $2000. Once every 5 loan years (upgrade or new system whichever is less)

Date: April 19, 2017 Name of Product: Cisco Spark Board Contact for more information:

Unit 1: Introduction to the Operating System, Computer Systems, and Networks 1.1 Define terminology Prepare a list of terms with definitions

Who is. Employment and Education Solutions 4/23/2015. Access Technologies, Inc. "For people without disabilities, technology makes things easier.

Avaya IP Office R9.1 Avaya one-x Portal Call Assistant Voluntary Product Accessibility Template (VPAT)

Voluntary Product Accessibility Template (VPAT)

Summary Table Voluntary Product Accessibility Template. Supports. Please refer to. Supports. Please refer to

TRANSCRIBING AND CODING P.A.R.T. SESSION SESSION1: TRANSCRIBING

1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1

Speech Processing / Speech Translation Case study: Transtac Details

Phobia Factor STUDENT BOOK, Pages 61 64

Communication Access Features on Apple devices

Section Web-based Internet information and applications VoIP Transport Service (VoIPTS) Detail Voluntary Product Accessibility Template

Assistive Technology Resources & Services Port Jefferson Free Library

INTELLIGENT LIP READING SYSTEM FOR HEARING AND VOCAL IMPAIRMENT

VPAT for Apple MacBook Air (mid 2013)

Note: This document describes normal operational functionality. It does not include maintenance and troubleshooting procedures.

Avaya IP Office 10.1 Telecommunication Functions

Assistive Technology Applications

Broadcasting live and on demand relayed in Japanese Local Parliaments. Takeshi Usuba (Kaigirokukenkyusho Co.,Ltd)

Cisco Unified Communications Accessibility Innovation

In this chapter, you will learn about the requirements of Title II of the ADA for effective communication. Questions answered include:

EDUCATIONAL TECHNOLOGY MAKING AUDIO AND VIDEO ACCESSIBLE

TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING

VPAT Summary. VPAT Details. Section Telecommunications Products - Detail. Date: October 8, 2014 Name of Product: BladeCenter HS23

Language Volunteer Guide

Networx Enterprise Proposal for Internet Protocol (IP)-Based Services. Supporting Features. Remarks and explanations. Criteria

SUMMARY TABLE VOLUNTARY PRODUCT ACCESSIBILITY TEMPLATE

Communications Accessibility with Avaya IP Office

Avaya one-x Communicator for Mac OS X R2.0 Voluntary Product Accessibility Template (VPAT)

COMPUTER-BASED ACCESSIBILITY FEATURES AND EMBEDDED ACCOMMODATIONS AVAILABLE FOR FIELD TEST

CSE 118/218 Final Presentation. Team 2 Dreams and Aspirations

Matrix sentence test (Italian)

Assistive Technology/Devices Lending Library

A C C E S S I B I L I T Y. in the Online Environment

Accessible Internet Video

Avaya G450 Branch Gateway, Release 7.1 Voluntary Product Accessibility Template (VPAT)

Networx Enterprise Proposal for Internet Protocol (IP)-Based Services. Supporting Features. Remarks and explanations. Criteria

Summary Table Voluntary Product Accessibility Template

On-The-Fly Student Notes from Video Lecture. Using ASR

Psychometric Properties of the Mean Opinion Scale

ITU-T. FG AVA TR Version 1.0 (10/2013) Part 3: Using audiovisual media A taxonomy of participation

This nonfiction book

whether or not the fundamental is actually present.

Video Captioning Workflow and Style Guide Overview

It is important to understand as to how do we hear sounds. There is air all around us. The air carries the sound waves but it is below 20Hz that our

COMPUTER ACCESS FOR STUDENTS WITH DISABILITIES. WHAT IS ADAPTIVE ACCESS? Alternate Access: Keyboard

Avaya B159 Conference Telephone Voluntary Product Accessibility Template (VPAT)

Online hearing test Lullenstyd Audiology :

Glove for Gesture Recognition using Flex Sensor

Content. The Origin. What is new in Déjà Vu X3? Transition smoothly from Déjà Vu X2 to. Déjà Vu X3. Help and guidance for my Déjà Vu X3

SANAKO Lab 100 STS USER GUIDE

AC40. The true clinical. hybrid

Multimodal Interaction for Users with Autism in a 3D Educational Environment

ENTREPRENEURS WITH DISABILITIES: THE ROLE OF ASSISTIVE TECHNOLOGY, CURRENT STATUS AND FUTURE OUTLOOK

Sanako Lab 100 STS USER GUIDE

Sign Language Interpretation Using Pseudo Glove

Voluntary Product Accessibility Template (VPAT)

icommunicator, Leading Speech-to-Text-To-Sign Language Software System, Announces Version 5.0

Fujitsu LifeBook T Series TabletPC Voluntary Product Accessibility Template

Phoneme Perception Test 3.0

Avaya Model 9611G H.323 Deskphone

FOUNDATIONAL INFORMATION. Types of Captioning 2 types. The difference is whether or not the user has to turn on the captions.

Houghton Mifflin Harcourt Avancemos!, Level correlated to the

Assistive Technology (AT): Support for Diverse Learning Needs

Study about Software used in Sign Language Recognition

Transcription:

Revista Informatica Economică nr. 2(46)/2008 135 Assistive Technologies Ion SMEUREANU, Narcisa ISĂILĂ Academy of Economic Studies, Bucharest smeurean@ase.ro, isaila_narcisa@yahoo.com A special place into assistive technologies is taken up by the speech recognition and speech synthesizer, which can be used by many different users, persons with visual, language or mobility disabilities. For many years soft developers have been concerned by speech recognition and text-to-speech because we assist to great changes in informatics area and accessibility is the main condition in the creation of assistive software applications. Keywords : speech recognition, text-to-speech, assistive technologies, accessibility. I ntroduction Over a decade, the computer asserts oneself as a needful instrument for persons with disabilities offering them a new perspective, a new way to live. For these users, the products which have existed on the market offer additional accessibility to computer and are created for each type of disability. For example, assistive technology for visual disability persons includes the following products: screen enlargers, screen readers, screen review utility, speech recognition systems, speech synthesizer, refreshable Braille displays, Braille embossers, text readers, word prediction programs. Assistive products for persons with mobility disabilities include speech recognition systems, editing programs on screen using alternative products (sip and puff, sticks, joysticks, trackballs), alternative keyboards, keyboard filters for editing or touch screens. The persons with learning impairments can use word prediction programs, reading comprehension programs, speech synthesizers, speech recognition programs. Many accessibility characteristics are offered by Windows operating system or Office package, which render easy the access to different elements using keyboard or mouse. Into operating system it s remarkable the accessibility offered by Magnifier utility (for increasing the zoom of some screen parts), Screen Review (which reads information from screen using the sound), Narrator (for helping the user who works with programs from Control Panel, NotePad, WordPad or Internet Explorer browser). There are a lot of settings in Control Panel using Accessibility Options application regarding contrast, filter options, colors or navigation, all to increase accessibility to each element of application used by the user. Office package includes accessibility characteristics as: zoom, contrast between graphic elements, the possibility to page setup for better seeing documents content (using Reading Layout), development accessibility Web pages (Microsoft FrontPage) by adding text to images, formatting styles, making image maps. 1. Speech Recognition Systems Named also voice recognition programs, may let users to introduce data, using voice instead keyboard or mouse. The main characteristics of the speech recognition system are: - vocabulary s size, the number of words recognized - the separated or continuous speech - the conditions of noise - the number of speakers - the percentage of recognize - processing time, on-line, delay, off-line - the area of applicability The speech recognition is a complex process composed by difficult constructive parts, in which a part of the system, the physics part, converts the sound into electric signal and adjusts it for entrance into the next part. The second part, the logic part, is represented by the computer with sound board and necessary program for all required processes.

136 Revista Informatica Economică nr. 2(46)/2008 Acoustic signal Database Database AMP FTJ CAN Pre-processing Extract the parameters Coding Recognize Unit recognized Electro Analysis Symbolic tact Digital Analyse Fig.1. General structure of a speech recognition system For understanding human speaking, each speech recognition system uses four main components with strong connection: 1) Dividing text in words, process used by speech recognition engine, which assures greater or less accuracy in speech recognition. Thus: a) digitization of the speech, meaning to insert short pauses after each pronounced word, which assures finding by the system the beginning and the end of the word. The advantage is a little power calculation but the system becomes unused if we don t respect pauses between words. b) identifying words into vocabulary method, which allows the user to speak naturally without pauses between words, but the system can read wrong if some used words aren t into the vocabulary. c) continuous speaking method, offers the best accuracy in speech recognition processing every pronounced word. Because the system doesn t use elements to separate words is necessary a longer period of time to find the beginning and the end for every word. 2) The vocabulary or words list, which speech recognition system can find to a certain moment. Using a rich vocabulary, that means to improve speech recognition but to increase the size vocabulary isn t a guaranty for a better accuracy if many words have the same meaning. 3) Finding words means to search in vocal database and to achieve connections to each other and write the audio signal. There are two methods: a) identifying the whole word, a method which consists in finding into the database for the word fit to the audio signal. That means less search ways but it is necessary to exist into the system templates for every word, which overload utilization. b) finding phonemes into the dictionary for the speaker language. The advantage is a reduced space for keeping information but the disadvantage is increased by the power calculation. 4) Dependence of the speaker is the main element to design and implement speech recognition system. The system can be: a) independent by speaker and in this case there is a great resources consumption to convert all into dialect human speech. b) dependent by speaker, using minimum resources but they require to educate systems for a few hours and so accuracy is ninety percentage. The users with mobility disabilities prefer these systems because they are easily to use. c) adjusted to speaker, meaning to educate system for the same speaker. The speech recognition system must achieve an equilibrium among four elements and to assure independence by characteristics of voice users. 2. Text-to- speech (TTS) Starting from a written text it suppose different constrains regarding vocabulary (theoretically unlimited) but pronounced sentences must be as naturally respecting an usual intonation. Because sentences can t be memorized, it s necessary to choose a limited en-

Revista Informatica Economică nr. 2(46)/2008 137 semble of linguistic units which by the process of concatenate allows the vocal synthesis from written text. A text-to-speech system contains two units: 1. a unit regarding editing text, which makes all the operations meaning content analyze to convert text in audio codes. 2. a synthesizer, which contents hardware support for synthesis proceeding. To create and use the rules for convert text into vocal signal are analyzed the following elements: a) the phonemes (sound s parts which compose the words) used to produce computerized voice ; b) the quality of vocal signal which depends on the rules of finding and converting the text to audio signal; intonation, emotions and some problems in pronunciation overload producing vocal signal. To eliminate these impediments it can be used tags control or associate transcription text version to transcription phonetics word. c) text-to-speech through concatenate diphones ; d) using phonemes pairs to produce each sound. It s evaluated every word and then the system joins phonemes to pronounce the word. The system which concatenates diphones is individual language because phonemes are differentiated by speaker s language. Text in ASCII format Text pre-processing Phonemes set Conversion graphemes- phonemes Conversion rules grapheme- phoneme Prosodic effects preparation Phoneme to diphone conversion Diphones database Prosodic effects Parameter to synthesizer Intonation rules Hardware synthesizer Vocal signal Fig.2. Elements of TTS system for Romanian language Speech synthesizers are the highest level of text-to-speech systems. They make the same thing as text-to-speech but there are the following parameters sets: - source characteristics: basic frequency, volume, intensity - acoustic characteristics of voice: the frequencies of high sonority, the band-pass width, the coefficients of prediction, the energy of the signals in of the filters. MBROLA is an example for the speech synthesizer, which converts the text to sound using phonemes dictionary of Romanian language. There are applied different accessibility technologies to make speech recognition and text-to-speech.

138 Revista Informatica Economică nr. 2(46)/2008 Text (or) Concepts Sequence parameters generation Synthesizer Speech Fig.3. Difference between TTS system and speech synthesizer 3. SAPI Model (Speech Application Programming Interface) Specific to Windows operating system allows speech recognition and conversion of the text in vocal signal. SAPI is used as interfaces COM (Component Object Model) and contents two distinctive levels: - high SAPI, which allows the access to basic services of speech recognition by Voice Command Object and simple outputs text-to-speech through Voice Text object; - low SAPI, which offers large access to Windows speech recognition services by SR Engine Enumerator Objects meaning SR Sharing and conversion of the text in vocal signal through TTS Enumerator and TTS Engine Objects. Low level of SAPI services is useful for developed use speech recognition and TTS services. 4. Application for reading information on Internet (searching by word) using web services or speech synthesis from text files, Word and Excel To accomplish such application the technologies used are: - Microsoft. NET Framework 2.0 package - MySQL 4.1 (in which is made database) - MyODBC 3.5.1 (for connection to MySQL) - Programming languages to access database: C, C++, Java, PHP - Audio programs for recording, processing and editing wave - MBROLA synthesizer. The interface with user contains: Levels of pre-proceeding text Reading information from files Accessing and reading information from web services. Pre-proceeding text requires: - connection to database by phonemes and waves - use phonemes dictionary of Romanian language (from MBROLA synthesizer) - pre-proceeding of the text which can contain letters and numbers; - conversion of the text in.pho format, which can be read by MBROLA; - conversion in wave format.

Revista Informatica Economică nr. 2(46)/2008 139 Reading information from Word or Excel files requires adding COM references from working with them. File reading is made by writing its name and extension into Text Box Audio. In application s interface was added the option to print the text in Braille format and print preview before printing, because learning by two channels, hearing and touching, assures to memorize the information for a long period of time. Conclusions For creating software products, the developers must take in account compatibility between their products and accessibility standards. The accessibility, for persons with disabilities, means the use of the instruments which have on its basis the speech synthesis. Reproducing of a natural voice remains an open path for research and a challenge for software designers. References [1] Amundsen, M., MAPI, SAPI and TAPI Developer s Guide, SAMS, 1996; [2] Pollard, C, Sag, I.A. - Head-driven Phrase Structured Grammar, CSLI, 1994; [3] Reveiu, A., Dârdală, M., Designing of accessible software for economic and social inclusion, The Proceedings of the 6th Biennial International Symposium, SIMPEC 2006, Braşov; [4] Smeureanu, I., Dârdală, M., Reveiu, A., VISUAL C#.NET, Editura CISON, Bucureşti, 2004; [5] MBROLA: http://tcts.fpms.ac.be/ [6] Ferency, A., Zaiu, D., Ferencz, M., Raţiu, T. Sistem Text-To-Speech pentru Limba Română; http://www.racai.ro/ [7]***, Accessibility in Microsoft Products http://www.microsoft.com/enable/