Towards Human-Centered Optimization of Mobile Sign Language Video Communication

Similar documents
Human-Centered Approach Evaluating Mobile Sign Language Video Communication

Two Themes. MobileASL: Making Cell Phones Accessible to the Deaf Community. Our goal: Challenges: Current Technology for Deaf People (text) ASL

Can You See Me Now? Communication Over Cell Phones for Deaf People. Eve Riskin Richard Ladner Computer Engineering University of Washington

MobileASL: Making Cell Phones Accessible to the Deaf Community

MobileASL: Making Cell Phones Accessible to the Deaf Community. Richard Ladner University of Washington

Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

New Approaches to Accessibility. Richard Ladner University of Washington

Tentative Schedule. What We ll Do Today. Braille Notetakers. Bar Code Reader. Accessibility Seminar Persons with Disabilities. K-NFB Reader Mobile

MobileAccessibility. Richard Ladner University of Washington

General Terms Performance; Experimentation; Human Factors.

Quantifying the Effect of Disruptions to Temporal Coherence on the Intelligibility of Compressed American Sign Language Video

ITU-T. FG AVA TR Version 1.0 (10/2013) Part 3: Using audiovisual media A taxonomy of participation

Wireless Emergency Communications Project

Overview. Video calling today. Who we are 8/21/12. Video Quality and Interoperability across Videophones Today and in the Future

Increasing Access to Technical Science Vocabulary Through Use of Universally Designed Signing Dictionaries

Activity Analysis Enabling Real-Time Video Communication on Mobile Phones for Deaf Users

Internal Consistency and Reliability of the Networked Minds Social Presence Measure

Sign Language in the Intelligent Sensory Environment

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

International Journal of Software and Web Sciences (IJSWS)

Hearing Aid Compatibility of Cellphones: Results from a National Survey

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)

ANALYSIS OF MOTION-BASED CODING FOR SIGN LANGUAGE VIDEO COMMUNICATION APPLICATIONS

$ 530 $ 800 $ HARDWARE SET a. HARDWARE SET b. HARDWARE SET c. Samsung Galaxy J2. Sennheiser HD280 Pro. Headphones.

Sign Language to English (Slate8)

Implementation of image processing approach to translation of ASL finger-spelling to digital text

Activity Analysis of Sign Language Video for Mobile Telecommunication

FROM H.264 TO HEVC: CODING GAIN PREDICTED BY OBJECTIVE VIDEO QUALITY ASSESSMENT MODELS. Kai Zeng, Abdul Rehman, Jiheng Wang and Zhou Wang

R R R HARDWARE SET a. HARDWARE SET b. HARDWARE SET c. Samsung Galaxy J2. Sennheiser HD280 Pro. Headphones.

Roger TM at work. Focus on work rather than on hearing

Speech Compression for Noise-Corrupted Thai Dialects

Increasing Access to Addiction Treatment Services Exploring Mobile Applications

Image Enhancement and Compression using Edge Detection Technique

MAC Sleep Mode Control Considering Downlink Traffic Pattern and Mobility

Shannon-Weaver communication transmission model

Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phase Components

easy read Your rights under THE accessible InformatioN STandard

1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1

DEAF COMMUNITY OUTREACH PILOT & COMMERCIALIZATION

Accessible healthcare for everyone

Intro to HCI evaluation. Measurement & Evaluation of HCC Systems

USING AUDITORY SALIENCY TO UNDERSTAND COMPLEX AUDITORY SCENES

Next Generation Systems: Impact on the Deaf Community. Focus Group with NTID Community

Access to Internet for Persons with Disabilities and Specific Needs

Lessons Learned: Enhancing Communication between Emergency Responders and Entities Serving People

Microphone Input LED Display T-shirt

Evaluating and restructuring a new faculty survey: Measuring perceptions related to research, service, and teaching

OSEP Leadership Conference

TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Perceptual Quality. Measurement Method: Physiological

Real-time Heart Monitoring and ECG Signal Processing

Date: April 19, 2017 Name of Product: Cisco Spark Board Contact for more information:

West Yorkshire & Harrogate Health and Care Partnership

POLICY. The Portrayal of Deaf People in the Media

User Guide V: 3.0, August 2017

A Limited Communication Domain Mobile Aid for a Deaf patient at the Pharmacy

Broadband and Deaf People. Discussion Paper

Digital Encoding Applied to Sign Language Video

Audioconference Fixed Parameters. Audioconference Variables. Measurement Method: Physiological. Measurement Methods: PQ. Experimental Conditions

Get free extra support during power cuts. Register for Priority Services.

Characterization of 3D Gestural Data on Sign Language by Extraction of Joint Kinematics

Consumer Perspective: VRS Quality Issues. Zainab Alkebsi and Ed Bosson

An Information Theory-Based Thermometer to Uncover Brid... Bridge Defects. Research January 17, 2017

The Open Access Institutional Repository at Robert Gordon University

Quick detection of QRS complexes and R-waves using a wavelet transform and K-means clustering

SFO Marking 25 Years of the Americans with Disabilities Act (ADA) Accessibility for Air Travelers with Disabilities at SFO

Glossary of Inclusion Terminology

Increasing Motor Learning During Hand Rehabilitation Exercises Through the Use of Adaptive Games: A Pilot Study

iclicker+ Student Remote Voluntary Product Accessibility Template (VPAT)

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Implementation of Spectral Maxima Sound processing for cochlear. implants by using Bark scale Frequency band partition

EDITORIAL POLICY GUIDANCE HEARING IMPAIRED AUDIENCES

WIDEXPRESS. no.30. Background

Signing High School Science

Developing a Game-Based Proprioception Reconstruction System for Patients with Ankle Sprain

Performance Comparison of Speech Enhancement Algorithms Using Different Parameters

NTID Center on ACCESS TECHNOLOGY Rochester Institute of Technology National Technical Institute for the Deaf

DEAF COMMUNITY OUTREACH PILOT & COMMERCIALIZATION

Overview 6/27/16. Rationale for Real-time Text in the Classroom. What is Real-Time Text?

What We ll Do Today. Goal of Capstone. Teams. Accessibility Capstone

Motivational Affordances: Fundamental Reasons for ICT Design and Use

Smart Connected Hearing NEXT GENERATION HEARING PROTECTION

Member 1 Member 2 Member 3 Member 4 Full Name Krithee Sirisith Pichai Sodsai Thanasunn

What assistive technology is available that could help my child to be more independent?

Jitter-aware time-frequency resource allocation and packing algorithm

EFFECTIVE COMMUNICATION IN MEDICAL SETTINGS

Internal Consistency and Reliability of the Networked Minds Measure of Social Presence

A new era in classroom amplification

Procedure on How to Access Translation and Interpretation Services and Good Practice Guidelines

BMMC (UG SDE) IV SEMESTER

WEB-BASED BILINGUAL INSTRUCTION FOR DEAF CHILDREN

Production of Stop Consonants by Children with Cochlear Implants & Children with Normal Hearing. Danielle Revai University of Wisconsin - Madison

Making Sure People with Communication Disabilities Get the Message

Accessibility and Lecture Capture. David J. Blezard Michael S. McIntire Academic Technology

Telephone Based Automatic Voice Pathology Assessment.

6. If I already own another brand Videophone, can I use it with the ACN Digital Phone Service? No.

Parametric Optimization and Analysis of Adaptive Equalization Algorithms for Noisy Speech Signals

AVR Based Gesture Vocalizer Using Speech Synthesizer IC

Assistant Professor, PG and Research Department of Computer Applications, Sacred Heart College (Autonomous), Tirupattur, Vellore, Tamil Nadu, India

NTID Students, Faculty & Staff. NTID & PEN-International. PEN-International Goals. RIT Information. NTID Technology Expertise Topics for Today

Transcription:

Towards Human-Centered Optimization of Mobile Sign Language Video Communication Jessica J. Tran Electrical Engineering DUB Group University of Washington Seattle, WA 98195 USA jjtran@uw.edu Abstract The mainstream adoption of mobile video communication, especially among deaf and hard-of-hearing people, is heavily reliant on cellular network capacity. Video compression lowers the rate at which video content is transmitted; however, intelligibility may be sacrificed. Currently, there is not a standard method to evaluate video intelligibility, or a good communication model on which to base evaluation. I am developing a better theoretical model, the Human Signal Intelligibility (HSI), to evaluate intelligibility of lowered video quality for the purpose of reducing bandwidth consumption and extending cell phone battery duration. The goal of my dissertation is to advance mobile sign language video communication so it does not rely on higher cellular network bandwidth capacities. I will conduct this work by (1) identifying the components in the HSI model that make up intelligibility of a communication signal and separating those from the comprehensibility of a communication signal; and (2) using this model to identify how low video quality can get before the intelligibility of video content is sacrificed. Thus far, I have evaluated the use of mobile video communication among deaf and hard-of-hearing teenagers; developed two new power-saving algorithms; and quantified battery savings and evaluated user perception when those algorithms are applied. Introduction Mobile video is becoming a technology of choice for communication among deaf and hard-of-hearing people in the United States because their native language is American Sign Language (ASL), which is a visual language. Most major U.S. cellular networks no longer provide unlimited data plans, or are throttling down network speeds in response to high data consumption rates. This limits mainstream adoption of mobile video communication because video users consume network bandwidth faster than average data users. Currently, cellular phone companies do not subsidize the extra cost of mobile video communication used by deaf and hard-of-hearing people. The goal of my dissertation research is to use video compression algorithms to reduce bandwidth consumption and increase battery duration for mobile sign language video communication. My research will answer how low video quality can get in terms of bitrate PAGE 29 SIGACCESS NEWSLETTER, ISSUE 105, JAN. 2013

and frame rate before intelligibility is compromised, and will quantify how much battery life can be extended. These findings will make mobile video communication more accessible while providing intelligible content, reducing bandwidth consumption, and extending cell phone battery life. Related Work Measuring human intelligibility and comprehension of a signal (audio or video) is challenging because it relies on multiple factors such as common ground [4]; signal quality; and the environment in which signals are transmitted. Researchers have attempted to link higher objective signal quality to greater intelligibility of content, assuming that the user has sufficient knowledge and perception abilities for comprehension [3,5]. One such measure of signal quality is peak signal-to-noise ratio (PSNR), which measures quality of image reconstruction after lossy compression. Thu and Ghanbari [10] have demonstrated that PSNR is a reasonable measurement when used across the same content; however, PSNR may not necessarily reflect comprehension, which is most important for sign language communication. Other researchers have focused on measuring signal intelligibility with the intent that if one finds the signal intelligible, then comprehension of content may follow [6,7]. Often human intelligibility and comprehensibility are used interchangeably to validate signal quality. Furthermore, existing communication models only focus on the communication channel itself [9] without considering the environment or the human sender and receiver. Models that have attempted to do so have been poorly defined and do not clearly identify the components of video intelligibility and comprehensibility [1,2]. Proposed Solution A clear theoretical model designed for evaluating signal intelligibility and signal comprehensibility is lacking. I am developing a better theoretical model, the Human Signal Intelligibility (HSI), which distinguishes the components of signal intelligibility from signal comprehensibility for evaluating communication signals. The HSI model will (1) extend Shannon s theory of communication [9] to include the human and environmental influences on signal intelligibility and signal comprehensibility, and (2) identify the components that make up intelligibility of a communication signal and separate those from the comprehensibility of a communication signal as demonstrated in Figure 1. PAGE 30 SIGACCESS NEWSLETTER, ISSUE 105, JAN. 2013

Figure 1: Block diagram of the Human Signal Intelligibility Model. Note that the components comprising intelligibility is a subset of signal comprehension. The distinction between signal intelligibility and signal comprehension is important because signal intelligibility does not imply signal comprehension. Intelligibility depends on signal quality, specifically how the signal was captured, transmitted, received, and perceived by the receiver. Comprehension relies on signal quality and the human receiver having prerequisite knowledge to process the information. In the HSI model, the components that comprise intelligibility are a subset of the components that comprise comprehension. Identifying where intelligibility breaks down is important to determine how much video quality can be reduced. For the purpose of this work, I will focus on mobile sign language video. Motivated by the HSI model, I will investigate how low sign language video quality can get before intelligibility is sacrificed. Specifically, I will reduce the frame rate and bitrate at which video is transmitted because these directly impact bandwidth consumption and the battery power needed to capture, transmit, and receive video. Progress and Future Work My Master s work created two new power saving algorithms; quantified battery savings; and evaluated user perception when those algorithms were applied [11,13]. The major finding was that the algorithm that extended the battery life the most (reducing both the frame rate and the frame size of video content) resulted in the least amount of perceived changes in video quality. I have also led a pilot field study with deaf and hard-of-hearing teenagers investigating use of an experimental smart phone application, MobileASL, that transmits video at extremely low bandwidths [8]. My colleagues and I found that mobile PAGE 31 SIGACCESS NEWSLETTER, ISSUE 105, JAN. 2013

video communication was preferred over text messaging; however, participants expressed that the short battery life limited their use of the technology. This and other findings further motivate the need for longer battery duration for successful mobile video communication. Future work will conduct a full factorial mixed-methods study to investigate how lowering video quality by reducing the rate at which video is transmitted (specifically, bitrates of 15, 30, 60, 120 kbps and frame rates of 1, 5, 10, 15 fps) will impact intelligibility of sign language video and resource consumption. In a web study, I anticipate finding two specific frame rate and bitrate pairs: one where video quality begins to affect intelligibility too negatively and one where increasing resource allocation no longer provides significant gains (i.e., a point of diminishing returns). The laboratory study will use a subset of parameter settings between these frame rate and bit rate pairs to evaluate intelligibility. In pairs of two, participants will be video recorded signing to each other over MobileASL at lowered video qualities. In post analysis, intelligibility will be measured by counting the number of repair requests; average number of turns associated with repair requests; and the number of conversational breakdowns. Finally, battery life will be quantified once transmission rates are known. Contributions I foresee my research contributing to effective mobile sign language video communication and to the theory of video signal intelligibility and signal comprehension. This work builds on my prior work presented at ASSETS 2010 [11] and ASSETS 2011 [12], which demonstrates the continued importance of making mobile communication more accessible to everyone. Goals for the Consortium My proposed dissertation research was approved by my thesis committee in June 2012. Although the main goals are set, there is much to be determined in the specifics of the study design such as how to control variables that make up the HSI model, and how to measure the effects of varied video quality on intelligibility. Validation of the HSI model is reliant on my proposed web and laboratory studies. Feedback on the HSI model and learning of best practices for the study design will increase the validity and model robustness. From the DC, I would like to explore future directions and applications of the Human Signal Intelligibility model. Conclusion While I have made progress in my dissertation research, I believe the ASSETS 2012 Doctoral Consortium is the ideal community to present the Human Signal Intelligibility model and empirical study designs. I am eager to receive feedback to make the model more robust; meet fellow researchers in the accessibility field; and contribute to thoughtful discussions. References [1] Barnlund, D. (2008). A transactional model of communication. Communication Theory. 47-57. [2] Berlo, D. (1960). The Process of Communication. Holt, Rinehart, & Winston. [3] Ciaramello, F. and Hemami, S. (2011). Quality vs. Intelligibility: Studying Human Preferences for American Sign Language Video. SPIE Human Vision and Electronic Imaging. 7865. PAGE 32 SIGACCESS NEWSLETTER, ISSUE 105, JAN. 2013

[4] Clark, H. and Brennan, S. (1991). Perspectives in Socially Shared Cognition, Grounding in Communication. American Psychological Association. 127-149. [5] Feghali, R., Speranza, F., Wang, D., and Vincent, A. (2007). Video Quality Metric for Bit Rate Control via Joint Adjustment of Quantization and Frame Rate. IEEE Transactions on Broadcasting. 53, 1, 441-446. [6] Harrigan, K. (1995). The SPECIAL System: Self-Paced Education with Compressed Interactive Audio Learning. Journal of Research on Computing in Education. 3, 27, 361-370. [7] Hooper, S., Miller, C., Rose, S., and Veletsianos, G. (2007). The Effects of Digital Video Quality on Learner Comprehension in an American Sign Language Assessment Environment. Sign Language Studies. 8, 1, 42-58. [8] Kim, J., Tran, J., Johnson, T., Ladner, R., Riskin, E. and Wobbrock, J. (2011). Effect of MobileASL on Communication Among Deaf Users. Extended Abstracts of the ACM Conference on Human Factors in Computing Systems (CHI 11), 2185-2190. [9] Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal. 27, 379-426, 623-656. [10]Thu, H. and Ghanbari, M. (2008). Scope of Validity of PSNR in image/video quality assessment. Electronic Letters. 44, 13, 800-801. [11]Tran, J., Johnson, T., Kim, J., Rodriguez, R., Yin, S., Riskin, E., Ladner, R. and Wobbrock, J. (2010). A Web-Based User Survey for Evaluating Power Saving Strategies for Deaf Users of MobileASL. The 12th International ACM SIGACCESS Conference on Computers and Accessibility (Assets 10 ) (Orlando, FL), 115-122. [12]Tran, J., Kim, J., Chon, J., Riskin, E., Ladner, R. and Wobbrock, J. (2011). Evaluating Quality and Comprehension of Real-Time Sign Language Video on Mobile Phones. The 13th International ACM SIGACCESS Conference on Computers and Accessibility (Assets 11 ) (Dundee, Scotland, UK, 2011), 115-122. [13]Tran, J. (2010). Power Saving Strategies for Two-Way, Real-Time Video-Enabled Cellular Phones. M.S. Thesis, Electrical Engineering Department, University of Washington, Seattle. About the Author: Jessica J. Tran is a Ph. D. Candidate in Electrical Engineering (EE) at the University of Washington (UW), Seattle. She is tri-advised by Professor Eve A. Riskin (EE), Professor Richard E. Ladner (CSE), and Associate Professor Jacob O. Wobbrock (ischool). She earned her MSEE in 2010 and BSEE in 2008 from UW. Her research interests are in digital signal processing-video compression- and Human Computer Interaction. Her research contributes to the MobileASL project. She focuses on making mobile video communication more accessible to deaf and hard-of-hearing people while considering video quality intelligibility tradeoffs. Website: http://bit.ly/jjtran PAGE 33 SIGACCESS NEWSLETTER, ISSUE 105, JAN. 2013