Modeling the Use of Space for Pointing in American Sign Language Animation

Similar documents
Learning to Generate Understandable Animations of American Sign Language

Synthesizing and Evaluating Animations of American Sign Language Verbs Modeled from Motion-Capture Data

Collecting an American Sign Language Corpus through the Participation of Native Signers

Centroid-Based Exemplar Selection of ASL Non-Manual Expressions using Multidimensional Dynamic Time Warping and MPEG4 Features

Comparison of Finite-Repertoire and Data-Driven Facial Expressions for Sign Language Avatars

Accessible Computing Research for Users who are Deaf and Hard of Hearing (DHH)

Measuring the Perception of Facial Expressions in American Sign Language Animations with Eye Tracking

Collecting and Evaluating the CUNY ASL Corpus. for Research on American Sign Language Animation

Effect of Presenting Video as a Baseline During an American Sign Language Animation User Study

COMPUTER ASSISTED COMMUNICATION FOR THE HEARING IMPAIRED

Comparing Native Signers Perception of American Sign Language Animations and Videos via Eye Tracking

Evaluating a Swiss German Sign Language Avatar among the Deaf Community

Sign Language MT. Sara Morrissey

1. Introduction and Background

Strategies for Building ASL Literacy

BSL Acquisition and Assessment of Deaf Children - Part 2: Overview of BSL Assessments. Wolfgang Mann

FLORIDA STATE COLLEGE AT JACKSONVILLE COLLEGE CREDIT COURSE OUTLINE. Interpreting II: Simultaneous Interpreting. INT 1200 with a grade of C or better

Scalable ASL sign recognition using model-based machine learning and linguistically annotated corpora

Building an Application for Learning the Finger Alphabet of Swiss German Sign Language through Use of the Kinect

American Sign Language (ASL) and the Special Language Credit Option. Revised May 2017

HANDSHAPE ASSIMILATION IN ASL FINGERSPELLING

PHONETIC CODING OF FINGERSPELLING

MA 1 Notes. Deaf vs deaf p. 3 MA1 F 13

MA 1 Notes. moving the hand may be needed.

Matt Huenerfauth. Professor, Golisano College of Computing and Information Sciences, Rochester Institute of Technology.

Arts and Entertainment. Ecology. Technology. History and Deaf Culture

Improving Reading of Deaf and Hard of Hearing Children Through Technology Morocco

Evaluating a Dynamic Time Warping Based Scoring Algorithm for Facial Expressions in ASL Animations

1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1

Repurposing Corpus Materials For Interpreter Education. Nancy Frishberg, Ph.D. MSB Associates San Mateo, California USA

Design and Evaluation of an American Sign Language Generator

Development of Indian Sign Language Dictionary using Synthetic Animations

The Sign2 Project Digital Translation of American Sign- Language to Audio and Text

Overview of the Sign3D Project High-fidelity 3D recording, indexing and editing of French Sign Language content

Better Health through Accessible Communication. Michael McKee, MD April 25, 2009

TVHS ASL 1: Unit 1 Study Guide Unit Test on:

ATLAS. Automatic Translation Into Sign Languages

Hand in Hand: Automatic Sign Language to English Translation

Multi-Modality American Sign Language Recognition

Interpreter Preparation (IPP) IPP 101 ASL/Non-IPP Majors. 4 Hours. Prerequisites: None. 4 hours weekly (3-1)

Sign Synthesis and Sign Phonology Angus B. Grieve-Smith Linguistics Department University of New Mexico

A Web Tool for Building Parallel Corpora of Spoken and Sign Languages

HEALTH SCIENCES AND ATHLETICS Institutional (ILO), Program (PLO), and Course (SLO) Alignment

Families with Young Children who are Deaf and Hard of Hearing in Minnesota

ATLAS Automatic Translation Into Sign Languages

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL

Increasing Access to Technical Science Vocabulary Through Use of Universally Designed Signing Dictionaries

Allen Independent School District Bundled LOTE Curriculum Beginning 2017 School Year ASL III

COLLEGE OF THE DESERT

Does Character's Visual Style Affect Viewer's Perception of Signing Avatars?

CHAPTER FOUR: Identity and Communication in the Deaf Community

Costello, Elaine, Signing: How to Speak with Your Hands, Bantam Books, New York, NY, 1995

World Language Department - Cambridge Public Schools STAGE 1 - DESIRED RESULTS. Unit Goals Unit 1: Introduction to American Sign Language

Communications Sciences & Disorders Course Descriptions

Real Time Sign Language Processing System

Deaf Jam. Film length: 55 minutes Lesson Length: 30 minutes Grades national center for MEDIA ENGAGEMENT

We re all in: How and why one hearing family chose sign language

Boise State University Foundational Studies Program Course Application Form

WEB-BASED BILINGUAL INSTRUCTION FOR DEAF CHILDREN

H 7978 SUBSTITUTE A ======== LC005519/SUB A ======== S T A T E O F R H O D E I S L A N D

Semantic Structure of the Indian Sign Language

Model answers. Childhood memories (signed by Avril Langard-Tang) Introduction:

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Interpreter Preparation (IPP) IPP 101 ASL/Non-IPP Majors. 4 Hours. Prerequisites: None. 4 hours weekly (3-1)

Making ASL/English Bilingualism Work in Your Home -A Guide for Parents and their EI Providers-

Running head: GENDER VARIATION IN AMERICAN SIGN LANGUAGE 1. Gender Variation in Sign Production by Users of American Sign Language

Machine Translation Approach to Translate Amharic Text to Ethiopian Sign Language

Poetry in South African Sign Language: What is different?

Eye Movements of Deaf and Hard of Hearing Viewers of Automatic Captions

Analysis of Visual Properties in American Sign Language

Why does language set up shop where it does?

performs the entire story. The grade for the in-class performance is based mostly on fluency and evidence of adequate practice.

American Sign Language

Expanding the Arts Deaf and Disability Arts, Access and Equality Strategy Executive Summary

American Sign Language 1b: Learning to Sign

NON-NEGOTIBLE EVALUATION CRITERIA

A Randomized- Controlled Trial of Foundations for Literacy

Irit Meir, Carol Padden. Emergence of Language Structures Workshop UCSD, February 6, 2007

THE GALLAUDET DICTIONARY OF AMERICAN SIGN LANGUAGE FROM HARRIS COMMUNICATIONS

Effects of Delayed Language Exposure on Spatial Language Acquisition by Signing Children and Adults

Attitudes, Accommodations and Advocacy, Oh My!

The Canadian Hearing Society gratefully acknowledges The Law Foundation of Ontario for its financial support of this project.

Deaf and Hard of Hearing Children s Visual Language and Developmental Checklist

Representations of action, motion, and location in sign space: A comparison of German (DGS) and Turkish (TİD) Sign Language narratives

A PARENT S GUIDE TO DEAF AND HARD OF HEARING EARLY INTERVENTION RECOMMENDATIONS

Calibration Guide for CyberGlove Matt Huenerfauth & Pengfei Lu The City University of New York (CUNY) Document Version: 4.4

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Deaf children as bimodal bilinguals and educational implications

Bimodal bilingualism: focus on hearing signers

Glossary of Inclusion Terminology

The Evolution of Verb Classes and Verb Agreement in Sign Languages

The Use of Visual Feedback During Signing: Evidence From Signers With Impaired Vision

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation

Professor Brenda Schick

Hearing, Deaf, and Hard-of-Hearing Students Satisfaction with On-Line Learning

GALLERY: BEAUTIFUL DRAWINGS SHOW THE MUSIC OF SIGN LANGUAGE

Testing Calculation of Scalar Implicatures in English and American Sign Language

American Sign Language I: Unit 1 Review

Assessment: Course Four Column SPRING/SUMMER 2015

Transcription:

Modeling the Use of Space for Pointing in American Sign Language Animation Jigar Gohel, Sedeeq Al-khazraji, Matt Huenerfauth Rochester Institute of Technology, Golisano College of Computing and Information Sciences jg2252@rit.edu; sha6709@rit.edu; matt.huenerfauth@rit.edu Abstract Based on motion recordings of humans, we model grammatically appropriate locations in space for a virtual human to point during American Sign Language animations. Keywords Deaf and Hard of Hearing, Emerging Assistive Technologies, Research & Development

Modeling the Use of Space for Pointing in ASL Animation 95 Introduction American Sign Language (ASL) is a primary means of communication for over 500,000 people in the United States (Mitchell et al., 2006), and standardized English reading-skills testing has revealed that many deaf adults in the U.S. have lower English literacy than their hearing peers (Traxler 2000), which can lead to accessibility barriers when these users are faced with English text online, which may be at too high a reading difficulty level. Given the challenges in re-recording videos of human signers (to update information content in ASL on websites), providing animations on websites is a more maintainable way to provide information in ASL. In ASL, signers associate items under discussion with locations around their body (McBurney 2002; Meier 1990), which the signer may point to later in the discourse to refer to these items again. For instance, if a signer were discussing a favorite book, she might mention the title of the book once, and then point at a location in space around her body. For the remainder of the conversation, she would not mention the title of the book again, but instead she would point to this location in space to refer to it. In this work, we model and predict the most natural locations for these spatial reference points (SRPs), based on recordings of human signers movements. We evaluated ASL animations generated from the model in a user-based study. Problem Statement We would like to automate the creation of ASL animations as much as possible (to make it easier to provide ASL content online), yet prior sign language animation systems do not automatically predict where signers should place SRPs in space so that the animation appears natural. Linguists debate whether ASL signers constrain their selection of SRP locations to a semicircular arc region around their torso or use a wider variety of locations (McBurney 2002; Meier 1990), but in any case, there is no finite set of points in space where SRPs may be

Modeling the Use of Space for Pointing in ASL Animation 96 established. Given this complexity, rather than defining a simplistic rule for where an animated human should establish SRPs, we investigate a data-driven prediction model. Specifically, in this work, we have extracted and modeled the locations of all SRPs established by human signers who were recorded in our pre-existing ASL motion capture corpus (Lu and Huenerfauth 2012b). Literature Review To justify why designers of ASL animations must accurately model the use of SRPs, Lu and Huenerfauth (2012c) conducted studies in which native ASL signers evaluated ASL animations with and without the establishment of SRPs. The authors found that adding these pointing-to-locations-in-space phenomena to ASL animations led to a significant improvement in users comprehension of the animations and to more positive subjective ratings of the animations quality. While early work on sign language animations utilized rule-based approaches, as more sign language corpora have become available, recent work has turned to data-driven modeling (Lu and Huenerfauth 2010; Morrissey and Way 2005; Segouat and Braffort 2009), e.g. using ASL corpus data to predict the movements of signer s hands during inflecting verbs (Lu and Huenerfauth 2012a). However, no previous research has predicted SRP locations using datadriven modeling. Modeling Methodology From 2009-2013, our lab recorded 246 unscripted multi-sentence single-signer ASL performances (Lu Huenerfauth 2010) from 8 human signers during 11 recording sessions, and the first 98 of these recordings were released with linguistic annotation of the timing of individual words and SRPs (Lu Huenerfauth 2012b). In this work, using these 98 annotated recordings, we have extracted the right-hand index-fingertip location and 3D-vector where the

Modeling the Use of Space for Pointing in ASL Animation 97 finger is pointing, during the time-periods in the corpus when the human signer was pointing to establish an SRP location in the 3D space around the body. We reduced the dimensionality of the modeling task as follows: We imagined a 2D plane in front of the human signer (at a distance of twice the human s arm-length), and we identified where the 3D finger-pointing vector would intersect with this 2D plane thereby producing a 2D coordinate on the plane that represents where the human was pointing. (See the side-view image in Figure 1.) Fig. 1. Clusters Projected onto 3D Virtual Human Space (see Figure 2 for Detail) To combine data from multiple human signers of different body size, we normalized based on: the human s shoulder width (in the left-right axis), the human s shoulder height (in the up-down axis), and the human s arm-length (to determine the position of the 2D intersection plane in the forward-backward axis). To model this 2D data representing where the human pointed when they set up individual SRPs, we use a Gaussian Mixture Model (GMM),

Modeling the Use of Space for Pointing in ASL Animation 98 implemented in R, to identify clusters in the signing space where signers tended to place items. A GMM with three clusters was the best fit to the data. Fig. 2. Visualization of the Shape of GMM Clusters (see Figure 1 for scale) While Figure 1 visualized the clusters overlaid on a virtual human character, to convey scale, Figure 2 provides a more detailed view of the clusters, to better convey their shape. In Figure 2, the color of data points (and redundant dotted-line boundaries) indicate cluster membership. The centroid of each cluster is indicated in a lighter yellow-color dot within each. A slightly smaller proportion of points were in cluster 2 (red). Evaluation Study A user-based study was conducted to evaluate animations produced using this model, with participation from fluent ASL signers who self-identified as Deaf/deaf, recruited at Rochester Institute of Technology. Out of the 30 participants who were recruited for the study, 20 were male and 10 were female. Their ages ranged from 19-30 (average: 24). Out of the 30 participants, 18 had started using ASL since birth. Eleven participants grew up in a household

Modeling the Use of Space for Pointing in ASL Animation 99 with parents who knew ASL, 6 participants have parents or family members who are deaf, 28 participants use ASL at college, and all participants reported Deaf community connections (e.g. friends, significant other, social groups). During the study, each participant viewed animations of ASL with a virtual human character performing 16 short (5-10 sentence) stories, originally created as stimuli in (Lu Huenerfauth 2012a). Each story was produced in the following three versions: ARTIST: The first version was designed by an ASL-expert 3D artist using a commercial animation tool, as described in (Lu Huenerfauth 2012a). These were intended as an upper baseline (topline), i.e. high-quality animations carefully produced by a skilled human. NO-POINTING: In this version, all pointing to SRP locations (to refer to previously mentioned people or things) was replaced with the virtual human simply re-stating the name of the person or thing throughout the story. These were intended as a lower baseline, i.e. as produced by an animation system that could not select SRP locations. GMM: The ARTIST version of the story was edited so that all of the SRP pointing in the story (originally specified by the artist) was instead replaced by pointing to locations sampled from our GMM model. Each participant saw all 16 stories, but they saw a third of the stories in each of the 3 conditions, counterbalanced across the study. After viewing each animation, the participant answered 4 comprehension questions about the information in that story (total 64 questions), and the participant was asked three 1-to-10 scalar-response subjective questions about three aspects of the animation s quality: its grammatical correctness, how easy it was to understand, and how natural the movement of the animation appeared.

Modeling the Use of Space for Pointing in ASL Animation 100 We hypothesized that: (H1) the subjective ratings and the comprehension question scores for the GMM condition would be higher than for our NO-POINTING baseline animations, and (H2) the ARTIST and GMM would have statistically equivalent scores. To evaluate H1, we conducted a Mann-Whitney Test for the subjective scores and a t-test for comprehension question scores, but we did not observe any statistical difference (p>0.05) in the scores between our GMM model and the NO-POINTING baseline condition. H1 was not supported. To evaluate H2, a Two One-Sided Test (TOST) revealed equivalence between the ARTIST and GMM conditions for both the comprehension question scores (comparison margin = 10% response accuracy) and for the subjective questions (comparison margin = 1.0 units on the 1-to-10 scale). H2 was supported. This result indicates that animations with pointing based on the GMM model were perceived as similar in quality to the ARTIST animations and led to similar comprehension-question scores. Conclusion and Future Work We have presented a method for modeling where human signers tend to point in 3D space when performing spatial reference in ASL, and we demonstrated how the model can predict pointing locations when generating ASL animations. Fluent ASL signers rated these animations as being similar in quality to animations in which the pointing that had been selected by an ASL-expert animation artist. This result contributes to our lab s overall goal of automating the process of synthesizing animations of ASL, which could lead to greater availability of ASL content on websites, to improve information accessibility for ASL users. In future work, we plan on modeling additional phenomena from our motion-capture recordings to synthesize ASL animations.

Modeling the Use of Space for Pointing in ASL Animation 101 Works Cited Huenerfauth, Matt, and Pengfei Lu. Effect of Spatial Reference and Verb Infection on the Usability of American Sign Language Animations. Universal Access in the Information Society: vol. 11, no. 2 (June 2012), 2012c, pp. 169-184. Lu, Pengfei, and Matt Huenerfauth. Collecting a Motion-Capture Corpus of American Sign Language for Data-Driven Generation Research. Proceedings of the 1 st SLPAT Workshop, at HLT-NAACL 2010, Los Angeles, CA, 2010. Lu, Pengfei, and Matt Huenerfauth. CUNY American Sign Language Motion-Capture Corpus: First Release. Proceedings of the 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, The 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, 2012b. Lu, Pengfei, and Matt Huenerfauth. Learning a Vector-Based Model of American Sign Language Inflecting Verbs from Motion-Capture Data. Proceedings of the 3 rd SLPAT Workshop, at NAACL-HLT 2012, Montreal, Quebec, Canada, 2012a. McBurney, Susan Lloyd. Pronominal reference in signed and spoken language: Are grammatical categories modality-dependent. Modality and structure in signed and spoken languages, 2002, pp. 329-369. Meier, Richard P. Person deixis in American sign language. Theoretical issues in sign language research, vol. 1, 1990, pp. 175-190. Mitchell, Ross E, et al. How many people use ASL in the United States?: Why estimates need updating. Sign Lang Studies, vol. 6, no. 3, 2006, pp. 306-335.

Modeling the Use of Space for Pointing in ASL Animation 102 Morrissey, Sara, and Andy Way. An example-based approach to translating sign language. Proc. Workshop on Example-Based Machine Translation, 2005, pp. 109-116. Segouat, Jérémie, and Annelies Braffort. Toward the study of sign language coarticulation: methodology proposal. Advances in Computer-Human Interactions, 2009. ACHI'09. Second International Conferences on. IEEE, 2009. Traxler, Carol Bloomquist. The Stanford Achievement Test: National norming and performance standards for deaf and hard-of-hearing students. Journal of deaf studies and deaf education, vol. 5, no.4, 2000, pp. 337-348.