Comparing Native Signers' Perception of American Sign Language Animations and Videos via Eye Tracking
Hernisa Kacorri, Allen Harper, Matt Huenerfauth · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS) · doi:10.1145/2513383.2513441
Summary
This paper investigates whether eye tracking can serve as an alternative or complementary evaluation method for assessing the quality of synthesized American Sign Language (ASL) animations. Computer-generated ASL animations offer accessibility benefits for deaf individuals with limited English literacy — a significant population, given that the majority of deaf high school graduates read at approximately a fourth-grade level. While videos of human signers can present ASL content, animations are more practical for dynamically generated or frequently updated information on websites and documents. The researchers conducted a study with 11 native ASL signers who viewed three types of stimuli: video recordings of a human signer, animations with facial expressions (the "Model" condition), and animations without facial expressions (the "Non" condition). Each stimulus presented a short ASL story, after which participants answered subjective evaluation questions (grammaticality, understandability, naturalness of movement) and comprehension questions. Two eye-tracking metrics were measured: proportional fixation time on the face area of interest (FacePFT), and the frequency of gaze transitions between the face and hands/body areas (TransFH). The study tested five hypotheses about how these metrics relate to stimulus type and participant responses.
Key findings
When viewing videos of human signers, participants spent significantly more time fixating on the face and made fewer gaze transitions between the face and hands compared to animations — supporting hypothesis H1 and suggesting that higher-quality ASL presentation leads to more face-focused gaze. No significant difference was found between animations with and without facial expressions (H2 not supported), possibly because the facial expression synthesis model was too simplistic to meaningfully affect viewing behavior, or because participants mentally grouped both animation types as the "same" virtual character. The most practically useful finding was that for animations, both eye-tracking metrics correlated significantly with subjective quality ratings (H3 supported): signers who rated animations as more grammatical, understandable, and natural also spent more time looking at the face and made fewer face-to-hand transitions. This is the first published result linking eye-tracker metrics to subjective judgments of sign language animation quality. However, eye-tracking metrics did not correlate with whether participants noticed facial expressions (H4) or with comprehension accuracy (H5), suggesting these metrics capture perceived quality rather than information extraction.
Relevance
This research provides important methodological guidance for researchers developing sign language animation technology. The validated correlation between eye-gaze patterns and subjective quality ratings means that eye tracking can supplement or partially replace traditional questionnaire-based evaluation — particularly valuable in contexts where interrupting participants with questions would disrupt natural viewing or artificially draw attention to specific animation features. For accessibility practitioners, the study reinforces that facial expressions are integral to ASL communication and that animation quality matters for deaf users' engagement with content. The practical recommendations for conducting eye-tracking studies with deaf participants — embedding instructions and questionnaires in the stimuli application, positioning researchers behind the screen, and carefully sizing stimuli — address important methodological considerations for inclusive research design. As automated ASL synthesis continues to improve, these evaluation methods will be essential for ensuring animation quality meets the needs of deaf users.
Tags: sign language · American Sign Language · animation · eye tracking · deaf accessibility · facial expression · user study · sign language synthesis