← All reviews

Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

Jessica J. Tran, Eve A. Riskin, Richard E. Ladner, Jacob O. Wobbrock · 2015 · ACM Transactions on Accessible Computing (TACCESS) · doi:10.1145/2797142

Summary

This paper investigates the lower limits of intelligible sign language video on mobile devices, seeking to determine the minimum frame rate and bit rate that still allow deaf users to understand ASL video conversations. Mobile video communication is essential for deaf and hard-of-hearing people who use sign language as their primary language, but high transmission rates cause network congestion, delayed video, and rapid battery drain — all of which degrade or prevent communication. The authors conducted a web-based study with 1,140 fluent ASL respondents who viewed sign language videos transmitted at four low frame rates (1, 5, 10, and 15 fps) crossed with four low bit rates (15, 30, 60, and 120 kbps) at a constant 320x240 pixel resolution. Participants rated perceived intelligibility on 5-point Likert scales and answered comprehension questions. The study was designed using the Human Signal Intelligibility Model (HSIM), a new conceptual framework the authors developed that distinguishes between signal intelligibility (the physical clarity of the video signal) and signal comprehension (the viewer's ability to understand the meaning, influenced by language proficiency and context). The HSIM guided decisions about which experimental variables to hold constant and which to manipulate. The survey itself was designed to be linguistically accessible to deaf participants, with all instructions presented in ASL video alongside English text, and demographic questions about ASL fluency used to filter for fluent signers. A separate battery drain experiment measured power consumption when transmitting video at the lower frame/bit rates on a Samsung Galaxy S3.

Key findings

The study discovered an "intelligibility ceiling effect": increasing the frame rate above 10fps did not improve perceived intelligibility (10fps actually received higher mean Likert scores than 15fps across all bit rates), and increasing the bit rate above 60kbps produced diminishing returns with no statistically significant improvement. This suggests that relaxing transmission parameters to 10fps at 60kbps — just 25% of the ITU-T recommended standard of 25fps at 100kbps — would provide intelligible video while dramatically reducing bandwidth consumption. At 1fps, video "bottomed out" and was rated significantly worse. Comprehension accuracy was affected by bit rate (chi-square p < 0.0001) but not frame rate when averaged across bit rates. The battery experiment confirmed that reducing transmission rates monotonically increases battery life: at 5fps/25kbps the estimated battery life was 258 minutes, compared to 177 minutes at 30fps/150kbps — a 46% improvement. Video transmission depleted a full battery in 3 to 4 hours regardless of settings, demonstrating the computational intensity of mobile video. The surprising finding that 10fps outperformed 15fps is explained by the fixed bit rate constraint: at 10fps more bits are allocated per frame, resulting in higher per-frame quality that compensates for reduced temporal resolution.

Relevance

This research has direct practical implications for mobile video communication platforms used by the deaf community — services like Sorenson VRS, Convo, and ZVRS — as well as for general-purpose apps like FaceTime and Skype when used for sign language. The finding that intelligible ASL communication requires only 25% of standard bandwidth opens significant opportunities: reduced data costs for deaf users (who rely on video calls as their equivalent of voice calls), extended battery life for longer conversations, and improved call quality in low-bandwidth areas or on limited data plans. For telecom policy, these findings support arguments that deaf users should receive data plan accommodations equivalent to the unlimited voice minutes provided to hearing users. The HSIM framework is a reusable conceptual contribution applicable to any evaluation of video or audio signal quality for communication. The linguistically accessible web survey methodology — presenting all content bilingually in ASL video and English — serves as a model for inclusive research design when working with deaf populations.

Tags: deaf and hard of hearing · sign language · video communication · mobile accessibility · bandwidth · video compression · intelligibility

Standards referenced: ITU-T