Effect of Caption Width on the TV User Experience by Deaf and Hard of Hearing Viewers

Abraham Glasser, Joseline Garcia, Chang Hwang, Christian Vogler, Raja Kushalnagar · 2021 · Proceedings of the 18th International Web for All Conference (W4A) · doi:10.1145/3430263.3452435

Summary

This study from Rochester Institute of Technology and Gallaudet University investigates how caption line width affects the viewing experience for Deaf and hard of hearing (DHH) viewers. Captioning technology has not kept pace with the shift from broadcast TV to diverse personal devices with varying screen sizes, and current guidelines — such as the six-seconds rule (a two-line subtitle displayed for six seconds, approximately 140-150 WPM or 12 characters per second) and the DCMP Captioning Key recommendation of 32 characters per line — are based on decades-old research. The authors tested three caption widths: 1 word per line (rapid serial visual presentation or RSVP style), 6 words per line (the previously suggested optimal width), and 12 words per line (closer to traditional full-width captions). These were crossed with three speech speeds (140, 180, and 240 WPM), producing 27 stimulus videos that 14 DHH college students from Gallaudet University viewed in a counterbalanced between-subjects design. After each 30-second clip, participants rated ease of following video action, ease of following captions, caption width preference, perceived speed, and comfort on 5-point Likert scales.

Key findings

DHH viewers showed no significant preference difference between 6-word and 12-word caption lines — both were rated similarly for ease of following video, ease of following captions, and overall comfort. However, single-word (RSVP-style) captions were significantly dispreferred compared to both 6 and 12 word lines across all speed conditions (p<.001 to p<.0001 for comfort ratings). With single-word captions, participants found it harder to follow the video action, especially at 180 and 240 WPM, because the rapidly flashing words demanded constant attention and forced a difficult split between reading captions and watching video content. Participants also perceived single-word captions as too fast at all speeds, even though the actual WPM was identical across conditions. At 6 and 12 word widths, perceived speed was neutral (around 3 on the scale). The authors suggest RSVP may be more suitable for short text where split attention is not an issue. Viewers appeared to adjust their reading time to caption width: they read 12-word lines more slowly with more time to switch to video, while reading 6-word lines more quickly with less switching, resulting in similar overall experience.

Relevance

This research has direct practical implications for anyone implementing captions on video content across different devices and screen sizes. The key takeaway is that the current guideline of approximately 6 words per line remains a solid minimum, and wider captions up to 12 words per line are equally acceptable — giving designers flexibility across device sizes without degrading the DHH viewing experience. The strong rejection of single-word RSVP-style captions is particularly important as this approach has been promoted for faster reading in some contexts but clearly fails for video captioning where viewers must divide attention between text and visual content. The finding that viewers self-regulate their reading pace based on caption width suggests that captioning systems should focus on providing adequate line length rather than trying to optimize display timing. For practitioners implementing customizable captions, offering width options between 6 and 12 words per line appears to cover the preference range without introducing the attention-splitting problems of narrower presentations.

Tags: captioning · deaf and hard of hearing · video accessibility · user experience · media accessibility · video customization

Standards referenced: DCMP Captioning Key