← All reviews

Helping students keep up with real-time captions by pausing and highlighting

Walter S. Lasecki, Raja Kushalnagar, Jeffrey P. Bigham · 2014 · Proceedings of the 11th Web for All Conference (W4A) · doi:10.1145/2596695.2596701

Summary

This paper addresses a fundamental problem with real-time captioning for deaf and hard of hearing (DHH) students: the mismatch between speaking rates (approximately 170 words per minute) and reading rates, which causes students to fall progressively behind the live content. The problem is compounded in classroom settings where students must split attention between captions and visual content like slides and demonstrations — eye-tracking data showed that DHH students spend the majority of their time reading captions and have 3-4 times less time than hearing peers to process visual materials. When students look away from captions to view referenced visual content, they lose their place and must search through the scrolling transcript to find where they left off, often missing additional content during this recovery process. The authors developed and tested two tools: a pausing caption player that lets users temporarily stop caption playback (via hold-to-pause or toggle) with controls to fast-forward or jump to live; and a highlighting player that marks the last word read with a yellow highlight when users press a key, while captions continue to update normally.

Key findings

Testing with 25 DHH students from the National Technical Institute for the Deaf showed both tools improved comprehension test scores, but highlighting was significantly more effective. Highlighting yielded a 14.56% improvement over baseline (p < 0.001), while pausing yielded a 7.32% improvement (not statistically significant). The difference between highlighting and pausing was also significant (98.79% larger improvement, p < 0.01). No highlighting tool users and only two pausing tool users saw decreased scores, suggesting the tools are reliably beneficial. Users strongly preferred highlighting over pausing — pausing prevented them from seeing what was currently being said, and fast-forwarding through paused content was difficult to read. As one participant noted: "Pausing captions meant that I could not see what was being said NOW." When offered both tools simultaneously, most users selected one (usually highlighting) and stuck with it. The highlighting approach maps onto a familiar mental model: using a placeholder in one information source while attending to another, similar to keeping a finger in a book while answering the phone.

Relevance

This research addresses a problem that affects millions of DHH students and extends to anyone using captions — including people in noisy environments, non-native speakers, and online learners. The key insight is that caption accessibility is not just about providing text — it is about giving users control over the temporal flow of that text relative to other information sources. The highlighting solution is elegant because it preserves the real-time nature of captions (users can still see new content arriving) while providing an anchor point for gaze-switching. This has direct implications for the design of video players, remote learning platforms, and live event captioning interfaces. The eye-tracking data documenting how DHH students actually read captions in classroom settings — spending most time on captions with little time for slides — provides compelling evidence for why multimodal learning environments need caption-specific interaction tools, not just caption provision.

Tags: deaf and hard of hearing · captioning · real-time captioning · education · inclusive classrooms · reading speed · gaze switching