← All reviews

Personalized and Accessible TV Interaction for People with Visual Impairments

Daniel Costa, Carlos Duarte · 2019 · Proceedings of the 16th International Web for All Conference (W4A) · doi:10.1145/3315002.3317566

Summary

This paper presents the design and implementation of a system that makes Connected TV applications accessible to people with visual impairments by using a smartphone as an accessible second-screen controller. Connected TVs and set-top boxes now offer interactive features beyond broadcast content — electronic programme guides, catch-up services, app stores with applications like Netflix, YouTube, and Facebook — but these features are heavily visual and provide no feedback accessible to blind users. TV remote controls also present accessibility barriers with small text, poor labeling, and lack of tactile feedback. The system architecture distributes components between the set-top box (STB) and a mobile device. On the STB, a segmentation module and UIML (User Interface Markup Language) builder extract the TV application's user interface structure and convert it into a standardized description that is sent to the mobile device. The Android mobile application parses this UIML document and conveys the TV interface content to the user through speech synthesis via TalkBack compatibility. Users control TV navigation through the smartphone — which they are already familiar with and which has robust accessibility features — using buttons arranged in a cross layout (directional keys, OK, Localize, Read Screen, Speech Command) with high contrast and generous sizing. Multiple input modalities are supported: TalkBack gestures, on-screen touch, mid-air gestures, and speech commands, all unified through a multimodal component. Two feedback modes are available: Concise (e.g., "Favorite Videos, 3 of 5") and Verbose (adding orientation and full menu contents).

Key findings

The system includes an adaptive personalization component based on real-time analysis of interaction event logs. The system detects four behavioral patterns that indicate user difficulties: Irrelevant Actions (actions that don't change the focused element, suggesting confusion), Action Re-occurrence (repeated actions suggesting the user isn't perceiving feedback correctly), Quick Scroll (rapid navigation that may indicate either expertise or difficulty finding content), and Lost Awareness (frequent use of the Localize feature suggesting the user is disoriented in the UI). Pattern analysis triggers after a threshold number of actions, with more frequent users triggering analysis sooner. When patterns are detected, the Adaptation Component suggests modifications — switching between Concise and Verbose feedback modes, adjusting speech synthesizer speed and pitch, changing font size and contrast on the TV application — but adaptations are only applied after user confirmation, preserving user agency. Initial user information is also gathered explicitly through a first-use survey about assistive technology experience and visual impairment severity. The system was successfully demonstrated with Opera TV Store applications, providing audio descriptions of TV app content that were previously completely inaccessible.

Relevance

This research addresses an often-overlooked accessibility domain: television. As TVs evolve from simple broadcast receivers into complex interactive platforms, their visual interfaces create growing barriers for people with visual impairments. The approach of using the smartphone as a second-screen accessible controller is pragmatic — it leverages the mature accessibility ecosystems of mobile platforms (TalkBack, VoiceOver) rather than trying to build screen reader functionality into TV platforms that lack it. For accessibility practitioners, the adaptive personalization approach is particularly instructive: rather than forcing users to manually configure accessibility settings, the system observes interaction patterns to infer when the user is struggling and proactively suggests adjustments. This same pattern-detection approach could be applied to web and mobile accessibility — detecting when users are lost, confused, or unable to find content and dynamically simplifying the interface. The system also demonstrates the value of the UIML abstraction: by extracting a standardized UI description from any TV application regardless of its underlying technology (Java, HTML), the accessibility layer becomes application-agnostic rather than requiring each app to implement its own accessibility support.

Tags: visual impairment · connected TV · personalization · adaptive interface · multimodal interaction · speech synthesis · TalkBack · second screen · audio description · interaction logging

Standards referenced: UIML