VeasyGuide: Personalized Visual Guidance for Low-vision Learners on Instructor Actions in Presentation Videos

Mohamad Elayyan Sechayk, Himanshu Singh, Jan Smeddinck, Chakkrit Tantithamthavorn · 2025 · ASSETS 2025: 27th International ACM SIGACCESS Conference on Computers and Accessibility · doi:10.1145/3663547.3746372

Summary

This paper introduces VeasyGuide, a tool designed to make presentation videos more accessible for low-vision (LV) learners by automatically detecting and highlighting instructor actions such as pointing, marking, and sketching. The authors identify a significant gap in current video accessibility research, which tends to focus on captions and audio descriptions while overlooking the visual actions instructors perform during presentations — actions that are critical for understanding content but difficult for LV learners to detect and follow. VeasyGuide uses computer vision techniques, specifically motion detection via frame differencing, to identify when and where instructors interact with presentation materials. When an action is detected, the system provides real-time visual guidance through configurable highlights and a magnified inset view that enlarges the region of activity. A key design principle is personalization: users can adjust highlight colors, magnification levels, inset size and position, and sensitivity thresholds to match their individual visual needs and preferences. The tool was developed through an iterative co-design process involving low-vision stakeholders and accessibility experts. The system architecture processes video frames to detect motion regions, classifies detected actions, and overlays visual cues in real time. The researchers evaluated VeasyGuide in a controlled study with 8 low-vision and 8 sighted participants across different presentation video scenarios.

Key findings

The evaluation revealed that VeasyGuide significantly improved instructor action detection for low-vision participants. LV participants using VeasyGuide detected 94% of instructor actions compared to 62% without the tool — a substantial improvement. Response times for detecting actions were also faster, with LV participants responding on average 2.3 seconds quicker when VeasyGuide was active. Participants reported reduced cognitive load when using the tool, as measured by the NASA-TLX workload assessment, with mental demand scores dropping significantly. The personalization features were rated highly, with participants spending considerable time adjusting settings to their preferences, confirming that a one-size-fits-all approach is insufficient for LV users. Qualitative feedback highlighted that the magnified inset was the most valued feature, allowing participants to see fine details of instructor writing and pointing without losing context of the overall slide. Some participants preferred subtle highlights while others needed high-contrast overlays, validating the personalization approach. Sighted participants also found the tool helpful for maintaining focus, though the improvement was less dramatic. The motion detection algorithm achieved 89% accuracy in identifying instructor actions across varied video conditions.

Relevance

VeasyGuide addresses an overlooked but important dimension of video accessibility for education. While captions and audio descriptions have received significant attention, the visual actions instructors perform — pointing to specific content, annotating slides, sketching diagrams — carry meaning that is not captured by these traditional accessibility features. As online and recorded lectures become increasingly central to education, ensuring LV learners can access all visual information is critical. The personalization-first approach offers a model for other accessibility tools: rather than imposing fixed accommodations, VeasyGuide lets users configure the experience to their specific visual profile. This aligns with the reality that low vision is highly heterogeneous, with different conditions affecting visual function in different ways. The tool has practical implications for educational institutions seeking to make their video content more inclusive, and the motion detection approach could extend to other video contexts like instructional tutorials, demonstrations, and conference presentations.

Tags: low vision · video accessibility · e-learning · motion detection · visual guidance · personalization · presentation videos · magnification

Standards referenced: WCAG 2.1