A Sound Understanding — An In-Situ Deployment of an Accessible Audio-Media Player with People Living with Aphasia

Filip Bircanin, Alexandre Nevsky, Madeline N Cruice, Ognjen Markovic, Timothy Neate · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791419

Summary

Bircanin and colleagues argue that audio-only media — radio, podcasts, audiobooks — has been largely ignored in accessibility research and practice, even though it structures the daily lives of millions and is a significant barrier for people living with aphasia (PWA), a communication disability that disrupts auditory comprehension while leaving intelligence, opinion formation, and many cognitive capacities intact. The paper reports a three-week, in-situ deployment of Re-Connect, a progressive web app audio-media player co-designed with an aphasia charity over four months of pre-deployment workshops with PWA and speech-language therapists. Re-Connect bundles twelve accessibility interventions layered on a curated library of BBC and LibriVox content, including synchronised read-along transcripts with word-level highlighting, adjustable playback speed, chapter navigation, multi-level Flesch-Kincaid-graded Full Summaries, Live Recap of the last 30-300 seconds, GenAI follow-up questions, Word LookUp, a Story Map visualisation, characters lists, and read-aloud TTS. Ten adults living with aphasia (5 women, 5 men; 42-82 years; stroke onset 1-15+ years prior; aphasia severity 1-3) installed the app on their own Android or iOS devices. The study used a Research-through-Design methodology and triangulated in-app telemetry with reflexive thematic analysis of semi-structured exit interviews scaffolded by log-guided vignettes. The work is framed explicitly as platform-level design guidance rather than a single-app prescription.

Key findings

Participants did not use every feature — instead they assembled small, stable personal repertoires of two to three interventions. Full Summary dominated (40% of all events), followed by Highlight Toggle (14.4%), Follow-Up Suggestions (6.8%), Playback Speed Change (6.7%) and Word LookUp (5.8%). Two distinct strategies emerged: front-loading, where users read the summary before pressing play to prime comprehension, and back-filling, where they returned to it afterward to repair gaps. The simplified Flesch-Kincaid 5th-grade summary was the default resort. Users repeatedly departed from the default to adjust complexity, speed, and read-along behaviour mid-episode — evidence that listener agency matters more than any single 'aphasia-friendly' preset. Recognition-first wayfinding (cover images, clean hierarchy, plain titles) determined whether people even started an episode: familiar, low-risk content 'fit before fix' was the gateway to engagement. Audiobooks were the most-consumed category (31%) because single-narrator, rhythmically regular prosody lowered turn-taking cost. Six of ten participants repurposed the app for therapy-adjacent practice alongside speech-language workbooks, and 8/10 rated the accessibility features 'highly useful.' Challenges included intimidation by text-heavy initial screens, confusion over unfamiliar labels, and occasional LLM hallucinations in summaries and follow-up suggestions.

Relevance

For practitioners this paper is one of very few in-situ, multi-week deployments of an accessible audio player with a marginalised user group, and it shifts the audio-accessibility conversation from 'can we provide a transcript' to 'can we support sense-making alongside the audio'. The platform-level recommendations — high-fidelity downloadable transcripts with speaker diarisation as the default, multi-level summaries, source-proximate repair tools on the main player screen rather than buried in menus, progressive disclosure of features, and regularised rather than slowed prosody — apply directly to Spotify, BBC Sounds, Apple Podcasts, and any long-form audio product. The paper is also a useful case study for product teams using generative AI features in accessible interfaces: the authors pre-generated LLM output offline and hand-checked it for tone and aphasia-friendliness, a practical pattern for avoiding hallucination harms. Limitations include the small, purposively sampled cohort with relatively mild-to-moderate aphasia, a three-week window that captures early adoption rather than long-term habituation, and no formal pre/post comprehension measures.

Tags: aphasia · audio accessibility · podcasts · audiobooks · complex communication needs · in-situ deployment · research through design · generative AI · transcripts · co-design