Enhancing Accessibility in Webtoons: Investigating Audio Effect Placement Strategies for Visually Impaired Users
Heewon Lee, Juwon Cheong, Minsung Kim, Jia Kim, Hyunjung Kim · 2025 · ASSETS 2025: 27th International ACM SIGACCESS Conference on Computers and Accessibility · doi:10.1145/3663547.3759420
Summary
This extended abstract investigates how the timing of audio effect (AE) placement—before, during (overlapping), or after narration—affects the user experience of audio-described webtoons for visually impaired users. Webtoons are Korean-originated vertical-scrolling comics optimized for mobile that are rich in visual and emotional cues but poorly served by standard assistive technologies like screen readers or text-to-speech. The study used the popular action webtoon Wind Breaker (Naver Webtoon) as a case study, producing three audio versions of Episodes 0-2 with different AE timing strategies. A controlled experiment was conducted with 28 blind or low-vision adults (visual acuity 0.02-0.04 or severely restricted visual fields) in South Korea, measuring four UX metrics: immersion, comprehension, usability, and auditory satisfaction through 5-point Likert surveys, short-answer comprehension questions, and semi-structured interviews. Audio descriptions followed Audio Description Coalition guidelines and Korean Communications Commission accessibility standards, using AI-based TTS (Naver Clova Dubbing, Papago TTS) with character-specific voice selections for narration and character dialogue. Sound effects were categorized into five types: environmental ambience, character actions, object interactions, expressive/emotional cues, and comic exaggerations.
Key findings
While Kruskal-Wallis tests did not reveal statistically significant differences (likely due to the sample size of 28, below the ~50 needed for significance), consistent trends emerged across both quantitative and qualitative data: (1) Overlapping AE placement scored highest across all immersion dimensions—spatial presence (14.607), content engagement (13.179), ecological validity (12.357), and emotional engagement (7.429)—and received the highest overall auditory satisfaction (4.357/5). Participants described it as creating "one integrated experience" where narration and effects felt simultaneous and natural. However, overlapping placement scored lowest on comprehension (13.179 vs. 13.571 for pre-placement), with some participants noting narration could be drowned out if volume balance was poor. (2) Pre-placement AE scored highest on comprehension (13.571) and clarity (8.615), functioning as an advance cue that helped listeners build mental models before narration confirmed them. (3) Post-placement AE was least effective overall, creating disconnection and unnatural pauses. The study produced three key design domains with guidelines: audio-effect management (distinguish between sustained ambient BGM and discrete SFX, apply automatic volume ducking of 3-6 dB when narration starts), speech delivery (use a hybrid AI/actor dubbing pipeline with AI narration as default and actor recordings for emotional scenes), and audio description practices (follow a concise order of visual focus → background context → key action, with brief pauses and transition cues between panels).
Relevance
This research addresses an underexplored intersection of media accessibility and digital entertainment—making webtoons, one of the most popular digital media formats globally (especially in East Asia), accessible to visually impaired users through sophisticated sound design rather than simple text description. The finding that AE placement timing significantly affects both immersion and comprehension has practical implications beyond webtoons for any audio-described media including comics, graphic novels, educational materials, and interactive narratives. The tension between immersion (favoring overlapping placement) and comprehension (favoring pre-placement) suggests that accessible audio media should offer customizable AE timing, allowing users to choose their preferred balance. The proposed hybrid AI/actor dubbing pipeline offers a cost-effective production model: AI-generated narration for most content, with professional voice acting reserved for emotionally critical scenes. This approach could make accessible webtoon production economically viable at scale.
Tags: blindness · low vision · audio description · webtoons · digital comics · sound design · immersion · user experience · media accessibility