Deaf and Hard-of-Hearing Users' Preferences for Hearing Speakers' Behavior during Technology-Mediated In-Person and Remote Conversations

Matthew Seita, Sarah Andrew, Matt Huenerfauth · 2021 · Proceedings of the 18th International Web for All Conference (W4A) · doi:10.1145/3430263.3452430

Summary

This paper presents the first quantitative evidence of Deaf and hard-of-hearing (DHH) individuals' preferences for specific speech and non-verbal behaviors from hearing conversational partners during technology-mediated communication. The researchers conducted two experimental studies at Rochester Institute of Technology: an in-person study (N=20) using Google Live Transcribe as an ASR-based captioning smartphone app, and a remote study (N=23) using Zoom with text chat. In both studies, DHH participants had brief conversational exchanges with a trained hearing actor who systematically varied seven categories of behavior at three levels each (high, medium, low): speech rate, voice intensity, enunciation (lip movement clarity), intonation dynamics (tone inflection), eye contact, gesturing, and intermittent pausing. Participants rated satisfaction on a 1-10 scale after each exchange and assigned priority scores (1-7) to rank which behaviors hearing people should focus on. The study was motivated by the inequitable burden placed on DHH individuals to manage communication — teaching hearing colleagues how to behave, sacrificing accuracy by choosing to lipread rather than burdening others, and adapting to hearing norms rather than the reverse. Nearly 20% of US adults are DHH, and communication barriers contribute to lower educational attainment, higher unemployment, and lower salaries.

Key findings

In the in-person study with ASR captioning, enunciation (p=0.0051) and intonation (p=0.0045) had significant effects on DHH satisfaction. Medium enunciation was preferred over low (under-enunciated) — participants found under-enunciation made the speaker hard to understand, while over-enunciation was perceived as rude or condescending. High (dynamic) intonation was preferred over low (monotone), as dynamic tone conveyed emotions, kept participants engaged, and supported connection. For priority ranking, enunciation and eye contact were rated significantly higher than intermittent pausing. In the remote Zoom study, more behaviors showed significant effects: speech rate (p=0.0077), voice intensity (p=0.00098), enunciation (p=0.00059), intonation (p=0.043), and eye contact (p=0.00002). Medium speech rate was preferred — too fast made lipreading difficult, too slow appeared patronizing. Medium-to-high voice intensity was preferred because louder speech produced larger, more visible lip movements even for participants who couldn't hear volume differences. Eye contact was rated the single highest priority behavior in the remote context (significantly above speech rate and intermittent pausing), reflecting the visual orientation of Deaf culture and ASL. Notably, gesturing and intermittent pausing showed no significant effects in either study.

Relevance

This research shifts the accessibility conversation from technology design alone to the behavior of hearing communication partners — an overlooked but critical factor. The findings provide actionable guidance for workplace training, educational settings, and videoconferencing best practices when hearing people communicate with DHH colleagues. Key recommendations: enunciate normally (not exaggerated), use dynamic intonation rather than monotone, maintain natural eye contact (especially critical in video calls), speak at a medium rate, and use adequate volume. The finding that preferences differ between in-person and remote contexts is particularly important as hybrid work becomes the norm — eye contact and speech rate matter more remotely due to the constrained visual field of video. The paper also motivates future technology that could prompt or nudge hearing speakers toward beneficial behaviors in real time, potentially redistributing the communication burden more equitably. For accessibility practitioners, this research highlights that technical solutions like ASR captioning are necessary but not sufficient — the human behaviors that feed into those systems significantly impact the DHH user experience.

Tags: deaf and hard of hearing · automatic speech recognition · videoconferencing · communication accessibility · speechreading · deaf accessibility · social interaction