Accessibility for Deaf and Hard of Hearing Users: Sign Language Conversational User Interfaces

Abraham Glasser, Vaishnavi Mande, Matt Huenerfauth · 2020 · Proceedings of the 2nd Conference on Conversational User Interfaces (CUI '20) · doi:10.1145/3405755.3406158

Summary

This short CUI 2020 position paper (3 pages, presented at the CUI@CHI workshop) lays out the research agenda for making voice-based conversational user interfaces (CUIs) — Alexa, Google Assistant, and similar personal assistant devices — accessible to Deaf and Hard-of-Hearing (DHH) users via sign-language input and output. The authors, from Rochester Institute of Technology's Center for Accessibility and Inclusion Research (CAIR), argue that the rapid consumer adoption of smart speakers is creating new accessibility barriers that hearing household members inadvertently drag DHH family members into. They identify five motivating challenges: (1) Automatic Speech Recognition (ASR) often fails to understand DHH users' voices even when hearing listeners can; (2) text-input workarounds are not functionally equivalent (no hands-free use, English-literacy assumptions that exclude some ASL-primary users); (3) universal design demands sign-language input and output, not just captioned output; (4) current claims of ASL recognition in personal assistants are restricted to small fixed command sets or unnaturally performed signs; and (5) sign-recognition datasets are too small for modern deep learning. The authors then outline a multi-phase research programme: semi-structured interviews with ~30 DHH ASL users (already nearly complete at time of writing), a large online survey of ~200 DHH people across the US to identify priority interaction scenarios, and Wizard-of-Oz lab studies in which DHH users sign commands to a personal assistant device while an interpreter voices them.

Key findings

As a position paper the contribution is an agenda rather than empirical results, but several substantive claims are documented. First, the authors' own prior ASSETS'17 study (winning the CHI 2019 undergraduate research competition) found that modern ASR failed on DHH voices even when professional speech pathologists and naive hearing listeners rated those voices as highly understandable — so human intuition about which DHH voices will work with ASR is not predictive. Second, text-input is not a functionally equivalent fallback for DHH users: it breaks spontaneous cross-room use, fails when hands are occupied (e.g., cooking), and presumes English literacy that some ASL-primary users do not have. Third, existing sign-recognition demos are media-hyped but fragile — they work only for small fixed command sets or when signs are performed unnaturally. Fourth, fundamental HCI design questions are entirely open: how should a DHH user 'wake up' a CUI, how should the system visually acknowledge a signed command, what vocabulary and linguistic structures do signers prefer, and how should output be rendered (signing avatar? captioned text?). Fifth, the team will publish a video dataset of DHH ASL users interacting with devices to benefit the computer-vision community.

Relevance

For accessibility practitioners, this paper is useful less as new data and more as a concise framing of why 'just add a text input' is not an accessible design response to the CUI trend. It makes the case for sign-language-first interaction as a universal-design requirement, names the specific HCI design questions that will need answers (wake-up, feedback, vocabulary, output modality), and points at the dataset bottleneck that still limits practical sign-recognition deployment. Limitations are intrinsic to the format: no empirical findings are reported, the agenda items have since been pursued in follow-on papers by the same team (e.g., Mande et al. 2021 on wake-up approaches), and the recommendations are scoped narrowly to ASL and American English rather than the global multiplicity of signed languages. The paper is best read alongside those later publications as the framing document for the research programme.

Tags: deaf and hard of hearing · sign language · conversational user interfaces · personal assistants · american sign language · automatic speech recognition · universal design · voice interface