← All reviews

Accessible Nonverbal Cues to Support Conversations in VR for Blind and Low Vision People

Crescentia Jung, Jazmin Collins, Ricardo E. Gonzalez Penuela, Jonathan Isaac Segal, Andrea Stevenson Won, Shiri Azenkot · 2024 · Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '24) · doi:10.1145/3663548.3675663

Summary

This paper addresses the inaccessibility of nonverbal communication for blind and low-vision (BLV) users in social virtual reality environments. Social VR platforms like VRChat, Rec Room, and Meta Horizon Worlds rely heavily on avatar-mediated nonverbal cues — eye contact, head nods, head shakes, gestures, and spatial proximity — to facilitate natural conversation. For sighted users, these cues provide essential social information about attention, agreement, and turn-taking, but they are entirely visual and therefore inaccessible to BLV users. The researchers designed accessible alternatives for three specific nonverbal behaviours: eye contact (indicating someone is looking at you), head nodding (agreement), and head shaking (disagreement). Each behaviour was mapped to both audio and haptic cues — spatial audio earcons positioned in 3D space to indicate the direction of the person performing the behaviour, and controller vibration patterns with distinct rhythms for each cue type. The cues were evaluated in a within-subjects study with 16 BLV participants who engaged in real-time three-person conversations in VR, comparing conditions with and without the accessible cues. The VR environment was built in Unity using the Normcore networking framework, and the experimental setup used Meta Quest 2 headsets with spatialized audio via the Meta XR Audio SDK.

Key findings

Participants achieved statistically significantly higher accuracy in detecting which conversational partner was paying attention to them when cues were enabled (83.3%) compared to without cues (53.1%), which was near chance level for a two-person discrimination task. Confidence ratings for attention detection also increased significantly with cues. Qualitatively, participants reported that the cues transformed their VR social experience from isolating to engaging — several noted that without cues, they felt like they were "talking into a void" with no feedback about whether anyone was listening. Participants had diverse preferences for audio versus haptic modalities: some preferred audio cues because they conveyed spatial direction naturally through binaural positioning, while others preferred haptics because audio cues could interfere with the conversation itself, especially in multi-person settings. Several participants advocated for both modalities simultaneously, using audio for direction and haptics for confirmation. A key finding was that participants used the cues not only for real-time conversation support but also as a tool for learning social norms they had never had access to — for example, understanding how frequently sighted people make eye contact and how head nodding patterns signal agreement. Participants also requested additional cues for behaviours like hand-raising, facial expressions, and proximity, but noted the risk of information overload, leading the authors to propose the concept of customisable "cue banks" where users select which nonverbal behaviours they want represented.

Relevance

This paper makes a compelling case that social VR accessibility extends far beyond screen reader compatibility or audio descriptions — it requires making the embodied, nonverbal dimensions of social interaction accessible. For accessibility practitioners working on immersive platforms, the design patterns here are directly applicable: using spatial audio to convey directional social information, haptic patterns to represent distinct behavioural categories, and giving users granular control over which cues they receive. The "cue bank" concept is a promising framework for managing the complexity of nonverbal communication without overwhelming users. The finding that BLV participants used the cues to learn social norms highlights an underexplored benefit of accessible technology — not just providing equivalent access to information, but enabling new forms of social learning. Limitations include the controlled lab setting (not a naturalistic social VR environment), the small sample size, and the restriction to only three nonverbal behaviours, though the authors provide a clear roadmap for expanding the system. The work also raises important questions about AI-driven cue detection and the privacy implications of tracking users' nonverbal behaviours.

Tags: virtual reality · blind and low vision · nonverbal communication · haptic feedback · social VR · spatial audio