← All reviews

Accessing Passersby Proxemic Signals through a Head-Worn Camera: Opportunities and Limitations for the Blind

Kyungjun Lee, Daisuke Sato, Saki Asakawa, Chieko Asakawa, Hernisa Kacorri · 2021 · Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '21) · doi:10.1145/3441852.3471232

Summary

This paper explores the potential and limitations of using head-worn cameras with computer vision to help blind people access proxemic signals — the spatial behaviour of nearby people that sighted individuals perceive visually, such as someone's presence, distance, relative position, and whether they are looking at you. These signals are critical for blind individuals to initiate social interactions, preserve personal space, and practice social distancing. The researchers built a testbed called GlAccess using Vuzix Blade smart glasses with an 80-degree field of view, a Bluetooth earphone, and a remote server running pedestrian detection algorithms. The system detects a passerby's face using Multi-task Cascaded Convolutional Networks (MTCNNs) and estimates four proxemic signals: presence, distance (using face bounding box height as a proxy), relative position (left/middle/right), and head pose (looking at user or not). These estimates are communicated via text-to-speech. The study involved 10 blind participants (nine totally blind, one legally blind, average age 63.6) and 40 sighted participants serving as passersby. In a controlled corridor scenario, blind participants wearing the smart glasses walked down a 66-foot (20-meter) corridor while sighted participants approached from the opposite direction, with the blind participant tasked with asking for a nearby office number. Each blind participant completed eight walks — four with stranger sighted participants and four with lab members whose faces the system could recognise. Data collection included 183 minutes of stationary camera recordings, over 1,700 smart glasses camera frames, proxemic signal logs, and 259 minutes of post-study interview audio.

Key findings

The analysis of camera frames revealed significant challenges unique to blind users. While passersby were reliably captured at distances greater than 6 feet (98% inclusion rate for any body part), within 6 feet — the critical social interaction distance — inclusion rates dropped dramatically to an average of 56% for any body part and 66% for the head. This was primarily caused by blind participants' head movements: many tended to move their head side to side while walking and especially while interacting, causing the narrow-field camera to miss the passerby even when they were directly beside the user. The system achieved perfect precision (1.0) for presence detection when a passerby was actually detected, but very low recall (0.33-0.34), meaning it missed most frames where a passerby was present, predominantly due to people being too far away for reliable face detection. Position estimation was strong (precision=0.90, recall=0.84 when detected), but distance estimation (precision=0.62, recall=0.66) and head pose detection (precision=0.77, recall=0.77) were less reliable. Qualitative feedback from blind participants was largely positive: seven of ten appreciated having access to proxemic information, noting it could help with social engagement, decision-making about interactions, and knowing who is nearby. Participants envisioned using person recognition to greet family and friends by name. However, they raised concerns about feedback verbosity, latency (stale estimates when walking at normal pace), inability to verify errors in visual estimates, and the need for bone conduction headphones to keep ears open for environmental sounds. Several participants who had been blind since birth noted they were unfamiliar with processing visual spatial information and might need training to benefit from such a system.

Relevance

This paper addresses a largely unexplored dimension of blind navigation and social interaction: access to the spatial behaviour of other people, rather than to physical obstacles or landmarks. For assistive technology developers, the findings reveal that head-worn cameras for blind users face unique challenges compared to sighted users — blind people's head movements are idiosyncratic and often exclude the very people they are near, suggesting that wider field-of-view cameras, multiple cameras, or supplementary proximity sensors are needed. The finding that the 1 frame-per-second sampling rate combined with head movements resulted in missed detections calls for dynamic, motion-aware sampling approaches. The study also highlights important design tensions: blind participants wanted detailed proxemic information but found continuous feedback overwhelming, suggesting context-aware delivery that adjusts verbosity based on the user's activity. The privacy implications of always-on wearable cameras for person recognition remain significant, though prior research suggests bystanders are generally more accepting when the technology serves an assistive purpose. For the accessibility community, this work establishes proxemics as a concrete, measurable domain where assistive technology can meaningfully expand blind people's social autonomy.

Tags: blind and low vision · proxemics · wearable camera · smart glasses · pedestrian detection · computer vision · social interaction · social distancing · assistive technology