TouchPilot: Designing a Guidance System that Assists Blind People in Learning Complex 3D Structures

Xiyue Wang, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa · 2023 · Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2023) · doi:10.1145/3597638.3608426

Summary

This paper introduces TouchPilot, a step-by-step guidance system designed to help blind people independently learn complex 3D structures through interactive 3D printed models (I3Ms). While existing I3Ms allow blind users to trigger audio labels by pointing at specific elements, this pinpointing approach presents only single-layered information and struggles with complex, multi-layered structures that have many components. The authors first conducted an observational study (Study 1) with six blind participants and three expert explainers at a science museum, examining how learners and explainers naturally interact when exploring complex 3D printed models of the International Space Station (ISS) and a Falcon 9 rocket. They identified 18 distinct interaction activities across experts and participants, finding that experts predominantly guided sessions using hierarchical explanations — introducing composites before basic elements in a top-down manner — along with navigation support, specification of locations, and extension information. Participants tended to follow expert guidance rather than actively initiating questions, and all participants praised the combination of models with expert explanations. Based on these findings, the researchers designed TouchPilot with three core functions: audio guidance that introduces hierarchical elements step by step, navigation support using directional voice cues to help users locate target elements, and confirmation through a tapping sound when the user's finger enters the correct area. The system uses a depth camera (Realsense D435) and MediaPipe Hands for optical hand tracking, mapping finger positions to a virtual 3D point cloud representation of the physical model. Users activate audio labels using a "number one" gesture (index finger pointing up) and control pacing with next/previous buttons.

Key findings

In the comparative study (Study 2) with eight blind participants, the guidance system produced significantly better learning outcomes than the pinpointing system. The mean correct answer rate was 70.8% with the guidance system versus 47.1% with the pinpointing system — a statistically significant difference. The guidance system outperformed pinpointing across all six question subcategories (composite-related textual, basic element textual, spatial, area-related, and location-related), with significant differences found for composite-related textual questions, all spatial questions, area-related questions, and location-related questions. Participants rated the guidance system higher for both independence (median 6 vs. 4 on a 7-point scale) and enjoyment (median 7 vs. 6). Most notably, all eight participants preferred a two-phase approach: first using the guidance system to build a structured understanding of the overall model, then switching to free pinpointing to review and explore elements of personal interest. Participants spent more time with the guidance system (mean 17.75 minutes vs. 6.88 minutes), reflecting deeper engagement, while pinpointing users often finished without discovering all elements. Participants identified potential applications including museum artifacts, architecture and maps, objects too large or dangerous to touch directly, and complex interfaces in daily life.

Relevance

This research addresses a significant gap in how blind people access complex spatial and structural information. The finding that systematic, hierarchical guidance produces substantially better learning outcomes than free exploration has broad implications for museum accessibility, STEM education, and any context where blind users need to understand multi-layered spatial relationships. The two-phase preference — structured guidance followed by free exploration — provides a practical design pattern for any interactive tactile system. The work also demonstrates that computer vision-based hand tracking can replace physical buttons and labels on 3D models, making it possible to annotate any 3D printed object without embedded electronics. For science museums and educational institutions, TouchPilot offers a model for making complex exhibits genuinely accessible rather than simply providing basic labels. The system's limitations around camera occlusion and tracking accuracy point toward future improvements, and the authors note the potential for LLMs to generate guide content dynamically, reducing the authoring burden.

Tags: 3D printed models · tactile learning · blindness · computer vision · guidance systems · interactive models · science education · museum accessibility · hand tracking