VRML-Based Representations of ASL Fingerspelling on the World Wide Web

S. Augustine Su, Richard K. Furuta · 1998 · Proceedings of the Third International ACM Conference on Assistive Technologies (Assets '98) · doi:10.1145/274497.274506

Summary

This paper presents techniques for representing American Sign Language (ASL) fingerspelling using 3D hand models in VRML 2.0 (Virtual Reality Modeling Language) on the World Wide Web. The authors argue that VRML offers a more effective way to document sign language online than traditional 2D media such as drawings and video, because 3D models allow users to view signs from any angle and interact with the representation. The hand model is constructed from VRML Cylinder and Sphere geometry nodes, with 28 degrees of freedom (DOFs): 20 DOFs for finger and thumb bending (distal, middle, proximal bending plus proximal deviation for each digit), plus 6 DOFs for the wrist (yaw, pitch) and forearm (x, y, z position and roll). To simplify interaction, the model reduces controllable DOFs to 24 by linking distal bending to middle bending at a fixed ratio. Users create handshapes through a VRML-built control panel with wheels and balls corresponding to each DOF. A CGI script called GestureMaker receives the DOF values and generates VRML files for static handshapes. For dynamic gestures (like the letters J and Z which require movement), key-frame animation uses VRML 2.0 PositionInterpolator and OrientationInterpolator nodes to create smooth transitions between handshapes.

Key findings

The system successfully demonstrated all 26 letters of the ASL manual alphabet and single-digit numbers as interactive 3D VRML models on the web. Users could view individual letter/number files or input arbitrary strings for animated fingerspelling sequences generated on-the-fly by the GestureMaker CGI script. For adjacent identical handshapes in a string, the system simulated the repeated-letter convention in fingerspelling by inserting a slight backward movement between repetitions. The authors identified several limitations: finger and thumb collision/penetration occurred during transitions between some letters (e.g., A to B), requiring additional intermediate key frames; the fingerspelling speed was reported as slow compared to real-world fingerspelling; and the model lacked the second hand, face, and body needed for full ASL signs beyond fingerspelling. The authors proposed a future direction of decomposing signs into sub-handshape elements based on Stokoe's sign writing system, which would allow signs to be expressed as strings of tokens — potentially enabling compilation of an ASL dictionary in VRML by following the notation system and translating symbols into 3D tokens.

Relevance

This 1998 paper represents an early effort to make sign language content accessible and interactive on the web using 3D technology, addressing a challenge that persists today. While VRML has been superseded by technologies like WebGL and three.js, the underlying approach — using parameterized 3D hand models with degrees of freedom to generate sign language animations procedurally — anticipated modern avatar-based sign language systems. The concept of generating fingerspelling animations from text input on-the-fly is directly relevant to current work on sign language avatars and machine translation systems. For accessibility practitioners, the paper highlights the ongoing tension between video-based and avatar-based sign language representation: video captures the full richness of signing but is expensive to produce and not searchable, while 3D models are procedurally generatable but lack naturalness. The proposed decomposition of signs into Stokoe-based tokens presaged modern computational approaches to sign language representation.

Tags: sign language · American Sign Language · fingerspelling · VRML · virtual reality · web accessibility · deaf accessibility · 3D modeling · animation

Standards referenced: VRML 2.0