A Syntactic Analysis of Accessibility to a Corpus of Statistical Graphs

Leo Ferres, Petro Verkhogliad, Livia Sumegi, Louis Boucher, Martin Lachance, Gitte Lindgaard · 2008 · Proceedings of the 2008 International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1368044.1368053

Summary

This paper from Carleton University and Statistics Canada tackles a persistent accessibility challenge: making statistical graphs accessible to blind and visually impaired users. The authors analysed a corpus of 120 real-world statistical graphs from Statistics Canada's daily publication "The Daily", examining them for structural accessibility problems using a formal context-free grammar they designed to describe graph components (titles, axes, series, content boxes, geometry). The key insight driving the research is that graphs published as raster images (GIF files) with alt text are only superficially accessible — even when the underlying structured data exists in graphing applications like MS Excel, WYSIWYG authoring practices frequently break the semantic structure. For example, graph authors commonly use floating text boxes to position titles visually rather than entering them in the application's designated title field, making the title function invisible to any tool querying the graph's object model. The work builds on the authors' prior iGraph-Lite system, which automatically generates natural language descriptions for statistical graphs.

Key findings

The corpus analysis revealed widespread structural accessibility failures. All 120 graphs had titles placed in text boxes rather than the application's title field, making them programmatically unidentifiable. Fifty-five percent (66 graphs) had missing category axis values — months or years omitted from the object model to reduce visual clutter on the horizontal axis, but leaving gaps that assistive technologies cannot fill. Six graphs had missing data values in their series. Ninety-eight of 120 graphs were classified as "custom" type rather than specific chart types, making it impossible for assistive technology to communicate the graph type (line, bar, etc.) to users — information that research shows is critical for correct mental encoding. The authors developed curation algorithms to repair these issues but found them inherently brittle and domain-specific. This led to their central contribution: the OM (Object Model) Principle, which states that any digital object is made more accessible by using the application's designated model for that object rather than visual workarounds. For instance, use the TITLE field for title text, not a positioned text box.

Relevance

This paper addresses a problem that remains largely unsolved nearly two decades later: the accessibility of data visualisations. The OM Principle is elegant in its simplicity and broadly applicable beyond graphs — it essentially argues that semantic structure should be preserved in authoring tools rather than relying on visual appearance alone. This principle directly parallels the broader accessibility argument for semantic HTML over purely visual styling. For practitioners creating charts and graphs today, the specific findings remain actionable: always use the designated title, axis label, and series name fields in charting tools rather than floating text boxes; include all category values in the data model even if not all are displayed; and specify chart types explicitly. The collaboration between Carleton University and Statistics Canada demonstrates the value of working with real-world data publishers to understand practical accessibility barriers at scale.

Tags: data visualization · graph accessibility · blind and low vision · alt text · knowledge representation · automated accessibility · statistical graphics

Standards referenced: WCAG 1.0