← All reviews

Measuring the Impact of Automated Evaluation Tools on Alternative Text Quality: A Web Translation Study

Silvia Rodríguez Vázquez · 2016 · Proceedings of the 13th International Web for All Conference (W4A) · doi:10.1145/2899475.2899484

Summary

This paper presents the first empirical study on web accessibility conducted around a translation task, investigating how automated evaluation tools affect the quality of image text alternatives produced by web translators. Twenty-eight professional French translators were asked to translate a mock development campaign website (containing 130 images across three web pages) from English to French and check for image accessibility. The experiment used a split-plot design with two independent variables: web accessibility knowledge (two groups of 14) and use of tools (three levels — no tool, one tool, or two tools). In the experimental conditions, translators used two different evaluation tools: aDesigner, a general web accessibility conformance checker that provides clues about alt text appropriateness beyond simply detecting missing alt attributes, and Acrolinx, a controlled-language authoring tool adapted with 40 accessibility-oriented style rules specifically developed for French alt text quality checking. Acrolinx could flag inappropriate alt text patterns (such as "Facebook" as alt text for a social sharing icon) and suggest improvements based on linguistic rules for descriptive, functional, and uninformative content. Translation versions were cumulative — T1 served as baseline (no tools), T2 was checked with the first tool, and T3 with the second tool — to examine how iterative tool use improved alt text quality.

Key findings

Seven screen reader users (JAWS and VoiceOver users from Switzerland, Canada, and France) evaluated 2,189 unique text alternatives using a four-level appropriateness scale. Statistical analysis (repeated-measure ANOVA, N=76,440 observations) showed highly significant effects. Using both tools produced significantly better alt text than no tools (p<0.001), and using either tool alone also improved quality significantly over no tools (p<0.001). Critically, Acrolinx produced significantly better results than aDesigner (p<0.001) when used as the sole tool, supporting the hypothesis that controlled-language style rules offering language-based repair recommendations are more effective than general conformance checkers for improving alt text quality. When both tools were used, the order mattered: using Acrolinx first followed by aDesigner yielded better results than the reverse order, likely because addressing linguistic quality issues first produced a stronger foundation that aDesigner could then complement by catching remaining structural issues like missing alt attributes. The "not appropriate" category dropped substantially when tools were used — from 14,006 observations with no tool to 5,252 with Acrolinx alone and 10,841 when both tools were applied.

Relevance

This research opens an important and largely unexplored area: the role of web translators and localizers in web accessibility. As content reaches global audiences through translation, translators become de facto gatekeepers of image accessibility in localized websites — they can either preserve, improve, or break the accessibility of alt text during the localization process. The finding that automated tools significantly improve translators' alt text quality, even without formal accessibility training, has direct practical implications: integrating accessibility checking into computer-assisted translation (CAT) tools could improve image accessibility across multilingual websites at scale. The controlled-language approach embodied by Acrolinx is particularly promising because it provides actionable, language-specific guidance rather than generic conformance warnings. For accessibility practitioners, this study reinforces that alt text quality is not just a binary question of presence or absence — appropriateness, linguistic quality, and contextual accuracy matter, and these dimensions require human judgment supported by intelligent tooling.

Tags: alternative text · image accessibility · web translation · localization · automated testing · controlled language · screen readers · multilingual accessibility

Standards referenced: WCAG 2.0 · ISO/IEC TS 20071-11:2012