Information overload in non-visual web transaction: context analysis spells relief
Jalal Mahmud · 2007 · Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '07) · doi:10.1145/1296843.1296905
Summary
This short ASSETS 2007 poster paper from Jalal Mahmud at Stony Brook University describes ongoing PhD research into reducing the information overload that blind users experience when completing multi-step web transactions — shopping, registrations, bill payments — using a screen reader. The argument is that because screen readers process pages sequentially, users must wade through large amounts of irrelevant content on every page in a transaction flow before reaching the few elements they actually need (a search box, a result list, an "Add to cart" button, a checkout form). Sighted users can skim past this content visually; non-visual users cannot. Mahmud's system extends his earlier CSurf context-directed browser and is composed of five modules: a Browser Object that fetches pages, a Geometric Analyzer that partitions each page into spatially coherent segments, a Context Analyzer that captures the textual context surrounding the link the user just followed and uses a Support Vector Machine to identify the most relevant segment on the next page, a Concept Extractor that recognises transaction-relevant "semantic concepts" such as Search Result, Item Detail, AddToCart, and SearchForm using a small hand-curated keyword base, and an Interface Manager that delivers only the relevant segments to the user via VoiceXML dialogues. The combination of geometric segmentation, link-context analysis, and shallow concept matching is intended to be domain-scalable rather than tied to per-site rules.
Key findings
The paper reports that the geometric segmentation and context-analysis components — published in detail in the author's WWW 2007 CSurf paper — achieve "reasonably high" precision and recall, and that a preliminary evaluation of the Concept Extractor across 12 commerce websites is encouraging, although extensive evaluation is still in progress and no numeric results are reported in this poster. The methodological contribution Mahmud claims over prior segmentation work (e.g. Embley et al.'s record-boundary discovery and Takagi et al.'s site-wide annotation) is that geometric segmentation does not depend on manually specified rules or site-specific knowledge and is therefore scalable across domains. The core insight is that the context surrounding the link the user just followed is a strong predictor of which segment of the next page is relevant, so the system can present only that segment — along with detected transactional concepts — rather than the full page. User studies with blind and with deaf-blind students were planned at Helen Keller Services for the Blind and the Helen Keller National Center but had not been completed at the time of writing.
Relevance
For accessibility practitioners, this paper is a useful pointer to a still-relevant problem: even on modern, ARIA-rich websites, screen-reader users completing multi-step transactions are forced to traverse large amounts of layout, navigation, and promotional content on every page. Mahmud's combination of page segmentation plus link-context analysis plus shallow concept extraction prefigures the approach taken by later screen-reader features (e.g. landmark and region navigation, content-extraction reader modes) and by AI-driven assistants that summarise transactional pages. The limitations of the work as reported are substantial: it is a two-page poster, no quantitative end-to-end results are given, no user studies with disabled participants are reported, and the system is built on a hand-curated set of transactional keywords that would need expansion to cover non-commerce domains. The paper is best read alongside the author's full CSurf paper for the technical details, and as a reminder that focused, task-relevant content delivery — rather than literal sequential rendering — is a necessary direction for non-visual web access.
Tags: screen readers · non-visual web access · web page segmentation · context analysis · web transactions · information overload · blindness and low vision · VoiceXML · machine learning · support vector machine · web accessibility · JAWS · IBM Home Page Reader