Semantic Bookmarking for Non-Visual Web Access
Saikat Mukherjee, I. V. Ramakrishnan, Michael Kifer · 2004 · Proceedings of the 6th International ACM SIGACCESS Conference on Computers and Accessibility (Assets 04) · doi:10.1145/1028630.1028663
Summary
This paper introduces semantic bookmarking, a technique for non-visual web access that allows blind users to bookmark meaningful content segments on web pages using domain ontologies rather than structural HTML positions. The research is built on HearSay, a speech-driven assistive web browser developed at Stony Brook University that automatically generates VoiceXML dialog interfaces from web page content. The core problem is that traditional bookmarks in assistive browsers are tied to specific HTML structures (tag sequences, positions in the DOM tree), so they break when web pages are redesigned even if the content remains the same. Semantic bookmarks instead associate saved locations with conceptual labels from a domain ontology. The system works through two processes: semantic partitioning, which groups HTML elements into semantically related clusters based on spatial locality and sequential patterns in the DOM tree, and semantic labeling, which assigns ontology-based concept labels to these partitions using classifiers. For example, on a news site, partitions might be labeled "Major Headlines," "Category News," or "Taxonomy News." The paper also introduces voicemarking, a speech-based creation and retrieval mechanism for semantic bookmarks where users can say the name of a concept and a keyword to jump directly to relevant content. Because semantic bookmarks are tied to concepts rather than page structure, they work across multiple websites in the same domain — a "Major Headlines" bookmark works on both the New York Times and CNN.
Key findings
Experimental evaluation compared access times across three systems: BrookesTalk (a baseline assistive browser), HearSay without voicemarks, and HearSay with voicemarks. Testing with four subjects on three news sites (New York Times, CNN, Google News) showed that HearSay significantly reduced access times compared to BrookesTalk for the "Major Headlines" concept — from 215.5 seconds with BrookesTalk to 86 seconds with HearSay to 53.25 seconds with HearSay+VoiceMark on the New York Times. Similar patterns held across CNN and Google News. The voicemarking feature provided further efficiency gains, with the percentage reduction in time increasing with question difficulty: easy questions saw a 4.58% reduction, medium questions 6.95%, and hard questions (requiring cross-site information correlation) 9.44%. This demonstrates that semantic bookmarks are most effective for complex information retrieval tasks across multiple sources. The semantic partitioning algorithm achieved over 90% accuracy for identifying concept instances across more than 100 web pages from a dozen different e-commerce product portals. The system also demonstrated effective handling of HTML tables by using ontologies to assign semantic labels to table rows and columns, enabling more comprehensible aural rendition than simple linearization.
Relevance
This research addresses a fundamental problem in non-visual web access: the inefficiency of sequential content consumption that forces screen reader users to wade through entire pages to find desired information. For accessibility practitioners, the semantic approach offers important insights. First, content-based rather than structure-based navigation is more robust and maintainable — this principle applies broadly to accessible web design where semantic HTML and ARIA landmarks serve similar conceptual functions. Second, the cross-site applicability of domain ontologies demonstrates the value of standardized content semantics, anticipating concepts later formalized in schema.org and structured data. Third, the voicemarking interaction pattern — where users create personalized shortcuts through speech — illustrates how accessibility tools can go beyond merely making content available to making it efficiently retrievable. Key limitations include the dependency on manually constructed domain ontologies (the system had ontologies for news, education, and e-commerce only), the small evaluation sample size (four subjects), and the focus on content-rich structured sites where partitioning algorithms work best.
Tags: web accessibility · visual impairment · screen reader · semantic web · web navigation · bookmarking · personalization · information retrieval · ontology
Standards referenced: VoiceXML