Web Information Extraction
Also known as: WIE, Web Data Extraction, Web Scraping
Web Information Extraction (WIE) is a set of techniques for automatically identifying and extracting structured data from web pages. In the context of accessibility, WIE methods are used to analyze the visual rendering of web pages to infer document structure, semantic roles, and content relationships that may not be explicitly marked up in the HTML. For example, WIE-based enrichers can detect headings by analyzing font size and style patterns even when heading tags are missing, identify navigation menus from spatial clustering of links, or determine reading order from geometric layout analysis. These techniques are particularly valuable because they do not depend on correct semantic HTML markup — they work from the rendered page as the browser displays it, extracting the same structural cues that sighted users perceive visually.
Category: Web Development · Artificial Intelligence
Related: Page Segmentation · Navigation Axis · DOM · Screen Reader