Document Layout Analysis

Also known as: DLA, page layout analysis

A computer-vision task that identifies and classifies the visual regions of a document page—headings, paragraphs, tables, figures, captions, lists, headers, and footers—typically using object-detection models trained on datasets such as DocLayNet, PubLayNet, or DocBank. Document layout analysis underpins many automated accessibility remediation tools because tagging a PDF requires first locating where each content region is and what kind of element it represents. Its accuracy directly limits how well automated taggers can assign correct PDF structure tags like <H1>, <P>, <Figure>, or <Table>.

Category: PDF accessibility · document accessibility · Machine Learning · Automated Accessibility

Related: Tagged PDF · PDF Remediation · Optical Character Recognition · Vision-Language Model

Sources

https://doi.org/10.1145/3772318.3790289