Architecture

Paradise parses HTML, JavaScript, and CSS into three specialised semantic models, merges the three through CSS selectors into a single integrated DocumentModel, and runs its analysers over that. The architecture is what makes cross-file accessibility analysis tractable: a handler in handlers.js, an element in index.html, and a class in styles.css become one element with one set of behaviours that the analysers can reason about together.

Three semantic models, merged

The three model types are independent of each other: the DOMModel captures HTML structure and ARIA attributes, the ActionLanguage tree captures JavaScript behaviours, and the CSSModel captures style declarations and selector specificity. Each is built by a parser that knows only its own source language. The integration happens in the DocumentModel, which uses CSS selectors as the joining key — the same selectors that the browser would use to decide which styles apply to which element.

┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│  DOMModel   │────▶│  DocumentModel   │◀────│  CSSModel   │
│  (HTML)     │     │   (Integration)  │     │  (Styles)   │
└─────────────┘     └──────────────────┘     └─────────────┘
                             ▲
                             │
                    ┌────────┴────────┐
                    │ ActionLanguage  │
                    │  (JavaScript)   │
                    └─────────────────┘

DOMModel and CSSModel feed the integrated DocumentModel directly; the ActionLanguage tree feeds it through the selectors its handlers target. The diagram is decorative for sighted readers; the prose above conveys the same structure for screen readers.

DOMModel

The DOMModel captures the static structure of the HTML — elements, their attributes, their parent-child relationships, their ARIA roles, properties, and states. It is parsed from the source HTML, not from a rendered browser DOM. That distinction matters: the source DOM is what an author wrote, before JavaScript has had a chance to mutate it. Source-level analysis catches issues the author can fix at the source level, where they belong.

The DOMModel also tracks the accessibility tree implications of each element — the role each element would expose to assistive technology, the accessible name and description it would carry, the keyboard interactions it would respond to by default. That computed view is what the analysers reason about when they ask “does this element behave the way an assistive-tech user would expect?”.

ActionLanguage

The ActionLanguage tree captures the semantic behaviour of the JavaScript — what each statement does, not just how it’s spelled. A loop becomes an iteration node. A closure becomes a binding node. A call to addEventListener becomes a registration node tied to the selector it targets. Two JavaScript fragments that have the same effect collapse to the same ActionLanguage tree, even when their syntax differs.

The shape of the intermediate representation (IR) descends directly from work I did on adaptive user interfaces in 2010; the form is treated in detail on ActionLanguage IR (in progress) and the research lineage on Lineage.

CSSModel

The CSSModel captures the cascade — every selector, every declaration block, every media query, with specificity and source order preserved. It can answer questions of the form “under condition X, what styles apply to element Y?” without rendering the page. That matters for accessibility because behaviours like display: none, visibility: hidden, and pointer-events: none change whether an element is reachable to keyboard and screen-reader users — and those declarations can sit in a different file from the element they affect.

The merge step: DocumentModel

The three models are integrated into a single DocumentModel by walking the DOMModel and resolving each element against the other two. For an element, the merge produces:

The element’s computed style at default viewport / media-query state, from the CSSModel.
The handlers attached to it — directly via on* attributes, via JavaScript in the ActionLanguage tree, or indirectly via event delegation on an ancestor — from selector resolution.
The ARIA relationships into and out of it: which elements name it via aria-labelledby, which it controls via aria-controls, which its aria-describedby targets, all matched against the actual elements in the DOMModel.
The focus path it sits on — its tabindex, its visible/focusable state under CSS, its position in source order.

Once an element carries all of that information, the analysers can ask cross-cutting questions in plain terms. Is this onclick handler also reachable by keyboard? — yes if the same selector also has a keydown handler in the ActionLanguage tree. Does this aria-labelledby point at an element that exists? — yes if the target id resolves against the DOMModel.

Worked example: a handler split across files

The single example that motivates the multi-model architecture more than any other. Three files:

<!-- index.html -->
<div id="save" class="save" role="button" tabindex="0">
  Save
</div>

// handlers.js
const save = document.getElementById("save");

save.addEventListener("click", () => {
  doSave();
});

save.addEventListener("keydown", (e) => {
  if (e.key === "Enter" || e.key === " ") {
    e.preventDefault();
    doSave();
  }
});

/* styles.css */
.save {
  cursor: pointer;
  padding: 0.5rem 1rem;
}

.save:focus {
  /* Bug: hides the button the moment it receives focus. */
  display: none;
}

An AST-pattern linter sees three files independently. The HTML linter flags the <div onclick> as a non-button click target. The JavaScript linter sees a file of event-handler code with no associated HTML. The CSS linter sees a class definition with no consumer. Each warning fires; the user weighs three independent complaints; no warning has the context to decide whether the situation is actually a problem.

A rendered-DOM scanner does better — it sees the click handler attached, the keydown handler attached, the role="button" applied. But it cannot tell whether the keydown handler ran because the source had it or because some test harness attached it; it cannot tell that the focus state breaks under .save:focus { display: none; } because that state is one user-action away from the snapshot it captured.

Paradise sees all three files at once. The DOMModel records the <div> with id="save" and role="button". The ActionLanguage tree records two registrations on #save: a click handler and a keydown handler that fires on Enter or Space. The CSSModel records that .save:focus sets display: none. The DocumentModel composes these and the analysers report one issue, in plain terms: the keyboard-equivalent handler is in place, but the element disappears the moment it receives focus, so it is not actually keyboard-reachable. That issue cannot be detected in any single file.

What the architecture enables

Categorically, the kinds of analysis Paradise can run that single-file or rendered-DOM tools struggle with:

Cross-file event-handler validation. Click + keyboard equivalents that aren’t in the same source file.
ARIA relationship validation. aria-labelledby, aria-describedby, aria-controls, aria-owns all checked against the actual elements that should exist.
Visibility-focus conflicts. Elements that take focus but are hidden by CSS — by display: none, by visibility: hidden, by zero-size dimensions, or by being clipped off-screen.
Focus-order reasoning. tabindex values across the page, evaluated as a global ordering rather than per-element local values.
Framework-aware patterns. React hooks and portals, Vue reactivity, Svelte directives, Angular bindings — patterns where the same accessibility rule needs different evidence to verify.
WAI-ARIA widget patterns. All twenty-one canonical patterns — combobox, dialog, tree, grid, etc. — checked end-to-end including the JavaScript that actually wires them up.

Confidence is a first-class concept

Every issue Paradise reports carries a confidence level alongside its severity: one of HIGH, MEDIUM, or LOW, plus a short human-readable reason (“all three sources present”, “handler resolution depends on dynamically-bound this”, “CSS rule applies through a selector that may be outscored at runtime”). Confidence reflects the engine’s certainty given the source it actually has — not the severity of the underlying issue. A HIGH-confidence info finding is often more actionable than a LOW-confidence error, because the engine is telling you it’s sure about the smaller thing and guessing about the larger one.

The level resolves to a numeric percentage that surfaces expose to users: in the Playground, every issue card shows a confidence percentage; in the VS Code plugin, the hover popup carries the same number. The percentage is derived from the level and the document context the analyser had available — a finding that runs over a complete HTML document gets a higher percentage than the same finding over a body-only fragment, which gets a higher percentage than the same finding over a bare fragment with no <body>. A full document at HIGH is 100%; a fragment at LOW is 40%. The mapping is calibrated against the engine’s evaluation corpus so the numbers carry information rather than reading as decoration.

Most accessibility tools suppress uncertainty: a finding is either reported or it isn’t, with no signal of how sure the tool was. The hidden cost is that everything reported reads as equally weighted, so users triage by severity alone — and the noisiest analysers (low-precision rules with high recall) drown out the signals. By exposing confidence as a first-class field, Paradise lets users sort, filter, and judge findings the way the engine actually saw them. Filter to HIGH-confidence-only on a triage pass; sweep through LOW-confidence findings as a separate audit; never see the two collapsed into one undifferentiated stream.

Suggested fixes alongside diagnostics

For many issues, the engine emits a suggested fix alongside the diagnostic — a short description of the change, a code suggestion, and (when known) the file the suggestion belongs in. Fixes are engine-emitted, surface-applied: the Playground renders them in a Fix dialog with Apply-to-editor and Copy buttons; the VS Code plugin exposes them as Quick Fixes via the standard Code Actions / lightbulb affordance; a CI consumer can iterate over issue.fix programmatically and apply in batch.

The fix payload is a starting point, not a guaranteed correction. Paradise reports what to write but doesn’t always know where to write it: the engine emits the corrective code and the file it probably belongs in, but it doesn’t indicate whether to insert, replace, or append at a specific line. Surfaces apply best-effort (the Playground currently appends to the named file) and surface that limitation in the UI prose so users review before committing. Fixes are most reliable for self-contained changes — an aria-label to add to a button, a keydown handler to mirror an existing click, a CSS rule to delete. They are less reliable when the correction depends on surrounding context the engine can’t resolve from source alone.

Honest framing matters here. A “one-click autofix” promise that lands the wrong code in the wrong place is worse than no autofix — it amplifies user mistakes rather than reducing them. Paradise reports the fix it knows, names the limitation in the same UI element, and lets the user choose whether to accept it.

What’s hard, what’s deferred

The architecture has limits, and the honest framing is important. Source-level analysis cannot see content fetched at runtime — third-party widgets injected into iframes, dynamic modules whose source isn’t present at scan time, content streamed in from a server in response to user interaction. For those, runtime tools like autoA11y are the right answer, not Paradise.

Within the source-level scope, the harder problems Paradise still works on are: dynamic CSS class assignment (a handler that adds .is-open to an element changes its visibility, but only conditionally); event delegation through complex parent chains (a handler on document.body that switches on e.target); template-driven HTML (React JSX, Vue templates) where the rendered structure is itself a function of state. Each of these has partial coverage in the current analysers; each is being tightened over time.

Reading on

ActionLanguage IR — the form of the tree, with worked example (in progress).
Lineage — where this architecture came from in the PhD-era research.
Back to Paradise.