Accessible maps

Seven years of work on something the accessibility field has effectively abandoned: real spatial cognition for non-sighted users, not turn-by-turn navigation. Three working demos across building, subdivision, and city-neighbourhood scales — deliberately different from one another, because there is no one right way to render a map; it depends on what the map is for. What they share is an approach to spatial cognition, and one theoretical contribution about how coordinate systems collapse under modality conversion.

The position

Maps, just like websites — and any other modal experience — need CISNA. You’re trying to give the spatial knowledge of a place to a person who cannot see; the inventory of features, the navigation between them, and the semantics of each are all in play, and the existing field has barely got past “turn-by-turn directions if you happen to be on this exact bus.” I do not do these things by half.

The tagline is do Google Maps right. The commercial mapping companies have made small beer progress on accessibility over a decade. The accessibility-focused alternatives have invested heavily in step-by-step navigation while leaving spatial cognition essentially unsolved. The maps work here does the part the field has abandoned.

Origin: this is not a retrofit

“Maps need CISNA” is not a contemporary framing applied to a new project. The doctoral Design Languagechapter from around 2009 cites Google Maps as the worked example of CISNA’s composite-content handling: “The maps presented on Google Maps would be a good example of this, as each map is a composite of images and text.” Howell’s 2005 paper on spatial metaphors for speech-based mobile city-guide services is cited alongside it as precedent reading. The CISNA architecture was being mapped onto interactive geographic content in the working papers fifteen years before the current SVG-tile platform shipped. What follows is the worked example of a 2009 claim.

The field critique

Before naming where the field has stalled, an integrity note about the evidence: Bob’s positions on Audiom, GoodMaps, and Blind Square are observer-grade— based on the academic literature, published material, and direct field interaction. CNIB Access Labs has not formally evaluated any of them. The only competitor Bob has tested in a structured way is Navilens, via a small-scale installation usability test with two lived-experience testers plus Bob trying it out (CNIB Access Labs engagement; not a formal audit). The phrase Bob uses about it: “I wouldn’t be prepared to call it an audit.” That asymmetry of evidence matters in both directions — more grounded than observer-only commentary, not inflated into formal-audit language.

The four products below are CNIB Access Labs partners, recommended case-by-case depending on the environment and the kind of movement the user needs. Each represents a distinct class of navigation-and-wayfinding tool with its own pros and cons; they are not like-for-like alternatives to one another.

The field map:

  • Audiom (XR Navigation)— the closest existing work and the most accomplished commercial team in the space. Pin-as-datum, arrow-key movement, configurable step size, surface-underfoot announcement. Backed by 13 academic studies, 150 blind + 40 sighted co-design participants, third-party VPAT, deployed at the Wisconsin Geological Survey, Georgia Tech, NASA, and the University of Washington. Genuine strengths in empirical validation and procurement readiness that the work here does not yet have.
  • Navilens— a real-world signage augmentation via proprietary visual codes, not a digital map at all. Massive deployment scale (MTA, Barcelona Metro, Heathrow, Coca-Cola packaging, hundreds of brands). The structural limit: Navilens cannot give spatial knowledge of a place you haven’t visited yet — the codes are physically placed; the product augments a route once a user is already walking it.
  • GoodMaps— indoor wayfinding for venues mapped with their LiDAR-based 3D point-cloud technology, deployed at airports (MidAmerica St. Louis), university campuses (York University’s Glendon Campus among others), and other commercial venues. Three surfaces: a mobile app for in-venue turn-by-turn with foot-level positioning, a web platform offering interactive 3D venue maps that can be previewed before a visit, and an SDK letting venue partners embed the positioning in their own apps. The map exploration is real but venue-bounded — the user gets a map of the venue they are entering, not a cognitive model of general space or unmapped places.
  • BlindSquare— positional awareness in real time. As the user moves, the app announces nearby points of interest, intersections, and venue features, letting them build a mental picture of the world immediately around them. Outdoors, the positioning is GPS plus OpenStreetMap and Foursquare data; indoors, it is Apple iBeacons that venues install, each beacon programmed to describe its location (door, service counter, washroom, vestibule). Every Service Canada location in Canada is BlindSquare-enabled, alongside the Yonge & St. Clair neighbourhood deployment in Toronto and other sites. Not turn-by-turn routing; not a pre-built spatial map — the user assembles the model from in-the-moment announcements about what is right here, right now.

The gap, summarised: the most frustrating thing about accessible maps is how little real progress there has been on spatial cognition specifically. Navigation gets the attention. Cognition gets the concession.

The research literature shows the same split. Manaswi Saha, Jon Froehlich, and colleagues’ 2022 CHI study of multi-stakeholder accessibility-map visualizations is careful, empirical, top-venue work — how policymakers, department officials, advocates, caregivers, and people with mobility impairments make sense of sidewalk-accessibility data across seven map types. Its own stated limitation is the tell: the visualizations, the authors note, were not designed to support people with different visual abilities, a gap they name explicitly and defer to future work. The data is about accessibility; the mapis not accessible to a non-sighted reader. That deferred piece — a map a non-sighted person can actually read and reason over — is where this work starts.

Three working demos

Each demo is a full-screen interactive map at its own URL; the page here is the brief that frames the demo and links out to it. What the three share is the approach to spatial cognition — pin-as-datum at viewport centre, dual-mode interaction (Cartesian via touch, polar via keyboard) — not the rendering or the feature set, which differ on purpose.

The difference is fit-for-purpose. On the Groves, what matters are the pinned points of interest— where the properties are — not the detail of the streets around them; so the Groves renders a raster base with an interactive pin overlay drawn on top, and only the pins need to be addressable. The East Toronto streetmap and the terminal mapare about exploring the detailed space itself, so there everything is drawn as addressable SVG, and the richer affordances — ARIA landmarks, category filters, the rotor, the F6 region cycle — live there rather than on the Groves. There is no one perfect solution; the right rendering follows the job the map is doing.

  • The Groves subdivision — the simplest, and the demo that produced the theoretical finding. By far the most stripped-down: residential streets, no interior detail. The simplicity is what exposed the asymmetry between visual scanning and blind navigation.
  • East Toronto streetmap — the earliest OSM-rendering demo, first shown at a 45-minute in-person session at the 2019 Guelph Accessibility Conference. The conceptual model the family of maps shares — ARIA Landmarks, filters, rotor — originated here. Rendering is deliberately basic; the contribution is the SVG structure for screen-reader navigation.
  • Terminal map — interior airport-terminal wayfinding (the worked example is YVR’s Level 3 departures). The most feature-rich demo: gates, security, washrooms, retail, services. The terminal-grade demonstration that the approach scales beyond residential subdivisions.

Spatial cognition under modality conversion

The theoretical contribution. When spatial information is rendered through a modality that is sequential rather than parallel (audio, screen reader, haptic) and that the user occupies rather than observes, the spatial reference frame collapses from Cartesian to first-person polar coordinates centred on the user. Cartesian space is a sighted observer’s frame; polar space is an embodied user’s frame. The modality shift forces the frame shift.

The finding is the same one the audio Tetris work produced in different vocabulary. Converting a visual game to audio shifted the player from third-person observational to first-person immersive; converting a visual map to screen-reader-mediated audio shifted the coordinate system from Cartesian to polar centred on a chosen reference point. POIs became (name, distance, compass direction) arranged in onion-skin order from a chosen centre. Same asymmetry expressed in coordinate-system terms.

It is not modality alone — it is modality plus interaction model. Touch as input mode preserves Cartesian even when output is audio, because the finger gives direct spatial reference. The fuller picture:

  • Visual + Cartesian— trivially the sighted user’s case.
  • Audio + sequential traversal (keyboard / screen-reader-only)— polar, centred on a chosen reference. The original finding.
  • Audio + touch exploration— Cartesian via touch (the finger is the spatial reference; each location announces what is under it) plus polar on tap (when the user interrogates a specific POI, the polar coordinates describe its surroundings).
  • Audio + live egocentric (in-situ navigation) — polar centred on the user’s actual GPS location, with compass orientation. Two distinct polar systems exist: allocentric (centred on a chosen reference, declarative, exploratory) and egocentric (centred on the user, dynamic, navigational).

The pin-as-datum is the embodiment of all of this in the UI. In all three demos the pin sits at the centre of the viewport; the map orbits the pin. That makes the pin the visible signifier of four things at once: the visual marker (sighted users see it at centre); the polar origin (all distances and directions are relative to it); the datum (fixed reference the map orbits); and the user’s agent in the multi-agent / Community-of-Practice framing — negotiating on behalf of user capability and preference. Wheelchair users have agents that prioritise gradients, ramps, accessible washrooms; blind users have agents that prioritise accessible crossings and green spaces for guide-dog rest breaks. Same OSM data, same pin, same datum — but the map adapts differently because the agent at the centre is negotiating differently. That is CISNA plus the four-model capability framework plus the multi-agent CoP framing, applied to spatial cognition.

This is paper-shaped substance that has not yet been written up. Working title: “Maps need CISNA: applying capability modelling and multi-agent communities of practice to accessible cartography.” A research direction, not a published claim.

Technical foundation

  • Addressable rendering where the goal is to explore the space. Commercial maps moved to raster tiles for performance; raster is opaque to screen readers. SVG elements are individually addressable, focusable, semantically labellable, scalable without resampling. Where the job is to explore the detailed space — the East Toronto streetmap, the terminal map, the multi-tile Toronto streetmap — everything is drawn as SVG, the opposite of the field’s performance-driven raster choice. Where the job is to find pinned points of interest rather than explore the surrounding detail — the Groves — a raster base carries an addressablepin overlay, and only the pins need to be vector. The accessible layer is always addressable; whether the base is SVG follows the map’s purpose, not dogma.
  • Pre-rendered SVG, no runtime spatial-database queries. Nothing on the platform queries OpenStreetMap (or an Overpass endpoint, or any spatial database) at runtime. The published demos use one-time static OSM pulls, rendered offline, and served as plain assets — the East Toronto streetmap, for instance, is a single SVG generated from one long-ago OSM extract; the data isn’t refreshed. The multi-tile Toronto streetmap currently in development extends the same principle to a city: OSM data is processed offline into 0.01° geographic squares (~1km²), each rendered as a compressed SVG.gz file with ARIA labels pre-built at generation time, served from a tile server Bob maintains. The viewer fetches tiles from that server as the viewport pans; the spatial database is touched only at tile-generation time, never at view time.
  • CSS-based filtering for clutter management. Visibility toggles run at CSS speed, not JavaScript speed.
  • OpenStreetMap as the data source. Community-maintained, openly licensed, with the fine-grained tagging the indoor and pedestrian pieces of the maps work depend on.

Universal-design discipline across four user populations

Not the usual one or two. Across the body of work, the interaction model addresses four user populations — screen-reader users, keyboard users, voice-control users (via Dragon NaturallySpeaking), and touch users — each with first-class affordances rather than a fallback experience.

The concepts below are distributed across the demos: this is the current state-of-play of the accessible-maps work as a whole, not a feature list any single demo implements end-to-end. Each demo carries some subset, and each new demo has been the surface on which one or another of these ideas was first expressed in code.

  • Rotor (iOS VoiceOver style) for narrowing tab order to a chosen POI class. Borrowed directly from the idiom users already know.
  • F6 landmark cycle.Three-position cycle (sidebar → map → controls), with last-position memory at each landmark. Two F6 taps from a selected map POI returns the user to the sidebar where they were. Borrowed from Windows / Microsoft Office.
  • Voice control via Dragon NaturallySpeaking. Rotor includes a Dragon-optimised mode with voice-friendly category names. The voice population is often skipped; not skipped here.
  • Context-adapted skip-links. Standard skip-to-content / skip-to-map-controls augmented with domain-specific landmarks (e.g. “skip to Pier A / B / C / D / E” in the terminal map, with focus moving to the lowest-numbered gate in that pier).

The seven-year arc

The arc begins with the Groves subdivision, built out of a client request for accessible spatial information about a residential development — the work that produced the polar-coordinate finding. The East Toronto streetmap followed: first publicly shown at a 45-minute in-person session at the 2019 Guelph Accessibility Conference (a low-fidelity, black-and-white, file:///-served rendering of an east Toronto streetmap) and the demo that introduced the ARIA Landmarks + filters + rotor model the family of maps now shares. The multi-tile Toronto streetmap followed as the direct architectural successor of East Toronto, scaling the single-tile pipeline to a full city — currently in active development, with no public demo live yet. Most recently, the terminal mapcarries the conceptual model into an indoor airport surface (worked example: YVR’s Level 3 departures). Same design vocabulary throughout; materially improved engineering and visual quality at each step.

Known gaps

  • Surface-under-foot announcement — Audiom has it (Esri facility data carries surface metadata); OSM doesn’t carry surface tags consistently for pedestrian-relevant features. A data-source limitation, not a design oversight.
  • Configurable step size on arrow-key movement — currently a TypeScript constant; should be user-configurable (city scale needs 50–100m steps; building scale needs 1–2m steps).
  • Direction-of-flow indication for unidirectional corridors on the terminal map — needed for any traveller who shouldn’t have to discover the direction by walking it. Known gap.
  • Right-click menu for non-drag pin placement — designed but not implemented.
  • No third-party VPAT or empirical usability validation at scale — Audiom has both; the work here has neither. Honest gap; the demos are working evidence, not procurement-ready artefacts.

Reading on