AI That Moves With You: A Review of Interactive Technologies Powered by Large Foundation Models for Mobility Impairment

Duosi Dai, Yuchong Zhang, Yong Ma, Danica Kragic · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791239

Summary

This paper is a scoping review (PRISMA-ScR) of how large foundation models - LLMs, large vision models, and vision-language models - are being woven into interactive assistive technologies for people with mobility impairments, covering publications from January 2020 to May 2025. The authors take a deliberately functional definition of 'mobility impairment' that includes not only motor disability and wheelchair use but also blindness and low vision, deaf/hard-of-hearing, ALS, Parkinson's, and age-related chronic conditions, on the grounds that these barriers all constrain independent movement through the environment. Searches across five databases (ACM DL, IEEE Xplore, PubMed, Scopus, Web of Science) returned 6,249 records; after de-duplication, screening, full-text assessment, and snowballing, 26 papers formed the final corpus. A six-dimension codebook was applied: research objective, evaluation, large model, interactive technology, mobility impairment, and challenges. The analysis produces three contributions: a conceptual design space organising FM-enabled assistance by model family, integration pattern, interaction paradigm, and problem domain; a tabulated corpus and reproducible codebook; and a forward research agenda for CHI. The paper is especially valuable as an entry point for HCI researchers trying to understand how fragmented current FM-accessibility work is and where coordinated effort could have outsized impact.

Key findings

Research output is concentrated and very recent: 24 of 26 papers appeared in 2024-2025, with ACM CHI and ASSETS the most common venues. 53.8% of systems target blind/low-vision users, versus only 3.8% wheelchair users and 7.7% DHH - so 'mobility' in the literature remains heavily skewed toward sensory-driven mobility. Problem domains cluster around information access & comprehension (33.3%), navigation & environment understanding (29.6%), health self-management & coaching (29.6%), with physical assistance at just 7.4% despite its importance. On the technical side, general-purpose LLMs (GPT-4 family) appeared in 51.9% of papers, VLMs in 33.3%, and hybrid LLM+VLM in 11.5%; prompt engineering and role conditioning dominated (92.3%), followed by safety-oriented techniques (34.6%) and RAG (23.1%). Four FM integration patterns recur: FM as Reasoner (59%, typically Perception to LLM-Reasoner chains), Orchestrator (19%, routing across sub-models), Accelerator (11%, e.g. SpeakFaster expanding abbreviated eye-gaze text for ALS users), and Planner-Actuator Bridge (4%, LLM-to-robot-action for physical assistance). Contribution types skew heavily toward artefacts (88.5% primary) and empirical studies (73.1% secondary), with methodology (53.8% secondary) and datasets (30.8% secondary) underrepresented - a gap the authors flag as a standards/reproducibility risk. Technical challenges recur: latency and compute on edge devices, cloud reliance, data scarcity and dataset bias, reasoning fragility, and brittleness in long horizon tasks. Ethical challenges: bystander privacy, data governance, cultural fit, reliability in high-stakes contexts, and the tension between AI autonomy and user agency.

Relevance

For accessibility practitioners and researchers this review is the current best map of where FM-based assistive systems are being built and where they are not. Three implications stand out. (1) The physical-assistance domain and wheelchair-user population are visibly underserved relative to BLV work - if you are building in those areas, you are near the frontier and the evidence base is thin, so expect to contribute both systems and methodology. (2) The dominant Perception-to-Reasoner pattern is well-established and reproducible; building one more pipeline is unlikely to advance the field as much as work on evaluation frameworks, standardised benchmarks tailored to interactive FM assistance, or methods for safe orchestration and planner-actuator bridges. (3) The authors' ethical taxonomy - privacy, bias, autonomy, reliability, cultural fit - is useful as a checklist for product reviews and research ethics applications in this space. Limitations: the functional definition of mobility impairment is defensible but controversial (including BLV users may frustrate readers expecting pure motor-disability focus), English-only inclusion under-represents non-Western work, and the 26-paper corpus is small relative to the screened 6,249 - several highly relevant preprints may have been missed. Still, this is a foundational reference for any team planning FM-based assistive technology work in 2026.

Tags: foundation model · large language model · vision-language model · literature review · scoping review · mobility impairment · motor disability · blind and low vision · wheelchair users · ALS · assistive technology · conversational agent · robotics · wearable technology · accessibility research

Standards referenced: PRISMA-ScR