I, Robot? Exploring Ultra-Personalized AI-Powered AAC; an Autoethnographic Account

Tobias M Weinberg, Ricardo E. Gonzalez Penuela, Stephanie Valencia, Thijs Roumen · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3790310

Summary

This paper is a multi-month autoethnographic study by an AAC (Augmentative and Alternative Communication) user-researcher who fine-tuned a personalized large language model on his own communication data and then used it as the suggestion engine in his everyday speech-generating device. The work directly challenges the prevailing assumption that personalization in AI-mediated communication is purely a technical problem of accuracy and fluency, arguing instead that ultra-personalization continually renegotiates the user's agency, identity, and privacy. The study unfolds in three phases. In Phase 1, the lead author spent seven months collecting all of his face-to-face AAC communication through a custom iOS app modeled on the iOS Notes interface, ultimately gathering 247 conversation threads spanning 19,764 messages with diary reflections logged throughout. In Phase 2, after toxicity filtering removed 269 messages and the author further excluded utterances he was uncomfortable delegating to a model, the team fine-tuned several candidate LLMs (Gemma 3, DeepSeek-R1-Distill-Qwen-7B, Llama 3.5, Cohere Command-R-08-2024, and GPT-4.1-mini) and selected GPT-4.1-mini for its English/Spanish bilingual fluency, training on 22.4M tokens at a cost of approximately USD 120. In Phase 3, he used the resulting model in-line in his AAC device for three months (June-August 2025), with structured diary entries every few days, Likert ratings, and detailed interaction logs. Two researchers performed two rounds of affinity diagramming over the diaries to surface themes around contextual fit, identity construction, and privacy.

Key findings

Out of 6,564 messages composed in the deployment period, the author accepted suggestions in only 153 instances (2.3%), accepting on average 7.5 words per accepted suggestion (SD=8.14), with 114 of 153 acceptances taking 100% of the suggested words. Communication speed-ups were modest (M=41.7 WPM with suggestions vs 31.4 WPM baseline), but daily Likert self-reports showed moderate helpfulness (M=3.61/5, SD=0.89) and notably strong sense of authorship control (M=3.94/5, SD=0.84). Three qualitative findings dominate. First, even before training, simply knowing his speech was being logged drove sustained self-censorship: the author dropped swearing, dark humor, and gossip from his everyday speech, and the curated dataset systematically muted authentic registers. Second, the personalized model amplified some identity facets - notably bilingual code-switching between English, Spanish, and Argentine Spanish, and culturally specific vocabulary like 'knish' or 'doradito' - while flattening others. Third, contextual integrity broke down repeatedly: the model resurfaced rare biographical details (family history, religious background) in unrelated professional or social settings, producing what prior work calls 'forced intimacy.' Suggestions that interlocutors could read off the screen were sometimes attributed to the author by his conversation partner before he had committed to them, blurring authorship. Reliability also mattered: in low-connectivity environments (subway, camping trip), the cloud-hosted model was unusable.

Relevance

For practitioners building AI-assisted AAC, predictive text, or any voice-substituting communication tool, this paper is a concrete reminder that personalization is not a free win. Three design takeaways carry directly: ultra-personalized AAC should decide WHEN to surface suggestions (suppress in rapid topic shifts, public settings, or unfamiliar audiences), WHAT to surface based on the user's multi-faceted identity rather than a flattened profile, and HOW the user can tune both the model and the suggestion behavior over time. The author's specific recommendations - partial word-level acceptance to preserve fine-grained authorship, modular per-context personalization rather than a single profile, on-device or hybrid inference to survive connectivity gaps, and explicit interfaces for users to retrain or reshape the model - are all actionable. Limitations to flag: this is N=1 by design, the author is a literate, technically skilled, full-keyboard user without significant motor impairment, and findings may not transfer to users with cognitive impairments or symbol-based AAC. Even so, the lived account surfaces dynamics (self-censorship under logging, contextual integrity violations from an authored model) that lab studies routinely miss.

Tags: AAC · autoethnography · large language models · personalization · agency · identity · privacy · fine-tuning · speech impairment · AI-mediated communication