← All terms

Computer-Using Agent

Also known as: CUA

An AI agent, typically built on a Large Multimodal Model, that perceives a computer's graphical user interface through screenshots, reasons about on-screen context, and directly manipulates the interface by clicking, typing, scrolling, and navigating between applications. Unlike script-based or template-driven automation, a CUA interprets layouts it has never seen before and adapts its actions in real time, enabling tasks such as online shopping, form-filling, or booking to be delegated through natural language. For accessibility, CUA offers the potential to replace rather than narrate inaccessible visual interfaces — but also introduces new risks around hallucination, verification, and user oversight when disabled users cannot directly see what the agent did.

Category: AI · Artificial Intelligence · Generative AI · Assistive Technology · Accessibility

Related: Large Multimodal Model · Large Language Model · Voice User Interface · Generative AI

Sources