Open-Vocabulary Detection
Also known as: Open-Vocabulary Object Detection, OVD
A class of computer vision object detection models that accept arbitrary text queries at inference time rather than being restricted to a fixed set of pre-trained classes. Instead of only recognizing, for example, the 80 COCO categories, an open-vocabulary detector (such as YOLO-World) takes user-supplied text prompts like 'cup', 'wheelchair ramp', or 'sparrow' and returns bounding boxes for matching objects. In accessibility tools for blind and low vision users, open-vocabulary detection is important because it lets the system be tuned to task- and environment-specific vocabularies, reducing irrelevant announcements and auditory clutter while letting the user control exactly what the system reports.
Category: Computer Vision · AI and accessibility · Machine Learning · Assistive Technology
Related: YOLO · Object recognition · Scene Description · Assistive technology