Red Teaming
Also known as: Generative Red-Teaming, AI Red Teaming
A structured evaluation practice in which an adversarial team probes a system — traditionally a network or application, increasingly an AI model or conversational agent — with realistic attack scenarios to find failures before malicious actors do. Generative red-teaming specifically targets LLM and generative-AI outputs with jailbreak, prompt-injection, instruction-hierarchy, and role-play attacks. In accessible AI products, red-teaming should explicitly cover disability-relevant misuse patterns (impersonation of a caregiver, extraction of health disclosures, misinformation that exploits information asymmetries).
Category: Security · AI · Research Methods
Related: Jailbreak · Prompt Injection · Threat Modeling