Glossary

AI Red Teaming

Also known as: LLM red teaming, AI adversarial testing

Definition

AI red teaming is the practice of probing AI systems — particularly LLMs — for jailbreaks, prompt injection, policy bypass, and unsafe outputs before they ship to production. It applies adversarial-mindset security testing to the model layer, with reproducer prompts and severity tagging on findings.

Why it matters

LLMs introduce attack surfaces traditional security tooling does not cover. A jailbreak that bypasses a content policy can expose an organization to brand and regulatory risk. A prompt injection that exfiltrates data through a tool call can become a breach. A model that leaks PII in adversarial prompts can violate GDPR and HIPAA simultaneously.

NIST AI RMF MANAGE-2.1 explicitly calls for adversarial testing of AI systems. The EU AI Act Article 15 requires accuracy, robustness, and cybersecurity testing for high-risk AI. SR 11-7's effective challenge pillar is increasingly interpreted to include red-teaming for LLM-based models. Pre-deployment red teaming is no longer optional.

In practice

PRISM Red Teaming runs a curated catalog of jailbreaks, prompt injection variants, and policy-bypass tests against any model or agent before it reaches production. Findings are severity-tagged with reproducer prompts. Cookbooks let teams re-run the same tests after each fix lands, providing the audit-grade evidence regulators expect.

PRISM Red Teaming

PRISM Guardrails

Production-side complement to red teaming

PRISM Model Audits

NIST AI RMF compliance

EU AI Act compliance

More glossary terms

Shadow AI AI Observability LLM Observability AI Guardrail Model Drift Prompt Injection All terms →

Start tracing in 5 minutes

One SDK. Five minutes. Full audit trails, PII redaction, and guardrail enforcement, from day one.

Tamper-proof traces, sealed before storage

Zero PII in storage, redacted at ingestion

Multi-cloud: Databricks, Snowflake, AWS, Azure

Request Demo

Enterprise Ready

Trace Latency

80%

PII Redacted

65%

Audit Time

90%

Agents Traced

70%

Trace IngestionActive

Audit ReportsReady in <60s

PII Status100% Redacted

AI Red Teaming

Also known as: LLM red teaming, AI adversarial testing

Definition