
A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

Industry leading jailbreak classification model for protection from adversarial attempts

Multi-modal model to classify safety for input prompts as well output responses.

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

Guardrail model to ensure that responses from LLMs are appropriate and safe