Multi-modal model to classify safety for input prompts as well output responses.
Improve safety, security, and privacy of AI systems at build, deploy and run stages.
Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior
Guardrail model to ensure that responses from LLMs are appropriate and safe