Multi-modal model to classify safety for input prompts as well output responses.
Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior
Guardrail model to ensure that responses from LLMs are appropriate and safe