Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Deploy this model now on your endpoint provider of choice