Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments