High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
Built for agentic workflows, this model excels in coding, instruction following, and function calling
Multi-modal vision-language model that understands text/img and creates informative responses
State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
Build a custom deep researcher powered by state-of-the-art models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
Leading reasoning and agentic AI accuracy model for PC and edge.
Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.
Multi-modal vision-language model that understands text/img/video and creates informative responses
A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.
Leaderboard topping reward model supporting RLHF for better alignment with human preferences.
Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling