Achieves near-parity with o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU.
The leading open models built by the community, optimized and accelerated by NVIDIA's enterprise-ready inference runtime.
DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.
Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
Reasoning vision language model (VLM) for physical AI and robotics.
Get started with workflows and code samples to build AI applications from the ground up.
Build a custom deep researcher powered by state-of-the-art models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.
Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
Continuously extract, embed, and index multimodal data for fast, accurate semantic search. Built on world-class NeMo Retriever models, the RAG blueprint connects AI applications to multimodal enterprise data wherever it resides.
Improve safety, security, and privacy of AI systems at build, deploy and run stages.