Achieves near-parity with o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU.
Get started with workflows and code samples to build AI applications from the ground up.
Build artificial general agents (AGA) powered by AGI models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.
Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
Continuously extract, embed, and index multimodal data for fast, accurate semantic search. Built on world-class NeMo Retriever models, the RAG blueprint connects AI applications to multimodal enterprise data wherever it resides.
Improve safety, security, and privacy of AI systems at build, deploy and run stages.
The leading open models built by the community, optimized and accelerated by NVIDIA's enterprise-ready inference runtime.
Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities