
An AI-powered, multi-agent system designed to optimize warehouse operations through intelligent automation, real-time monitoring, and natural language interaction.

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

Translation model in 12 languages with few-shots example prompts capability.

State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

Accelerate post-training of end-to-end autonomous vehicle stacks with vector search and retrieval for large video datasets.

Open Mixture of Experts LLM (230B, 10B active) for reasoning, coding, and tool-use/agent workflows

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

Securely extract, embed, and index multimodal data with encryption in-use for fast, accurate semantic search.

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities

State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.

Record-setting accuracy and performance for Mandarin English transcriptions.

Transform your scene idea into ready-to-use 3D assets using Llama 3.1 8B, NV SANA, and Microsoft TRELLIS

Excels in agentic coding and browser use and supports 256K context, delivering top results.

Elevate Shopping Experiences Online and In Stores.

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

Generates high-quality numerical embeddings from text inputs.

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

English text embedding model for question-answering retrieval.

Lightweight reasoning model for applications in latency bound, memory/compute constrained environments

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

An MOE LLM that follows instructions, completes requests, and generates creative text.

An MOE LLM that follows instructions, completes requests, and generates creative text.

Improve safety, security, and privacy of AI systems at build, deploy and run stages.

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Create intelligent virtual assistants for customer service across every industry

Use the multi-LLM compatible NIM container to deploy a broad range of LLMs from Hugging Face.

Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

Design, test, and optimize a new generation of intelligence manufacturing data centers using digital twins.

Build a data flywheel, with NVIDIA NeMo microservices, that continuously optimizes AI agents for latency and cost — while maintaining accuracy targets.

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

Enable smooth global interactions in 36 languages.

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

Automate and optimize the configuration of radio access network (RAN) parameters using agentic AI and a large language model (LLM)-driven framework.

State-of-the-art, multilingual model tailored to all 24 official European Union languages.


State-of-the-art accuracy and speed for English transcriptions.

Enhance speech by correcting common audio degradations to create studio quality speech output.

Create high quality images using Flux.1 in ComfyUI, guided by 3D.

FLUX.1 is a state-of-the-art suite of image generation models

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds


Built for agentic workflows, this model excels in coding, instruction following, and function calling

Updated version of DeepSeek-R1 with enhanced reasoning, coding, math, and reduced hallucination.

Investigate, understand, and interpret single cell data in minutes, not days by leveraging RAPIDS-singlecell, powered by NVIDIA RAPIDS

Streamline evaluation, monitoring, and optimization of AI data flywheel with Weights & Biases.

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

Simulate, test, and optimize physical AI and robotic fleets at scale in industrial digital twins before real-world deployment.

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Cutting-edge vision-language model exceling in high-quality reasoning from images.

A generative model of protein backbones for protein binder design.

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

Generate exponentially large amounts of synthetic motion trajectories for robot manipulation from just a few human demonstrations.

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

Multilingual LLM with emphasis on European languages supporting regulated use cases including financial services, government, heavy industry

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

Cutting-edge lightweight open language model exceling in high-quality reasoning.

Cutting-edge lightweight open language model exceling in high-quality reasoning.

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

Cutting-edge lightweight open language model exceling in high-quality reasoning.

State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

Model for writing and interacting with code across a wide range of programming languages and tasks.

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AI model trained on Japanese language that understands regional nuances.

Generate detailed, structured reports on any topic using LangGraph and Llama3.3 70B NIM.

Easily run essential genomics workflows to save time leveraging Parabricks

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Transform PDFs into AI podcasts for engaging on-the-go audio content.

High accuracy and optimized performance for transcription in 25 languages

Enable smooth global interactions in 36 languages.

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Grounding dino is an open vocabulary zero-shot object detection model.



Run computational-fluid dynamics (CFD) simulations


Leading content safety model for enhancing the safety and moderation capabilities of LLMs

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

Predicts the 3D structure of a protein from its amino acid sequence.

Predicts the 3D structure of a protein from its amino acid sequence.

Estimate gaze angles of a person in a video and redirect to make it frontal.

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.


Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

Verify compatibility of OpenUSD assets with instant RTX render and rule-based validation.

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.