NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIALaunch from Hugging FaceBeta
Sorting by Most Recent

minimaxaiminimax-m2

Open Mixture of Experts LLM (230B, 10B active) for reasoning, coding, and tool-use/agent workflows

ConversationalReasoningchatLong ContextFunction Calling

nvidiallama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderationllm safetymultilingual guard modelmultilingual content safetynemoguard

deepseek-aideepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

tool callingchatadvanced reasoningagentic

bytedanceseed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

thinking budgetchatreasoningtext-generation

nvidianvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

thinking budgetchatreasoning

nvidiacosmos-reason1-7b

Reasoning vision language model (VLM) for physical AI and robotics.

video understandingSynthetic Data Generationautonomous vehiclesindustrialPhysical AIvision language modelreasoningroboticssmart cities

openaigpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

text-to-textchatreasoningmath

openaigpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

text-to-textchatreasoningmath

opengpt-xteuken-7b-instruct-commercial-v0.4

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

sovereign aitext-to-textchateuropeanMultilingual

metallama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal SafetyContent SafetyGuardrailContent Moderator

nvidiallama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retrieverembeddingRetrieval Augmented GenerationText-to-Embedding

gotocompanygemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

Sovereign AIchatIndonesianText-to-TextRegional Language Generation

googlegemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

TranslationchatText-to-TextLanguage Generation

microsoftphi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chatCode GenerationText-to-TextLanguage Generation

deepseek-aideepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

chatMathadvanced reasoning

nvidiallama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

Dialogue SafetyLLM safetyGuard ModelContent safety

nvidianemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

LLM SecurityJailbreak DetectionPrompt InjectionNVIDIA NIM

nvidiallama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

LLM safetycontent moderationGuard modelContent safety

igeniuscolosseum_355b_instruct_16k

NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry

Heavy industryGovernmentchatHighly regulated use case supportFinancial services

tiiuaefalcon3-7b-instruct

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

CodingchatCode GenerationLanguage GenerationImproved reasoningMathScientific knowledge

igeniusitalia_10b_instruct_16k

Multilingual LLM with emphasis on European languages supporting regulated use cases including financial services, government, heavy industry

Heavy industryGovernmentchatHighly regulated use case supportFinancial services

qwenqwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language GenerationchatText-to-TextLarge Language Models

nvidiacosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLMVision language modelimage captionimage to text

qwenqwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completioncode generationchattext-to-code

nvidiausdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

OpenUSDSynthetic Data GenerationDigital TwinchatCode Generation

metallama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

ReasoningchatCode GenerationText-to-TextInstruction followingMath

nvidianemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

IndicchatText-to-TextLanguage Generation

qwenqwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language GenerationchatText-to-TextLarge Language Models

nvidiavila

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLMVision language modelimage captionimage to text

tokyotech-llmllama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Large Language ModelchatRegional Language Generation

ai21labsjamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatLanguage GenerationText-to-text

nvidianemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chatText-to-TextLanguage Generation

microsoftphi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

rakutenrakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatText-to-TextLanguage GenerationLarge Language Models

rakutenrakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatText-to-TextLanguage GenerationLarge Language Models

googleshieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

GuardrailText-to-Text

metallama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.

synthetic data generationchatCode Generation

nvidiallama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-textchatNon-Commercial Use Only

mistralaimistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chatText-to-TextLanguage Generation

nvidiaocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Optical Character RecognitionimageOptical Character Detectioncvvlmcomputer visionTAO Toolkitvideo

mediatekbreeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chatText-to-TextRegional Language Generation

nvidiavisual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

imageImage GenerationcvImage Segmentationvlmcomputer visionTAO ToolkitvideoNVIDIA NIM

nvidiaretail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Object Detectionimagecvvlmcomputer visionTAO ToolkitvideoNVIDIA NIM

googlepaligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

imagecvVision AssistantvlmVisual Question Answeringcomputer visionLanguage GenerationImage-to-Textvideo

aisingaporesea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

ChatText-to-TextRegional Language GenerationLarge Language Models

microsoftphi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

microsoftphi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

mistralaimixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced ReasoningchatCode GenerationText-to-TextLarge Language Models

metallama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

mistralaimistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chatText-to-TextLanguage GenerationNVIDIA NIM

mistralaimixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced ReasoningchatCode GenerationText-to-TextLarge Language Models