NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIALaunch from Hugging FaceBeta
Publisher
Use Case
NIM Type
Sorting by Most Recent

deepseek-aideepseek-r1-distill-llama-70b

Distilled version of Llama 3.3 70B using reasoning data generated by DeepSeek R1 for enhanced performance.

minimaxaiminimax-m2

Open Mixture of Experts LLM (230B, 10B active) for reasoning, coding, and tool-use/agent workflows

deepseek-aideepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

moonshotaikimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities

qwenqwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

bytedanceseed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

deepseek-aideepseek-v3.1

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

nvidianvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

nvidiacosmos-reason1-7b

Reasoning vision language model (VLM) for physical AI and robotics.

openaigpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

openaigpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

nvidiallama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

sarvamaisarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

microsoftphi-4-mini-flash-reasoning

Lightweight reasoning model for applications in latency bound, memory/compute constrained environments

moonshotaikimi-k2-instruct

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

mistralaimagistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

deepseek-aideepseek-r1-0528

Updated version of DeepSeek-R1 with enhanced reasoning, coding, math, and reduced hallucination.

nvidiallama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

marinmarin-8b-instruct

State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.

ibmgranite-3.3-8b-instruct

Small language model fine-tuned for improved reasoning, coding, and instruction-following

qwenqwen3-235b-a22b

Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following

mistralaimistral-medium-3-instruct

Powerful, multimodal language model designed for enterprise applications, including software development, data analysis, and reasoning.

nvidiallama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

qwenqwq-32b

Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.

nvidiallama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

nvidiallama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

deepseek-aideepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

googlegemma-3-27b-it

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

deepseek-aideepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

deepseek-aideepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

deepseek-aideepseek-r1-distill-qwen-7b

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

microsoftphi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

mistralaimistral-small-24b-instruct

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

deepseek-aideepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

tiiuaefalcon3-7b-instruct

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

qwenqwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

qwenqwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

metallama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

metallama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

metallama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

metallama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

metallama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

qwenqwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

microsoftphi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

rakutenrakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

rakutenrakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

metallama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

metallama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

nv-mistralaimistral-nemo-12b-instruct

Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.

microsoftphi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

upstagesolar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

microsoftphi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

microsoftphi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

microsoftphi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

microsoftphi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

microsoftphi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

mistralaimixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

metallama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

metallama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

mistralaimixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.