Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
20 models
Sort By
dateCreated:DESC
Most Recent
DeepSeek AI
deepseek-v3.1-terminus
DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.
tool calling
+3
11.9M
5mo
Moonshotai
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+4
10.54M
5mo
NVIDIA
llama-3.3-nemotron-super-49b-v1.5
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
chat
+4
4.01M
7mo
Moonshotai
kimi-k2-instruct
State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities
coding
+3
19.22M
7mo
Mistral AI
magistral-small-2506
High performance reasoning model optimized for efficiency and edge deployment
coding
+4
3.21M
7mo
Qwen
qwen3-235b-a22b
Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following
chat
+3
17.73M
7mo
NVIDIA
llama-3.1-nemotron-ultra-253b-v1
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
chat
+4
6.3M
7mo
Qwen
qwq-32b
Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.
coding
+3
3.2M
8mo
NVIDIA
llama-3.3-nemotron-super-49b-v1
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
chat
+4
1.08M
7mo
NVIDIA
llama-3.1-nemotron-nano-8b-v1
Leading reasoning and agentic AI accuracy model for PC and edge.
chat
+4
529K
8mo
Qwen
qwen2.5-coder-32b-instruct
Advanced LLM for code generation, reasoning, and fixing across popular programming languages.
code completion
+3
4.48M
8mo
Meta
llama-3.3-70b-instruct
Advanced LLM for reasoning, math, general knowledge, and function calling
Reasoning
+5
23.33M
8mo
Meta
llama-3.2-3b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+3
11.39K
585K
9mo
Meta
llama-3.2-1b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+3
15.46K
357K
9mo
Rakuten
rakutenai-7b-instruct
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
chat
+3
413K
9mo
Rakuten
rakutenai-7b-chat
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
chat
+3
406K
9mo
Meta
llama-3.1-8b-instruct
Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.
chat
+4
4.15M
7mo
Mistral AI
mixtral-8x22b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
Advanced Reasoning
+4
3.85M
7mo
Meta
llama3-8b-instruct
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
chat
+4
1.07M
9mo
Mistral AI
mixtral-8x7b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
Advanced Reasoning
+4
598K
7mo
Items per page
24
1
1
of 1 pages