Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
19 models
Sort By
dateCreated:DESC
Most Recent
DeepSeek AI
Free Endpoint
deepseek-v3.1-terminus
DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.
tool calling
+3
11.82M
6mo
Moonshotai
Free Endpoint
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+3
14.06M
7mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1.5
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
2.74M
8mo
Moonshotai
Free Endpoint
kimi-k2-instruct
State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities
coding
+3
21.14M
9mo
Mistral AI
Free Endpoint
magistral-small-2506
High performance reasoning model optimized for efficiency and edge deployment
coding
+3
1.25M
9mo
NVIDIA
Deprecation in 4d
Downloadable
llama-3.1-nemotron-ultra-253b-v1
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
math
+3
6.57M
9mo
Qwen
Deprecated
Free Endpoint
qwq-32b
Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.
coding
+3
989K
9mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
1.33M
9mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-8b-v1
Leading reasoning and agentic AI accuracy model for PC and edge.
math
+3
743K
9mo
Qwen
Downloadable
qwen2.5-coder-32b-instruct
Advanced LLM for code generation, reasoning, and fixing across popular programming languages.
code completion
+2
2.81M
9mo
Meta
Downloadable
llama-3.3-70b-instruct
Advanced LLM for reasoning, math, general knowledge, and function calling
Instruction following
+4
12.68M
10mo
Meta
Downloadable
llama-3.2-3b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
Chat
+3
17.63K
1.01M
11mo
Meta
Downloadable
llama-3.2-1b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+3
16.24K
324K
11mo
Rakuten
Deprecated
Free Endpoint
rakutenai-7b-instruct
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Chat
+3
44.78K
11mo
Rakuten
Deprecated
Free Endpoint
rakutenai-7b-chat
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Chat
+3
45.01K
11mo
Meta
Downloadable
llama-3.1-8b-instruct
Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.
Chat
+4
13.67M
9mo
Mistral AI
Downloadable
mixtral-8x22b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
Advanced Reasoning
+4
2.38M
9mo
Meta
Deprecated
Downloadable
llama3-8b-instruct
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Chat
+4
712K
11mo
Mistral AI
Downloadable
mixtral-8x7b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
Advanced Reasoning
+4
454K
9mo
Items per page
24
1
1
of 1 pages