Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (2)
5 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
MoE
+4
Items per page
24
1
1
of 1 pages
43.29M
1mo
DeepSeek AI
Deprecation in 3d
Free Endpoint
deepseek-v3.2
State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.
long context
+2
8.24M
4mo
NVIDIA
Downloadable
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
MoE
+3
9.35M
4mo
Moonshotai
Deprecation in 11d
Free Endpoint
kimi-k2-thinking
Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.
Conversational
+3
2.94M
4mo
Moonshotai
Deprecation in 4d
Free Endpoint
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+3
8.1M
7mo