Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
12 models
Sort By
dateCreated:DESC
Most Recent
Mistral AI
Downloadable
mistral-small-4-119b-2603
Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
chat
+3
1.14M
1w
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
chat
+5
14.59M
1w
Qwen
Free Endpoint
qwen3.5-122b-a10b
122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
chat
+4
3.98M
2w
Z.ai
Downloadable
glm-5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
22.19M
1mo
Stepfun-ai
Free Endpoint
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
chat
+3
8.98M
1mo
Moonshotai
Downloadable
kimi-k2.5
1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
Multimodal
+4
31.47M
1mo
NVIDIA
Downloadable
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
chat
+4
13.62M
3mo
Qwen
Downloadable
qwen3-next-80b-a3b-thinking
80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.
chat
+2
4.61M
6mo
OpenAI
Downloadable
gpt-oss-20b
Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
reasoning
+4
9.4M
7mo
OpenAI
Downloadable
gpt-oss-120b
Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
reasoning
+4
43.97M
7mo
Mistral AI
Downloadable
mixtral-8x22b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
chat
+5
5.23M
8mo
Mistral AI
Downloadable
mixtral-8x7b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
chat
+5
692K
8mo
Items per page
24
1
1
of 1 pages