Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
12 models
Sort By
dateCreated:DESC
Most Recent
Mistral AI
Downloadable
mistral-small-4-119b-2603
Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
chat
+3
847K
1w
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
chat
+5
12.57M
1w
Qwen
Free Endpoint
qwen3.5-122b-a10b
122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
chat
+4
3.51M
2w
Z.ai
Downloadable
glm-5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
19.68M
1mo
Stepfun-ai
Free Endpoint
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
chat
+3
8.85M
1mo
Moonshotai
Downloadable
kimi-k2.5
1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
Multimodal
+4
29.65M
1mo
NVIDIA
Downloadable
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
chat
+4
13.77M
3mo
Qwen
Downloadable
qwen3-next-80b-a3b-thinking
80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.
chat
+2
4.72M
6mo
OpenAI
Downloadable
gpt-oss-20b
Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
reasoning
+4
9.21M
7mo
OpenAI
Downloadable
gpt-oss-120b
Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
reasoning
+4
43.53M
7mo
Mistral AI
Downloadable
mixtral-8x22b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
chat
+5
5.31M
8mo
Mistral AI
Downloadable
mixtral-8x7b-instruct-v0.1
An MOE LLM that follows instructions, completes requests, and generates creative text.
chat
+5
748K
8mo
Items per page
24
1
1
of 1 pages