Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
11 models
Sort By
dateCreated:DESC
Most Recent
Qwen
qwen3.5-397b-a17b
Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
MoE
+4
4.66M
2w
Z.ai
glm5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
5.4M
2w
Minimaxai
minimax-m2.1
MiniMax M2.1 excels in multi-language coding, app/web dev, office AI, and agent integration
Agentic
+3
7.8M
1mo
Stepfun-ai
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
Agentic
+3
6.25M
1mo
Mistral AI
devstral-2-123b-instruct-2512
State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.
coding
+4
4.46M
2mo
Mistral AI
mistral-large-3-675b-instruct-2512
A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
language generation
+4
4.43M
3mo
DeepSeek AI
deepseek-v3.1-terminus
DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.
tool calling
+3
11.35M
4mo
Qwen
qwen3-next-80b-a3b-instruct
Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
chat
+2
8.81M
5mo
Moonshotai
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+4
10.27M
5mo
Qwen
qwen3-coder-480b-a35b-instruct
Excels in agentic coding and browser use and supports 256K context, delivering top results.
agentic coding
+4
2.89M
6mo
Moonshotai
kimi-k2-instruct
State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities
coding
+3
18.28M
7mo
Items per page
24
1
1
of 1 pages