Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
112 models
Sort By
dateCreated:DESC
Most Recent
Mistral AI
Downloadable
mistral-small-4-119b-2603
Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
chat
+3
1.97M
2w
NVIDIA
Free Endpoint
nemotron-voicechat
Nemotron 3 Voicechat
English
+2
4.05K
2w
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
chat
+5
22.91M
2w
Qwen
Downloadable
qwen3.5-122b-a10b
122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
chat
+4
5.64M
3w
Minimaxai
Downloadable
minimax-m2.5
MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
reasoning
+3
9.12M
1mo
Qwen
Downloadable
qwen3.5-397b-a17b
Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
chat
+4
14.18M
1mo
Z.ai
Downloadable
glm-5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
30.45M
1mo
Stepfun-ai
Free Endpoint
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
chat
+3
9.48M
1mo
Moonshotai
Downloadable
kimi-k2.5
1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
Multimodal
+4
38.57M
2mo
Z.ai
Free Endpoint
glm-4.7
GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
Tool Calling
+4
15.53M
2mo
DeepSeek AI
Free Endpoint
deepseek-v3.2
State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.
chat
+3
16.67M
3mo
NVIDIA
Downloadable
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
chat
+4
13.52M
3mo
Mistral AI
Free Endpoint
devstral-2-123b-instruct-2512
State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.
coding
+4
5.56M
3mo
Moonshotai
Free Endpoint
kimi-k2-thinking
Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.
Conversational
+4
3.57M
3mo
Mistral AI
Free Endpoint
mistral-large-3-675b-instruct-2512
A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
chat
+4
7.42M
3mo
Mistral AI
Downloadable
ministral-14b-instruct-2512
A general purpose VLM ideal for chat and instruction based use cases
chat
+4
3.94M
3mo
NVIDIA
Downloadable
nemotron-nano-12b-v2-vl
Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
chat
+4
1.18M
5mo
DeepSeek AI
Free Endpoint
deepseek-v3.1-terminus
DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.
chat
+4
14M
5mo
Stockmark
Downloadable
stockmark-2-100b-instruct
Japanese-specialized large-language-model for enterprises to read and understand complex business documents.
sovereign ai
+4
3.76M
6mo
Qwen
Downloadable
qwen3-next-80b-a3b-instruct
Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
chat
+2
19.8M
6mo
Moonshotai
Free Endpoint
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+4
12.96M
6mo
Speakleash
Free Endpoint
bielik-11b-v2.6-instruct
State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.
chat
+4
392K
6mo
Qwen
Downloadable
qwen3-next-80b-a3b-thinking
80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.
chat
+2
3.92M
6mo
ByteDance
Free Endpoint
seed-oss-36b-instruct
ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
chat
+3
3.68M
6mo
Items per page
24
1
1
2
2
3
3
4
4
5
5
of 5 pages