Skip to main content
Explore
Models
Skills
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
10 models
Sort By
dateCreated:DESC
Most Recent
Stepfun-ai
Downloadable
Free Endpoint
step-3.7-flash
A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.
B200
+5
Items per page
24
1
1
of 1 pages
2.13M
7d
Mistral AI
Downloadable
Free Endpoint
mistral-medium-3.5-128b
A high performing model for text generation, coding and agentic use cases
coding
+3
3.24M
1mo
DeepSeek AI
Downloadable
Free Endpoint
deepseek-v4-flash
DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.
B200
+6
13.41M
1mo
DeepSeek AI
Downloadable
deepseek-v4-pro
DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.
B200
+4
7.99M
1mo
Z.ai
Downloadable
Free Endpoint
glm-5.1
GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
B200
+5
25.39M
1mo
Minimaxai
Downloadable
Free Endpoint
minimax-m2.7
MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
B200
+5
13.47M
1mo
Google
Downloadable
Free Endpoint
gemma-4-31b-it
Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
B200
+6
5.48M
2mo
Stepfun-ai
Free Endpoint
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
Agentic
+2
11.5M
4mo
Qwen
Free Endpoint
qwen3-coder-480b-a35b-instruct
Excels in agentic coding and browser use and supports 256K context, delivering top results.
agentic coding
+3
4.89M
9mo
Sarvamai
Downloadable
Free Endpoint
sarvam-m
Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.
coding
+5
277K
10mo