Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (2)
15 models
Sort By
dateCreated:DESC
Most Recent
Google
Downloadable
gemma-4-31b-it
Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
coding
+4
Today
Minimaxai
Downloadable
minimax-m2.5
MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
reasoning
+3
9.77M
1mo
Stepfun-ai
Free Endpoint
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
chat
+3
8.99M
2mo
Z.ai
Free Endpoint
glm-4.7
GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
Tool Calling
+4
14.39M
2mo
Mistral AI
Free Endpoint
devstral-2-123b-instruct-2512
State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.
coding
+4
4.65M
3mo
Moonshotai
Free Endpoint
kimi-k2-instruct-0905
Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
long-context
+4
14M
6mo
Sarvamai
Downloadable
sarvam-m
Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.
coding
+6
253K
8mo
Moonshotai
Free Endpoint
kimi-k2-instruct
State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities
coding
+4
20.61M
8mo
Mistral AI
Free Endpoint
magistral-small-2506
High performance reasoning model optimized for efficiency and edge deployment
coding
+4
2.09M
8mo
IBM
Free Endpoint
granite-3.3-8b-instruct
Small language model fine-tuned for improved reasoning, coding, and instruction-following
coding
+3
78.41K
8mo
Qwen
Free Endpoint
qwq-32b
Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.
coding
+4
2.23M
9mo
DeepSeek AI
Downloadable
deepseek-r1-distill-llama-8b
Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.
Distillation
+5
2.31M
8mo
DeepSeek AI
Downloadable
deepseek-r1-distill-qwen-32b
Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.
coding
+4
2.49K
2.54M
10mo
DeepSeek AI
Downloadable
deepseek-r1-distill-qwen-14b
Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.
coding
+4
1.88K
2.17M
10mo
Tiiuae
Free Endpoint
falcon3-7b-instruct
Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities
chat
+6
1.74M
10mo
Items per page
24
1
1
of 1 pages