Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
18 models
Sort By
dateCreated:DESC
Most Recent
OpenAI
Downloadable
gpt-oss-20b
Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
reasoning
+3
9.07M
8mo
OpenAI
Downloadable
gpt-oss-120b
Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
reasoning
+3
39.37M
8mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1.5
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
2.81M
8mo
Sarvamai
Downloadable
sarvam-m
Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.
coding
+5
165K
8mo
Microsoft
Deprecated
Free Endpoint
phi-4-mini-flash-reasoning
Lightweight reasoning model for applications in latency bound, memory/compute constrained environments
edge
+3
156K
9mo
Mistral AI
Free Endpoint
magistral-small-2506
High performance reasoning model optimized for efficiency and edge deployment
coding
+3
1.26M
9mo
Marin
Deprecated
Free Endpoint
marin-8b-instruct
State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.
Reasoning
+3
145K
11mo
NVIDIA
Deprecation in 3d
Downloadable
llama-3.1-nemotron-ultra-253b-v1
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
math
+3
6.68M
9mo
Qwen
Deprecated
Free Endpoint
qwq-32b
Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.
coding
+3
931K
9mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
1.4M
9mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-8b-v1
Leading reasoning and agentic AI accuracy model for PC and edge.
math
+3
776K
9mo
Mistral AI
Deprecated
Downloadable
mistral-small-24b-instruct
Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.
code
+3
189K
9mo
Qwen
Deprecated
Downloadable
qwen2.5-7b-instruct
Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.
Chinese Language Generation
+3
6.98M
11mo
Meta
Downloadable
llama-3.3-70b-instruct
Advanced LLM for reasoning, math, general knowledge, and function calling
Instruction following
+4
12.24M
10mo
Qwen
Deprecated
Free Endpoint
qwen2-7b-instruct
Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.
Chinese Language Generation
+3
129K
11mo
Upstage
Free Endpoint
solar-10.7b-instruct
Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.
Non-Commercial Use Only
+4
161K
1y
Microsoft
Deprecated
Downloadable
phi-3-mini-4k-instruct
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
Chat
+4
76.3K
11mo
Microsoft
Deprecated
Free Endpoint
phi-3-mini-128k-instruct
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
Chat
+4
102K
11mo
Items per page
24
1
1
of 1 pages