Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters (1)
9 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
chat
+5
8.75M
1w
NVIDIA
Downloadable
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
chat
+4
13M
3mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1.5
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
chat
+4
5.37M
7mo
Mistral AI
Free Endpoint
mistral-nemotron
Built for agentic workflows, this model excels in coding, instruction following, and function calling
chat
+3
867K
9mo
IBM
Free Endpoint
granite-3.3-8b-instruct
Small language model fine-tuned for improved reasoning, coding, and instruction-following
coding
+3
137K
8mo
NVIDIA
Downloadable
llama-3.1-nemotron-ultra-253b-v1
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
chat
+4
8.28M
8mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
chat
+4
1.11M
8mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-8b-v1
Leading reasoning and agentic AI accuracy model for PC and edge.
chat
+4
631K
8mo
Meta
Downloadable
llama-3.3-70b-instruct
Advanced LLM for reasoning, math, general knowledge, and function calling
Instruction following
+5
20.39M
9mo
Items per page
24
1
1
of 1 pages