Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
214 models
Sort By
dateCreated:DESC
Most Recent
Qwen
qwen3.5-122b-a10b
122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
tool calling
+4
1d
NVIDIA
nemotron-table-structure-v1
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+4
1
2d
NVIDIA
nemotron-page-elements-v3
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+3
1
2d
NVIDIA
nemotron-graphic-elements-v1
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+4
1
2d
NVIDIA
llama-nemotron-embed-1b-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
2
2d
NVIDIA
gliner-pii
GLiNER PII detects Personally Identifiable Information in text.
PII Detection
+1
15.52K
3d
Minimaxai
minimax-m2.5
MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
coding
+3
1.8M
1w
NVIDIA
cosmos-transfer2.5-2b
Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
Synthetic Data Generation
+4
1w
Qwen
qwen3.5-397b-a17b
Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
MoE
+4
5.14M
2w
Z.ai
glm5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
5.86M
2w
NVIDIA
llama-nemotron-embed-vl-1b-v2
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
591K
3w
Minimaxai
minimax-m2.1
MiniMax M2.1 excels in multi-language coding, app/web dev, office AI, and agent integration
Agentic
+3
8.1M
1mo
Stepfun-ai
step-3.5-flash
200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
Agentic
+3
6.67M
1mo
Moonshotai
kimi-k2.5
1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
Multimodal
+4
19.52M
1mo
Z.ai
glm4.7
GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
Tool Calling
+4
17.55M
1mo
NVIDIA
nemotron-content-safety-reasoning-4b
A context‑aware safety model that applies reasoning to enforce domain‑specific policies.
NeMo Guardrails
+3
441K
1mo
NVIDIA
cosmos-reason2-8b
Vision language model that excels in understanding the physical world using structured reasoning on videos or images.
video understanding
+8
186K
2mo
NVIDIA
nemoretriever-page-elements-v3
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+4
687K
2mo
DeepSeek AI
deepseek-v3.2
State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.
long context
+3
14.58M
2mo
NVIDIA
nemotron-3-nano-30b-a3b
Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
MoE
+4
11.1M
2mo
NVIDIA
riva-translate-4b-instruct-v1_1
Translation model in 12 languages with few-shots example prompts capability.
nvidia nim
+2
437K
2mo
Mistral AI
devstral-2-123b-instruct-2512
State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.
coding
+4
4.84M
2mo
Moonshotai
kimi-k2-thinking
Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.
Conversational
+4
2.99M
2mo
Mistral AI
mistral-large-3-675b-instruct-2512
A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
language generation
+4
4.89M
3mo
Items per page
24
1
1
2
2
3
3
4
4
5
5
...
9
9
of 9 pages