Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
225 models
Sort By
dateCreated:DESC
Most Recent
Minimaxai
Free Endpoint
minimax-m2.7
MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
coding
+3
Today
NVIDIA
Deprecation in 3d
Free Endpoint
audio2face-3d-claire-notongue
Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Digital Humans
+2
21
2d
NVIDIA
Deprecation in 3d
Free Endpoint
audio2face-3d-james-notongue
Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Digital Humans
+2
2d
NVIDIA
Deprecation in 3d
Free Endpoint
audio2face-3d-james
Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Digital Humans
+2
147
2d
Google
Downloadable
gemma-4-31b-it
Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
reasoning
+4
865K
1w
NVIDIA
Downloadable
llama-nemotron-rerank-vl-1b-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
522
1w
NVIDIA
Enterprise
Build A Generative Protein Binder Design Pipeline
This blueprint shows how generative AI and accelerated NIM microservices can design protein binders smarter and faster.
NVIDIA BioNemo
+4
2.98K
2w
Mistral AI
Downloadable
mistral-small-4-119b-2603
Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
chat
+3
6.51M
3w
NVIDIA
Free Endpoint
nemotron-voicechat
Nemotron 3 Voicechat
English
+2
6.36K
3w
NVIDIA
Downloadable
nemotron-asr-streaming
Real-time speech recognition for English
Automatic Speech Recognition
+2
18.79K
4w
Black-forest-labs
Downloadable
flux.2-klein-4b
FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed
Text-to-Image
+2
95.87K
4w
NVIDIA
Downloadable
nemotron-ocr-v1
Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.
Table Extraction
+4
1.06M
1mo
NVIDIA
Downloadable
nemotron-3-super-120b-a12b
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
chat
+5
44.54M
1mo
NVIDIA
Downloadable
llama-nemotron-rerank-1b-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
100K
1mo
Qwen
Downloadable
qwen3.5-122b-a10b
122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
chat
+4
7.92M
1mo
NVIDIA
Downloadable
nemotron-table-structure-v1
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+5
15.27K
1mo
NVIDIA
Downloadable
nemotron-page-elements-v3
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+4
38.68K
1mo
NVIDIA
Downloadable
nemotron-graphic-elements-v1
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Object Detection
+5
15.69K
1mo
NVIDIA
Downloadable
llama-nemotron-embed-1b-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
1.12M
1mo
NVIDIA
Free Endpoint
gliner-pii
GLiNER PII detects Personally Identifiable Information in text.
PII Detection
+1
51.21K
1mo
Minimaxai
Downloadable
minimax-m2.5
MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
reasoning
+3
10.95M
1mo
NVIDIA
Free Endpoint
cosmos-transfer2.5-2b
Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
Synthetic Data Generation
+4
1mo
Qwen
Downloadable
qwen3.5-397b-a17b
Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
chat
+4
13.84M
1mo
Z.ai
Downloadable
glm-5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
MoE
+3
39.55M
1mo
Items per page
24
1
1
2
2
3
3
4
4
5
5
...
10
10
of 10 pages