Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
73 results for
Filters
Models (57)
Blueprints (7)
Other (9)
Sort By
score:DESC
Best Match
DGX Spark
20 MIN
Live VLM WebUI
Real-time Vision Language Model interaction with webcam streaming
Playbook
Vision AI
+4
2mo
NVIDIA
Downloadable
llama-3.2-nemoretriever-1b-vlm-embed-v1
Multimodal question-answer retrieval representing user queries as text and documents as images.
Model
nemo retriever
+3
271K
8mo
DGX Spark
1 HR
Vision-Language Model Fine-tuning
Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3
Playbook
DGX
+6
5mo
Qwen
Downloadable
qwen3.5-397b-a17b
Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
Model
chat
+4
8.02M
3w
NVIDIA
Launchable
LLM Router
Route LLM requests to the best model for the task at hand.
Blueprint
NVIDIA AI
+1
3w
NVIDIA
Free Endpoint
cosmos-nemotron-34b
Multi-modal vision-language model that understands text/img/video and creates informative responses
Model
VLM
+3
6
1y
Google
Free Endpoint
paligemma
Vision language model adept at comprehending text and visual inputs to produce informative responses
Model
image
+8
335K
1y
NVIDIA
Free Endpoint
retail-object-detection
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
Model
Object Detection
+8
363
1y
NVIDIA
Free Endpoint
visual-changenet
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
Model
image
+8
640
1y
DGX Spark
vLLM for Inference
Install and use vLLM on DGX Spark
Playbook
DGX
+2
5d
Z.ai
Free Endpoint
glm-4.7
GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
Model
Tool Calling
+4
17.73M
1mo
Z.ai
Downloadable
glm-5
GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
Model
chat
+3
9.8M
4w
DGX Spark
TRT LLM for Inference
Install and use TensorRT-LLM on DGX Spark
Playbook
DGX
+2
5mo
NVIDIA
Multi-LLM NIM
Use the multi-LLM compatible NIM container to deploy a broad range of LLMs from Hugging Face.
Blueprint
nim
3w
NVIDIA
Free Endpoint
ocdrnet
OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.
Model
Optical Character Recognition
+8
736
1y
DGX Spark
LM Studio on DGX Spark
Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.
Playbook
Inference
+4
1mo
NVIDIA
Downloadable
nemotron-nano-12b-v2-vl
Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
Model
chat
+4
1.4M
4mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-vl-8b-v1
Multi-modal vision-language model that understands text/img and creates informative responses
Model
chat
+3
9.15M
8mo
NVIDIA
Downloadable
llama-nemotron-embed-vl-1b-v2
Multimodal question-answer retrieval representing user queries as text and documents as images.
Model
nemo retriever
+3
883K
1mo
DGX Spark
30 MIN
NIM on Spark
Deploy a NIM on Spark
Playbook
DGX
+1
5mo
Mistral AI
Downloadable
ministral-14b-instruct-2512
A general purpose VLM ideal for chat and instruction based use cases
Model
chat
+4
4.67M
3mo
NVIDIA
Downloadable
cosmos-reason1-7b
Reasoning vision language model (VLM) for physical AI and robotics.
Model
video understanding
+8
15.93K
7mo
DGX Spark
30 MIN
Nemotron-3-Nano with llama.cpp
Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark
Playbook
Nemotron
+3
2mo
NVIDIA
Launchable
Ambient Healthcare Agents
Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM
Blueprint
NVIDIA AI
+3
3w
Items per page
24
1
1
2
2
3
3
4
4
of 4 pages