Skip to main content
Explore
Models
Skills
Blueprints
GPUs
Docs
Search
⌘K
Ctrl+K
?
Login
vila Model by NVIDIA | NVIDIA NIM
NVIDIA
vila
Deprecated
Free Endpoint
Multi-modal vision-language model that understands text/img/video and creates informative responses
VLM
Vision language model
image caption
image to text
Get API Key
Experience
Experience
API Reference
API Reference
Accelerated by DGX Cloud
This NIM Endpoint has been deprecated
Please transition to another model to avoid any service interruptions.
For more models information, visit our
API Reference