NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

nvidia

vila

PREVIEW

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlmvision language modelimage captionimage to text
Get API Key
API Reference
Accelerated by DGX Cloud
Sorry, your browser does not support inline SVG.