Cutting-edge vision-language model exceling in high-quality reasoning from images.
Cutting-edge vision-Language model exceling in high-quality reasoning from images.
Multi-modal vision-language model that understands text/images and generates informative responses
An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.
AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.
Shutterstock Early Access preview of Generative 3D service for 360 HDRi generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries.
Shutterstock Generative 3D service for 3D asset generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries
NV-CLIP is a multimodal embeddings model for image and text.
Advanced text-to-image model for generating high quality images
Vision language model adept at comprehending text and visual inputs to produce informative responses
One-shot visual language understanding model that translates images of plots into tables.
Multi-modal vision-language model that understands text/images and generates informative responses
A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation