Multi-modal vision-language model that understands text/img/video and creates informative responses
Generates physics-aware video world states from text and image prompts for physical AI development.
Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.
Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.
Shutterstock Generative 3D service for 360 HDRi generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries.
Cutting-edge vision-language model exceling in high-quality reasoning from images.
Cutting-edge vision-Language model exceling in high-quality reasoning from images.
Multi-modal vision-language model that understands text/img/video and creates informative responses
An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.
AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.
Shutterstock Generative 3D service for 3D asset generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries
Advanced text-to-image model for generating high quality images
A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation