Run computational-fluid dynamics (CFD) simulations
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
Cutting-edge vision-language model exceling in retrieving text and metadata from images.
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Multi-modal vision-language model that understands text/img/video and creates informative responses
This NVIDIA Omniverseâ„¢ Blueprint demonstrates how commercial software vendors can create interactive digital twins.
Advanced AI model detects faces and identifies deep fake images.
Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
Cutting-edge vision-language model exceling in high-quality reasoning from images.
Cutting-edge vision-Language model exceling in high-quality reasoning from images.
Robust image classification model for detecting and managing AI-generated content.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Grounding dino is an open vocabulary zero-shot object detection model.
Vision foundation model capable of performing diverse computer vision and vision language tasks.
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.