NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

16 results for

Filters

No filters available
NVIDIA
DownloadableFree Endpoint

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.
Model
broadcast
Items per page
of 1 pages
298
1w
NVIDIA
LaunchableEnterprise

Build a Video Search and Summarization (VSS) Agent

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
Blueprint
NVIDIA AI
2mo
DGX Spark

Build a Video Search and Summarization (VSS) Agent

Run the VSS Blueprint on your Spark
Playbook
DGX
6mo
DGX Spark
1 HR

Vision-Language Model Fine-tuning

Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3
Playbook
DGX
6mo
NVIDIA
Free Endpoint

cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.
Model
Synthetic Data Generation
868
1y
NVIDIA
Free Endpoint

cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
Model
Synthetic Data Generation
220
9mo
NVIDIA
Free Endpoint

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
Model
Synthetic Data Generation
1mo
Google
Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses
Model
image
38.52K
1y
NVIDIA
Downloadable

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.
Model
video understanding
164K
4mo
NVIDIA
DownloadableFree Endpoint

Active Speaker Detection

Detect and track speaker identities across video frames.
Model
localization
167
1w
NVIDIA
Deprecation in 21dDownloadable

eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.
Model
telepresence
1.55K
1y
Moonshotai
Deprecation in 6dDownloadable

kimi-k2.5

1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
Model
Multimodal
47.21M
2mo
NVIDIA
Downloadable

LipSync

Generative lip dubbing that syncs lips in a video to input audio.
Model
lipsync
1w
NVIDIA
Downloadable

NVIDIA AI for Media Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.
Model
HDRI
160
1w
NVIDIA
Enterprise

Cosmos Dataset Search

Accelerate post-training of end-to-end autonomous vehicle stacks with vector search and retrieval for large video datasets.
Blueprint
NVIDIA AI
2mo
NVIDIA
Downloadable

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
Model
language generation
4.68M
5mo