Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

38 results for

Filters

  • Free Endpoint
    10
  • Partner Endpoint
    3
  • Download Available
    10
  • Enterprise Blueprint
    2
  • Launchable
    1
  • Synthetic Data Generation
    5
  • Image-to-Text
    4
  • Deepinfra
    3
  • Bitdeer
    2
  • Vultr
    2
  • Eigen AI
    1
  • GMI Cloud
    1
  • NVIDIA
    36
  • Google
    1
  • Moonshotai
    1
  • Application Developer
    15
  • AI Engineer
    13
  • Ml Engineer
    10
  • Developer
    8
  • Solutions Architect
    8
  • NVIDIA AI
    2
  • AI And Machine Learning
    16
  • Physical AI
    1
  • B200
    1
  • H100 80GB HBM3
    1
  • H200
    1
  • Video Search and Summarization (VSS)
    11
  • TAO Toolkit
    2
  • Cosmos
    1
  • DeepStream SDK
    1
  • Physical AI Dataset
    1
  • NVIDIA
    DownloadableFree Endpoint

    synthetic-video-detector

    NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.
    Model
    broadcast
    Items per page
    of 2 pages
    90.31K
    2mo

    Use this skill when deploying, operating, or integrating the VSS 3.2 GA RT-Embed Video Embedding microservice. Covers Docker Compose bring-up, GPU and storage prerequisites, the `/v1` REST API (file uploads, text and video embeddings, live RTSP streams, h
    Skill
    Video Search and Summarization (VSS)
    391
    4d

    Use this skill when producing a VSS analysis report — Mode A per-clip VLM, Mode B incident-range via video-analytics. Not for standalone video summarization, real-time alerts, or ad-hoc Q&A.
    Skill
    Video Search and Summarization (VSS)
    392
    4d
    RTX Workstation
    18 MIN

    NVIDIA Video Generation Guide

    Learn how to create videos using LTX-2 in ComfyUI, accelerated on RTX. Learn how to take control of visual generative AI, creating high resolution video on RTX.
    Playbook
    ComfyUI
    15d

    Use to summarize a recorded video via the LVS summarization microservice (HITL-gated) with a VLM fallback. Not for report generation or live RTSP captioning.
    Skill
    Developer
    396
    4d

    Use this skill to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.
    Skill
    Developer
    395
    4d

    Use to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.
    Skill
    Video Search and Summarization (VSS)
    390
    4d

    Use when running video data augmentation and auto-labeling workflows on OSMO: flow selection, preflight, submit-time interpolation, monitoring, and output retrieval. Trigger keywords: video data augmentation, data enrichment, auto labeling, VDA demo, OSMO
    Skill
    Developer
    435
    16d

    Use to deploy the vss-video-analytics-api REST service standalone (config-source, data-log bind, Elasticsearch, optional Kafka). Not for full warehouse deploy.
    Skill
    Video Search and Summarization (VSS)
    397
    4d

    Multi-step video annotation pipeline that turns raw videos into Chain-of-Thought training data — multi-level captions, structured descriptions, and QA pairs (MCQ, binary, open-ended) with reasoning traces, via VLM/LLM distillation. Use when the user wants
    Skill
    TAO
    178
    4d

    Use to call the VIOS REST API (sensor list, timelines, clip extraction, snapshots, add/delete sensors and streams). Not for VLM inference or search.
    Skill
    Developer
    387
    4d
    DGX Station
    45 MIN

    Image & Video Generation with ComfyUI

    Generate images and videos with FLUX, Wan 2.1, HunyuanVideo, and Cosmos on DGX Station
    Playbook
    Image Generation
    21d
    General
    LaunchableEnterprise

    Build a Video Search and Summarization (VSS) Agent

    Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
    Blueprint
    NVIDIA AI
    3mo
    DGX Spark

    Build a Video Search and Summarization (VSS) Agent

    Run the VSS Blueprint on your Spark
    Playbook
    DGX
    8mo
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3-nano-omni-30b-a3b-reasoning

    Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.
    Model
    Image-to-Text
    7.54M
    1mo
    DGX Spark
    1 HR

    Vision-Language Model Fine-tuning

    Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3
    Playbook
    DGX
    8mo
    NVIDIA
    Free Endpoint

    cosmos-transfer1-7b

    Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
    Model
    Synthetic Data Generation
    250
    11mo
    NVIDIA
    Free Endpoint

    cosmos-transfer2.5-2b

    Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.
    Model
    Synthetic Data Generation
    3mo
    Google
    Free Endpoint

    paligemma

    Vision language model adept at comprehending text and visual inputs to produce informative responses
    Model
    image
    10.22K
    1y
    NVIDIA
    Downloadable

    cosmos-reason2-8b

    Vision language model that excels in understanding the physical world using structured reasoning on videos or images.
    Model
    video understanding
    191K
    5mo
    NVIDIA
    DownloadableFree Endpoint

    cosmos3-nano-reasoner

    Vision language model that excels in understanding the physical world using structured reasoning on videos or images.
    Model
    video understanding
    1.94K
    15d

    Cosmos-Embed1 video-text embedding for text-to-video retrieval, video-to-video search, semantic deduplication, and fine-tuning. Use when the user asks to "fine-tune Cosmos-Embed1", "run cosmos-embed inference", "export Cosmos-Embed1", "embed videos", or "
    Skill
    AI Engineer
    Today
    NVIDIA
    DownloadableFree Endpoint

    Active Speaker Detection

    Detect and track speaker identities across video frames.
    Model
    broadcast
    473
    2mo

    Use to run top-level VSS fusion search on archived video, or to ingest video files / RTSP streams for search. Do NOT use for ad-hoc visual Q&A (use vss-ask-video), live captioning (use vss-deploy-dense-captioning), or video summarization and reports (use
    Skill
    Developer
    392
    4d