NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data science

  • Single-cell RNA Sequencing
  • Portfolio Optimization
  • CUDA-X Data Science
  • Optimized JAX
  • Text to Knowledge Graph

tools

  • VS Code
  • DGX Dashboard
  • Comfy UI
  • RAG Application in AI Workbench
  • Set up Tailscale on Your Spark

fine tuning

  • Fine-tune with Pytorch
  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Unsloth on DGX Spark

use case

  • Install and Use Isaac Sim and Isaac Lab
  • Live VLM WebUI
  • Vibe Coding in VS Code
  • Build and Deploy a Multi-Agent Chatbot
  • NCCL for Two Sparks
  • Connect Two Sparks
  • Build a Video Search and Summarization (VSS) Agent

inference

  • Nemotron-3-Nano with llama.cpp
  • Speculative Decoding
  • vLLM for Inference
  • SGLang for Inference
  • TRT LLM for Inference
  • Multi-modal Inference
  • NIM on Spark
  • NVFP4 Quantization
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Live VLM WebUI

20 MIN

Real-time Vision Language Model interaction with webcam streaming

View on GitHub
OverviewInstructionsTroubleshooting

Basic idea

Live VLM WebUI is a universal web interface for real-time Vision Language Model (VLM) interaction and benchmarking. It enables you to stream your webcam directly to any VLM backend (Ollama, vLLM, SGLang, or cloud APIs) and receive live AI-powered analysis. This tool is perfect for testing VLM models, benchmarking performance across different hardware configurations, and exploring vision AI capabilities.

The interface provides WebRTC-based video streaming, integrated GPU monitoring, customizable prompts, and support for multiple VLM backends. It works seamlessly with the powerful Blackwell GPU in your DGX Spark, enabling real-time vision inference at impressive speeds.

What you'll accomplish

You'll set up a complete real-time vision AI testing environment on your DGX Spark that allows you to:

  • Stream webcam video and get instant VLM analysis through a web browser
  • Test and compare different vision language models (Gemma 3, Llama Vision, Qwen VL, etc.)
  • Monitor GPU and system performance in real-time while models process video frames
  • Customize prompts for various use cases (object detection, scene description, OCR, safety monitoring)
  • Access the interface from any device on your network with a web browser

What to know before starting

  • Basic familiarity with Linux command line and terminal operations
  • Basic knowledge of Python package installation with pip
  • Basic knowledge of REST APIs and how services communicate via HTTP
  • Familiarity with web browsers and network access (IP addresses, ports)
  • Optional: Knowledge of Vision Language Models and their capabilities (helpful but not required)

Prerequisites

Hardware Requirements:

  • Webcam (laptop built-in camera, USB camera, or remote browser with camera)
  • At least 10GB available storage space for Python packages and model downloads

Software Requirements:

  • DGX Spark with DGX OS installed
  • Python 3.10 or later (verify with python3 --version)
  • pip package manager (verify with pip --version)
  • Network access to download Python packages from PyPI
  • A VLM backend running locally (Ollama being easiest) or cloud API access
  • Web browser access to https://<SPARK_IP>:8090

VLM Backend Options:

  1. Ollama (recommended for beginners) - Easy to install and use
  2. vLLM - Higher performance for production workloads
  3. SGLang - Alternative high-performance backend
  4. NIM - NVIDIA Inference Microservices for optimized performance
  5. Cloud APIs - NVIDIA API Catalog, OpenAI, or other OpenAI-compatible APIs

Ancillary files

All source code and documentation can be found at the Live VLM WebUI GitHub repository.

The package will be installed directly via pip, so no additional files are required for basic installation.

Time & risk

  • Estimated time: 20-30 minutes (including Ollama installation and model download)
    • 5 minutes to install Live VLM WebUI via pip
    • 10-15 minutes to install Ollama and download a model (varies by model size)
    • 5 minutes to configure and test
  • Risk level: Low
    • Python packages installed in user space, isolated from system
    • No system-level changes required
    • Port 8090 must be accessible for web interface functionality
    • Self-signed SSL certificate requires browser security exception
  • Rollback: Uninstall the Python package with pip uninstall live-vlm-webui. Ollama can be uninstalled with standard package removal. No persistent changes to DGX Spark configuration.
  • Last Updated: 01/02/2026
    • First Publication

Resources

  • Live VLM WebUI GitHub Repository
  • Live VLM WebUI Documentation
  • DGX Spark Documentation
  • DGX Spark Forum