Basic idea
ComfyUI is a node-based visual interface for building image and video generation workflows using diffusion models. Instead of a single text box, you connect processing nodes — model loaders, text encoders, samplers, decoders — into a graph that gives full control over every generation step.
- Node-based workflows let you build, modify, and share complex generation pipelines visually.
- Multi-model support covers the latest architectures: FLUX for images, Wan 2.1 and HunyuanVideo for video, and NVIDIA Cosmos for world generation.
- Full precision on GB300 — with 252 GB of HBM3e, you can run 12–17B image models and 13–14B video models at bf16 with no quantization or offloading, which is impossible on consumer hardware.
What you'll accomplish
Deploy ComfyUI on DGX Station and run image and video generation workflows using six state-of-the-art models:
- FLUX.1 [dev] (12B) — high-quality text-to-image generation
- HiDream-I1 Full (17B) — the largest open image model, with four text encoders including Llama-3.1-8B
- Wan 2.1 T2V/I2V 14B — text-to-video and image-to-video at 720p
- HunyuanVideo (13B) — 1080p video generation leveraging the full GB300 memory (~100–120 GB VRAM)
- NVIDIA Cosmos-Predict2 (14B) — NVIDIA's world foundation model for video-to-world generation
You will also learn advanced techniques including ControlNet-guided generation and combined image-to-video pipelines.
What to know before starting
- Basic Docker container usage
- Familiarity with generative AI concepts (prompts, diffusion models) is helpful but not required
Prerequisites
- NVIDIA DGX Station with GB300 GPU
- Docker installed:
docker --version
- NVIDIA Container Toolkit configured:
nvidia-smi should show the GB300
- HuggingFace account with access token: https://huggingface.co/settings/tokens
- At least 200 GB free disk space for model weights
- Network access to HuggingFace and GitHub
Ancillary files
All required assets can be found in the ComfyUI playbook repository.
assets/Dockerfile — Builds the ComfyUI container image from NGC PyTorch base (ARM64)
assets/scripts/download-models.sh — Downloads all model weights from Hugging Face using the hf CLI (huggingface-hub package)
assets/workflows/*.json — Eight UI workflows (ComfyUI 0.4 graph with nodes / links) for Load in the web UI
assets/workflow_api/*.api.json — The same eight graphs in API format for /prompt and automation (curl, scripts)
assets/scripts/api_to_ui_workflow.py — Regenerates UI JSON from API JSON if you edit a graph programmatically
Time & risk
- Duration: 45 minutes (excluding model downloads, which may take 30–60 minutes depending on network speed)
- Risks:
- Model downloads require HuggingFace authentication and substantial bandwidth (~150 GB total)
- Port 8188 must be accessible for the ComfyUI web interface
- Rollback: Stop and remove the Docker container. Delete the
models/ directory to reclaim disk space.
- Last Updated: 05/26/2026