NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • MIG on DGX Station

data science

  • Topic Modeling
  • Text to Knowledge Graph on DGX Station

tools

  • NVFP4 Quantization

fine tuning

  • Nanochat Training

use case

  • NemoClaw with Nemotron-3-Super and vLLM on DGX Station
  • Local Coding Agent
  • Secure Long Running AI Agents with OpenShell on DGX Station

inference

  • Serve Qwen3-235B with vLLM

Image & Video Generation with ComfyUI

45 MIN

Generate images and videos with FLUX, Wan 2.1, HunyuanVideo, and Cosmos on DGX Station

ComfyUICosmosDGX StationDockerFLUXGB300HunyuanVideoImage GenerationVideo GenerationWan 2.1
View on GitHub
OverviewOverviewInstructionsInstructionsTroubleshootingTroubleshooting

Common issues

SymptomCauseFix
"permission denied" when running dockerUser not in docker groupRun sudo usermod -aG docker $USER && newgrp docker
Container fails to start with GPU errorNVIDIA Container Toolkit not configuredRun nvidia-ctk runtime configure --runtime=docker and restart Docker
ComfyUI web UI not accessibleFirewall blocking port or wrong IPVerify with docker logs comfyui, check that port 8188 is open, use http://<STATION_IP>:8188
"Model file not found" when running workflowModel not downloaded or wrong pathVerify models are in ./models/ and the volume mount is correct in the docker run command
HuggingFace download fails with 401Invalid or missing HF tokenVerify HF_TOKEN is exported and valid at huggingface.co/settings/tokens
CUDA out of memory during video generationFrame count or resolution too highReduce frame count or resolution. At 720p with Wan 2.1 14B, keep clips under 5 seconds initially
CUDA out of memory during 1080p HunyuanVideoModel + video tensors exceed GPU memoryUse fewer frames (e.g., 49 instead of 97). HunyuanVideo at 1080p needs ~100-120 GB
Workflow loads but nodes show red "missing"Custom node not installedUse ComfyUI-Manager (click Manager → Install Missing Custom Nodes) or rebuild the Docker image
Video output is a black screenVAE decode issue or wrong model variantEnsure you are using the correct model variant (T2V vs I2V) and the VAE is loaded
Very slow generation, GPU utilization lowPyTorch not using GPU or wrong CUDA versionRun nvidia-smi inside container: docker exec comfyui nvidia-smi. Ensure GPU is visible
"No module named ..." error on startupCustom node dependency not installedExec into container and install: docker exec comfyui pip install <module> then restart
Docker build fails on ARM64 with Could not find a version that satisfies the requirement onnxruntime-gpuonnxruntime-gpu has no aarch64 wheel on PyPIAlready handled by the shipped Dockerfile, which sed-substitutes onnxruntime-gpu → onnxruntime (CPU build) in every custom_node requirements.txt before pip install. If you see this error, you are building from a Dockerfile predating that fix — pull the latest assets and rebuild.
Docker build fails on ARM64 (other packages)Some custom-node dependencies have no aarch64 wheelFind the failing package in the build log. The custom-node install loop is wrapped in || true, so the build still completes but the affected node will be missing modules at runtime. Either skip the node (remove its directory from custom_nodes/ in the Dockerfile clone block) or install via ComfyUI-Manager after launch with a manually built wheel.
NGC image pull requires authenticationNGC registry needs loginRun docker login nvcr.io with your NGC API key
device >= 0 && device < num_gpus INTERNAL ASSERT FAILED on startupUsing --gpus all on a multi-GPU system causes a PyTorch assertionUse --gpus '"device=N"' to target the GB300 specifically (check index with nvidia-smi)
No HiDream models available warning on startupHiDream custom node reports no models foundThis is a warning, not an error. It clears once HiDream model files are downloaded (Tier 2)
Web UI: "Error: the workflow does not contain any nodes" when using LoadThe file is API format (flat node_id → {class_type, inputs}), not a UI workflowIn the playbook, use assets/workflows/<name>.json in the Load dialog (under user/default/workflows inside the container). For curl / HTTP API, use assets/workflow_api/<name>.api.json inside {"prompt": ...}.
huggingface-cli: command not found or download script errorsDeprecated CLI nameInstall huggingface_hub and use hf download (the script does this automatically).
Download script exits but models/diffusion_models/ is emptySilent failure in older scripts or wrong tokenRe-run with bash -x assets/scripts/download-models.sh 1; confirm HF_TOKEN and license acceptance on Hugging Face. The script now fails fast if a file is missing after hf download.
Container exits on startup with ModuleNotFoundError: torchaudioContainer was built from a Dockerfile predating the torchaudio shimRebuild the image: docker build -t comfyui-gb300 -f assets/Dockerfile .. The shipped Dockerfile creates an import-only torchaudio stub (NGC PyTorch's custom NVFP4 ABI is incompatible with PyPI torchaudio wheels). Lightricks audio VAE workflows are not supported in this image; no other workflow needs torchaudio.
OSError: ... undefined symbol: torch_dtype_float4_e2m1fn_x2 from torchaudioReal torchaudio installed on top of NGC PyTorchSame fix as above — rebuild from the shipped Dockerfile. Do not pip install torchaudio manually inside the container.
DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU deviceExpected on aarch64. PyPI has no onnxruntime-gpu wheel for arm64; the Dockerfile substitutes the CPU onnxruntime packageInformational warning, not an error. DWPose preprocessing runs on CPU (slower than GPU) but produces correct output.
aimdo: ... funchook_prepare(cuMemFree_v2) failed: 8 Failed to allocate memory in unused regions at startupNGC PyTorch's CUDA-hooks diagnostic tool (aimdo) cannot install hooks under default container caps and falls back to no-opBenign. ComfyUI works normally; the message is informational from the NGC base image. No action required.
RequestsDependencyWarning: urllib3 (...) or charset_normalizer (...) doesn't match a supported version! at startupVersion skew between requests and the NGC-pinned urllib3 / charset_normalizer wheelsBenign. ComfyUI's HTTP traffic still works. Suppress with PYTHONWARNINGS=ignore::requests.RequestsDependencyWarning if it bothers you.

NOTE

ComfyUI logs are visible with docker logs -f comfyui. Most errors (missing models, node failures) are reported in these logs with clear messages.

Resources

  • ComfyUI (GitHub)
  • ComfyUI Examples
  • FLUX.1 on HuggingFace
  • Wan 2.1 on HuggingFace
  • NVIDIA Cosmos-Predict2
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation