Open WebUI with Ollama

15 MIN

Install Open WebUI and use Ollama to chat with models on your Spark

Common issues with setting up via NVIDIA Sync

SymptomCauseFix
Permission denied on docker psUser not in docker groupRun Step 1 completely, including terminal restart
Browser doesn't open automaticallyAuto-open setting disabledManually navigate to localhost:12000
Model download failsNetwork connectivity issuesCheck internet connection, retry download
GPU not detected in containerMissing --gpus=all flagRecreate container with correct start script
Port 12000 already in useAnother application using portChange port in Custom App settings or stop conflicting service

Common issues with manual setup

SymptomCauseFix
Permission denied on docker psUser not in docker groupRun Step 1 completely, including logging out and logging back in or use sudo
Model download failsNetwork connectivity issuesCheck internet connection, retry download
GPU not detected in containerMissing --gpus=all flagRecreate container with correct command
Port 8080 already in useAnother application using portChange port in docker command or stop conflicting service

NOTE

DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. With many applications still updating to take advantage of UMA, you may encounter memory issues even when within the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'