CLI Coding Agent

Symptom	Cause	Fix
`ollama: command not found`	Ollama not installed or PATH not updated	Rerun `curl -fsSL https://ollama.com/install.sh \| sh` and open a new shell
`ollama launch` reports unknown command	Ollama is older than v0.15	Update Ollama: `curl -fsSL https://ollama.com/install.sh \| sh`
Model load fails with version error or HTTP 412	Ollama version is too old for the model	Update Ollama: `curl -fsSL https://ollama.com/install.sh \| sh`
`model not found` when launching an agent	Model was not pulled	Run `ollama pull qwen3.6` and retry
`connection refused` to localhost:11434	Ollama service not running	Start with `ollama serve` or `sudo systemctl start ollama`
`ollama launch <agent>` exits immediately	Agent integration failed to initialize	Re-run `ollama launch <agent>`; if it persists, check `journalctl -u ollama`
Slow responses or OOM errors	Model variant too large for GPU memory	Switch to `qwen3.6:35b-a3b-nvfp4` or close other GPU workloads

NOTE

DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. If you see memory pressure, flush the buffer cache with:

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

Symptom	Cause	Fix
`ollama: command not found`	Ollama not installed or PATH not updated	Rerun `curl -fsSL https://ollama.com/install.sh \| sh` and open a new shell
`ollama launch` reports unknown command	Ollama is older than v0.15	Update Ollama: `curl -fsSL https://ollama.com/install.sh \| sh`
Model load fails with version error or HTTP 412	Ollama version is too old for the model	Update Ollama: `curl -fsSL https://ollama.com/install.sh \| sh`
`model not found` when launching an agent	Model was not pulled	Run `ollama pull qwen3.6` and retry
`connection refused` to localhost:11434	Ollama service not running	Start with `ollama serve` or `sudo systemctl start ollama`
`ollama launch <agent>` exits immediately	Agent integration failed to initialize	Re-run `ollama launch <agent>`; if it persists, check `journalctl -u ollama`
Slow responses or OOM errors	Model variant too large for GPU memory	Switch to `qwen3.6:35b-a3b-nvfp4` or close other GPU workloads

NOTE

DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. If you see memory pressure, flush the buffer cache with:

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

Resources

CLI Coding Agent

Resources