Basic idea
Use Ollama on DGX Spark to run a local coding model and connect a CLI coding agent. This
playbook supports three options: Claude Code, OpenCode, and Codex CLI. Each
agent is wired up with Ollama's built-in launch method (ollama launch <agent>), so you
can work without environment variables, provider config files, or external cloud APIs.
Choose your CLI agent
Pick the tab that matches the CLI agent you want to use:
- Claude Code: Fastest path to a working CLI agent with a local Ollama model.
- OpenCode: Open-source CLI launched directly from Ollama.
- Codex CLI: OpenAI Codex CLI launched directly from Ollama against the local model.
What you'll accomplish
You will run a local coding model (Qwen3.6) on your DGX Spark with Ollama, launch your chosen CLI agent against it with a single command, and complete a small coding task end-to-end.
What to know before starting
- Comfort with Linux command line basics
- Experience running terminal-based tools and editors
- Familiarity with Python for the short coding task
Prerequisites
- DGX Spark access with NVIDIA DGX OS 7.3.1 (Ubuntu 24.04.3 LTS base)
- Internet access to download model weights
- Ollama v0.15 or newer (required for
ollama launch) - GPU memory depends on the Qwen3.6 variant you choose:
qwen3.6:latest(35B-a3b, MoE) — ~24GB, 256K contextqwen3.6:35b-a3b-nvfp4— ~22GB, NVIDIA FP4 build tuned for Blackwell (DGX Spark)qwen3.6:35b-a3b-q8_0— ~39GB, higher-quality quantqwen3.6:35b-a3b-bf16— ~71GB, full precision (fits Spark's unified memory)
Time & risk
- Duration: ~15-25 minutes (mostly model download time)
- Risk level: Low
- Large model downloads can fail if network connectivity is unstable
- Ollama versions older than 0.15 do not support
ollama launch
- Rollback: Stop Ollama and delete the downloaded model from
~/.ollama/models - Last Updated: 04/16/2026
- Switched to
ollama launchmethod and upgraded the default model to Qwen3.6
- Switched to