NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data science

  • Single-cell RNA Sequencing
  • Portfolio Optimization
  • CUDA-X Data Science
  • Text to Knowledge Graph
  • Optimized JAX

tools

  • DGX Dashboard
  • Comfy UI
  • Connect Three DGX Spark in a Ring Topology
  • Connect Multiple DGX Spark through a Switch
  • RAG Application in AI Workbench
  • Set up Tailscale on Your Spark
  • VS Code

fine tuning

  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Fine-tune with Pytorch
  • Unsloth on DGX Spark

use case

  • NemoClaw with Nemotron 3 Super and Telegram on DGX Spark
  • Secure Long Running AI Agents with OpenShell on DGX Spark
  • OpenClaw 🦞
  • Live VLM WebUI
  • Install and Use Isaac Sim and Isaac Lab
  • Vibe Coding in VS Code
  • Build and Deploy a Multi-Agent Chatbot
  • Connect Two Sparks
  • NCCL for Two Sparks
  • Build a Video Search and Summarization (VSS) Agent
  • Spark & Reachy Photo Booth

inference

  • Speculative Decoding
  • Run models with llama.cpp on DGX Spark
  • vLLM for Inference
  • Nemotron-3-Nano with llama.cpp
  • SGLang for Inference
  • TRT LLM for Inference
  • NVFP4 Quantization
  • Multi-modal Inference
  • NIM on Spark
  • LM Studio on DGX Spark

CLI Coding Agent

20 MINS

Build local CLI coding agents with Ollama

Claude CodeCodexCodingLLMOllamaOpenCodeQwen
OverviewOverviewClaude CodeClaude CodeOpenCodeOpenCodeCodex CLICodex CLITroubleshootingTroubleshooting

Basic idea

Use Ollama on DGX Spark to run a local coding model and connect a CLI coding agent. This playbook supports three options: Claude Code, OpenCode, and Codex CLI. Each agent is wired up with Ollama's built-in launch method (ollama launch <agent>), so you can work without environment variables, provider config files, or external cloud APIs.

Choose your CLI agent

Pick the tab that matches the CLI agent you want to use:

  • Claude Code: Fastest path to a working CLI agent with a local Ollama model.
  • OpenCode: Open-source CLI launched directly from Ollama.
  • Codex CLI: OpenAI Codex CLI launched directly from Ollama against the local model.

What you'll accomplish

You will run a local coding model (Qwen3.6) on your DGX Spark with Ollama, launch your chosen CLI agent against it with a single command, and complete a small coding task end-to-end.

What to know before starting

  • Comfort with Linux command line basics
  • Experience running terminal-based tools and editors
  • Familiarity with Python for the short coding task

Prerequisites

  • DGX Spark access with NVIDIA DGX OS 7.3.1 (Ubuntu 24.04.3 LTS base)
  • Internet access to download model weights
  • Ollama v0.15 or newer (required for ollama launch)
  • GPU memory depends on the Qwen3.6 variant you choose:
    • qwen3.6:latest (35B-a3b, MoE) — ~24GB, 256K context
    • qwen3.6:35b-a3b-nvfp4 — ~22GB, NVIDIA FP4 build tuned for Blackwell (DGX Spark)
    • qwen3.6:35b-a3b-q8_0 — ~39GB, higher-quality quant
    • qwen3.6:35b-a3b-bf16 — ~71GB, full precision (fits Spark's unified memory)

Time & risk

  • Duration: ~15-25 minutes (mostly model download time)
  • Risk level: Low
    • Large model downloads can fail if network connectivity is unstable
    • Ollama versions older than 0.15 do not support ollama launch
  • Rollback: Stop Ollama and delete the downloaded model from ~/.ollama/models
  • Last Updated: 04/16/2026
    • Switched to ollama launch method and upgraded the default model to Qwen3.6

Resources

  • Ollama Documentation
  • Ollama Launch Method
  • Qwen3.6 Model Page
  • Claude Code + Ollama Guide
  • OpenCode Ollama Provider
  • Codex + Ollama Guide
  • DGX Spark Documentation
  • DGX Spark Forum
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation