NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data science

  • Single-cell RNA Sequencing
  • Portfolio Optimization
  • CUDA-X Data Science
  • Text to Knowledge Graph
  • Optimized JAX

tools

  • DGX Dashboard
  • Comfy UI
  • RAG Application in AI Workbench
  • Set up Tailscale on Your Spark
  • VS Code
  • Connect Three DGX Spark in a Ring Topology
  • Connect Multiple DGX Spark through a Switch

fine tuning

  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Fine-tune with Pytorch
  • Unsloth on DGX Spark

use case

  • NemoClaw with Nemotron 3 Super and Telegram on DGX Spark
  • cuTile Kernels
  • CLI Coding Agent
  • Live VLM WebUI
  • Install and Use Isaac Sim and Isaac Lab
  • Vibe Coding in VS Code
  • Build and Deploy a Multi-Agent Chatbot
  • Connect Two Sparks
  • NCCL for Two Sparks
  • Build a Video Search and Summarization (VSS) Agent
  • Spark & Reachy Photo Booth
  • Secure Long Running AI Agents with OpenShell on DGX Spark
  • OpenClaw 🦞

inference

  • LM Studio on DGX Spark
  • Speculative Decoding
  • Run models with llama.cpp on DGX Spark
  • Nemotron-3-Nano with llama.cpp
  • SGLang for Inference
  • TRT LLM for Inference
  • NVFP4 Quantization
  • Multi-modal Inference
  • NIM on Spark
  • vLLM for Inference

CLI Coding Agent

20 MINS

Build local CLI coding agents with Ollama

Claude CodeCodexCodingLLMOllamaOpenCodeQwen
OverviewOverviewClaude CodeClaude CodeOpenCodeOpenCodeCodex CLICodex CLITroubleshootingTroubleshooting

Step 1
Confirm your environment

Description: Verify the OS version and GPU are visible before installing anything.

cat /etc/os-release | head -n 2
nvidia-smi

Expected output should show Ubuntu 24.04.3 LTS (DGX OS 7.3.1 base) and a detected GPU.

Step 2
Install or update Ollama

Description: Install Ollama or ensure it is recent enough to support ollama launch.

curl -fsSL https://ollama.com/install.sh | sh
ollama --version

If Ollama is already installed, just verify the version:

ollama --version

Expected output should show Ollama v0.15 or newer.

Step 3
Pull Qwen3.6

Description: Download the Qwen3.6 model weights to your Spark node.

ollama pull qwen3.6

Optional variants if you want different memory footprints or precision:

ollama pull qwen3.6:35b-a3b-nvfp4   # NVIDIA FP4 build tuned for Blackwell (~22GB)
ollama pull qwen3.6:35b-a3b-q8_0    # Higher-quality 8-bit quant (~39GB)
ollama pull qwen3.6:35b-a3b-bf16    # Full precision (~71GB)

Expected output should show qwen3.6 (and any optional variants) in ollama list.

Step 4
Test local inference (optional)

Description: Run a quick prompt to confirm the model loads.

ollama run qwen3.6

Try a prompt like:

Write a short README checklist for a Python project.

Expected output should show the model responding in the terminal. When you are done, type /bye or press Ctrl+D to exit the interactive session before continuing.

Step 5
Launch Claude Code with Ollama

Description: Use Ollama's built-in launch method to start Claude Code against your local model. No environment variables or config files are required.

ollama launch claude

Expected output should show Claude Code starting and using the local Qwen3.6 model. Qwen3.6 ships with a 256K context window by default; adjust context length through Ollama's settings if you need to tune it further.

Step 6
Complete a small coding task

Description: Create a tiny repo and let Claude Code implement a function and tests.

mkdir -p ~/cli-agent-demo
cd ~/cli-agent-demo

printf 'def add(a, b):\n    """Return the sum of a and b."""\n    pass\n' > math_utils.py
printf 'import math_utils\n\n\ndef test_add():\n    assert math_utils.add(1, 2) == 3\n' > test_math_utils.py

If you do not already have pytest installed:

python -m pip install -U pytest

In Claude Code:

Please implement add() in math_utils.py and make sure the test passes.

Run the test:

python -m pytest -q

Expected output should show the test passing.

Step 7
Cleanup and rollback

Description: Remove the model and stop services if you no longer need them.

To stop the service:

sudo systemctl stop ollama

WARNING

This will delete the downloaded model files.

ollama rm qwen3.6

Step 8
Next steps

  • Try the qwen3.6:35b-a3b-nvfp4 or bf16 variants for different quality/VRAM tradeoffs
  • Use Claude Code on multi-file refactors or test-generation tasks
  • Explore the full 256K context window on larger codebases

Resources

  • Ollama Documentation
  • Ollama Launch Method
  • Qwen3.6 Model Page
  • Claude Code + Ollama Guide
  • OpenCode Ollama Provider
  • Codex + Ollama Guide
  • DGX Spark Documentation
  • DGX Spark Forum
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation