Description: Verify the OS version and GPU are visible before installing anything.
cat /etc/os-release | head -n 2
nvidia-smi
Expected output should show Ubuntu 24.04.3 LTS (DGX OS 7.3.1 base) and a detected GPU.
Description: Install Ollama or ensure it is recent enough to support ollama launch.
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
If Ollama is already installed, just verify the version:
ollama --version
Expected output should show Ollama v0.15 or newer.
Description: Download the Qwen3.6 model weights to your Spark node.
ollama pull qwen3.6
Optional variants if you want different memory footprints or precision:
ollama pull qwen3.6:35b-a3b-nvfp4 # NVIDIA FP4 build tuned for Blackwell (~22GB)
ollama pull qwen3.6:35b-a3b-q8_0 # Higher-quality 8-bit quant (~39GB)
ollama pull qwen3.6:35b-a3b-bf16 # Full precision (~71GB)
Expected output should show qwen3.6 (and any optional variants) in ollama list.
Description: Run a quick prompt to confirm the model loads.
ollama run qwen3.6
Try a prompt like:
Write a short README checklist for a Python project.
Expected output should show the model responding in the terminal. When you are done, type /bye or press Ctrl+D to exit the interactive session before continuing.
Description: Use Ollama's built-in launch method to start Claude Code against your local model. No environment variables or config files are required.
ollama launch claude
Expected output should show Claude Code starting and using the local Qwen3.6 model. Qwen3.6 ships with a 256K context window by default; adjust context length through Ollama's settings if you need to tune it further.
Description: Create a tiny repo and let Claude Code implement a function and tests.
mkdir -p ~/cli-agent-demo
cd ~/cli-agent-demo
printf 'def add(a, b):\n """Return the sum of a and b."""\n pass\n' > math_utils.py
printf 'import math_utils\n\n\ndef test_add():\n assert math_utils.add(1, 2) == 3\n' > test_math_utils.py
If you do not already have pytest installed:
python -m pip install -U pytest
In Claude Code:
Please implement add() in math_utils.py and make sure the test passes.
Run the test:
python -m pytest -q
Expected output should show the test passing.
Description: Remove the model and stop services if you no longer need them.
To stop the service:
sudo systemctl stop ollama
WARNING
This will delete the downloaded model files.
ollama rm qwen3.6
qwen3.6:35b-a3b-nvfp4 or bf16 variants for different quality/VRAM tradeoffs