| Symptom | Cause | Fix |
|---|---|---|
ollama: command not found | Ollama not installed or PATH not updated | Rerun `curl -fsSL https://ollama.com/install.sh |
| Model load fails with version error | Ollama is older than 0.15.0 | Update Ollama to 0.15.0 or newer (required for GLM-4.7-Flash). Do not pin to 0.14.3. |
model not found in Claude Code | Model was not pulled | Run ollama pull glm-4.7-flash or ollama pull hf.co/unsloth/GLM-4.7-GGUF:Q8_0 and retry. Use the same model name in claude --model .... |
connection refused to localhost:11434 | Ollama service not running | Start with ollama serve or sudo systemctl start ollama |
| Slow responses or OOM | Insufficient GPU memory or fragmentation | On DGX Station (NVIDIA GB300), ensure no other heavy GPU workloads. If OOM persists, use a smaller variant (e.g. glm-4.7-flash:q8_0 or glm-4.7-flash:q4_K_M) or OLLAMA_MAX_LOADED_MODELS=1. |
claude: command not found after install | CLI not on PATH or install script did not complete | Restart the terminal or run source ~/.bashrc (or your shell profile). Check the install script output for the install path and add it to PATH. |
| Claude Code install fails (Node.js / network) | Node.js missing or install script cannot download | Ensure Node.js is installed (node --version). If the install script fails with a network error, retry from a stable connection or download the Claude Code CLI from the official site. See Claude Code documentation for alternatives. |
NOTE
DGX Station with NVIDIA GB300 provides ample GPU memory for glm-4.7-flash (fast testing) and unsloth/GLM-4.7-GGUFglm-4.7-flash:bf16). Use OLLAMA_MAX_LOADED_MODELS=1 if you hit memory limits with multiple models.