Text to Knowledge Graph on DGX Station

30 MIN

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

Common issues

SymptomCauseFix
Ollama performance issuesSuboptimal settings for GB300Set environment variables:
OLLAMA_FLASH_ATTENTION=1 (enables flash attention for better performance)
OLLAMA_KEEP_ALIVE=30m (keeps model loaded for 30 minutes)
OLLAMA_MAX_LOADED_MODELS=1 (avoids VRAM contention)
OLLAMA_KV_CACHE_TYPE=q8_0 (reduces KV cache VRAM with minimal performance impact)
VRAM exhausted or memory pressure (e.g. when switching between Ollama models)GPU memory fragmentationClear GPU memory: nvidia-smi --gpu-reset or restart Docker containers
Slow triple extractionLarge model or large context windowReduce document chunk size or use faster models
ArangoDB connection refusedService not fully startedWait 30s after start.sh, verify with docker ps
Container fails to start with GPU errorNVIDIA Container Toolkit not configuredRun nvidia-ctk runtime configure --runtime=docker and restart Docker
Port already in usePrevious instance still runningRun ./stop.sh first or use docker compose down
Default is vLLM; need Ollama insteadPrefer ArangoDB + OllamaStart with ./start.sh --ollama.
vLLM takes long to become readyModel load can take 30+ minutesThe start script waits and shows elapsed time. The UI shows a banner and "vLLM (Local) – Initializing…" until ready. Check progress: docker logs vllm-service -f.

NOTE

DGX Station with GB300 Ultra provides massive GPU memory capacity, enabling you to run larger models (70B+) for higher-quality knowledge extraction. If you encounter memory issues with very large models, try reducing the context window size or using quantized model variants.