Text to Knowledge Graph

30 MIN

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

SymptomCauseFix
Ollama performance issuesSuboptimal settings for DGX SparkSet environment variables:
OLLAMA_FLASH_ATTENTION=1 (enables flash attention for better performance)
OLLAMA_KEEP_ALIVE=30m (keeps model loaded for 30 minutes)
OLLAMA_MAX_LOADED_MODELS=1 (avoids VRAM contention)
OLLAMA_KV_CACHE_TYPE=q8_0 (reduces KV cache VRAM with minimal performance impact)
VRAM exhausted or memory pressure (e.g. when switching between Ollama models)Linux buffer cache consuming GPU memoryFlush buffer cache: sudo sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Slow triple extractionLarge model or large context windowReduce document chunk size or use faster models
ArangoDB connection refusedService not fully startedWait 30s after start.sh, verify with docker ps

NOTE

DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. With many applications still updating to take advantage of UMA, you may encounter memory issues even when within the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'