---
title: "Text to Knowledge Graph"
publisher: "nvidia"
type: "playbook"
updated: "2025-10-13T17:51:53.596Z"
description: "Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization"
canonical: "https://build.nvidia.com/spark/txt2kg.md"
---

# Basic idea

This playbook demonstrates how to build and deploy a comprehensive knowledge graph generation and visualization solution that serves as a reference for knowledge graph extraction.
The unified memory architecture enables running larger, more accurate models that produce higher-quality knowledge graphs and deliver superior downstream GraphRAG performance.

This txt2kg playbook transforms unstructured text documents into structured knowledge graphs using:
- **Knowledge Triple Extraction**: Using Ollama with GPU acceleration for local LLM inference to extract subject-predicate-object relationships
- **Graph Database Storage**: ArangoDB for storing and querying knowledge triples with relationship traversal
- **GPU-Accelerated Visualization**: Three.js WebGPU rendering for interactive 2D/3D graph exploration

> **Future Enhancements**: Vector embeddings and GraphRAG capabilities are planned enhancements.

# What you'll accomplish

You will have a fully functional system capable of processing documents, generating and editing knowledge graphs, and providing querying, accessible through an interactive web interface.
The setup includes:
- **Local LLM Inference**: Ollama for GPU-accelerated LLM inference with no API keys required
- **Graph Database**: ArangoDB for storing and querying triples with relationship traversal
- **Interactive Visualization**: GPU-accelerated graph rendering with Three.js WebGPU
- **Modern Web Interface**: Next.js frontend with document management and query interface
- **Fully Containerized**: Reproducible deployment with Docker Compose and GPU support

# Prerequisites

-  DGX Spark with latest NVIDIA drivers
-  Docker installed and configured with NVIDIA Container Toolkit
-  Docker Compose

# Time & risk

- **Duration**:
- 2-3 minutes for initial setup and container deployment
- 5-10 minutes for Ollama model download (depending on model size)
- Immediate document processing and knowledge graph generation

- **Risks**:
- GPU memory requirements depend on chosen Ollama model size
- Document processing time scales with document size and complexity

- **Rollback**: Stop and remove Docker containers, delete downloaded models if needed
- **Last Updated**: 01/08/2025
- Migrated from Pinecone to Qdrant for ARM64 compatibility
- Added vLLM support with Neo4j
- Added Palette UI components with accessibility improvements
- Added CPU-only mode for development (`./start.sh --cpu`)
- Optimized ArangoDB with deterministic keys and BM25 search
- Added GNN preprocessing scripts for knowledge graph training

## More

- [Instructions](/spark/txt2kg/instructions.md)
- [Troubleshooting](/spark/txt2kg/troubleshooting.md)