Spark & Reachy Photo Booth
AI augmented photo booth using the DGX Spark and Reachy Mini.
Basic idea
Spark & Reachy Photo Booth is an interactive and event-driven photo booth demo that combines the DGX Spark™ with the Reachy Mini robot to create an engaging multimodal AI experience. The system showcases:
- A multi-modal agent built with the
NeMo Agent Toolkit - A ReAct loop driven by the
openai/gpt-oss-20bLLM powered byTensorRT-LLM - Voice interaction based on
nvidia/riva-parakeet-ctc-1.1Bandhexgrad/Kokoro-82M - Image generation with
black-forest-labs/FLUX.1-Kontext-devfor image-to-image restyling - User position tracking built with
facebookresearch/detectron2andFoundationVision/ByteTrack - MinIO for storing captured/generated images as well as sharing them via QR-code
The demo is based on a several services that communicate through a message bus.
NOTE
This playbook applies to both the Reachy Mini and Reachy Mini Lite robots. For simplicity, we’ll refer to the robot as Reachy throughout this playbook.
What you'll accomplish
You'll deploy a complete photo booth system on DGX Spark running multiple inference models locally — LLM, image generation, speech recognition, speech generation, and computer vision — all without cloud dependencies. The Reachy robot interacts with users through natural conversation, captures photos, and generates custom images based on prompts, demonstrating real-time multimodal AI processing on edge hardware.
What to know before starting
- Basic Docker and Docker Compose knowledge
- Basic network configuration skills
Prerequisites
Hardware Requirements:
- NVIDIA DGX Spark
- A monitor, a keyboard, and a mouse to run this playbook directly on the DGX Spark.
- Reachy Mini or Reachy Mini Lite robot
TIP
Make sure your Reachy robot firmware is up to date. You can find instructions to update it here. Software Requirements:
- The official DGX Spark OS image including all required utilities such as Git, Docker, NVIDIA drivers, and the NVIDIA Container Toolkit
- An internet connection for the DGX Spark
- NVIDIA NGC Personal API Key (
NVIDIA_API_KEY). Create a key if necessary. Make sure to enable theNGC Catalogscope when creating the key. - Hugging Face access token (
HF_TOKEN). Create a token if necessary. Make sure to create a token with Read access to contents of all public gated repos you can access permission.
Ancillary files
All required assets can be found in the Spark & Reachy Photo Booth repository.
- The Docker Compose application
- Various configuration files
- Source code for all the services
- Detailed documentation
Time & risk
- Estimated time: 2 hours including hardware setup, container building, and model downloads
- Risk level: Medium
- Rollback: Docker containers can be stopped and removed to free resources. Downloaded models can be deleted from cache directories. Robot and peripheral connections can be safely disconnected. Network configurations can be reverted by removing custom settings.
- Last Updated: 01/27/2026
- 1.0.0 First Publication
Governing terms
Your use of the Spark Playbook scripts is governed by Apache License, Version 2.0 and enables use of separate open source and proprietary software governed by their respective licenses: Flux.1-Kontext NIM, Parakeet 1.1b CTC en-US ASR NIM, TensorRT-LLM, minio/minio, arizephoenix/phoenix, grafana/otel-lgtm, Python, Node.js, nginx, busybox, UV Python Packager, Redpanda, Redpanda Console, gpt-oss-20b, FLUX.1-Kontext-dev, FLUX.1-Kontext-dev-onnx.
NOTE
FLUX.1-Kontext-dev and FLUX.1-Kontext-dev-onnx are models released for non-commercial use. Contact sales@blackforestlabs.ai for commercial terms. You are responsible for accepting the applicable License Agreements and Acceptable Use Policies, and for ensuring your HF token has the correct permissions.

