NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data-science

  • Optimized JAX
  • Text to Knowledge Graph

tools

  • Comfy UI
  • DGX Dashboard
  • VS Code
  • RAG application in AI Workbench
  • Set up Tailscale on your Spark

fine-tuning

  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Fine tune with Pytorch
  • Unsloth on DGX Spark
  • Vision-Language Model Fine-tuning

use-case

  • Build and Deploy a Multi-Agent Chatbot
  • NCCL for Two Sparks
  • Connect Two Sparks
  • Video Search and Summarization

inference

  • Multi-modal Inference
  • NIM on Spark
  • NVFP4 Quantization
  • Speculative Decoding
  • TRT LLM for Inference
  • Install and Use vLLM for Inference

Connect Two Sparks

1 HR

Connect two Spark devices and setup them up for inference and fine-tuning

View on GitHub

Basic idea

Configure two DGX Spark systems for high-speed inter-node communication using 200GbE direct QSFP connections. This setup enables distributed workloads across multiple DGX Spark nodes by establishing network connectivity and configuring SSH authentication.

What you'll accomplish

You will physically connect two DGX Spark devices with a QSFP cable, configure network interfaces for cluster communication, and establish passwordless SSH between nodes to create a functional distributed computing environment.

What to know before starting

  • Basic understanding of distributed computing concepts
  • Working with network interface configuration and netplan
  • Experience with SSH key management

Prerequisites

  • Two DGX Spark systems
  • One QSFP cable for direct 200GbE connection between two devices
  • SSH access available to both systems
  • Root or sudo access on both systems: sudo whoami
  • The same username on both systems

Ancillary files

All required files for this playbook can be found here on GitHub

  • discover-sparks.sh script for automatic node discovery and SSH key distribution

Time & risk

  • Duration: 1 hour including validation

  • Risk level: Medium - involves network reconfiguration

  • Rollback: Network changes can be reversed by removing netplan configs or IP assignments

Resources

  • DGX Spark Documentation
  • DGX Spark Forum
  • Terms of Service