NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data science

  • Single-cell RNA Sequencing
  • Portfolio Optimization
  • CUDA-X Data Science
  • Text to Knowledge Graph
  • Optimized JAX

tools

  • DGX Dashboard
  • Comfy UI
  • Connect Three DGX Spark in a Ring Topology
  • Connect Multiple DGX Spark through a Switch
  • RAG Application in AI Workbench
  • Set up Tailscale on Your Spark
  • VS Code

fine tuning

  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Fine-tune with Pytorch
  • Unsloth on DGX Spark

use case

  • NemoClaw with Nemotron 3 Super and Telegram on DGX Spark
  • Secure Long Running AI Agents with OpenShell on DGX Spark
  • OpenClaw 🦞
  • Live VLM WebUI
  • Install and Use Isaac Sim and Isaac Lab
  • Vibe Coding in VS Code
  • Build and Deploy a Multi-Agent Chatbot
  • Connect Two Sparks
  • NCCL for Two Sparks
  • Build a Video Search and Summarization (VSS) Agent
  • Spark & Reachy Photo Booth

inference

  • Run models with llama.cpp on DGX Spark
  • vLLM for Inference
  • Nemotron-3-Nano with llama.cpp
  • Speculative Decoding
  • SGLang for Inference
  • TRT LLM for Inference
  • NVFP4 Quantization
  • Multi-modal Inference
  • NIM on Spark
  • LM Studio on DGX Spark

NCCL for Two Sparks

30 MIN

Install and test NCCL on two Sparks

DGXSpark
OverviewOverviewRun on two SparksRun on two SparksTroubleshootingTroubleshooting

Basic idea

NCCL (NVIDIA Collective Communication Library) enables high-performance GPU-to-GPU communication across multiple nodes. This walkthrough sets up NCCL for multi-node distributed training on DGX Spark systems with Blackwell architecture. You'll configure networking, build NCCL from source with Blackwell support, and validate communication between nodes.

What you'll accomplish

You'll have a working multi-node NCCL environment that enables high-bandwidth GPU communication across DGX Spark systems for distributed training workloads, with validated network performance and proper GPU topology detection.

What to know before starting

  • Working with Linux network configuration and netplan
  • Basic understanding of MPI (Message Passing Interface) concepts
  • SSH key management and passwordless authentication setup

Prerequisites

  • Two DGX Spark systems
  • Completed the Connect two Sparks playbook
  • NVIDIA driver installed: nvidia-smi
  • CUDA toolkit available: nvcc --version
  • Root/sudo privileges: sudo whoami

Time & risk

  • Duration: 30 minutes for setup and validation
  • Risk level: Medium - involves network configuration changes
  • Rollback: The NCCL & NCCL Tests repositories can be deleted from DGX Spark
  • Last Updated: 12/15/2025
    • Use nccl latest version v2.28.9-1

Resources

  • NCCL Documentation
  • DGX Spark Documentation
  • DGX Spark Forum
  • DGX Spark User Performance Guide
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation