NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • Set Up Local Network Access
  • Open WebUI with Ollama

data science

  • Single-cell RNA Sequencing
  • Portfolio Optimization
  • CUDA-X Data Science
  • Text to Knowledge Graph
  • Optimized JAX

tools

  • DGX Dashboard
  • Comfy UI
  • Connect Three DGX Spark in a Ring Topology
  • Connect Multiple DGX Spark through a Switch
  • RAG Application in AI Workbench
  • Set up Tailscale on Your Spark
  • VS Code

fine tuning

  • FLUX.1 Dreambooth LoRA Fine-tuning
  • LLaMA Factory
  • Fine-tune with NeMo
  • Fine-tune with Pytorch
  • Unsloth on DGX Spark

use case

  • NemoClaw with Nemotron 3 Super and Telegram on DGX Spark
  • Secure Long Running AI Agents with OpenShell on DGX Spark
  • OpenClaw 🦞
  • Live VLM WebUI
  • Install and Use Isaac Sim and Isaac Lab
  • Vibe Coding in VS Code
  • Build and Deploy a Multi-Agent Chatbot
  • Connect Two Sparks
  • NCCL for Two Sparks
  • Build a Video Search and Summarization (VSS) Agent
  • Spark & Reachy Photo Booth

inference

  • Speculative Decoding
  • Run models with llama.cpp on DGX Spark
  • vLLM for Inference
  • Nemotron-3-Nano with llama.cpp
  • SGLang for Inference
  • TRT LLM for Inference
  • NVFP4 Quantization
  • Multi-modal Inference
  • NIM on Spark
  • LM Studio on DGX Spark

CUDA-X Data Science

30 MIN

Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes

DGXSparkclusteringdata analyticsdata sciencedimensionality reductionmachine learningpandas
View on GitHub
OverviewOverviewInstructionsInstructions

Step 1
Verify system requirements

  • Verify the system has CUDA 13 installed using nvcc --version or nvidia-smi
  • Install conda using these instructions
  • Create Kaggle API key using these instructions and place the kaggle.json file in the same folder as the notebook

Step 2
Installing Data Science libraries

Use the following command to install the CUDA-X libraries (this will create a new conda environment)

  conda create -n rapids-test -c rapidsai -c conda-forge -c nvidia  \
  rapids=25.10 python=3.12 'cuda-version=13.0' \
  jupyter hdbscan umap-learn

Step 3
Activate the conda environment

  conda activate rapids-test

Step 4
Cloning the playbook repository

  • Clone the github repository and go the assets folder place in cuda-x-data-science folder
      git clone https://github.com/NVIDIA/dgx-spark-playbooks
    
  • Place the kaggle.json created in Step 1 in the assets folder

Step 5
Run the notebooks

There are two notebooks in the GitHub repository. One runs an example of a large strings data processing workflow with pandas code on GPU.

  • Run the cudf_pandas_demo.ipynb notebook and use localhost:8888 in your browser to access the notebook
      jupyter notebook cudf_pandas_demo.ipynb
    

The other goes over an example of machine learning algorithms including UMAP and HDBSCAN.

  • Run the cuml_sklearn_demo.ipynb notebook and use localhost:8888 in your browser to access the notebook
      jupyter notebook cuml_sklearn_demo.ipynb
    

If you are remotely accessing your DGX-Spark then make sure to forward the necesary port to access the notebook in your local browser. Use the below instruction for port fowarding

  ssh -N -L YYYY:localhost:XXXX username@remote_host 
  • YYYY: The local port you want to use (e.g. 8888)
  • XXXX: The port you specified when starting Jupyter Notebook on the remote machine (e.g. 8888)
  • -N: Prevents SSH from executing a remote command
  • -L: Spcifies local port forwarding

Resources

  • NVIDIA RAPIDS Documentation
  • DGX Spark Documentation
  • DGX Spark DevZone Forum
  • DGX Spark User Performance Guide
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation