Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes
nvcc --version or nvidia-smiUse the following command to install the CUDA-X libraries (this will create a new conda environment)
conda create -n rapids-test -c rapidsai-nightly -c conda-forge -c nvidia \
rapids=25.10 python=3.12 'cuda-version=13.0' \
jupyter hdbscan umap-learn
conda activate rapids-test
git clone https://github.com/NVIDIA/dgx-spark-playbooks
There are two notebooks in the GitHub repository. One runs an example of a large strings data processing workflow with pandas code on GPU.
localhost:8888 in your browser to access the notebook
jupyter notebook cudf_pandas_demo.ipynb
The other goes over an example of machine learning algorithms including UMAP and HDBSCAN.
localhost:8888 in your browser to access the notebook
jupyter notebook cuml_sklearn_demo.ipynb
If you are remotely accessing your DGX-Spark then make sure to forward the necesary port to access the notebook in your local browser. Use the below instruction for port fowarding
ssh -N -L YYYY:localhost:XXXX username@remote_host
YYYY: The local port you want to use (e.g. 8888)XXXX: The port you specified when starting Jupyter Notebook on the remote machine (e.g. 8888)-N: Prevents SSH from executing a remote command-L: Spcifies local port forwarding