Investigate, understand, and interpret single cell data in minutes, not days by leveraging RAPIDS-singlecell, powered by NVIDIA RAPIDS
Investigate, understand, and interpret single-cell data in minutes, not days, by leveraging RAPIDS-singlecell, powered by NVIDIA RAPIDS™
For single-cell analysis, scientists can test near-real-time data analysis and visualization easily, achieving up to 938X faster accelerations versus CPU by using RAPIDS-singlecell, developed by scverse. This blueprint is for scientists who understand single-cell analysis and want to leverage RAPIDS for single-cell data.
It is strongly recommended that users review the README in this blueprint before working through the notebooks.
For this blueprint, two possible deployments are provided:
Please use the table in the Notebook Overview below to determine which size is right for you.
The workflow is as follows:
The outline below is a suggested exploration flow. Unless otherwise noted, users can choose any notebook to get started, as long as the GPU resources are available to run the notebook.
For those who are new to doing basic analysis for single-cell data, the end-to-end analysis of 01_demo_gpu_e2e is the best place to start, where users are walked through the steps of data preprocessing, cleanup, visualization, and investigation.
Notebook | Description | Min GPU Size / Instance |
---|---|---|
01_demo_gpu_e2e | End-to-end workflow, where we understand the cells, run ETL on the dataset then visualize and explore the results. This tutorial is good for all users. | 24GB / Standard Instance |
02_decoupler | This notebook continues from the outputs of 01_demo_gpu_e2e as an overview of methods that can be used to investigate transcriptional regulation. | 24GB / Standard Instance |
demo_gpu_e2e_with_PR | End-to-end workflow, like 01_demo_gpu_e2e, but uses Pearson residuals for normalization. | 24GB / Standard Instance |
spatial_autocorr | An introduction to spatial transcriptomics analysis and visualization. | 24GB / Standard Instance |
out-of-core_processing | In this notebook, we show the scalability of the analysis of up to 11M cells easily by using Dask. Requires a 48GB GPU. | 48GB / Standard Instance |
multi_gpu_large_data_showcase | This notebook enhances the 11M cell dataset analysis with Dask without exceeding memory limits. It fully scales to utilize all available GPUs, uses chunk-based execution, and efficiently manages memory. Requires 8x H100s or better. For all other GPU systems, please run out-of-core_processing instead. | 8x 80GB / Large Instance |
demo_gpu-seuratv3 | In this notebook, show diversity in capability by running a similar workflow to 01_demo_gpu_e2e but on brain cells. | 24GB / Standard Instance |
demo_gpu-seuratv3-brain-1M | In this notebook, we scale up the analysis of demo_gpu-seuratv3 to 1 million brain cells. Requires an 80GB GPU, like an H100. | 80GB / Large Instance |
The following containers are used in this blueprint:
Additional software—including use of RAPIDS-singlecell, developed by scverse—is available on GitHub accompanying these notebooks.
The single-cell analysis blueprint recommends using L40s with minimum 24GB VRAM, unless otherwise stated in the tutorial. Users may have to wait 5–10 minutes for the instance to start, depending on cloud availability.
The blueprint supports:
Hardware Requirements
Software Requirements
Governing Terms:
NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy subcards. Please report security vulnerabilities or NVIDIA AI concerns here.