Skip to main content
Explore
Models
Skills
Blueprints
GPUs
Docs
Search
⌘K
Ctrl+K
?
Login
7 results for
Filters (1)
Models (0)
Blueprints (0)
Skills (0)
Other (7)
Sort By
score:DESC
Best Match
DGX Spark
60 MIN
cuTile Kernels
Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300
Playbook
FMHA
+10
1mo
Items per page
24
1
1
of 1 pages
DGX Station
30 MIN
LLM Inference with SGLang
Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance
Playbook
RadixAttention
+6
20d
DGX Spark
30 MIN
LM Studio on DGX Spark
Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.
Playbook
Inference
+3
4mo
DGX Spark
30 MIN
Nemotron-3-Nano with llama.cpp
Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark
Playbook
Nemotron
+3
6mo
DGX Spark
30 MIN
Run models with llama.cpp on DGX Spark
Build llama.cpp with CUDA and serve models via an OpenAI-compatible API
Playbook
DGX Spark
+3
2mo
RTX Workstation
30 MIN
vLLM for Inference
Install and use vLLM on NVIDIA RTX Pro 6000
Playbook
vLLM
+1
5d
DGX Station
30 MIN
vLLM for Inference
Install and use vLLM on DGX Station
Playbook
vLLM
+1
3mo