Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Terms of Use
Privacy Policy
Your Privacy Choices
Contact
Copyright © 2026 NVIDIA Corporation
2 results for
Filters (1)
Models (0)
Blueprints (0)
Other (2)
Sort By
score:DESC
Best Match
DGX Spark
30 MIN
Run models with llama.cpp on DGX Spark
Build llama.cpp with CUDA and serve models via an OpenAI-compatible API (Nemotron 3 Nano Omni as example)
Playbook
DGX Spark
+3
1mo
Items per page
24
1
1
of 1 pages
DGX Spark
30 MIN
Speculative Decoding
Learn how to set up speculative decoding for fast inference on Spark
Playbook
DGX
+1
7mo