Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

5 results for

Filters (1)

Publisher

NVIDIA

5

Labels (1)

Inference

Sort By

20 MIN

Serve Qwen3-235B with vLLM

Set up vLLM server with Qwen3-235B on DGX Station

1mo

Items per page

of 1 pages

30 MIN

LM Studio on DGX Spark

Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.

2mo

30 MIN

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

4mo

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API (Nemotron 3 Nano Omni as example)

3w

60 MIN

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

1d