Try NVIDIA NIM APIs

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

Playbook

RadixAttention

17d

DGX Spark

30 MIN

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Spark using NemoClaw and Ollama in a secure sandbox, with optional Telegram.

Playbook

2mo

DGX Station

30 MIN

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Station using NemoClaw in a secure sandbox, with optional Telegram.

Playbook

1mo

RTX Workstation

8 MIN

How to Fine-Tune an LLM on NVIDIA GPUs With Unsloth

Fine-tune popular AI models faster in Unsloth with NVIDIA RTX AI PCs, RTX PRO workstations, and DGX Spark—plus explore the new Nemotron Nano 3 family of open models.

Playbook

Fine-Tuning

12d

DGX Spark

30 MIN

NIM on Spark

Deploy a NIM on Spark

Playbook

DGX

8mo

DGX Spark

30 MIN

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

Playbook

Nemotron

5mo

DGX Spark

30 MIN

Run Hermes Agent with Local Models

Install and run the Hermes self-improving AI agent on DGX Spark.

Playbook

Nous Research

1mo

DGX Spark

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

Playbook

DGX Spark

2mo

NVIDIA

Downloadable

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

Model

nemo guardrails

13.5K

11mo

Google

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

Model

diffusion-llm

14.2K

Healthcare & Life Sciences

LaunchableDeveloper Example

Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

Blueprint

NVIDIA AI

3mo

DGX Spark

20 MINS

CLI Coding Agent

Build local CLI coding agents with Ollama

Playbook

Coding

1mo

DGX Station

30 MINS

Local Coding Agent

Run local CLI coding agents with Ollama on DGX Station (NVIDIA GB300) using glm-4.7-flash (fast) or unsloth/GLM-4.7-GGUF:Q8_0 (best quality)

Playbook

Coding

2mo

DGX Spark

30 MINS

OpenClaw 🦞

Run OpenClaw locally on DGX Spark with a vLLM-served local model

Playbook

DGX

3mo

Z.ai

DownloadableFree Endpoint

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Model

Agentic AI

27.59M

1mo

DGX Spark

30 MIN

vLLM for Inference

Install and use vLLM on DGX Spark

Playbook

DGX

3mo

RTX Workstation

30 MIN

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

Playbook

vLLM

DGX Station

30 MIN

vLLM for Inference

Install and use vLLM on DGX Station

Playbook

vLLM

3mo

DGX Spark

20 MIN

Live VLM WebUI

Real-time Vision Language Model interaction with webcam streaming

Playbook

Vision AI

5mo

Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.

Skill

Developer

351

14d

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use after running VLM evaluation when you have a predictions JSON and need to identify failure cases for DEFT root cause analysis on a binary

Skill

Developer

148

Today

DGX Station

30 MIN

Nanochat Training

Train a small ChatGPT-style LLM (nanochat) with tokenizer, pretraining, midtraining, and SFT on DGX Station with GB300 Ultra

Playbook

Training

3mo