Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
⌘KCtrl+K
View All Playbooks
View All Playbooks

onboarding

  • MIG on DGX Station

data science

  • Topic Modeling
  • Text to Knowledge Graph on DGX Station

tools

  • NVFP4 Quantization

fine tuning

  • NVFP4 Pretraining with Megatron Bridge
  • Nanochat Training

use case

  • Run NemoClaw with a Local LLM
  • DGX Station AI Skills for Coding Agents
  • Profiler-Driven Kernel Optimization for Fine-Tuning
  • Local Healthcare Agent on DGX Station
  • Secure Long Running AI Agents with OpenShell on DGX Station
  • Local Coding Agent

inference

  • vLLM for Inference
  • Image & Video Generation with ComfyUI
  • Isaac GR00T N1.6 Fine-Tuning
  • LLM Inference with SGLang

DGX Station AI Skills for Coding Agents

15 MIN

Give your coding agent (Claude Code, Codex, Gemini CLI, Cursor) DGX Station expertise via an AGENTS.md and on-demand Agent Skills

AGENTS.mdAI AgentsAgent SkillsBlackwellClaude CodeCodexCursorDGX StationGB300Gemini CLIMIGMixed CoherencySGLangvLLM
View on GitHub
OverviewOverviewInstructionsInstructionsTroubleshootingTroubleshooting

Step 1
Install your coding agent

Pick whichever agent you prefer — the rest of this playbook works the same regardless. Install commands:

AgentInstall
Claude Codecurl -fsSL https://claude.ai/install.sh | sh
OpenAI Codex CLInpm i -g @openai/codex
Gemini CLInpm i -g @google/gemini-cli
CursorDownload from https://cursor.com/

Verify with claude --version, codex --version, gemini --version, or by launching Cursor.

Step 2
Install the skills into your project

Navigate to the project where you want DGX Station expertise, then run the installer with the harness you use:

cd ~/your-project

# Pick one:
/path/to/this/playbook/assets/install.sh claude
/path/to/this/playbook/assets/install.sh codex
/path/to/this/playbook/assets/install.sh gemini
/path/to/this/playbook/assets/install.sh cursor

# Or install for all four at once:
/path/to/this/playbook/assets/install.sh all

If you downloaded the playbook as a zip, the path is relative to the extracted directory:

station-ai-skills/assets/install.sh claude ~/your-project

The installer is additive for skill directories (won't clobber existing skills you've written) and refuses to overwrite an existing context file (AGENTS.md, CLAUDE.md, GEMINI.md) unless you pass --force.

Resulting layout (per harness):

your-project/
  AGENTS.md   or  CLAUDE.md   or  GEMINI.md      # context file (named for your agent)
  .claude/skills/<name>/SKILL.md                  # claude
  .codex/prompts/<name>.md                        # codex
  .gemini/commands/<name>.md                      # gemini
  .cursor/rules/<name>.mdc                        # cursor

Where <name> is each of vllm-setup, sglang-setup, mig-configure, dgx-diagnose.

NOTE

Every supported agent automatically reads the context file from the working directory at startup. Skills/prompts/rules in the harness-specific directory are discovered automatically — no additional configuration needed.

Step 3
Verify the setup

Start your agent in the project directory and ask a question that requires constraint knowledge:

Can I use --gpus all to run my CUDA workload on DGX Station?

The agent should immediately warn about the mixed-coherency constraint and recommend --gpus '"device=N"' targeting. If you don't get the warning, the context file isn't being loaded — see Troubleshooting.

Then verify the skills are discoverable:

AgentHow to check
Claude CodeType / — vllm-setup, sglang-setup, mig-configure, dgx-diagnose should appear in the autocomplete
Codex CLIType /prompts: — same four names appear
Gemini CLIType / — same four names appear
CursorOpen the Rules panel — same four rules appear

Step 4
Use vllm-setup to deploy an inference server

Invoke the skill in your agent:

AgentInvocation
Claude Code/vllm-setup (slash command) or just describe the task ("deploy vllm with Qwen3-8B")
Codex CLI/prompts:vllm-setup
Gemini CLI/vllm-setup
CursorIn chat: "use the vllm-setup rule to deploy a vllm server"

The agent will walk you through deploying a vLLM server with a validated container image, correct GPU targeting, and recommended parameters. It will check your GPU index, ask which model you want to serve, and generate the full docker run command.

Step 5
Use sglang-setup to deploy SGLang

Same invocation pattern, but for SGLang with the cu130 container, RadixAttention prefix caching, and structured JSON output support.

Step 6
Use mig-configure to partition the GB300

The agent will query your current MIG state, show available profiles, help you choose a layout for your workloads, and execute the partitioning commands.

Step 7
Use dgx-diagnose to troubleshoot issues

If you encounter problems, invoke dgx-diagnose. The agent will check GPU status, driver version, running processes, MIG state, and Fabric Manager to identify the issue.

Step 8
Customize

Both the AGENTS.md and the skills are plain markdown — extend them freely.

Add project-specific constraints to AGENTS.md (or your harness-specific context file):

## Project-specific

- Our production MIG layout is 3g.139gb + 2g.70gb + 2g.70gb
- Always use port 8080 for inference (nginx proxy on 443)
- Model weights are cached at /data/models, mount with -v /data/models:/root/.cache/huggingface/hub

Create new skills by adding a directory and SKILL.md to assets/skills/, then re-run install.sh:

mkdir -p assets/skills/run-benchmarks
cat > assets/skills/run-benchmarks/SKILL.md << 'EOF'
---
name: run-benchmarks
description: Run our standard inference benchmark suite against the running vLLM or SGLang server and compare against the baseline.
---

# Run benchmarks

1. Check which inference server is running (vLLM on port 8000 or SGLang on port 30000)
2. Run the appropriate benchmark script from ./benchmarks/
3. Report throughput (tokens/sec), latency (TTFT, ITL), and memory utilization
4. Compare against the baseline in ./benchmarks/baseline.json
EOF

TIP

Keep AGENTS.md focused on constraints and pitfalls (things that break). Put procedural workflows in skills (things you do step-by-step).

Resources

  • Anthropic Agent Skills Overview
  • AGENTS.md Standard
  • Claude Code Documentation
  • OpenAI Codex AGENTS.md Guide
  • Gemini CLI Custom Commands
  • Cursor Rules Documentation
  • vLLM Documentation
  • SGLang Documentation
  • MIG User Guide
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation