DGX Station AI Skills for Coding Agents

Step 1
Install your coding agent

Pick whichever agent you prefer — the rest of this playbook works the same regardless. Install commands:

Agent	Install
Claude Code	`curl -fsSL https://claude.ai/install.sh \| bash`
OpenAI Codex CLI	`npm i -g @openai/codex`
Gemini CLI	`npm i -g @google/gemini-cli`
Cursor	Download from `https://cursor.com/`

NOTE

On a stock DGX Station the global npm prefix (/usr/lib/node_modules) is root-owned, so npm i -g … fails with EACCES. Either prefix the command with sudo, or configure a user-local prefix first (npm config set prefix ~/.npm-global and add ~/.npm-global/bin to your PATH) and run npm i -g … without sudo.

Verify with claude --version, codex --version, gemini --version, or by launching Cursor.

Step 2
Install the skills into your project

Navigate to the project where you want DGX Station expertise, then run the installer with the harness you use:

cd ~/your-project

# Pick one:
/path/to/this/playbook/assets/install.sh claude
/path/to/this/playbook/assets/install.sh codex
/path/to/this/playbook/assets/install.sh gemini
/path/to/this/playbook/assets/install.sh cursor

# Or install for all four at once:
/path/to/this/playbook/assets/install.sh all

If you downloaded the playbook as a zip, the path is relative to the extracted directory:

station-ai-skills/assets/install.sh claude ~/your-project

The installer is additive for skill directories (won't clobber existing skills you've written) and refuses to overwrite an existing context file (AGENTS.md, CLAUDE.md, GEMINI.md) unless you pass --force.

Resulting layout (per harness):

your-project/
  AGENTS.md   or  CLAUDE.md   or  GEMINI.md      # context file (named for your agent)
  .claude/skills/<name>/SKILL.md                  # claude
  .agents/skills/<name>/SKILL.md                  # codex
  .gemini/commands/<name>.md                      # gemini
  .cursor/rules/<name>.mdc                        # cursor

Where <name> is each of vllm-setup, sglang-setup, mig-configure, dgx-diagnose.

NOTE

Every supported agent automatically reads the context file from the working directory at startup. Skills/prompts/rules in the harness-specific directory are discovered automatically — no additional configuration needed.

Step 3
Verify the setup

Start your agent in the project directory and ask a question that requires constraint knowledge:

Can I use --gpus all to run my CUDA workload on DGX Station?

The agent should immediately warn about the mixed-coherency constraint and recommend --gpus '"device=N"' targeting. If you don't get the warning, the context file isn't being loaded — see Troubleshooting.

Then verify the skills are discoverable:

Agent	How to check
Claude Code	Type `/` — `vllm-setup`, `sglang-setup`, `mig-configure`, `dgx-diagnose` should appear in the autocomplete
Codex CLI	Run `/skills` (or type `$`) — same four names appear
Gemini CLI	Type `/` — same four names appear
Cursor	Open the Rules panel — same four rules appear

Step 4
Use vllm-setup to deploy an inference server

Invoke the skill in your agent:

Agent	Invocation
Claude Code	`/vllm-setup` (slash command) or just describe the task ("deploy vllm with Qwen3-8B")
Codex CLI	`$vllm-setup` (mention), or run `/skills` and pick it
Gemini CLI	`/vllm-setup`
Cursor	In chat: "use the vllm-setup rule to deploy a vllm server"

The agent will walk you through deploying a vLLM server with a validated container image, correct GPU targeting, and recommended parameters. It will check your GPU index, ask which model you want to serve, and generate the full docker run command.

Step 5
Use sglang-setup to deploy SGLang

Same invocation pattern, but for SGLang with the cu130 container, RadixAttention prefix caching, and structured JSON output support.

Step 6
Use mig-configure to partition the GB300

The agent will query your current MIG state, show available profiles, help you choose a layout for your workloads, and execute the partitioning commands.

Step 7
Use dgx-diagnose to troubleshoot issues

If you encounter problems, invoke dgx-diagnose. The agent will check GPU status, driver version, running processes, MIG state, and Fabric Manager to identify the issue.

Step 8
Customize

Both the AGENTS.md and the skills are plain markdown — extend them freely.

Add project-specific constraints to AGENTS.md (or your harness-specific context file):

## Project-specific

- Our production MIG layout is 3g.139gb + 2g.70gb + 2g.70gb
- Always use port 8080 for inference (nginx proxy on 443)
- Model weights are cached at /data/models, mount with -v /data/models:/root/.cache/huggingface/hub

Create new skills by adding a directory and SKILL.md to assets/skills/, then re-run install.sh:

mkdir -p assets/skills/run-benchmarks
cat > assets/skills/run-benchmarks/SKILL.md << 'EOF'
---
name: run-benchmarks
description: Run our standard inference benchmark suite against the running vLLM or SGLang server and compare against the baseline.
---

# Run benchmarks

1. Check which inference server is running (vLLM on port 8000 or SGLang on port 30000)
2. Run the appropriate benchmark script from ./benchmarks/
3. Report throughput (tokens/sec), latency (TTFT, ITL), and memory utilization
4. Compare against the baseline in ./benchmarks/baseline.json
EOF

TIP

Keep AGENTS.md focused on constraints and pitfalls (things that break). Put procedural workflows in skills (things you do step-by-step).

DGX Station AI Skills for Coding Agents

Step 1Install your coding agent

Step 2Install the skills into your project

Step 3Verify the setup

Step 4Use vllm-setup to deploy an inference server

Step 5Use sglang-setup to deploy SGLang

Step 6Use mig-configure to partition the GB300

Step 7Use dgx-diagnose to troubleshoot issues

Step 8Customize

Resources