DGX Station AI Skills for Coding Agents

Basic idea

Modern coding agents — Claude Code, OpenAI Codex CLI, Gemini CLI, Cursor — all support two extension mechanisms: a project-level context file that's loaded into every conversation, and on-demand procedural workflows (called skills, prompts, commands, or rules depending on the harness). This playbook ships both for DGX Station:

An AGENTS.md with the critical DGX Station constraints your agent should always know (mixed coherency, GPU targeting, common pitfalls). AGENTS.md is the cross-harness standard; an install.sh lays it down as CLAUDE.md, GEMINI.md, or AGENTS.md depending on the agent you use.
Four Agent Skills — vllm-setup, sglang-setup, mig-configure, dgx-diagnose — authored once in the Anthropic Agent Skills format and installed into the right per-harness location (.claude/skills/, .agents/skills/, .gemini/commands/, or .cursor/rules/).

This approach keeps your agent's context lean in every conversation while giving it deep procedural knowledge on demand, regardless of which agent you use.

AGENTS.md vs Agent Skill — why split?

	AGENTS.md	Agent Skill
Loaded	Every conversation, automatically	Only when invoked by name (or matched by description, in Claude)
Best for	Constraints, pitfalls, "never do X" rules	Step-by-step workflows, deployment procedures
Context cost	Consumed every time	Zero until invoked

The DGX Station mixed-coherency constraint (--gpus all will crash) should be in every conversation. The full vLLM deployment procedure should not.

What you'll accomplish

Install the AGENTS.md and four Agent Skills into your project directory for your chosen agent (Claude Code, Codex, Gemini CLI, or Cursor).
Verify the agent loads the constraints automatically and the skills on demand.
Invoke vllm-setup to deploy a vLLM inference server with validated configuration.
Invoke sglang-setup to deploy an SGLang inference server.
Invoke mig-configure to partition the GB300 into MIG instances.
Invoke dgx-diagnose to troubleshoot common DGX Station issues.

What to know before starting

Basic familiarity with one supported coding agent (running it, giving it prompts, using slash commands or rule references)
General understanding of DGX Station (two GPUs, Docker-based workflows)

Prerequisites

NVIDIA DGX Station with GB300
One of the supported coding agents installed:
- Claude Code: curl -fsSL https://claude.ai/install.sh | bash
- OpenAI Codex CLI: npm i -g @openai/codex
- Gemini CLI: npm i -g @google/gemini-cli
- Cursor: download from https://cursor.com/
A project directory where you do DGX Station work

Ancillary files

assets/AGENTS.md — canonical context file with critical constraints, GPU targeting, software versions, and common pitfalls. Cross-harness standard.
assets/skills/vllm-setup/SKILL.md — skill: deploy vLLM with validated configuration.
assets/skills/sglang-setup/SKILL.md — skill: deploy SGLang with validated configuration.
assets/skills/mig-configure/SKILL.md — skill: configure MIG partitions on the GB300.
assets/skills/dgx-diagnose/SKILL.md — skill: troubleshoot common DGX Station issues.
assets/install.sh — per-harness installer (claude, codex, gemini, cursor, or all).

Time & risk

Duration: 10-15 minutes
Risk level: Low — this playbook copies markdown files into your project directory
Rollback: Delete the context file (AGENTS.md / CLAUDE.md / GEMINI.md) and the harness-specific skill directory (.claude/skills/, .agents/skills/, .gemini/commands/, or .cursor/rules/) from your project directory
Last Updated: 05/18/2026
- Restructured as harness-agnostic Agent Skills (Claude Code, Codex, Gemini CLI, Cursor)