Prerequisites and environment
This playbook is for DGX Station (single node). Ensure your DGX Station has Docker with NVIDIA runtime, GPU access, and required API keys. Nanochat uses Weights & Biases (W&B) for training visualization and a Hugging Face token for evaluation datasets.
# Verify GPU and Docker
nvidia-smi
docker run --rm --gpus all nvcr.io/nvidia/pytorch:26.01-py3 nvidia-smi
Expected output should show your GPU(s) and driver version. Create a W&B account and a Hugging Face token if you do not have them.
export WANDB_API_KEY=<YOUR_WANDB_API_KEY>
export HF_TOKEN=<YOUR_HF_TOKEN>
Clone the playbook and set up nanochat
Clone the playbook repository and run the setup script to clone the nanochat repo and build the Docker image.
git clone https://github.com/NVIDIA/dgx-station-playbooks.git
cd dgx-station-playbooks/nvidia/station-nanochat/assets
From the assets directory, run the setup script. It clones nanochat, checks out the supported commit, and builds the nanochat Docker image (PyTorch NGC base with tiktoken, tokenizers, datasets, wandb, etc.).
./setup.sh
Setup may take several minutes while the image builds. Verify the image:
docker images | grep nanochat
You should see the nanochat image listed.
Launch full training
NOTE
The default launch.sh uses cache directories under /nanochat_cache. If that path does not exist on your DGX Station, edit launch.sh and replace those paths with your own (e.g. $(pwd)/nanochat_cache and $(pwd)/hf_cache), and create the directories before running.
To run full training (d20 model, 240 shards, midtraining, SFT, report) for higher-quality results, use the full launcher. On a DGX Station with GB300 Ultra this can take on the order of 16 hours:
export WANDB_API_KEY=<YOUR_WANDB_API_KEY>
export HF_TOKEN=<YOUR_HF_TOKEN>
./launch_full.sh
This runs speedrun_full.sh inside the container: full FineWeb download (240 shards), 561M-parameter (d20) pretraining, midtraining, supervised fine-tuning, and report generation.
Verify and use the model
After training completes, checkpoints and the tokenizer are under ~/.cache/nanochat/ (or the cache path used in launch.sh). Run inference from the nanochat directory (e.g. assets/nanochat) on your DGX Station.
Web UI (recommended):
cd nanochat
source ../.venv/bin/activate # if using venv from container context; otherwise use the container
python -m scripts.chat_web
Open a browser to http://<STATION_IP>:8000 where <STATION_IP> is your DGX Station’s IP address.
CLI:
cd nanochat
python -m scripts.chat_cli -p "Why is the sky blue?"
python -m scripts.chat_cli -i sft -p "Write a haiku about machine learning"
A full report is generated at nanochat/report.md after the run. You can also monitor training at wandb.ai under your project.
Cleanup
To stop training early, interrupt the launch script or stop the container:
WARNING
This stops the training run and any in-progress work in the container.
# If launch.sh is running: press Ctrl+C
# Or stop the container by name
docker stop $(docker ps -q --filter ancestor=nanochat)
To free disk space after training (use the same path as your cache if you set NANOCHAT_CACHE):
rm -rf ./nanochat_cache ./hf_cache
docker system prune -a
Next steps and customization
- Small scale run:
./launch.shcan run a lite training by following the customization guide to make changes tospeedrun_station.sh. This can potentially bring down the training time. - Custom cache paths: Set
NANOCHAT_CACHEandHF_CACHEbefore launching (e.g.export NANOCHAT_CACHE=/path/to/nanochat_cache) if you want cache outside the assets directory. - Monitoring: Use
nvidia-smiand W&B dashboards to watch GPU utilization and training metrics (loss, throughput). - Inference: Try the web UI and CLI with different checkpoints (
base,mid,sft) and prompts; see sample prompts inassets/README.md.