Basic idea
LM Studio is an application for discovering, running, and serving large language models entirely on your own hardware. You can run local LLMs like gpt-oss, Qwen3, Gemma3, DeepSeek, and many more models privately and for free.
This playbook shows you how to deploy LM Studio on an NVIDIA DGX Spark device to run LLMs locally with GPU acceleration. Running LM Studio on DGX Spark enables Spark to act as your own private, high-performance LLM server.
LM Link (optional) lets you use your Spark’s models from another machine as if they were local. You can link your DGX Spark and your laptop (or other devices) over an end-to-end encrypted connection, so you can load and run models on the Spark from your laptop without being on the same LAN or opening network access. See LM Link and Step 3b in the Instructions.
What you'll accomplish
You'll deploy LM Studio on an NVIDIA DGX Spark device to run gpt-oss 120B, and use the model from your laptop. More specifically, you will:
- Install llmster, a totally headless, terminal native LM Studio on the Spark
- Run LLM inference locally on DGX Spark via API
- Interact with models from your laptop using the LM Studio SDK
- Optionally use LM Link to connect Spark and laptop over an encrypted link so remote models appear as local (no same-network or bind setup required)
What to know before starting
- Set Up Local Network Access to your DGX Spark device
- Working with terminal/command line interfaces
- Understanding of REST API concepts
Prerequisites
Hardware Requirements:
- DGX Spark device with ARM64 processor and Blackwell GPU architecture
- Minimum 65GB GPU memory, 70GB or above is recommended
- At least 65GB available storage space, 70GB or above is recommended
Software Requirements:
- NVIDIA DGX OS
- Client device (Mac, Windows, or Linux)
- Laptop and DGX Spark must be on the same local network
- Network access to download packages and models
LM Link (optional)
LM Link lets you use your local models remotely. You link machines (e.g. your DGX Spark and your laptop), then load models on the Spark and use them from the laptop as if they were local.
- End-to-end encrypted — Built on Tailscale mesh VPNs; devices are not exposed to the public internet.
- Works with the local server — Any tool that connects to LM Studio’s local API (e.g.
localhost:1234) can use models from your Link, including Codex, Claude Code, OpenCode, and the LM Studio SDK. - Preview — Free for up to 2 users, 5 devices each (10 devices total). Create your Link at lmstudio.ai/link.
If you use LM Link, you can skip binding the server to 0.0.0.0 and using the Spark’s IP; once devices are linked, point your laptop at localhost:1234 and remote models appear in the model loader.
Ancillary files
All required assets can be found below. These sample scripts can be used in Step 6 of Instructions.
- run.js - JavaScript script for sending a test prompt to Spark
- run.py - Python script for sending a test prompt to Spark
- run.sh - Bash script for sending a test prompt to Spark
Time & risk
- Estimated time: 15-30 minutes (including model download time, which may vary depending on your internet connection and the model size)
- Risk level: Low
- Large model downloads may take significant time depending on network speed
- Rollback:
- Downloaded models can be removed manually from the models directory.
- Uninstall LM Studio or llmster
- Last Updated: 03/12/2026
- Add instructions for LM Link features