LM Studio on DGX Spark

Basic idea

LM Studio is an application for discovering, running, and serving large language models entirely on your own hardware. You can run local LLMs like gpt-oss, Qwen3, Gemma3, DeepSeek, and many more models privately and for free.

This playbook shows you how to deploy LM Studio on an NVIDIA DGX Spark device to run LLMs locally with GPU acceleration. Running LM Studio on DGX Spark enables Spark to act as your own private, high-performance LLM server.

LM Link (optional) lets you use your Spark’s models from another machine as if they were local. You can link your DGX Spark and your laptop (or other devices) over an end-to-end encrypted connection, so you can load and run models on the Spark from your laptop without being on the same LAN or opening network access. See LM Link and Step 3b in the Instructions.

What you'll accomplish

You'll deploy LM Studio on an NVIDIA DGX Spark device to run Nemotron 3 Nano Omni (nvidia/nemotron-3-nano-omni), and use the model from your laptop. More specifically, you will:

Install llmster, a totally headless, terminal native LM Studio on the Spark
Run LLM inference locally on DGX Spark via API
Interact with models from your laptop using the LM Studio SDK
Optionally use LM Link to connect Spark and laptop over an encrypted link so remote models appear as local (no same-network or bind setup required)

What to know before starting

Set Up Local Network Access to your DGX Spark device
Working with terminal/command line interfaces
Understanding of REST API concepts

Prerequisites

Hardware Requirements:

DGX Spark device with ARM64 processor and Blackwell GPU architecture
Minimum 65GB GPU memory, 70GB or above is recommended
At least 65GB available storage space, 70GB or above is recommended

Software Requirements:

NVIDIA DGX OS
Client device (Mac, Windows, or Linux)
Laptop and DGX Spark must be on the same local network
Network access to download packages and models

Model support matrix

To explore all supported models in LM Studio, check out LM Studio model catalog page.

Model	Support Status	Model Path
Nemotron 3 Nano Omni	✅	`nvidia/nemotron-3-nano-omni`
Qwen3.6-35B-A3B	✅	`qwen/qwen3.6-35b-a3b`
GPT-OSS-120B	✅	`openai/gpt-oss-120b`

LM Link (optional)

LM Link lets you use your local models remotely. You link machines (e.g. your DGX Spark and your laptop), then load models on the Spark and use them from the laptop as if they were local.

End-to-end encrypted — Built on Tailscale mesh VPNs; devices are not exposed to the public internet.
Works with the local server — Any tool that connects to LM Studio’s local API (e.g. localhost:1234) can use models from your Link, including Codex, Claude Code, OpenCode, and the LM Studio SDK.
Preview — Free for up to 2 users, 5 devices each (10 devices total). Create your Link at lmstudio.ai/link.

If you use LM Link, you can skip binding the server to 0.0.0.0 and using the Spark’s IP; once devices are linked, point your laptop at localhost:1234 and remote models appear in the model loader.

Ancillary files

All required assets can be found below. These sample scripts can be used in Step 7 of Instructions.

run.js - JavaScript script for sending a test prompt to Spark
run.py - Python script for sending a test prompt to Spark
run.sh - Bash script for sending a test prompt to Spark

Time & risk

Estimated time: 15-30 minutes (including model download time, which may vary depending on your internet connection and the model size)
Risk level: Low
- Large model downloads may take significant time depending on network speed
Rollback:
- Downloaded models can be removed manually from the models directory.
- Uninstall LM Studio or llmster
Last Updated: 04/28/2026
- Introduce Nemotron Omni as example