LM Studio is an application for discovering, running, and serving large language models entirely on your own hardware. You can run local LLMs like gpt-oss, Qwen3, Gemma3, DeepSeek, and many more models privately and for free.
This playbook shows you how to deploy LM Studio on an NVIDIA DGX Spark device to run LLMs locally with GPU acceleration. Running LM Studio on DGX Spark enables Spark to act as your own private, high-performance LLM server.
LM Link (optional) lets you use your Spark’s models from another machine as if they were local. You can link your DGX Spark and your laptop (or other devices) over an end-to-end encrypted connection, so you can load and run models on the Spark from your laptop without being on the same LAN or opening network access. See LM Link and Step 3b in the Instructions.
You'll deploy LM Studio on an NVIDIA DGX Spark device to run Nemotron 3 Nano Omni (nvidia/nemotron-3-nano-omni), and use the model from your laptop. More specifically, you will:
Hardware Requirements:
Software Requirements:
To explore all supported models in LM Studio, check out LM Studio model catalog page.
| Model | Support Status | Model Path |
|---|---|---|
| Nemotron 3 Nano Omni | ✅ | nvidia/nemotron-3-nano-omni |
| Qwen3.6-35B-A3B | ✅ | qwen/qwen3.6-35b-a3b |
| GPT-OSS-120B | ✅ | openai/gpt-oss-120b |
LM Link lets you use your local models remotely. You link machines (e.g. your DGX Spark and your laptop), then load models on the Spark and use them from the laptop as if they were local.
localhost:1234) can use models from your Link, including Codex, Claude Code, OpenCode, and the LM Studio SDK.If you use LM Link, you can skip binding the server to 0.0.0.0 and using the Spark’s IP; once devices are linked, point your laptop at localhost:1234 and remote models appear in the model loader.
All required assets can be found below. These sample scripts can be used in Step 7 of Instructions.