
Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.
The following are key system requirements and supported features to consider when self-hosting the deepseek-r1-distill-qwen-14b model.
| Precision | Minimum GPU Memory | Recommended GPU Memory |
|---|---|---|
| fp8 | 14 GB | 18 GB |
| bf16 | 29 GB | 35 GB |
Deploying this NIM with less than the recommended amount of GPU memory requires setting the environment variable NIM_RELAX_MEM_CONSTRAINTS=1
| Feature | Supported |
|---|---|
| LoRA Customization | ❌ |
| Fine-tuning Customization | ✅ |
| Tool Calling | ❌ |
| TensorRT-LLM Local Engine Building | ✅ |