---
title: "deepseek-r1-distill-qwen-14b"
publisher: "deepseek-ai"
type: "endpoint"
updated: "2025-05-22T19:14:30.187Z"
description: "Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance."
canonical: "https://build.nvidia.com/deepseek-ai/deepseek-r1-distill-qwen-14b"
---

## **Model Overview**

### **Background** 

DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. DeepSeek-R1 sought to address these issues and further enhance reasoning performance by incorporating cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

### **Description**

DeepSeek-R1-Distill-Qwen-14B is a distilled version of the DeepSeek-R1 series, built upon the Qwen2.5-14B architecture. This model is designed to deliver efficient performance for reasoning, math, and code tasks while maintaining high accuracy. By distilling knowledge from the larger DeepSeek-R1 model, it provides state-of-the-art performance with reduced computational requirements.

This model is ready for commercial use. For more details, visit the [DeepSeek website](https://www.deepseek.com/).

## **Third-Party Community Consideration**

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to the [DeepSeek-R1-Distill-Qwen-14B Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B).

## **License/Terms of Use**

Governing NVIDIA Download Terms：The NIM container is governed by the [NVIDIA Software License Agreement](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf) and [Product-Specific Terms](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/) for AI Products; and the use of this model is governed by the [NVIDIA Community Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/). ADDITIONAL INFORMATION: [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE) and [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE).

**Model Developer**  
DeepSeek AI

## **Model Architecture**

Architecture Type: Distilled version of Mixture of Experts (MoE)  
Network Architecture:  Qwen  
Version: 2.5

### **Input**

**Input Type:** Text  
**Input Format:** String  
**Input Parameters:** 1D  
**Other Properties Related to Input:**  
DeepSeek recommends adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:

1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.  
2. Avoid adding a system prompt; all instructions should be contained within the user prompt.  
3. For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \\boxed{}."  
4. When evaluating model performance, it is recommended to conduct multiple tests and average the results.

Additionally, the DeepSeek-R1 series models tend to bypass thinking patterns (i.e., outputting "\<think\>\\n\\n\</think\>") when responding to certain queries, which can adversely affect the model's performance. To ensure that the model engages in thorough reasoning, DeepSeek recommends enforcing the model to initiate its response with "\<think\>\\n" at the beginning of every output.

### **Output**

**Output Type:** Text  
**Output Format:** String  
**Output Parameters:** 1D

## **Software Integration**

**Runtime Engine:** TensorRT-LLM  
**Supported Hardware Microarchitecture Compatibility:** NVIDIA Hopper, NVIDIA Lovelace  
**Preferred/Supported Operating System(s):** Linux

### **Training Dataset**

**Data Collection Method by dataset**: Automated   
**Labelling Method by dataset:** Automated  
**Properties:** 800k samples curated with DeepSeek-R1

### **Testing Dataset**

**Data Collection Method by dataset:** Automated. Reasoning data generated by DeepSeek-R1.  
**Labelling Method by dataset:** Automated

### **Evaluation Dataset**

Please see the Evaluation section of the [DeepSeek-R1-Distill-Qwen-14B Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more information.  
**Data Collection Method by dataset:** Hybrid: Human, Automated

**Labeling Method by dataset:** Hybrid: Human, Automated

## **Inference**

**Engine:** TensorRT-LLM  
**Test Hardware:** H20, L20

## **Ethical Considerations**

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## **Model Limitations**

The base model was trained on data that contains toxic language and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.

**You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.**

## Prototype

```python
from openai import OpenAI

client = OpenAI(
base_url = "https://integrate.api.nvidia.com/v1",
api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
model="",
messages=[{"role":"user","content":""}],
temperature=,
top_p=,
max_tokens=,
stream=NaN
)

print(completion.choices[0].message)
```

```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
model="",
api_key="$NVIDIA_API_KEY", 
temperature=,
top_p=,
max_tokens=,
)

response = client.invoke([{"role":"user","content":""}])
print(response.content)
```

```javascript
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: '$NVIDIA_API_KEY',
baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
const completion = await openai.chat.completions.create({
model: "",
messages: [{"role":"user","content":""}],
temperature: ,
top_p: ,
max_tokens: ,
stream: 
})

process.stdout.write(completion.choices[0]?.message?.content);

}

main();
```

```bash
invoke_url='https://integrate.api.nvidia.com/v1/chat/completions'

authorization_header='Authorization: Bearer '
accept_header='Accept: application/json'
content_type_header='Content-Type: application/json'

data=$'{
"messages": [
{
"role": "user",
"content": ""
}
]
}'

response=$(curl --silent -i -w "\n%{http_code}" --request POST \
--url "$invoke_url" \
--header "$authorization_header" \
--header "$accept_header" \
--header "$content_type_header" \
--data "$data"
)

echo "$response"
```