---
title: "deepseek-v3.1-terminus"
publisher: "deepseek-ai"
type: "endpoint"
updated: "2025-10-07T00:01:10.196Z"
description: "DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling."
canonical: "https://build.nvidia.com/deepseek-ai/deepseek-v3_1-terminus"
---

# DeepSeek-V3.1-Terminus Overview

## **Model Summary**

DeepSeek-V3.1-Terminus is an updated checkpoint of the DeepSeek-V3 family and refines model stability, multilingual consistency, and agent behavior. It improves agent and code/search capabilities while addressing mixed-language and character issues seen in earlier checkpoints.

This model is ready for commercial/non-commercial use.

**Third-Party Community Consideration:** <br>
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to [Non-NVIDIA DeepSeek-V3.1-Terminus Model Card](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus).

**GOVERNING TERMS:** The trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). Additional Information: [MIT](https://choosealicense.com/licenses/mit/).

### Deployment Geography:
Global <br>

### Use Case:
Reasoning, search/agent workflows, code assistance, multilingual QA, and knowledge retrieval for research and enterprise applications.

### Release Date:
build.nvidia.com 10/06/2025 via [link](https://build.nvidia.com/deepseek-ai/deepseek-v3_1-terminus) <br>
Huggingface 08/21/2025 via [link](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus) <br>

### Reference(s):
References: [Link](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus)

## **Model Architecture**

**Architecture Type:** Transformer <br>
**Network Architecture:** DeepSeek-V3 Mixture-of-Experts variant with MLA <br>
**Total Parameters:** ~685B<br>
**Base Model:** DeepSeek-V3 family <br>

## **Input**

**Input Types:** Text <br>
**Input Formats:** String <br>
**Input Parameters:** One Dimensional (1D) <br>
**Other Input Properties:** Undisclosed <br>

## **Output**

**Output Types:** Text <br>
**Output Format:** String <br>
**Output Parameters:** One Dimensional (1D) <br>
**Other Output Properties:** Undisclosed <br>

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. 

## **Software Integration**

**Runtime Engines:** <br>
SGLang

**Supported Hardware Microarchitecture Compatibility:** <br>
NVIDIA Blackwell <br>
NVIDIA Hopper <br>

**Operating Systems:** Linux <br>
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

## **Model Version(s)**

**Model Version(s)** <br>
DeepSeek-V3.1-Terminus <br>

## **Training, Testing, and Evaluation Datasets**

### Training Dataset
**Training Data Collection:** Undisclosed <br>
**Training Labeling:** Undisclosed <br>
**Training Properties:** Undisclosed <br>
**Data Modality:** Text
**Text Training Data Size:** [1 Billion to 10 Trillion Tokens]

### Testing Dataset
**Testing Data Collection:** Undisclosed <br>
**Testing Labeling:** Undisclosed <br>
**Testing Properties:** Undisclosed <br>

### Evaluation Dataset
**Evaluation Benchmark Score:** Undisclosed <br>
**Evaluation Data Collection:** Undisclosed <br>
**Evaluation Labeling:** Undisclosed <br>
**Evaluation Properties:** Undisclosed <br>

## **Inference**

**Acceleration Engine:** SGLang <br>
**Test Hardware:** B200 <br>

## **Ethical Considerations**

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)

## Prototype

```python
from openai import OpenAI

client = OpenAI(
base_url = "https://integrate.api.nvidia.com/v1",
api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
model="",
messages=[{"role":"user","content":""}],
temperature=,
top_p=,
max_tokens=,

stream=NaN
)

print(completion.choices[0].message.content)
```

```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
model="",
api_key="$NVIDIA_API_KEY", 
temperature=,
top_p=,
max_tokens=,

)

response = client.invoke([{"role":"user","content":""}])
print(response.content)
```

```javascript
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: '$NVIDIA_API_KEY',
baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
const completion = await openai.chat.completions.create({
model: "",
messages: [{"role":"user","content":""}],
temperature: ,
top_p: ,
max_tokens: ,

stream: 
})

process.stdout.write(completion.choices[0]?.message?.content);

}

main();
```

```bash
invoke_url='https://integrate.api.nvidia.com/v1/chat/completions'

authorization_header='Authorization: Bearer '
accept_header='Accept: application/json'
content_type_header='Content-Type: application/json'

data=$'{
"messages": [
{
"role": "user",
"content": ""
}
]
}'

response=$(curl --silent -i -w "\n%{http_code}" --request POST \
--url "$invoke_url" \
--header "$authorization_header" \
--header "$accept_header" \
--header "$content_type_header" \
--data "$data"
)

echo "$response"
```