---
title: "mixtral-8x7b-instruct-v0.1"
publisher: "mistralai"
type: "endpoint"
updated: "2025-07-18T20:14:20.037Z"
description: "An MOE LLM that follows instructions, completes requests, and generates creative text."
canonical: "https://build.nvidia.com/mistralai/mixtral-8x7b-instruct"
---

# Model Overview

## Description:

Mixtral 8x7B Instruct is a language model that can follow instructions, complete requests, and generate creative text formats. Mixtral 8x7B a high-quality sparse mixture of experts model (SMoE) with open weights.<br>
This model has been optimized through supervised fine-tuning and direct preference optimization (DPO) for careful instruction following. On MT-Bench, it reaches a score of 8.30, making it the best open-source model, with a performance comparable to GPT3.5.<br>
Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference. It is the strongest open-weight model with a permissive license and the best model overall regarding cost/performance trade-offs. In particular, it matches or outperforms GPT3.5 on most standard benchmarks.<br>
Mixtral has the following capabilities.

* It gracefully handles a context of 32k tokens.
* It handles English, French, Italian, German and Spanish.
* It shows strong performance in code generation.
* It can be finetuned into an instruction-following model that achieves a score of 8.3 on MT-Bench.

## Third-Party Community Consideration:

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see [Mistral's 8x7B Instruct Hugging Face Model Card](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).

## Terms of use

By using this software or model, you are agreeing to the [terms and conditions](https://mistral.ai/terms-of-service/) of the license, acceptable use policy and Mistral's privacy policy. Mixtral-8x7B is released under the Apache 2.0 license

## References(s):

Mixtral 8x7B Instruct [Model Card](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on Hugging Face <br>
[Mixtral of experts | Mistral AI | Open source models](https://mistral.ai/news/mixtral-of-experts/) <br>

## Model Architecture:

**Architecture Type:** Transformer <br>
**Network Architecture:** Sparse Mixture of GPT-based experts <br>
**Model Version:** 0.1 <br>

## Input:
**Input Format:** Text <br>
**Input Parameters:** Temperature, Top P, Max Output Tokens<br>

## Output:
**Output Format:** Text <br>
**Output Parameters:** None <br>

## Software Integration:
**Supported Hardware Platform(s):** Hopper, Ampere, Turing, Ada <br>
**Supported Operating System(s):** Linux <br>

# Inference:

**Engine:** [Triton](https://developer.nvidia.com/triton-inference-server) <br>
**Test Hardware:** Other <br>

## Prototype

```python
from openai import OpenAI

client = OpenAI(
base_url = "https://integrate.api.nvidia.com/v1",
api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
model="",
messages=[{"role":"user","content":""}],
temperature=,
top_p=,
max_tokens=,
stream=NaN
)

print(completion.choices[0].message)
```

```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
model="",
api_key="$NVIDIA_API_KEY", 
temperature=,
top_p=,
max_tokens=,
)

response = client.invoke([{"role":"user","content":""}])
print(response.content)
```

```javascript
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: '$NVIDIA_API_KEY',
baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
const completion = await openai.chat.completions.create({
model: "",
messages: [{"role":"user","content":""}],
temperature: ,
top_p: ,
max_tokens: ,
stream: ,
})

process.stdout.write(completion.choices[0]?.message?.content);

}

main();
```

```bash
curl https://integrate.api.nvidia.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NVIDIA_API_KEY" \
-d '{
"model": "mistralai/mixtral-8x7b-instruct-v0.1",
"messages": [{"role":"user","content":""}],
"temperature": ,   
"top_p": ,
"max_tokens": ,
"stream":                 
}'
```