NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

mistralai

mixtral-8x7b-instruct-v0.1

Run Anywhere

An MOE LLM that follows instructions, completes requests, and generates creative text.

Deploying your application in production? Get started with a 90-day evaluation of NVIDIA AI Enterprise

Follow the steps below to download and run the NVIDIA NIM inference microservice for this model on your infrastructure of choice.

Step 1
Get API Key and Install the NIM Operator

Install the NVIDIA GPU Operator

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ && helm repo update helm install nim-operator nvidia/k8s-nim-operator --create-namespace -n nim-operator

Step 2
Create a ImagePull Secrets

kubectl create ns nim-service kubectl create secret -n nim-service docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password=<PASTE_API_KEY_HERE> kubectl create secret -n nim-service generic ngc-api-secret \ --from-literal=NGC_API_KEY=<PASTE_API_KEY_HERE>

Step 3
Create a NIM Service

Ensure that a default StorageClass exists in the cluster. If none is present, create an appropriate StorageClass before proceeding.

NOTE:

  • Select model-size based on the model and GPU type as described here.
  • For example, change the nvidia.com/gpu: 1 based on the model and number of GPU requirements
apiVersion: apps.nvidia.com/v1alpha1 kind: NIMService metadata: name: mixtral-8x7b-instruct-v01 namespace: nim-service spec: image: repository: nvcr.io/nim/mistralai/mixtral-8x7b-instruct-v01 tag: latest pullPolicy: IfNotPresent pullSecrets: - ngc-secret authSecret: ngc-api-secret storage: pvc: create: true size: "model-size" volumeAccessMode: "ReadWriteOnce" replicas: 1 resources: limits: nvidia.com/gpu: 1 expose: service: type: ClusterIP port: 8000

Step 4
Test the Deployed NIM

kubectl run --rm -it -n default curl --image=curlimages/curl:latest -- ash
curl -X "POST" \ 'http://mixtral-8x7b-instruct-v01.nim-service:8000/v1/chat/completions' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model": "mistralai/mixtral-8x7b-instruct-v0-1", "messages": [ { "content":"What should I do for a 4 day vacation at Cape Hatteras National Seashore?", "role": "user" }], "top_p": 1, "n": 1, "max_tokens": 1024, "stream": false, "frequency_penalty": 0.0, "stop": ["STOP"] }'

For more details on getting started with this NIM, visit the NVIDIA NIM Operator Docs.