---
title: "nv-embedcode-7b-v1"
publisher: "nvidia"
type: "endpoint"
updated: "2025-05-29T23:43:27.966Z"
description: "The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries."
canonical: "https://build.nvidia.com/nvidia/nv-embedcode-7b-v1"
---

# Model Overview

## Description:

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

Code retrieval is a critical task in many domains including coding assistance, code explanation, summarization, and documentation search. NV-EmbedCode transforms the input code or textual data into dense vector representations, known as embeddings, enabling effective retrieval and search.

This model is ready for commercial use.

NV-EmbdeCode is part of NVIDIA's effort to provide state-of-the-art, commercially-ready models and microservices, optimized for the lowest latency and highest throughput. The models that form the core of this solution have been trained using responsibly selected, auditable data sources. <br>

## Intended use

The NV-EmbedCode model is most suitable for users who want to build a code retrieval system over a large text or code corpus, leveraging the latest dense retrieval technologies. <br>

### License/Terms of Use

The use of this model is governed by the [NVIDIA AI Foundation Models Community License Agreement](https://developer.nvidia.com/downloads/nv-ai-foundation-models-license) and the [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/).

Technology can have a profound impact on people and the world, and NVIDIA is committed to enabling trust and transparency in AI development. NVIDIA encourages users to adopt principles of AI ethics and trustworthiness to guide your business decisions by following the guidelines in the NVIDIA AI Foundation Models Community License Agreement. <br>

## Model Architecture:

**Architecture Type:** Transformer <br>
**Network Architecture:** Fine-tuned NVIDIA Retrieval QA Mistral 7B Embedding model <br>
**Embedding Dimension:** 4096 <br>
**Parameter Count:** 7.1 billion <br>

The NV-EmbedCode model is a transformer encoder - a fine-tuned version of [NVIDIA Retrieval QA Mistral 7B Embedding model](https://build.nvidia.com/nvidia/nv-embedqa-mistral-7b-v2), with 32 layers and 4096 as embedding size, which is trained on public datasets. Mistral Models are pre-trained with casual attention. As our [research](https://arxiv.org/abs/2405.17428) demonstrated that bi-directional attention improved the performance, NV-Embed series of models use bi-directional attention. Embedding models for retrieval are typically trained using a bi-encoder architecture. This involves encoding a pair of query and chunked passages independently using the embedding model. Contrastive learning is used to maximize the similarity between the query and its relevant (positive) passage, while minimizing the similarity to irrelevant (negative) passages.

## Model Version(s):

NVIDIA Code Embedding v1

Short name: NV-EmbedCode-v1 <br>

### Input

**Input Type:** Code or text <br>
**Input Format:** List of strings (any list length, any string length) <br>
**Other Properties Related to Input:** The model was trained with documents of length up to 512 tokens however, similar to Mistral-7b, it has a theoretical attention span of approximately 131K tokens. <br>

### Output

**Output Type:** Floats <br>
**Output Format:** List of float arrays (same length as input list, 4096 dimensions per float array) <br>
**Other Properties Related to Output:** Model outputs embedding vectors of dimension 4096 for each text string. <br>

# Training Dataset & Evalution:

## Training Dataset:

Our training dataset is a carefully curated blend of multiple sources. It includes publicly available code retrieval datasets with commercial licenses, issue description–code pairs sourced from public GitHub repositories with commercial licenses, and synthetic data generated in response to coding questions. We prefix the queries with task-specific instructions following our research in [NV-Embed](https://arxiv.org/abs/2405.17428). For general tasks, we used ''Instruct: Retrieve code or text based on user query.\nQuery:''. The instruction can be changed based on the retrieval task.

The training dataset details are as follows:

**Use Case:** Code retrieval from text or code data. <br>
**Data Sources:** Public datasets licensed for commercial use and synthetically-generated data. <br>
**Language:** English (US), programming languages including Python, C/C++, Java, JavaScript, SQL, Go, Ruby, PHP. <br>
**Volume:** 534k pairs of query-positive document. <br>
**Data Collection Method by dataset:** Unknown <br>
**Labeling Method by dataset:** The synthetic data is generated using [DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5). <br>

## Evaluation Dataset:

We evaluated NV-EmbedCode model using the [CoIR benchmark](https://arxiv.org/html/2407.02883v1) and a curated set based on [SWE-bench](https://arxiv.org/abs/2310.06770). CoIR consists of 10 code datasets across four retrieval tasks: (1) Text-to-Code Retrieval, (2) Code-to-Code Retrieval, (3) Code-to-Text Retrieval, and (4) Hybrid Code Retrieval. The default evaluation metric for CoIR is average NDCG@10 across all datasets. SWE-bench originally consists of real-world software engineering problems from GitHub issues and their corresponding pull requests. We adapted it into a retrieval task, where the goal is to identify the files that need to be edited to resolve an issue. These files are identified using the pull request that solved the issue. For SWE-bench Lite, we use Recall@1 to measure whether the top retrieved file is the correct one for resolving the issue, as each instance typically involves editing just one file.
<br>

| Retrieval Method         | CoIR Main Score (NDCG@10) | SWE-bench Lite (Recall@1) |
| :----------------------- | :-----------------------: | ------------------------: |
| NV-EmbedCode             |          72.45%           |                    70.33% |
| NV-EmbedQA-Mistral-7B-v2 |          60.08%           |                    61.33% |
| SFR-Embedding-Code-2B_R  |          67.41%           |                    47.00% |
| SFR-Mistral-2_R          |          61.85%           |                    60.33% |
| BM25                     |             -             |                    42.33% |

## Technical Details

### Software Integration

**Runtime:** NeMo Retriever Text Embedding NIM <br>
**Supported Hardware Microarchitecture Compatibility:** NVIDIA Ampere, NVIDIA Hopper, NVIDIA Lovelace <br>
**Supported Operating System(s):** Linux <br>
**Engine:** [TensorRT](https://developer.nvidia.com/tensorrt-getting-started) <br>
**Test Hardware:** See Support Matrix from [NIM documentation](https://docs.nvidia.com/nim/nemo-retriever/text-embedding/latest/overview.html). <br>

## Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ [Explainability](./explainability.md), [Bias](./bias.md), [Safety & Security](./safety.md), and [Privacy](./privacy.md) Subcards.

Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## Bias

| Field                                                                                                                                                            | Response |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- |
| Participation considerations from adversely impacted groups [protected classes](https://calcivilrights.ca.gov/disputeresolution/protected-characteristics/) in model design and testing | None     |
| Measures taken to mitigate against unwanted bias                                                                                                                 | None     |

## Explainability

| Field                          | Response                                                                                                                                                                                                                                              |
| ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Intended Application & Domain: | Embedding passages and queries for retrieval in coding tasks                                                                                                                                                                                          |
| Model Type:                    | Transformer encoder                                                                                                                                                                                                                                   |
| Intended User:                 | Generative AI creators working with conversational AI models for coding tasks.                                                                                                                                                                        |
| Output:                        | Array of float numbers (Dense Vector Representation for the input text)                                                                                                                                                                               |
| Describe how the model works:  | Model transforms the tokenized input string into a dense vector representation.                                                                                                                                                                       |
| Performance Metrics:           | Accuracy, Throughput, and Latency                                                                                                                                                                                                                     |
| Potential Known Risks:         | This model may not always retrieve the correct passage(s) for a given query.                                                                                                                                                            |
| Licensing & Terms of Use:      | The use of this model is governed by [NVIDIA AI Foundation Models Community License Agreement](https://developer.nvidia.com/downloads/nv-ai-foundation-models-license) and the [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/). |
| Technical Limitations          | The model was trained with input length up to 512 tokens, therefore, it may perform poorly on specialized longer inputs.                                                                                                                              |

## Privacy

| Field                                                                                                                                           | Response                                       |
| ----------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
| Generatable or reverse engineerable personally-identifiable information (PII)?                                                                  | None                                           |
| Was consent obtained for any personal data used?                                                                                                | Not Applicable                                 |
| Personal data used to create this model?                                                                                                                  | None                                           |
| How often is the dataset reviewed?                                                                                                              | Before Every Release                           |
| Is there provenance for all datasets used in training?                                                                                          | Yes                                            |
| Does data labeling (annotation, metadata) comply with privacy laws?                                                                             | Yes                                            |
| Is data compliant with data subject requests for data correction or removal, if such a request was made?                                        | No, not possible with externally-sourced data. |

## Safety & Security

| Field                                             | Response                                                                                                                                                                                                          |
| ------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Model Application(s):                             | Code Embedding for Retrieval                                                                                                                                                                                      |
| Describe the physical safety impact (if present). | None Known                                                                                                                                                                                                     |
| Use Case Restrictions:                            | Abide by [NVIDIA AI Foundation Models Community License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/).                                                 |
| Model and dataset restrictions:                   | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. |

## Prototype

```bash
invoke_url='https://integrate.api.nvidia.com/v1/embeddings'

authorization_header='Authorization: Bearer '
accept_header='Accept: application/json'
content_type_header='Content-Type: application/json'

data=$'{
"model": "nvidia/nv-embedcode-7b-v1",
"encoding_format": "float",
"truncate": "NONE",
"messages": [
{
"role": "user",
"content": ""
}
]
}'

response=$(curl --silent -i -w "\n%{http_code}" --request POST \
--url "$invoke_url" \
--header "$authorization_header" \
--header "$accept_header" \
--header "$content_type_header" \
--data "$data"
)

echo "$response"
```

```python
from openai import OpenAI

client = OpenAI(
api_key="$NVIDIA_API_KEY",
base_url="https://integrate.api.nvidia.com/v1"
)

response = client.embeddings.create(
input=[""],
model="nvidia/nv-embedcode-7b-v1",
encoding_format="float",
extra_body={"input_type": "", "truncate": "NONE"}
)

print(response.data[0].embedding)
```

```python
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings

client = NVIDIAEmbeddings(
model="nvidia/nv-embedcode-7b-v1", 
api_key="$NVIDIA_API_KEY", 
truncate="NONE", 
)

embedding = client.embed_query("")
print(embedding)
```