nvidia/nv-embed-v1

PREVIEW

Generates high-quality numerical embeddings from text inputs.

Model Overview

Description

The NV-Embed Model is a generalist embedding model that excels across 56 tasks, including retrieval, reranking, classification, clustering, and semantic textual similarity tasks. NV-Embed achieves the highest score of 59.36 on 15 retrieval tasks within this benchmark.

NV-Embed features several innovative designs, such as latent vectors for improved pooled embedding output and a two-stage instruction tuning method, enhancing the accuracy of both retrieval and non-retrieval tasks. For more technical details, refer to the paper: NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models.

Terms of use

The use of this model is governed by the license.

References(s)

For more details, refer to the NV-Embed paper.

Intended use

The NV-Embed Model is designed for users who need a high-performance generalist embedding model for tasks such as text retrieval, reranking, classification, clustering, and semantic textual similarity.

Model Architecture

Architecture Type: Decoder-only LLM
Network Architecture: Mistral-7B-v0.1 with Latent-Attention pooling
Embedding Dimension: 4096
Max Input Tokens: 32k
Parameter Count: 7.1 billion

The NV-Embed Model is based on the Mistral-7B-v0.1 architecture with a unique Latent-Attention pooling mechanism. This allows the model to generate more expressive pooled embeddings by having the LLM attend to latent vectors. It employs a two-stage instruction tuning method to improve performance across various tasks.

Input

Input Type: text
Input Format: list of strings with task-specific instructions

Output

Output Type: floats
Output Format: list of float arrays, each array containing the embeddings for the corresponding input string

Model Version(s)

NV-Embed-v1

Training Dataset & Evaluation

Training Dataset

The NV-Embed model was trained on a diverse mixture of publicly available datasets, including various retrieval and non-retrieval tasks. The training data did not include any synthetic data from proprietary models like GPT-4, ensuring the model's accessibility and reproducibility.

Evaluation Results

NV-Embed was evaluated using the Massive Text Embedding Benchmark (MTEB), achieving a record-high score of 69.32 across 56 tasks. It significantly outperforms previous leading embedding models, particularly excelling in retrieval tasks.

Performance on MTEB benchmark:

  • Overall Score: 69.32
  • Score on Retrieval Tasks: 59.36

Ethical Considerations

Bias, Safety & Security, and Privacy

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards here. Please report security vulnerabilities or NVIDIA AI Concerns here.

Special Training Data Considerations

The model was trained on publicly available data, which may contain toxic language and societal biases. Therefore, the model may amplify those biases, such as associating certain genders with specific social stereotypes.