nvidia/nv-embed-v1
PREVIEWGenerates high-quality numerical embeddings from text inputs.
Model Overview
Description
The NV-Embed Model is a generalist embedding model that excels across 56 tasks, including retrieval, reranking, classification, clustering, and semantic textual similarity tasks. NV-Embed achieves the highest score of 59.36 on 15 retrieval tasks within this benchmark.
NV-Embed features several innovative designs, such as latent vectors for improved pooled embedding output and a two-stage instruction tuning method, enhancing the accuracy of both retrieval and non-retrieval tasks. For more technical details, refer to the paper: NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models.
Terms of use
The use of this model is governed by the license.
References(s)
For more details, refer to the NV-Embed paper.
Intended use
The NV-Embed Model is designed for users who need a high-performance generalist embedding model for tasks such as text retrieval, reranking, classification, clustering, and semantic textual similarity.
Model Architecture
Architecture Type: Decoder-only LLM
Network Architecture: Mistral-7B-v0.1 with Latent-Attention pooling
Embedding Dimension: 4096
Max Input Tokens: 32k
Parameter Count: 7.1 billion
The NV-Embed Model is based on the Mistral-7B-v0.1 architecture with a unique Latent-Attention pooling mechanism. This allows the model to generate more expressive pooled embeddings by having the LLM attend to latent vectors. It employs a two-stage instruction tuning method to improve performance across various tasks.
Input
Input Type: text
Input Format: list of strings with task-specific instructions
Output
Output Type: floats
Output Format: list of float arrays, each array containing the embeddings for the corresponding input string
Model Version(s)
NV-Embed-v1
Training Dataset & Evaluation
Training Dataset
The NV-Embed model was trained on a diverse mixture of publicly available datasets, including various retrieval and non-retrieval tasks. The training data did not include any synthetic data from proprietary models like GPT-4, ensuring the model's accessibility and reproducibility.
Evaluation Results
NV-Embed was evaluated using the Massive Text Embedding Benchmark (MTEB), achieving a record-high score of 69.32 across 56 tasks. It significantly outperforms previous leading embedding models, particularly excelling in retrieval tasks.
Performance on MTEB benchmark:
- Overall Score: 69.32
- Score on Retrieval Tasks: 59.36
Ethical Considerations
Bias, Safety & Security, and Privacy
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards here. Please report security vulnerabilities or NVIDIA AI Concerns here.
Special Training Data Considerations
The model was trained on publicly available data, which may contain toxic language and societal biases. Therefore, the model may amplify those biases, such as associating certain genders with specific social stereotypes.