Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
12 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
llama-nemotron-embed-vl-1b-v2
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
3w
NVIDIA
llama-3_2-nemoretriever-300m-embed-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
5mo
NVIDIA
llama-3_2-nemoretriever-300m-embed-v1
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
7mo
NVIDIA
llama-3.2-nemoretriever-500m-rerank-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
8mo
NVIDIA
llama-3.2-nemoretriever-1b-vlm-embed-v1
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
8mo
NVIDIA
nv-embedcode-7b-v1
The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.
nemo retriever
+2
9mo
NVIDIA
llama-3.2-nv-embedqa-1b-v2
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
nemo retriever
+3
7mo
NVIDIA
llama-3.2-nv-rerankqa-1b-v2
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
nemo retriever
+2
7mo
NVIDIA
nv-embedqa-e5-v5
English text embedding model for question-answering retrieval.
Embedding
+4
7mo
NVIDIA
nv-embed-v1
Generates high-quality numerical embeddings from text inputs.
Non-Commercial Use Only
+2
7mo
BAAI
bge-m3
Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.
Embeddings
+2
10mo
NVIDIA
rerank-qa-mistral-4b
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
Ranking
+1
1y
Items per page
24
1
1
of 1 pages