Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
9 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
llama-nemotron-rerank-1b-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
118
2d
NVIDIA
llama-nemotron-embed-1b-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
202K
4d
NVIDIA
llama-nemotron-embed-vl-1b-v2
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
687K
3w
NVIDIA
llama-3_2-nemoretriever-300m-embed-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
134K
5mo
NVIDIA
llama-3_2-nemoretriever-300m-embed-v1
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Retrieval Augmented Generation
+2
68.84K
7mo
NVIDIA
llama-3.2-nemoretriever-500m-rerank-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
1.17K
8mo
NVIDIA
llama-3.2-nemoretriever-1b-vlm-embed-v1
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
269K
8mo
NVIDIA
llama-3.2-nv-embedqa-1b-v2
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
nemo retriever
+3
6.67M
7mo
NVIDIA
llama-3.2-nv-rerankqa-1b-v2
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
nemo retriever
+2
151K
7mo
Items per page
24
1
1
of 1 pages