Explore

Models

Skills

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Retrieval Augmented Generation

Text-to-Embedding

Drug Discovery

Image-to-Text

Speech-to-Text

Inference Providers

Deepinfra

OpenRouter

Together AI

GMI Cloud

Bitdeer

Publisher

NVIDIA

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever

Items per page

of 1 pages

84K

3mo

NVIDIA

Downloadable

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever

501K

4mo

NVIDIA

Downloadable

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Text-to-Embedding

4mo

NVIDIA

Downloadable

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retriever

5mo