Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
30 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
Downloadable
llama-nemotron-rerank-vl-1b-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
Items per page
24
1
1
2
2
of 2 pages
3.13K
3w
NVIDIA
Downloadable
llama-nemotron-rerank-1b-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
179K
1mo
NVIDIA
Downloadable
llama-nemotron-embed-1b-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
1.95M
1mo
NVIDIA
Downloadable
llama-nemotron-embed-vl-1b-v2
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
9.71M
2mo
NVIDIA
Free Endpoint
llama-3.1-nemotron-safety-guard-8b-v3
Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs
content moderation
+4
128K
5mo
NVIDIA
Downloadable
llama-3_2-nemoretriever-300m-embed-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
123
6mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1.5
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
2.75M
9mo
NVIDIA
Free Endpoint
llama-3_2-nemoretriever-300m-embed-v1
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
357K
9mo
Meta
Free Endpoint
llama-guard-4-12b
Multi-modal model to classify safety for input prompts as well output responses.
LLM Multimodal Safety
+3
197K
9mo
NVIDIA
Downloadable
llama-3.2-nemoretriever-500m-rerank-v2
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
nemo retriever
+2
7.16K
10mo
NVIDIA
Deprecation in 3d
Downloadable
llama-3.2-nemoretriever-1b-vlm-embed-v1
Multimodal question-answer retrieval representing user queries as text and documents as images.
nemo retriever
+3
52.65K
10mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-vl-8b-v1
Multi-modal vision-language model that understands text/img and creates informative responses
doc intelligence
+2
8.88M
9mo
NVIDIA
Deprecation in 1d
Downloadable
llama-3.1-nemotron-ultra-253b-v1
Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
math
+3
6.64M
9mo
Meta
Free Endpoint
llama-4-maverick-17b-128e-instruct
A general purpose multimodal, multilingual 128 MoE model with 17B parameters.
language generation
+3
10.92M
9mo
NVIDIA
Downloadable
llama-3.3-nemotron-super-49b-v1
High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
math
+3
1.57M
9mo
NVIDIA
Downloadable
llama-3.1-nemotron-nano-8b-v1
Leading reasoning and agentic AI accuracy model for PC and edge.
math
+3
863K
9mo
NVIDIA
Downloadable
llama-3.1-nemoguard-8b-topic-control
Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.
nemo guardrails
+4
136K
1y
NVIDIA
Downloadable
llama-3.1-nemoguard-8b-content-safety
Leading content safety model for enhancing the safety and moderation capabilities of LLMs
nemo guardrails
+4
144K
1y
NVIDIA
Downloadable
llama-3.2-nv-embedqa-1b-v2
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
nemo retriever
+3
2.57M
9mo
NVIDIA
Downloadable
llama-3.2-nv-rerankqa-1b-v2
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
nemo retriever
+2
172K
9mo
Meta
Downloadable
llama-3.3-70b-instruct
Advanced LLM for reasoning, math, general knowledge, and function calling
Instruction following
+4
11.07M
10mo
Meta
Downloadable
llama-3.2-3b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
Chat
+3
12.93K
923K
11mo
Meta
Downloadable
llama-3.2-11b-vision-instruct
Cutting-edge vision-language model exceling in high-quality reasoning from images.
Image-Text Retrieval
+4
1.04M
10mo
Meta
Downloadable
llama-3.2-90b-vision-instruct
Cutting-edge vision-Language model exceling in high-quality reasoning from images.
Image-Text Retrieval
+4
1.41M
10mo