Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters

Free Endpoint

17

Partner Endpoint

11

Download Available

19

Use Case

Code Generation

5

Retrieval Augmented Generation

4

Image-to-Text

3

Text-to-Embedding

2

Drug Discovery

1

Inference Providers

Deepinfra

9

Together AI

6

GMI Cloud

3

CoreWeave

3

Bitdeer

1

Publisher

NVIDIA

13

Meta

11

Abacus.AI

1

Google

0

Mistral AI

0

NIM Container GPUs

H100 80GB HBM3

8

H200

8

L40S

8

A100 SXM4 80GB

8

A10G

8

25 models

Sort By

Downloadable

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

Items per page

of 2 pages

84.41K

2mo

Downloadable

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

501K

3mo

Downloadable

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Text-to-Embedding

4.45M

3mo

Downloadable

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

7.59M

4mo

Downloadable

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

text and table extraction

218K

7mo

Free Endpoint

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation

336K

7mo

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

advanced reasoning

3.17M

10mo

Free Endpoint

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety

222K

11mo

DownloadableFree Endpoint

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligence

10.15M

11mo

Free Endpoint

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

language generation

20.32M

11mo

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

advanced reasoning

4.93M

11mo

DownloadableFree Endpoint

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

advanced reasoning

1.47M

11mo

Downloadable

nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognition

85.79K

1y

Downloadable

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nemo guardrails

149K

1y

Downloadable

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

nemo guardrails

160K

1y

DownloadableFree Endpoint

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Instruction following

18.79M

1y

DownloadableFree Endpoint

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Language Generation

28.5K1.22M

1y

DownloadableFree Endpoint

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Image-Text Retrieval

1.67M

1y

DownloadableFree Endpoint

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Image-Text Retrieval

2.69M

1y

DownloadableFree Endpoint

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Language Generation

44.06K290K

1y

Free Endpoint

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

Code Generation

783K

1y

Free Endpoint

esm2-650m

Generates embeddings of proteins from their amino acid sequences.

128K

1y

DownloadableFree Endpoint

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

3.9M

1y

DownloadableFree Endpoint

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

25.09M

11mo