Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters

Free Endpoint

22

Partner Endpoint

10

Download Available

30

Use Case

Retrieval Augmented Generation

6

Object Detection

4

Image-to-Text

3

Optical Character Recognition

3

Text-to-Embedding

2

Inference Providers

Deepinfra

7

OpenRouter

4

Bitdeer

4

Lightning AI

3

Together AI

2

Publisher

NVIDIA

35

Mistral AI

4

Meta

0

Google

0

Qwen

0

NIM Container GPUs

H100 80GB HBM3

3

H200

3

L40S

3

A100 SXM4 80GB

3

A10G

3

39 models

Sort By

Downloadable

nemotron-ocr-v2

Nemotron OCR v2 is a state-of-the-art multilingual text recognition model designed for robust end-to-end optical character recognition (OCR) on complex real-world images.

Table Extraction

338K

18d

Items per page

of 2 pages

DownloadableFree Endpoint

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

8M

1mo

DownloadableFree Endpoint

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

2M

1mo

DownloadableFree Endpoint

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

4M

2mo

DownloadableFree Endpoint

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

8M

2mo

Deprecation in 3dFree Endpoint

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

295K

2mo

Downloadable

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

84K

3mo

DownloadableFree Endpoint

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

code generation

13M

3mo

Free Endpoint

nemotron-voicechat

Nemotron 3 Voicechat

2K

3mo

Downloadable

nemotron-asr-streaming

Real-time speech recognition for English

Automatic Speech Recognition

6K

4mo

Downloadable

nemotron-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Table Extraction

341K

4mo

DownloadableFree Endpoint

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

60M

4mo

Downloadable

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

501K

4mo

Downloadable

nemotron-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection

157K

4mo

Downloadable

nemotron-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection

433K

4mo

Downloadable

nemotron-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection

40K

4mo

Downloadable

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Text-to-Embedding

4M

4mo

Downloadable

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

8M

5mo

Deprecation in 3dFree Endpoint

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NeMo Guardrails

504K

5mo

DownloadableFree Endpoint

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

12M

6mo

Free Endpoint

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

language generation

3M

7mo

Downloadable

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

text and table extraction

218K

8mo

DownloadableFree Endpoint

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

language generation

2M

8mo

Free Endpoint

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation

336K

8mo