Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters

Free Endpoint73

Partner Endpoint41

Download Available108

Use Case

Drug Discovery13Retrieval Augmented Generation10Image-to-Text9Speech-to-Text9Image Generation8

Inference Providers

Deepinfra32OpenRouter28Together AI20GMI Cloud15Bitdeer7

Publisher

NVIDIA76Meta11Google6Mistral AI6Black forest labs4

NIM Container GPUs

H100 80GB HBM312A100 SXM4 80GB11L40S11A10G9B2009

139 models

Sort By

Downloadable

Video Super Resolution NIM

Upscale encoded or ST 2110 video to higher resolutions with NVIDIA Video Super Resolution.

video upscaling streaming nvidia ai for media video super resolution

Last updated on July 21, 2026

Items per page

of 6 pages

Free Endpoint

nemotron-3-embed-1b

1B embedding model for semantic search, retrieval, and RAG applications.

Nemotron Retriever

Agentic Retrieval Code Retrieval Text-to-Embedding Retrieval Augmented Generation

Last updated on July 16, 2026

Thinkingmachines

DownloadableFree Endpoint

inkling

Inkling is a multimodal (text + image) reasoning model from Thinking Machines — a Mamba-hybrid, 256-expert Mixture-of-Experts architecture with tool use and switchable reasoning.

reasoning image-to-text multimodal

Last updated on July 16, 2026

Free Endpoint

laguna-xs-2.1

Efficient 33B MoE for local, long-horizon agentic coding and terminal tasks

Coding Reasoning Tool Use

Last updated on July 15, 2026

DownloadableFree Endpoint

glm-5.2

GLM-5.2 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Coding Reasoning Tool Use

8M API calls in the last 30 days

Last updated on July 3, 2026

Downloadable

qwen-image-edit-nvpcb-ovsl2sl

An image edit model specialized for Omniverse synthetic to photographic solder-light style captured at NVIDIA PCB inspection stations

Synthetic Data Generation

Image Generation Physical AI

Last updated on July 3, 2026

Downloadable

nemotron-ocr-v2

Nemotron OCR v2 is a state-of-the-art multilingual text recognition model designed for robust end-to-end optical character recognition (OCR) on complex real-world images.

Table Extraction

nemo retriever data ingestion extraction Optical Character Recognition

338K API calls in the last 30 days

Last updated on June 24, 2026

Free Endpoint

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

text-to-text reasoning

10M API calls in the last 30 days

Last updated on June 12, 2026

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

text-to-text reasoning

4M API calls in the last 30 days

Last updated on June 10, 2026

DownloadableFree Endpoint

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

MoE Frontier Reasoning Long Context

52M API calls in the last 30 days

Last updated on June 4, 2026

Downloadable

chatterbox-multilingual-tts

Natural and expressive voices in 23 languages. For voice agents and brand ambassadors.

Chatterbox Speech Generation multilingual Text-to-Speech

7K API calls in the last 30 days

Last updated on June 3, 2026

DownloadableFree Endpoint

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

safety and moderation multilingual content safety ai safety nemo guardrails

2M API calls in the last 30 days

Last updated on June 2, 2026

Free Endpoint

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

autonomous vehicles

Physical AI robotics text-to-world image-to-world Synthetic Data Generation

2K API calls in the last 30 days

Last updated on June 1, 2026

DownloadableFree Endpoint

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

video understanding

autonomous vehicles industrial Physical AI vision language model reasoning robotics smart cities Synthetic Data Generation

2K API calls in the last 30 days

Last updated on June 1, 2026

DownloadableFree Endpoint

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

7M API calls in the last 30 days

Last updated on May 29, 2026

DownloadableFree Endpoint

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

Mixture-of-Experts Reasoning Image-to-Text

16M API calls in the last 30 days

Last updated on May 1, 2026

Downloadable

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

Image Generation

Last updated on May 1, 2026

Downloadable

qwen-image-edit

Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.

Image Generation

Last updated on May 1, 2026

DownloadableFree Endpoint

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

reasoning text agentic

5M API calls in the last 30 days

Last updated on April 29, 2026

DownloadableFree Endpoint

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

VLM Video Omni OCR

8M API calls in the last 30 days

Last updated on April 28, 2026

DownloadableFree Endpoint

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

coding fast agentic

15M API calls in the last 30 days

Last updated on April 24, 2026

DownloadableFree Endpoint

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

reasoning coding agentic

8M API calls in the last 30 days

Last updated on April 24, 2026

Deprecation in 7dDownloadable

Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.

remote contribution lighting nvidia ai for media

242 API calls in the last 30 days

Last updated on April 17, 2026

DownloadableFree Endpoint

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.

media2 forensics nvidia ai for media diffusion models

90K API calls in the last 30 days

Last updated on April 16, 2026