Explore

Models

Skills

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Synthetic Data Generation

Code Generation

Drug Discovery

Image-to-Text

Retrieval Augmented Generation

Inference Providers

OpenRouter

Deepinfra

Together AI

GMI Cloud

CoreWeave

Publisher

NVIDIA

OpenAI

Mistral AI

Qwen

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

video understanding

Items per page

of 1 pages

29d

NVIDIA

DownloadableFree Endpoint

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

Quantum

332K

2mo

NVIDIA

DownloadableFree Endpoint

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

MoE

60M

3mo

Qwen

DownloadableFree Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

tool calling

10M

3mo

NVIDIA

Free Endpoint

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NeMo Guardrails

145K

5mo

NVIDIA

Downloadable

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

video understanding

191K

6mo

OpenAI

DownloadableFree Endpoint

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

reasoning

18M

10mo

OpenAI

DownloadableFree Endpoint

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

reasoning

57M

10mo

NVIDIA

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

advanced reasoning

11mo

NVIDIA

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

advanced reasoning

11mo

Mistral AI

DownloadableFree Endpoint

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning

996K

11mo