⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

2

Partner Endpoint

10

Download Available

10

Use Case

Code Generation

2

Image-to-Text

1

Drug Discovery

0

Retrieval Augmented Generation

0

Object Detection

0

Inference Providers

Deep Infra

8

Fireworks AI

7

Together AI

7

GMI Cloud

6

CoreWeave

5

Publisher

Mistral AI

3

NVIDIA

2

Qwen

2

OpenAI

2

Moonshotai

1

API Catalog Type

Enterprise

0

Blueprint Type

NVIDIA BioNemo

0

Labels (1)

reasoning

12 models

Sort By

Downloadable

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

847K

1w

Downloadable

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

12.57M

1w

Free Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

3.51M

2w

Downloadable

glm-5

GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.

19.68M

1mo

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

8.85M

1mo

Downloadable

kimi-k2.5

1T multimodal MoE for high‑capacity video and image understanding with efficient inference.

29.65M

1mo

Downloadable

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

13.77M

3mo

Downloadable

qwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

4.72M

6mo

Downloadable

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

9.21M

7mo

Downloadable

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

43.53M

7mo

Downloadable

mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

5.31M

8mo

Downloadable

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

748K

8mo

Items per page

of 1 pages