⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Retrieval Augmented Generation

Drug Discovery

Code Generation

Speech-to-Text

Inference Providers

Deep Infra

Together AI

GMI Cloud

Lightning AI

Vultr

Publisher

Qwen

Mistral AI

DeepSeek AI

Google

Z.ai

NIM Container GPUs

A100 SXM4 80GB

B200

GB200

GH200 144G HBM3e

H100 80GB HBM3

Labels (1)

agentic

10 models

Sort By

Mistral AI

Downloadable

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

coding

Items per page

of 1 pages

1.73M

DeepSeek AI

Downloadable

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

coding

8.96M

DeepSeek AI

Downloadable

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

coding

7.15M

Z.ai

Downloadable

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Agentic AI

19.47M

Google

Downloadable

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

coding

6.68M

1mo

Qwen

Downloadable

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

MoE

11.27M

2mo

Stepfun-ai

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

Agentic

12.17M

3mo

Mistral AI

Free Endpoint

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

language generation

4.52M

5mo

Qwen

Downloadable

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

text-generation

23.85M

7mo

Qwen

Free Endpoint

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

agentic coding

5.11M

8mo