⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Retrieval Augmented Generation

Drug Discovery

Image-to-Text

Code Generation

Speech-to-Text

Inference Providers

Deep Infra

Bitdeer AI

Together AI

GMI Cloud

Lightning AI

Publisher

DeepSeek AI

Moonshotai

Mistral AI

Google

Qwen

GPU Types

A100 SXM4 80GB

B200

GB200

GH200 144G HBM3e

H100 80GB HBM3

Labels (2)

agentic

coding

9 models

Sort By

DeepSeek AI

Downloadable

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

coding

Items per page

of 1 pages

361K

DeepSeek AI

Downloadable

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

coding

781K

Z.ai

Downloadable

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Agentic AI

2.53M

Google

Downloadable

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

coding

3.46M

Stepfun-ai

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

Agentic

9.07M

2mo

Mistral AI

Deprecation in 14dFree Endpoint

devstral-2-123b-instruct-2512

State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.

coding

2.81M

4mo

Moonshotai

Free Endpoint

kimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.

long-context

9.75M

7mo

Qwen

Free Endpoint

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

agentic coding

3.27M

7mo

Moonshotai

Free Endpoint

kimi-k2-instruct

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

coding

15.32M

9mo