⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Drug Discovery

Image-to-Text

Code Generation

Retrieval Augmented Generation

Speech-to-Text

Inference Providers

Deep Infra

GMI Cloud

Together AI

Bitdeer AI

Lightning AI

Publisher

Mistral AI

DeepSeek AI

Minimaxai

Stepfun ai

Qwen

NIM Container GPUs

B200

H200

H100 80GB HBM3

L40S

DGX Spark

Labels (1)

coding

12 models

Sort By

Stepfun-ai

DownloadableFree Endpoint

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

B200

Items per page

of 1 pages

1.21M

Mistral AI

DownloadableFree Endpoint

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

coding

3.05M

1mo

DeepSeek AI

DownloadableFree Endpoint

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

B200

13.19M

1mo

DeepSeek AI

Downloadable

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

B200

8.1M

1mo

Z.ai

DownloadableFree Endpoint

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

B200

24.9M

1mo

Minimaxai

DownloadableFree Endpoint

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

B200

13.46M

1mo

Google

DownloadableFree Endpoint

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

B200

5.76M

2mo

Minimaxai

DeprecatedDownloadable

minimax-m2.5

MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

B200

2.5M

3mo

Stepfun-ai

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

Agentic

11.52M

4mo

Qwen

Free Endpoint

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

agentic coding

5.13M

9mo

Sarvamai

DownloadableFree Endpoint

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

coding

286K

10mo

Mistral AI

DeprecatedFree Endpoint

magistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

coding

364K

10mo