⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Drug Discovery

Image-to-Text

Code Generation

Retrieval Augmented Generation

Speech-to-Text

Inference Providers

Together AI

Deep Infra

GMI Cloud

Bitdeer AI

Lightning AI

Publisher

Mistral AI

Minimaxai

Google

DeepSeek AI

Sarvamai

NIM Container GPUs

A100 SXM4 80GB

B200

GB200

GH200 144G HBM3e

H100 80GB HBM3

Labels (2)

reasoning

coding

9 models

Sort By

Mistral AI

Downloadable

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

coding

2.74M

Items per page

of 1 pages

DeepSeek AI

Downloadable

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

coding

9.49M

1mo

Z.ai

Downloadable

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Agentic AI

26.16M

1mo

Minimaxai

Free Endpoint

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

coding

14.29M

1mo

Google

Downloadable

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

coding

6.82M

1mo

Minimaxai

DeprecatedDownloadable

minimax-m2.5

MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

coding

4.72M

2mo

Stepfun-ai

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

Agentic

12.79M

3mo

Sarvamai

Downloadable

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

coding

317K

10mo

Mistral AI

DeprecatedFree Endpoint

magistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

coding

776K

10mo