⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

Free Endpoint

9

Partner Endpoint

13

Download Available

6

Use Case

Code Generation

1

Retrieval Augmented Generation

0

Drug Discovery

0

Image-to-Text

0

Object Detection

0

Inference Providers

Fireworks AI

8

Deep Infra

7

Together AI

6

GMI Cloud

5

Bitdeer AI

3

Publisher

DeepSeek AI

3

Mistral AI

2

Moonshotai

2

Google

1

Qwen

1

API Catalog Type

Enterprise

0

Blueprint Type

NVIDIA BioNemo

0

Labels (2)

coding

reasoning

15 models

Sort By

Downloadable

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

Today

Downloadable

minimax-m2.5

MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

9.77M

1mo

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

8.99M

2mo

Free Endpoint

glm-4.7

GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.

14.39M

2mo

Free Endpoint

devstral-2-123b-instruct-2512

State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.

4.65M

3mo

Free Endpoint

kimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.

14M

6mo

Downloadable

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

253K

8mo

Free Endpoint

kimi-k2-instruct

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

20.61M

8mo

Free Endpoint

magistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

2.09M

8mo

Free Endpoint

granite-3.3-8b-instruct

Small language model fine-tuned for improved reasoning, coding, and instruction-following

78.41K

8mo

Free Endpoint

qwq-32b

Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.

2.23M

9mo

Downloadable

deepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

2.31M

8mo

Downloadable

deepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

2.49K2.54M

10mo

Downloadable

deepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

1.88K2.17M

10mo

Free Endpoint

falcon3-7b-instruct

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

1.74M

10mo

Items per page

of 1 pages