⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Code Generation

Retrieval Augmented Generation

Drug Discovery

Image-to-Text

Object Detection

Inference Providers

Together AI

Deep Infra

GMI Cloud

CoreWeave

Digital Ocean

Publisher

NVIDIA

Mistral AI

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

math

2.74M

8mo

Mistral AI

Free Endpoint

magistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

coding

1.25M

9mo

NVIDIA

Deprecation in 4dDownloadable

llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

math

6.57M

9mo

Qwen

DeprecatedFree Endpoint

qwq-32b

Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.

coding

989K

9mo

NVIDIA

Downloadable

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

math

1.33M

9mo

NVIDIA

Downloadable

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

math

743K

9mo

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Instruction following

12.68M

10mo

Items per page

of 1 pages