⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

API Endpoint

Download Available

Use Case

Code Generation

Text Translation

Drug Discovery

Image-to-Text

Retrieval Augmented Generation

Publisher

Qwen

NVIDIA

Microsoft

ByteDance

THUDM

Labels (1)

chat

6 models

Sort By

Qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

chat

5mo

ByteDance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

thinking budget

5mo

Qwen

qwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completion

9mo

NVIDIA

mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation

THUDM

chatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

Text Translation

7mo

Microsoft

phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chat

9mo

Items per page

of 1 pages