⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Labels (1)

Text-to-text

52 models

Sort By

Free Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

1.49M

1w

Downloadable

minimax-m2.5

MiniMax M2.5 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

4.19M

2w

Free Endpoint

deepseek-v3.2

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

16.35M

2mo

Downloadable

qwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

4.24M

6mo

Free Endpoint

deepseek-v3.1

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

13.99M

6mo

Downloadable

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

8.46M

7mo

Downloadable

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

41.01M

7mo

Downloadable

teuken-7b-instruct-commercial-v0.4

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

530K

7mo

Downloadable

eurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

4.72K530K

8mo

Downloadable

gemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

531K

8mo

Downloadable

gemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

4.34K554K

9mo

Downloadable

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

2.79M

9mo

Downloadable

qwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation

1.77M

9mo

Downloadable

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

23.28M

9mo

Free Endpoint

nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

536K

9mo

Free Endpoint

granite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

494K

1y

Free Endpoint

llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

438K

1y

Downloadable

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

15.84K690K

9mo

Downloadable

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

16K330K

9mo

Free Endpoint

qwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation

728K

9mo

Free Endpoint

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

643K

9mo

Free Endpoint

jamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

571K

9mo

Free Endpoint

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

574K

1y

Free Endpoint

mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation

4.75K

1y

Items per page

of 3 pages