⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Medical Imaging

Retrieval Augmented Generation

Drug Discovery

Code Generation

Inference Providers

Deep Infra

Together AI

Bitdeer AI

GMI Cloud

CoreWeave

Publisher

NVIDIA

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

code generation

Items per page

of 1 pages

8.24M

1mo

Qwen

Downloadable

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

tool calling

7.65M

1mo

Qwen

Downloadable

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

MoE

9.6M

2mo

Microsoft

Downloadable

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

text-to-3d

2.56K

7mo

NVIDIA

Downloadable

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligence

7.32M

10mo

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Image-Text Retrieval

908K

11mo

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Image-Text Retrieval

1.38M

11mo

NVIDIA

Downloadable

nvclip

NV-CLIP is a multimodal embeddings model for image and text.

Computer vision

57.45K

10mo

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image

28.56K

NVIDIA

Downloadable

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

Interactive Annotation

757