Explore

Models

Skills

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Image Generation

Text-to-Image

Synthetic Data Generation

Medical Imaging

Inference Providers

Deepinfra

OpenRouter

Together AI

GMI Cloud

Bitdeer

Publisher

NVIDIA

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

autonomous vehicles

Items per page

of 1 pages

1mo

Mistral AI

DownloadableFree Endpoint

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

code generation

13M

3mo

Black-forest-labs

Downloadable

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

image editing

271K

4mo

Qwen

DownloadableFree Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

tool calling

10M

4mo

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

MoE

13M

4mo

Microsoft

Downloadable

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

text-to-3d

10mo

NVIDIA

DownloadableFree Endpoint

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligence

10M

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Image-Text Retrieval

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Image-Text Retrieval

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image

10K

NVIDIA

Downloadable

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

Interactive Annotation

824