⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

API Endpoint

2

Download Available

1

Use Case

Image-to-Text

1

Code Generation

0

Retrieval Augmented Generation

0

Drug Discovery

0

Object Detection

0

Publisher

NVIDIA

1

Google

1

Qwen

1

Meta

0

Mistral AI

0

Labels (2)

VLM

image

3 models

Sort By

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

6.11M

3w

cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

6

1y

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

330K

1y

Items per page

of 1 pages