⌘KCtrl+K

Your Privacy Choices

Contact

Explore

Models

⌘KCtrl+K

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Code Generation

Retrieval Augmented Generation

Drug Discovery

Object Detection

Inference Providers

Bitdeer AI

Deep Infra

Together AI

GMI Cloud

CoreWeave

Publisher

Google

NVIDIA

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

language generation

4.73M

5mo

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

language generation

11.03M

9mo

Google

Free Endpoint

gemma-3-27b-it

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant

5.98M

11mo

Microsoft

DeprecatedFree Endpoint

phi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant

1.2M

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image

41.63K

Items per page

of 1 pages