Try NVIDIA NIM APIs

⌘KCtrl+K

Explore

Models

Skills

⌘KCtrl+K

11 results for

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Image Generation

Text-to-Image

Synthetic Data Generation

Medical Imaging

Inference Providers

Deep Infra

Bitdeer AI

GMI Cloud

Together AI

Vultr

Publisher

NVIDIA

Qwen

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

Items per page

of 1 pages

2.8M

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Model

MoE

12.46M

3mo

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

1.71M

NVIDIA

Free Endpoint

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

Model

autonomous vehicles

1.58K

Black-forest-labs

Downloadable

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

Model

image editing

270K

2mo

Microsoft

Downloadable

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

Model

text-to-3d

3.99K

9mo

NVIDIA

DownloadableFree Endpoint

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

Model

doc intelligence

9.93M

11mo

Mistral AI

DownloadableFree Endpoint

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

Model

code generation

16.19M

2mo

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

10.29K

NVIDIA

Downloadable

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

Model

Interactive Annotation

803

Qwen

DownloadableFree Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

Model

B200

10.34M

3mo