Try NVIDIA NIM APIs

Explore

Models

Skills

Blueprints

6 results for

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Enterprise Blueprint

Launchable

Use Case

Image-to-Text

Synthetic Data Generation

Inference Providers

Deepinfra

GMI Cloud

Together AI

Bitdeer

Eigen AI

Publisher

NVIDIA

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

Items per page

of 1 pages

1.67M

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

2.69M

DGX Spark

1 HR

Vision-Language Model Fine-tuning

Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3

Playbook

DGX

8mo

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

10.22K

NVIDIA

DownloadableFree Endpoint

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

Model

doc intelligence

10.15M

11mo

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Model

MoE

13.15M

3mo