Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

5 results for

Filters (2)

Free Endpoint

5

Partner Endpoint

4

Download Available

1

Use Case

Image-to-Text

5

Synthetic Data Generation

0

Inference Providers

Fireworks AI

4

Deep Infra

2

Bitdeer AI

2

Together AI

1

CoreWeave

1

Publisher

Meta

2

Google

2

Microsoft

1

NVIDIA

0

Labels (2)

Vision Assistant

language generation

Sort By

Free Endpoint

phi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant

625K

1y

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

41.97K

1y

Free Endpoint

gemma-3-27b-it

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

6.39M

10mo

Free Endpoint

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

7.03M

8mo

DownloadableFree Endpoint

llama-4-scout-17b-16e-instruct

A multimodal, multilingual 16 MoE model with 17B parameters.

language generation

22.2K

8mo

Items per page

of 1 pages