Try NVIDIA NIM APIs

Explore

Models

Skills

Blueprints

5 results for

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Enterprise Blueprint

Use Case

Image-to-Text

Image Generation

Text-to-Image

Inference Providers

OpenRouter

Together AI

Deepinfra

Publisher

Google

NVIDIA

Microsoft

Black forest labs

Mistral AI

Audience

AI Engineer

Developer

Ml Engineer

Blueprint Type

NVIDIA Isaac GR00T

NVIDIA Omniverse

Domain

Physical AI

Library

Physical AI Dataset

Labels (1)

language generation

Sort By

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

Items per page

of 1 pages

10K

Google

Free Endpoint

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

34M

Microsoft

Deprecation in 2dFree Endpoint

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

Model

Speech Recognition

173K

Google

Free Endpoint

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

NVIDIA

DownloadableFree Endpoint

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

Model

language generation

8mo