Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Drug Discovery

Retrieval Augmented Generation

Speech-to-Text

Code Generation

Inference Providers

Deepinfra

Together AI

Bitdeer

GMI Cloud

CoreWeave

Publisher

Multi-modal vision-language model that understands text/img and creates informative responses

Items per page

of 1 pages

10.15M

11mo

Cutting-edge vision-language model exceling in high-quality reasoning from images.

1.67M

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

2.69M

Free Endpoint

Vision language model adept at comprehending text and visual inputs to produce informative responses

10.22K