Try NVIDIA NIM APIs

|

|

Manage My Privacy

|

Copyright © 2024 NVIDIA Corporation

Publisher

Use Case

NIM Type

Sorting by Most Recent

phi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant

Visual Question Answering

Vision foundation model capable of performing diverse computer vision and vision language tasks.

Image Classification

NvClip generates vector embeddings for the given image or text.

Image Classification

phi-3-vision-128k-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Groundbreaking multimodal model designed to understand and reason about visual elements in images.

One-shot visual language understanding model that translates images of plots into tables.

Visual Language Understanding

Multi-modal model for a wide range of tasks, including image understanding and language generation.