Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Contact

Explore

⌘KCtrl+K

7 results for

Filters (1)

API Endpoint

Download Available

Launchable

Enterprise

Use Case

Image-to-Text

Object Detection

Image Generation

Optical Character Recognition

Code Generation

Publisher

NVIDIA

Google

Qwen

cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

Model

VLM

DGX Spark

20 MIN

Live VLM WebUI

Real-time Vision Language Model interaction with webcam streaming

Playbook

Vision AI

2mo

NVIDIA

ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Model

Optical Character Recognition

798

Google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

327K

Qwen

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Model

MoE

5.42M

NVIDIA

retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Model

Object Detection

794

NVIDIA

visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

Model

image

615

Items per page

of 1 pages