Try NVIDIA NIM APIs

⌘KCtrl+K

9 results for

Sort By

cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

Model

Synthetic Data Generation

22.25K

11mo

NVIDIA

cosmos-reason1-7b

Reasoning vision language model (VLM) for physical AI and robotics.

Model

video understanding

15.93K

6mo

NVIDIA

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

Model

video understanding

194K

2mo

NVIDIA

cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Model

Synthetic Data Generation

15.87K

8mo

NVIDIA

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Model

Synthetic Data Generation

NVIDIA

ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Model

Optical Character Recognition

798

Google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

327K

NVIDIA

retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Model

Object Detection

794

NVIDIA

visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

Model

image

615

Items per page

of 1 pages