Publisher
Use Case
NIM Type
Sorting by Most Recent
meta / 
llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

meta / 
llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

nvidia / 
vila

Multi-modal vision-language model that understands text/images and generates informative responses

briaai / 
BRIA-2.3

An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.

nvidia / 
usdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

shutterstock / 
edify-360-hdri-early-access

Shutterstock Early Access preview of Generative 3D service for 360 HDRi generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries.

Shutterstock / 
edify-3d

Shutterstock Generative 3D service for 3D asset generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries

nvidia / 
nvclip

NV-CLIP is a multimodal embeddings model for image and text.

stabilityai / 
stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

google / 
paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

google / 
deplot

One-shot visual language understanding model that translates images of plots into tables.

nvidia / 
neva-22b

Multi-modal vision-language model that understands text/images and generates informative responses

stabilityai / 
sdxl-turbo

A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation