Try NVIDIA NIM APIs

Explore

Models

Skills

Blueprints

17 results for

Filters

Free Endpoint

Partner Endpoint

Download Available

Enterprise Blueprint

Use Case

Image Generation

Text-to-Image

Image-to-Text

Optical Character Recognition

Synthetic Data Generation

Inference Providers

Deepinfra

GMI Cloud

OpenRouter

Together AI

Vultr

Publisher

NVIDIA

Qwen

Google

Microsoft

Mistral AI

Audience

AI Engineer

Data Scientist

Developer

Ml Engineer

Application Developer

Blueprint Type

NVIDIA Isaac GR00T

NVIDIA Omniverse

Domain

AI And Machine Learning

Developer Tools

NIM Container GPUs

B200

GB200

Library

TAO Toolkit

Jetson

NeMo Retriever

Sort By

Qwen

Downloadable

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

Model

Text-to-Image

1mo

Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate next step after `tao-route-visual-cha

Skill

Developer

647

17d

Items per page

of 1 pages

Mistral AI

DownloadableFree Endpoint

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

Model

code generation

13M

3mo

Qwen

DownloadableFree Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

Model

tool calling

10M

3mo

DGX Spark

1 HR

FLUX.1 Dreambooth LoRA Fine-tuning

Fine-tune FLUX.1-dev 12B model using Dreambooth LoRA for custom image generation

Playbook

Image Generation

8mo

NVIDIA

Free Endpoint

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

Model

autonomous vehicles

29d

Microsoft

Downloadable

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

Model

text-to-3d

10mo

Robotics

Enterprise

Synthetic Manipulation Motion Generation for Robotics

Generate exponentially large amounts of synthetic motion trajectories for robot manipulation from just a few human demonstrations.

Blueprint

synthetic data

4mo

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Model

MoE

13M

4mo

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

10K

CLIP vision-language model for image-text retrieval, zero-shot classification, embedding extraction, ONNX export, and TensorRT deployment. Use when fine-tuning or training CLIP, running zero-shot classification, computing image embeddings, or deploying CL

Skill

AI Engineer

651

17d

OCRNet for scene text recognition. Recognizes text content from cropped text-region images and supports CTC and attention-based decoders. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCRNet mode

Skill

Developer

649

17d

OCDNet for scene text detection. Detects arbitrary-oriented text regions in natural images using a differentiable binarization approach. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCDNet model

Skill

AI Engineer

646

17d

Stability AI

Downloadable

stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

Model

Text-to-Image

10mo

NVIDIA

Downloadable

nemotron-ocr-v2

Nemotron OCR v2 is a state-of-the-art multilingual text recognition model designed for robust end-to-end optical character recognition (OCR) on complex real-world images.

Model

Table Extraction

Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or

Skill

Developer

947

29d

Build a per-target knowledge-base markdown next to the active profile by walking the BSP root and source tree. Use after init-image / init-source; not for editing profile fields.

Skill

Developer

211