Try NVIDIA NIM APIs

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use when the user asks to "analyze VLM BCQ gaps", "extract VLM false positives and false negatives", or identify failure cases from a predict

Skill

Developer

1mo

Items per page

of 4 pages

Practical guidance for training MoE VLMs in Megatron Bridge. Compares FSDP and 3D-parallel approaches, using rounded lessons from Qwen3-VL, Qwen3-Next, and other multimodal experiments.

Skill

Developer

1mo

DGX Spark

1 HR

Vision-Language Model Fine-tuning

Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3

Playbook

DGX

9mo

NVIDIA

DownloadableFree Endpoint

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

Model

Image-to-Text

2mo

Google

Free Endpoint

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

Model

image

12K

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Model

MoE

16M

5mo

General

LaunchableDeveloper Example

LLM Router

Route LLM requests to the best model for the task at hand.

Blueprint

NVIDIA AI

4mo

Z.ai

DownloadableFree Endpoint

glm-5.2

GLM-5.2 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Model

Agentic AI

13d

DGX Spark

30 MIN

vLLM for Inference

Install and use vLLM on DGX Spark

Playbook

DGX

4mo

RTX Workstation

30 MIN

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

Playbook

vLLM

1mo

DGX Station

30 MIN

vLLM for Inference

Install and use vLLM on DGX Station

Playbook

vLLM

4mo

DGX Spark

1 HR

TRT LLM for Inference

Install and use TensorRT-LLM on DGX Spark

Playbook

DGX

9mo

Benchmark Jetson LLM/VLM serving performance across vLLM, llama.cpp, and Ollama with structured JSON output.

Skill

Developer

813

24d

Stand up vLLM or SGLang serving on Jetson, using upstream vLLM on Thor and Orin JetPack 7.2+, and NVIDIA-AI-IOT vLLM on older Orin.

Skill

AI Engineer

822

24d

DGX Station

30 MIN

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

Playbook

RadixAttention

1mo

Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.

Skill

Developer

1mo

DGX Spark

30 MIN