Try NVIDIA NIM APIs

Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

72 results for

Filters

Free Endpoint

18

Partner Endpoint

10

Download Available

17

Developer Example

4

Launchable

4

Use Case

Image-to-Text

4

Code Generation

2

Inference Providers

Deepinfra

10

OpenRouter

7

Together AI

7

GMI Cloud

5

CoreWeave

3

Publisher

NVIDIA

59

Mistral AI

3

Google

2

Meta

2

OpenAI

2

Audience

AI Engineer

22

Ml Engineer

20

Developer

17

Application Developer

11

Platform Engineer

7

Blueprint Type

NVIDIA AI

4

Domain

AI And Machine Learning

22

Physical AI

1

NIM Container GPUs

A100 PG509 200

1

A100 SXM4 80GB

1

A10G

1

B200

1

GB200

1

Library

TAO Toolkit

6

Jetson

5

Video Search and Summarization (VSS)

5

NeMo Megatron Bridge

4

Megatron Core

3

Sort By

General

LaunchableDeveloper Example

LLM Router

Route LLM requests to the best model for the task at hand.

4mo

1 HR

TRT LLM for Inference

Install and use TensorRT-LLM on DGX Spark

9mo

Items per page

of 3 pages

Benchmark Jetson LLM/VLM serving performance across vLLM, llama.cpp, and Ollama with structured JSON output.

813

24d

Stand up vLLM or SGLang serving on Jetson, using upstream vLLM on Thor and Orin JetPack 7.2+, and NVIDIA-AI-IOT vLLM on older Orin.

822

24d

30 MIN

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

1mo

30 MIN

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Spark using NemoClaw and vLLM in a secure sandbox, with optional Telegram.

3mo

30 MIN

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Station using NemoClaw in a secure sandbox, with optional Telegram.

2mo

RTX Workstation

8 MIN

How to Fine-Tune an LLM on NVIDIA GPUs With Unsloth

Fine-tune popular AI models faster in Unsloth with NVIDIA RTX AI PCs, RTX PRO workstations, and DGX Spark—plus explore the new Nemotron Nano 3 family of open models.

1mo

30 MIN

NIM on Spark

Deploy a NIM on Spark

9mo

Downloadable

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

nemo guardrails

13K

1y

30 MIN

Run Hermes Agent with Local Models

Install and run the Hermes self-improving AI agent on DGX Spark.

2mo

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

3mo

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

4M

1mo

Healthcare & Life Sciences

LaunchableDeveloper Example

Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

4mo

20 MINS

CLI Coding Agent

Build local CLI coding agents with Ollama

2mo

30 MINS

Local Coding Agent

Run local CLI coding agents with Claude Code and Ollama on DGX Station (NVIDIA GB300) using qwen3.6:27b

3mo

30 MINS

OpenClaw 🦞

Run OpenClaw locally on DGX Spark with a vLLM-served local model

4mo

DownloadableFree Endpoint

glm-5.2

GLM-5.2 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

8M

13d

30 MIN

Nanochat Training

Train a small ChatGPT-style LLM (nanochat) with tokenizer, pretraining, midtraining, and SFT on DGX Station with GB300 Ultra

4mo

Downloadable

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

nemo guardrails

160K

1y

Free Endpoint

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation

336K

8mo

Downloadable

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nemo guardrails

149K

1y

Free Endpoint

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety

357K

1y

DownloadableFree Endpoint

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

2M

1mo