Try NVIDIA NIM APIs

Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

20 results for

Filters (1)

Free Endpoint

5

Partner Endpoint

3

Download Available

4

Launchable

3

Developer Example

3

Enterprise Blueprint

0

NemoClaw Blueprint

0

Use Case

Drug Discovery

0

Image-to-Text

0

Retrieval Augmented Generation

0

Speech-to-Text

0

Code Generation

0

Inference Providers

Vultr

2

Deepinfra

1

Together AI

1

GMI Cloud

0

Bitdeer

0

Publisher

NVIDIA

18

Meta

1

Google

1

Mistral AI

0

Qwen

0

Audience

Developer

0

AI Engineer

0

Ml Engineer

0

Application Developer

0

Data Scientist

0

Blueprint Type

NVIDIA AI

3

NVIDIA Omniverse

0

NVIDIA BioNemo

0

NVIDIA Isaac GR00T

0

Domain

AI And Machine Learning

0

Accelerated Computing

0

Physical AI

0

Infrastructure

0

Developer Tools

0

NIM Container GPUs

B200

0

H100 80GB HBM3

0

H200

0

L40S

0

A100 SXM4 80GB

0

Library

TAO Toolkit

0

NeMo Megatron Bridge

0

Video Search and Summarization (VSS)

0

cuOpt

0

MONAI

0

Labels (1)

LLM

Sort By

Financial Services

LaunchableDeveloper Example

AI Model Distillation for Financial Data

Distill and deploy domain-specific AI models from unstructured financial data to generate market signals efficiently—scaling your workflow with the NVIDIA Data Flywheel Blueprint for high-performance, cost-efficient experimentation.

algorithmic trading

4mo

Healthcare & Life Sciences

LaunchableDeveloper Example

Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

4mo

Items per page

of 1 pages

20 MINS

CLI Coding Agent

Build local CLI coding agents with Ollama

1mo

60 MIN

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

1mo

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

171K

7d

RTX Workstation

8 MIN

How to Fine-Tune an LLM on NVIDIA GPUs With Unsloth

Fine-tune popular AI models faster in Unsloth with NVIDIA RTX AI PCs, RTX PRO workstations, and DGX Spark—plus explore the new Nemotron Nano 3 family of open models.

16d

Downloadable

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

nemo guardrails

160K

1y

Downloadable

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nemo guardrails

149K

1y

Free Endpoint

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation

336K

7mo

Free Endpoint

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety

222K

11mo

General

LaunchableDeveloper Example

LLM Router

Route LLM requests to the best model for the task at hand.

4mo

30 MINS

Local Coding Agent

Run local CLI coding agents with Ollama on DGX Station (NVIDIA GB300) using glm-4.7-flash (fast) or unsloth/GLM-4.7-GGUF:Q8_0 (best quality)

2mo

30 MIN

Nanochat Training

Train a small ChatGPT-style LLM (nanochat) with tokenizer, pretraining, midtraining, and SFT on DGX Station with GB300 Ultra

3mo

Downloadable

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

nemo guardrails

13.5K

11mo

Free Endpoint

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

230K

2mo

30 MIN

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

6mo

Free Endpoint

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

337K

15d

30 MINS

OpenClaw 🦞

Run OpenClaw locally on DGX Spark with a vLLM-served local model

3mo

30 MIN

Run Hermes Agent with Local Models

Install and run the Hermes self-improving AI agent on DGX Spark.

1mo

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

2mo