Try NVIDIA NIM APIs

Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

7 results for

Filters (1)

Free Endpoint

0

Partner Endpoint

0

Download Available

0

Launchable

0

Developer Example

0

Enterprise Blueprint

0

NemoClaw Blueprint

0

Use Case

Drug Discovery

0

Image-to-Text

0

Retrieval Augmented Generation

0

Speech-to-Text

0

Code Generation

0

Inference Providers

Deepinfra

0

Together AI

0

GMI Cloud

0

Bitdeer

0

CoreWeave

0

Publisher

NVIDIA

7

Meta

0

Google

0

Mistral AI

0

Qwen

0

Audience

Developer

0

AI Engineer

0

Ml Engineer

0

Application Developer

0

Data Scientist

0

Blueprint Type

NVIDIA AI

0

NVIDIA Omniverse

0

NVIDIA BioNemo

0

NVIDIA Isaac GR00T

0

Domain

AI And Machine Learning

0

Accelerated Computing

0

Physical AI

0

Infrastructure

0

Developer Tools

0

NIM Container GPUs

B200

0

H100 80GB HBM3

0

H200

0

L40S

0

A100 SXM4 80GB

0

Library

TAO Toolkit

0

NeMo Megatron Bridge

0

Video Search and Summarization (VSS)

0

cuOpt

0

MONAI

0

Labels (1)

Inference

Sort By

60 MIN

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

1mo

Items per page

of 1 pages

30 MIN

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

20d

30 MIN

LM Studio on DGX Spark

Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.

4mo

30 MIN

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

6mo

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

2mo

RTX Workstation

30 MIN

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

5d

30 MIN

vLLM for Inference

Install and use vLLM on DGX Station

3mo