Try NVIDIA NIM APIs

Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

7 results for

Filters (1)

Free Endpoint

0

Partner Endpoint

0

Download Available

0

Developer Example

0

Launchable

0

Use Case

Image-to-Text

0

Code Generation

0

Retrieval Augmented Generation

0

Text-to-Embedding

0

Inference Providers

Deepinfra

0

Together AI

0

Bitdeer

0

GMI Cloud

0

CoreWeave

0

Publisher

NVIDIA

7

Mistral AI

0

Google

0

Meta

0

OpenAI

0

Audience

AI Engineer

0

Developer

0

Ml Engineer

0

Application Developer

0

Data Scientist

0

Blueprint Type

NVIDIA AI

0

Domain

AI And Machine Learning

0

NIM Container GPUs

B200

0

H200

0

L40S

0

A100 PG509 200

0

A100 SXM4 80GB

0

Library

Video Search and Summarization (VSS)

0

NeMo Megatron Bridge

0

TAO Toolkit

0

Megatron Core

0

NeMoClaw

0

Labels (1)

Inference

Sort By

30 MIN

vLLM for Inference

Install and use vLLM on DGX Station

3mo

Items per page

of 1 pages

RTX Workstation

30 MIN

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

7d

30 MIN

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

23d

30 MIN

LM Studio on DGX Spark

Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.

4mo

30 MIN

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

6mo

30 MIN

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

2mo

60 MIN

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

1mo