Try NVIDIA NIM APIs

Deploy an AI-powered coding assistant on DGX Spark that delivers expert CUDA-aware chat, real-time code completion, and retrieval-augmented generation grounded in authoritative GPU programming knowledge—powered by NVIDIA NIM microservices.

Blueprint

Last updated on June 10, 2026

DGX Spark

30 MIN

Vibe Coding in VS Code

Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue

Playbook

VibeCoding
Spark

Last updated on October 10, 2025

DGX Spark

5 MIN

VS Code

Install and use VS Code locally or remotely

Playbook

Spark

Last updated on October 7, 2025

Google

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

Model

diffusion-llm

text-to-text
reasoning

4M API calls in the last 30 days

Last updated on June 10, 2026

NVIDIA

Free Endpoint

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

Model

Chat

Text-to-Text
Language Generation

3M API calls in the last 30 days

Last updated on August 26, 2024

NVIDIA

Free Endpoint

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

Model

autonomous vehicles

Physical AI
robotics
text-to-world
image-to-world
Synthetic Data Generation

2K API calls in the last 30 days

Last updated on June 1, 2026

DGX Spark

1 HR

FLUX.1 Dreambooth LoRA Fine-tuning

Fine-tune FLUX.1-dev 12B model using Dreambooth LoRA for custom image generation

Playbook

Image Generation

ComfyUI
DGX
LoRA
Spark
Fine-tuning
Text-to-Image

Last updated on October 7, 2025

Google

DownloadableFree Endpoint

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

Model

reasoning

coding
text-to-text
agentic

6M API calls in the last 30 days

Last updated on April 2, 2026

Thinkingmachines

DownloadableFree Endpoint

inkling

Inkling is a multimodal (text + image) reasoning model from Thinking Machines — a Mamba-hybrid, 256-expert Mixture-of-Experts architecture with tool use and switchable reasoning.

Model

text-to-text

reasoning
image-to-text
multimodal

Last updated on July 16, 2026

Meta

DownloadableFree Endpoint

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

Model

Chat

Text-to-Text
Language Generation
Code Generation

5M API calls in the last 30 days

Last updated on June 12, 2025

Meta

DownloadableFree Endpoint

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

chat

Text-to-Text
Language Generation
Code Generation

40K downloads in the last 30 days

545K API calls in the last 30 days

Last updated on May 21, 2025

Meta

DownloadableFree Endpoint

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

Chat

Text-to-Text
Language Generation
Code Generation

27K downloads in the last 30 days

1M API calls in the last 30 days

Last updated on May 22, 2025

Minimaxai

Free Endpoint

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

Model

coding

text-to-text
reasoning

10M API calls in the last 30 days

Last updated on June 12, 2026

Upstage

DeprecatedFree Endpoint

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Model

Non-Commercial Use Only

chat
Text-to-Text
Language Generation
Large Language Models

527K API calls in the last 30 days

Last updated on April 10, 2025

Microsoft

Downloadable

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

Model

text-to-3d

Run-on-RTX
image-to-3d

18K API calls in the last 30 days

Last updated on September 3, 2025

OpenAI

DownloadableFree Endpoint

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

Model

reasoning

text-to-text
chat
math

45M API calls in the last 30 days

Last updated on August 5, 2025

OpenAI

DownloadableFree Endpoint

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

Model

reasoning

text-to-text
chat
math

19M API calls in the last 30 days

Last updated on August 5, 2025

Meta

DownloadableFree Endpoint

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

Model

Chat

Text-to-Text
Language Generation
Run-on-RTX
Code Generation

19M API calls in the last 30 days

Last updated on July 9, 2025

Meta

DownloadableFree Endpoint

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Model

Instruction following

Math
Reasoning
Text-to-Text
Code Generation

27M API calls in the last 30 days

Last updated on June 12, 2025

Mistral AI

DownloadableFree Endpoint

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Model

Advanced Reasoning

Chat
Text-to-Text
Large Language Models
Code Generation

1M API calls in the last 30 days

Last updated on July 18, 2025

General

LaunchableDeveloper Example

PDF to Podcast

Transform PDFs into AI podcasts for engaging on-the-go audio content.

Blueprint

NVIDIA AI

AI Agent
Text-to-Speech
Multi-modal
Conversational AI
PDF-to-Podcast
Text-to-speech

Last updated on February 17, 2026