Try NVIDIA NIM APIs

Free Endpoint

nemotron-voicechat

Nemotron 3 Voicechat

English

1.91K

2mo

Items per page

of 2 pages

Google

Free Endpoint

gemma-2-2b-it

Advanced small language generative AI model for edge applications

Chat

1.02M

Google

Free Endpoint

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

43.86M

10mo

Free Endpoint

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

Chat

1.72M

Free Endpoint

usdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

Digital Twin

11mo

General

LaunchableEnterprise

Build a Video Search and Summarization (VSS) Agent

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

Blueprint

NVIDIA AI

3mo

Google

Free Endpoint

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

3.74M

10mo

DownloadableFree Endpoint

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

MoE

61.03M

2mo

Microsoft

DownloadableFree Endpoint

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

Chat

476K

Upstage

Free Endpoint

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Non-Commercial Use Only

480K

OpenAI

DownloadableFree Endpoint

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

reasoning

54.33M

10mo

OpenAI

DownloadableFree Endpoint

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

reasoning

20.07M

10mo

DownloadableFree Endpoint

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

44.55K318K

DownloadableFree Endpoint

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

26.08K1.24M

Mistral AI

DownloadableFree Endpoint

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

983K

10mo

DownloadableFree Endpoint

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

3.95M

DownloadableFree Endpoint

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

28.97M

11mo

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

5.19M

10mo

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

3.21M

10mo

Mistral AI

DownloadableFree Endpoint

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

3.51M

6mo

Qwen

DownloadableFree Endpoint

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

10.34M

3mo

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

MoE

12.46M

3mo

Mistral AI

Free Endpoint

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

3.23M

6mo

Microsoft

Free Endpoint

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.