Try NVIDIA NIM APIs

Free Endpoint

nemotron-voicechat

Nemotron 3 Voicechat

English

1mo

Items per page

of 2 pages

2.61K

Google

Free Endpoint

gemma-2-2b-it

Advanced small language generative AI model for edge applications

524K

11mo

Google

Free Endpoint

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generation

495K

9mo

Moonshotai

Deprecation in 7dFree Endpoint

kimi-k2-instruct

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

coding

11.05M

9mo

Free Endpoint

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

648K

Free Endpoint

usdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

Digital Twin

10mo

Google

Free Endpoint

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generation

1.74M

9mo

OpenAI

Downloadable

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

reasoning

28.78M

9mo

Downloadable

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

2.58M

10mo

Downloadable

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat

26.83K466K

11mo

Downloadable

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

17.29K1.02M

11mo

Downloadable

mistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

483K

11mo

Downloadable

mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning

2.2M

9mo

Downloadable

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning

576K

9mo

Downloadable

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

MoE

46.68M

1mo

Microsoft

Downloadable

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

576K

11mo

Upstage

Free Endpoint

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Non-Commercial Use Only

303K

OpenAI

Downloadable

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

reasoning

11.98M

9mo

Downloadable

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

18.41M

10mo

Downloadable

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

math

2.43M

9mo

Downloadable

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

math

2.82M

9mo

Downloadable

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

language generation

2.18M

5mo

Qwen

Downloadable

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

tool calling

9.24M

2mo

Qwen

Downloadable

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.