Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

11 results for

Filters

Free Endpoint

6

Partner Endpoint

8

Download Available

5

Use Case

Retrieval Augmented Generation

2

Code Generation

1

Text-to-Embedding

1

Inference Providers

Deep Infra

6

Fireworks AI

6

GMI Cloud

4

Bitdeer AI

3

Together AI

3

Publisher

NVIDIA

4

Moonshotai

2

Qwen

2

ByteDance

1

DeepSeek AI

1

Sort By

Free Endpoint

kimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.

9.06M

5mo

Free Endpoint

kimi-k2-thinking

Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.

2.88M

3mo

Free Endpoint

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

3.61M

6mo

Downloadable

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

11.55M

3mo

Downloadable

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

1.68M

5d

Free Endpoint

deepseek-v3.2

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

15.35M

3mo

Downloadable

llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

146K

7mo

Free Endpoint

phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

643K

9mo

Downloadable

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

11.85M

5mo

Free Endpoint

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

3.75M

6mo

Downloadable

llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

6.2M

7mo

Items per page

of 1 pages