Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Contact

Explore

⌘KCtrl+K

10 results for

Filters

API Endpoint

Download Available

Use Case

Retrieval Augmented Generation

Code Generation

Text-to-Embedding

Publisher

NVIDIA

Moonshotai

Qwen

ByteDance

DeepSeek AI

Sort By

DeepSeek AI

deepseek-v3.2

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

Model

long context

15.64M

2mo

Moonshotai

kimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.

Model

long-context

10.04M

5mo

Moonshotai

kimi-k2-thinking

Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.

Model

Conversational

3.22M

3mo

Qwen

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

Model

agentic coding

3.83M

6mo

NVIDIA

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

Model

MoE

12.32M

2mo

NVIDIA

llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

Model

nemo retriever

151K

7mo

Microsoft

phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

Model

chat

590K

9mo

Qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

Model

chat

11.15M

5mo

ByteDance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

Model

thinking budget

3.46M

6mo

NVIDIA

llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

Model

nemo retriever

6.63M

7mo

Items per page

of 1 pages