Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Contact

Explore

⌘KCtrl+K

8 results for

Filters

Free Endpoint

Partner Endpoint

Download Available

Use Case

Retrieval Augmented Generation

Inference Providers

Together AI

Bitdeer AI

Deep Infra

GMI Cloud

CoreWeave

Publisher

NVIDIA

Qwen

DeepSeek AI

Mistral AI

Sarvamai

Sort By

Sarvamai

Downloadable

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

Model

coding

Items per page

of 1 pages

144K

9mo

Mistral AI

Downloadable

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

Model

code generation

8.35M

1mo

NVIDIA

Downloadable

nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

Model

thinking budget

377K

8mo

Qwen

Downloadable

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

Model

text-generation

17.92M

7mo

Qwen

Downloadable

qwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

Model

Reasoning

1.73M

7mo

DeepSeek AI

Deprecation in 5dFree Endpoint

deepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

Model

tool calling

6.4M

6mo

NVIDIA

Downloadable

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

Model

MoE

42.51M

1mo

NVIDIA

Free Endpoint

nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

Model

nemo retriever

118K

11mo