Try NVIDIA NIM APIs

Explore

Models

Skills

Blueprints

52 results for

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Code Generation

Drug Discovery

Retrieval Augmented Generation

Speech-to-Text

Inference Providers

Deepinfra

OpenRouter

Together AI

GMI Cloud

Bitdeer

Publisher

NVIDIA

Meta

Mistral AI

Google

Qwen

NIM Container GPUs

B200

H200

H100 80GB HBM3

L40S

A100 SXM4 80GB

Labels (1)

chat

Sort By

DeepSeek AI

DownloadableFree Endpoint

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

Model

MoE

Items per page

of 3 pages

15M

2mo

DeepSeek AI

DownloadableFree Endpoint

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

Model

Moe

2mo

Google

DownloadableFree Endpoint

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

Model

diffusion-llm

27d

Abacus.AI

Free Endpoint

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

Model

Code Generation

783K

Google

Free Endpoint

gemma-2-2b-it

Advanced small language generative AI model for edge applications

Model

Chat

Google

Free Endpoint

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

34M

11mo

Google

Free Endpoint

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

11mo

Google

DownloadableFree Endpoint

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

Model

reasoning

3mo

Z.ai

DownloadableFree Endpoint

glm-5.2

GLM-5.2 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

Model

Agentic AI

OpenAI

DownloadableFree Endpoint

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

Model

reasoning

46M

11mo

OpenAI

DownloadableFree Endpoint

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

Model

reasoning

18M

11mo

NVIDIA

DownloadableFree Endpoint

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

Model

Quantum

332K

2mo

Moonshotai

DownloadableFree Endpoint

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

Model

Multimodal

16M

2mo

Meta

DownloadableFree Endpoint

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

Model

Chat

Meta

DownloadableFree Endpoint

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

Model

Chat

20M

NVIDIA

DownloadableFree Endpoint

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

Model

advanced reasoning

NVIDIA

DownloadableFree Endpoint

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

Model

doc intelligence

10M

Meta

DownloadableFree Endpoint

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

Meta

DownloadableFree Endpoint

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

Code Generation

40K290K

Meta

DownloadableFree Endpoint

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

Language Generation

27K1M

Meta

DownloadableFree Endpoint

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Model

Image-Text Retrieval

Meta

DownloadableFree Endpoint

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Model

Instruction following

19M

NVIDIA

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

Model

advanced reasoning

11mo

NVIDIA

DownloadableFree Endpoint

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

Model

advanced reasoning

11mo