Skip to main content

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (2)

Free Endpoint2

Partner Endpoint2

Download Available2

Use Case

Drug Discovery0Retrieval Augmented Generation0Speech-to-Text0Image Generation0Image-to-Text0

Inference Providers

Deepinfra2OpenRouter2GMI Cloud2Together AI1Bitdeer1

Publisher

DeepSeek AI2NVIDIA0Meta0Google0Mistral AI0

NIM Container GPUs

A100 SXM4 80GB0H100 80GB HBM30L40S0A10G0B2000

Labels (2)

agenticMoe

2 models

Sort By

DownloadableFree Endpoint

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

coding

MoE
fast
agentic

Items per page

of 1 pages

17M API calls in the last 30 days

Last updated on April 24, 2026

DownloadableFree Endpoint

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

Moe

reasoning
coding
agentic

7M API calls in the last 30 days

Last updated on April 24, 2026