Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
7 models
Sort By
dateCreated:DESC
Most Recent
Qwen
qwen3-next-80b-a3b-instruct
Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
chat
+2
5mo
ByteDance
seed-oss-36b-instruct
ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
thinking budget
+3
5mo
Black-forest-labs
FLUX.1-Kontext-dev
FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.
Image Generation
+2
6mo
Qwen
qwen2.5-coder-7b-instruct
Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.
code completion
+3
9mo
NVIDIA
mistral-nemo-minitron-8b-base
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
language generation
+3
1y
THUDM
chatglm3-6b
Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.
Text Translation
+4
7mo
Microsoft
phi-3-small-128k-instruct
Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
chat
+4
9mo
Items per page
24
1
1
of 1 pages